[honu] checkpoint for docs
This commit is contained in:
parent
13e16d2b12
commit
34689f1711
|
@ -463,8 +463,7 @@ Then, in the pattern above for 'if', 'then' would be bound to the following synt
|
|||
(syntax->datum unparsed))
|
||||
;; if parsed is #f then we don't want to expand to anything that will print
|
||||
;; so use an empty form, begin, `parsed' could be #f becuase there was no expression
|
||||
;; in the input such as parsing just ";". hygiene should ensure that this variable
|
||||
;; will not collide with anything else
|
||||
;; in the input such as parsing just ";".
|
||||
(with-syntax ([parsed (if (not parsed) #'(begin) parsed)]
|
||||
[(unparsed ...) unparsed])
|
||||
(if (null? (syntax->datum #'(unparsed ...)))
|
||||
|
|
|
@ -1,236 +1,181 @@
|
|||
#lang scribble/doc
|
||||
@(require scribble/manual
|
||||
scribble/bnf
|
||||
(for-label scheme))
|
||||
honu/core/read
|
||||
(for-label honu/core/read))
|
||||
|
||||
@(define lcomma (litchar ", "))
|
||||
|
||||
@title{Honu}
|
||||
|
||||
@defterm{Honu} is a family of languages built on top of Racket. Honu
|
||||
syntax resembles Java. Like Racket, however, Honu has no fixed syntax,
|
||||
because Honu supports extensibility through macros and a base syntax
|
||||
of @as-index{H-expressions}, which are analogous to S-expressions.
|
||||
|
||||
The Honu language currently exists only as a undocumented
|
||||
prototype. Racket's parsing and printing of H-expressions is
|
||||
independent of the Honu language, however, so it is documented here.
|
||||
@defterm{Honu} is a language with Java-like syntax built on top of Racket.
|
||||
Honu's main goal is to support syntactic abstraction mechanisms similar to
|
||||
Racket. Currently, Honu is a prototype and may change without notice.
|
||||
|
||||
@table-of-contents[]
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@section{H-expressions}
|
||||
@defmodulelang[honu]
|
||||
|
||||
The Racket reader incorporates an H-expression reader, and Racket's
|
||||
printer also supports printing values in Honu syntax. The reader can
|
||||
be put into H-expression mode either by including @litchar{#hx} in the
|
||||
input stream, or by calling @racket[read-honu] or
|
||||
@racket[read-honu-syntax] instead of @racket[read] or
|
||||
@racket[read-syntax]. Similarly, @racket[print] (or, more precisely,
|
||||
the default print handler) produces Honu output when the
|
||||
@racket[print-honu] parameter is set to @racket[#t].
|
||||
@section{Get started}
|
||||
To use Honu in a module, write the following line at the top of the file.
|
||||
|
||||
When the reader encounters @litchar{#hx}, it reads a single
|
||||
H-expression, and it produces an S-expression that encodes the
|
||||
H-expression. Except for atomic H-expressions, evaluating this
|
||||
S-expression as Racket is unlikely to succeed. In other words,
|
||||
H-expressions are not intended as a replacement for S-expressions to
|
||||
represent Racket code.
|
||||
@racketmod[honu]
|
||||
|
||||
Honu syntax is normally used via @litchar{#lang honu}, which reads
|
||||
H-expressions repeatedly until an end-of-file is encountered, and
|
||||
processes the result as a module in the Honu language.
|
||||
|
||||
Ignoring whitespace, an H-expression is either
|
||||
|
||||
@itemize[
|
||||
|
||||
@item{a number (see @secref["honu:numbers"]);}
|
||||
|
||||
@item{an identifier (see @secref["honu:identifiers"]);}
|
||||
|
||||
@item{a string (see @secref["honu:strings"]);}
|
||||
|
||||
@item{a character (see @secref["honu:chars"]);}
|
||||
|
||||
@item{a sequence of H-expressions between parentheses (see @secref["honu:parens"]);}
|
||||
|
||||
@item{a sequence of H-expressions between square brackets (see @secref["honu:parens"]);}
|
||||
|
||||
@item{a sequence of H-expressions between curly braces (see @secref["honu:parens"]);}
|
||||
|
||||
@item{a comment followed by an H-expression (see @secref["honu:comments"]);}
|
||||
|
||||
@item{@litchar{#;} followed by two H-expressions (see @secref["honu:comments"]);}
|
||||
|
||||
@item{@litchar{#hx} followed by an H-expression;}
|
||||
|
||||
@item{@litchar{#sx} followed by an S-expression (see @secref[#:doc
|
||||
'(lib "scribblings/reference/reference.scrbl") "reader"]).}
|
||||
|
||||
]
|
||||
|
||||
Within a sequence of H-expressions, a sub-sequence between angle
|
||||
brackets is represented specially (see @secref["honu:parens"]).
|
||||
|
||||
Whitespace for H-expressions is as in Racket: any character for which
|
||||
@racket[char-whitespace?] returns true counts as a whitespace.
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@subsection[#:tag "honu:numbers"]{Numbers}
|
||||
|
||||
The syntax for Honu numbers is the same as for Java. The S-expression
|
||||
encoding of a particular H-expression number is the obvious Racket
|
||||
number.
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@subsection[#:tag "honu:identifiers"]{Identifiers}
|
||||
|
||||
The syntax for Honu identifiers is the union of Java identifiers plus
|
||||
@litchar{;}, @litchar{,}, and a set of operator identifiers. An
|
||||
@defterm{operator identifier} is any combination of the following
|
||||
characters:
|
||||
|
||||
@t{
|
||||
@hspace[2] @litchar{+} @litchar{-} @litchar{=} @litchar{?}
|
||||
@litchar{:} @litchar{<} @litchar{>} @litchar{.} @litchar{!} @litchar{%}
|
||||
@litchar{^} @litchar{&} @litchar{*} @litchar{/} @litchar{~} @litchar{|}
|
||||
You can use Honu at the REPL on the command line by invoking racket like so
|
||||
@verbatim{
|
||||
racket -Iq honu
|
||||
}
|
||||
|
||||
The S-expression encoding of an H-expression identifier is the obvious
|
||||
Racket symbol.
|
||||
@section{Reader}
|
||||
|
||||
Input is parsed to form maximally long identifiers. For example, the
|
||||
input @litchar{int->int;} is parsed as four H-expressions represented
|
||||
by symbols: @racket['int], @racket['->], @racket['int], and
|
||||
@racket['|;|].
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@subsection[#:tag "honu:strings"]{Strings}
|
||||
|
||||
The syntax for an H-expression string is exactly the same as for an
|
||||
S-expression string, and an H-expression string is represented by the
|
||||
obvious Racket string.
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@subsection[#:tag "honu:chars"]{Characters}
|
||||
|
||||
The syntax for an H-expression character is the same as for an
|
||||
H-expression string that has a single content character, except that a
|
||||
@litchar{'} surrounds the character instead of @litchar{"}. The
|
||||
S-expression representation of an H-expression character is the
|
||||
obvious Racket character.
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@subsection[#:tag "honu:parens"]{Parentheses, Brackets, and Braces}
|
||||
|
||||
A H-expression between @litchar{(} and @litchar{)}, @litchar{[} and
|
||||
@litchar{]}, or @litchar["{"] and @litchar["}"] is represented by a
|
||||
Racket list. The first element of the list is @racket['#%parens] for a
|
||||
@litchar{(}...@litchar{)} sequence, @racket['#%brackets] for a
|
||||
@litchar{[}...@litchar{]} sequence, or @racket['#%braces] for a
|
||||
@litchar["{"]...@litchar["}"] sequence. The remaining elements are the
|
||||
Racket representations for the grouped H-expressions in order.
|
||||
|
||||
In an H-expression sequence, when a @litchar{<} is followed by a
|
||||
@litchar{>}, and when nothing between the @litchar{<} and @litchar{>}
|
||||
is an immediate symbol containing a @litchar{=}, @litchar{&}, or
|
||||
@litchar{|}, then the sub-sequence is represented by a Racket list
|
||||
that starts with @racket['#%angles] and continues with the elements of
|
||||
the sub-sequence between the @litchar{<} and @litchar{>}
|
||||
(exclusive). This representation is applied recursively, so that angle
|
||||
brackets can be nested.
|
||||
|
||||
An angle-bracketed sequence by itself is not a single H-expression,
|
||||
since the @litchar{<} by itself is a single H-expression; the
|
||||
angle-bracket conversion is performed only when representing sequences
|
||||
of H-expressions.
|
||||
|
||||
Symbols with a @litchar{=}, @litchar{&}, or @litchar{|} prevent
|
||||
angle-bracket formation because they correspond to operators that
|
||||
normally have lower or equal precedence compared to less-than and
|
||||
greater-than operators.
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@subsection[#:tag "honu:comments"]{Comments}
|
||||
|
||||
An H-expression comment starts with either @litchar{//} or
|
||||
@litchar{/*}. In the former case, the comment runs until a linefeed or
|
||||
return. In the second case, the comment runs until @litchar{*/}, but
|
||||
@litchar{/*}...@litchar{*/} comments can be nested. Comments are
|
||||
treated like whitespace.
|
||||
|
||||
A @litchar{#;} starts an H-expression comment, as in S-expressions. It
|
||||
is followed by an H-expression to be treated as whitespace. Note that
|
||||
@litchar{#;} is equivalent to @litchar{#sx#;#hx}.
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@subsection{Honu Output Printing}
|
||||
|
||||
Some Racket values have a standard H-expression representation. For
|
||||
values with no H-expression representation but with a
|
||||
@racket[read]able S-expression form, the Racket printer produces an
|
||||
S-expression prefixed with @litchar{#sx}. For values with neither an
|
||||
H-expression form nor a @racket[read]able S-expression form, then
|
||||
printer produces output of the form @litchar{#<}...@litchar{>}, as in
|
||||
Racket mode. The @racket[print-honu] parameter controls whether
|
||||
Racket's printer produces Racket or Honu output.
|
||||
|
||||
The values with H-expression forms are as follows:
|
||||
@subsection{Tokens}
|
||||
The Honu reader, @racket[honu-read], will tokenize the input stream according to
|
||||
the following regular expressions.
|
||||
|
||||
@itemize[
|
||||
|
||||
@item{Every real number has an H-expression form, although the
|
||||
representation for an exact, non-integer rational number is
|
||||
actually three H-expressions, where the middle H-expression is
|
||||
@racket[/].}
|
||||
|
||||
@item{Every character string is represented the same in H-expression
|
||||
form as its S-expression form.}
|
||||
|
||||
@item{Every character is represented like a single-character string,
|
||||
but (1) using a @litchar{'} as the delimiter instead of
|
||||
@litchar{"}, and (2) protecting a @litchar{'} character content
|
||||
with a @litchar{\} instead of protecting @litchar{"} character
|
||||
content.}
|
||||
|
||||
@item{A list is represented with the H-expression sequence
|
||||
@litchar{list(}@nonterm{v}@|lcomma|...@litchar{)},
|
||||
where each @nonterm{v} is the representation of each element of
|
||||
the list.}
|
||||
|
||||
@item{A pair that is not a list is represented with the H-expression
|
||||
sequence
|
||||
@litchar{cons(}@nonterm{v1}@|lcomma|@nonterm{v2}@litchar{)},
|
||||
where @nonterm{v1} and @nonterm{v2} are the representations of
|
||||
the pair elements.}
|
||||
|
||||
@item{A vector's representation depends on the value of the
|
||||
@racket[print-vector-length] parameter. If it is @racket[#f],
|
||||
the vector is represented with the H-expression sequence
|
||||
@litchar{vectorN(}@nonterm{v}@|lcomma|...@litchar{)}, where
|
||||
each @nonterm{v} is the representation of each element of the
|
||||
vector. If @racket[print-vector-length] is set to @racket[#t],
|
||||
the vector is represented with the H-expression sequence
|
||||
@litchar{vectorN(}@nonterm{n}@|lcomma|@nonterm{v}@|lcomma|...@litchar{)},
|
||||
where @nonterm{n} is the length of the vector and each
|
||||
@nonterm{v} is the representation of each element of the
|
||||
vector, and multiple instances of the same value at the end of
|
||||
the vector are represented by a single @nonterm{v}.}
|
||||
|
||||
@item{The empty list is represented as the H-expression
|
||||
@litchar{null}.}
|
||||
|
||||
@item{True is represented as the H-expression @litchar{true}.}
|
||||
|
||||
@item{False is represented as the H-expression @litchar{false}.}
|
||||
|
||||
@item{Identifiers are [a-zA-Z_?][a-zA-Z_?0-9]*}
|
||||
@item{Strings are "[^"]*"}
|
||||
@item{Numbers are \d+(\.\d+)?}
|
||||
@item{And the following tokens + = * / - ^ || | && <= >= <- < > !
|
||||
:: := : ; ` ' . , ( ) { } [ ]}
|
||||
]
|
||||
|
||||
@subsection{Structure}
|
||||
|
||||
After tokenization a Honu program will be converted into a tree with minimal
|
||||
structure. Enclosing tokens will be grouped into a single object represented as
|
||||
an s-expression. Enclosing tokens are pairs of (), {}, and [].
|
||||
|
||||
Consider the following stream of tokens
|
||||
|
||||
@codeblock|{
|
||||
x ( 5 + 2 )
|
||||
}|
|
||||
|
||||
This will be converted into
|
||||
@codeblock|{
|
||||
(x (#%parens 5 + 2))
|
||||
}|
|
||||
|
||||
{} will be converted to (#%braces ...) and [] will be conveted to (#%brackets
|
||||
...)
|
||||
|
||||
@defproc[(honu-read (port port?)) any]{
|
||||
Read an s-expression from the given port.
|
||||
}
|
||||
|
||||
@defproc[(honu-read-syntax (name any) (port port?)) any]{
|
||||
Read a syntax object from the given port.
|
||||
}
|
||||
|
||||
@defproc[(honu-lexer (port port?)) (list position-token?)]{
|
||||
Tokenize a port into a stream of honu tokens.
|
||||
}
|
||||
|
||||
@section{Parsing}
|
||||
|
||||
Honu is parsed using an algorithm based primarily on operator precedence. The
|
||||
main focus of the operator precedence algorithm is to support infix operators.
|
||||
In short, the algorithm operates in the following way
|
||||
|
||||
@itemlist[
|
||||
@item{1. parse an @tech{expression}}
|
||||
@item{2. check for a binary operator. if one is found then continue to step 3
|
||||
otherwise return the expression from step 1 immediately.}
|
||||
@item{3. parse another @tech{expression}}
|
||||
@item{4. check for a binary operator. if one is found then check if its precedence is
|
||||
higher than the operator found in step 2, and if so then continue parsing from
|
||||
step 3. if the precedence is lower or an operator is not found then build an
|
||||
infix expression from the left hand expression from step 1, the binary operator
|
||||
in step 2, and the right hand expression in step 3.}
|
||||
]
|
||||
|
||||
Parsing will maintain the following registers
|
||||
@itemlist[
|
||||
@item{@bold{left} - a function that takes the right hand side of an expression and
|
||||
returns the infix expression by combining the left hand side and the
|
||||
operator.}
|
||||
@item{@bold{current} - the current right hand side}
|
||||
@item{@bold{precedence} - represents the current precedence level}
|
||||
@item{@bold{stream} - stream of tokens to parse}
|
||||
]
|
||||
|
||||
This algorithm is illustrated with the following example. Consider the raw
|
||||
stream of tokens
|
||||
|
||||
@codeblock|{ 1 + 2 * 3 - 9 }|
|
||||
|
||||
@tabular[
|
||||
@list[
|
||||
@list["left" (hspace 1) "current" (hspace 1) "precedence" (hspace 1) "stream"]
|
||||
@list[@racket[(lambda (x) x)] (hspace 1)
|
||||
@racket[#f] (hspace 1)
|
||||
@racket[0] (hspace 1)
|
||||
@codeblock|{1 + 2 * 3 - 9}|]
|
||||
@list[@racket[(lambda (x) x)] (hspace 1)
|
||||
@racket[1] (hspace 1)
|
||||
@racket[0] (hspace 1)
|
||||
@codeblock|{+ 2 * 3 - 9}|]
|
||||
@list[@racket[(lambda (x) #'(+ 1 x))] (hspace 1)
|
||||
@racket[#f] (hspace 1)
|
||||
@racket[1] (hspace 1)
|
||||
@codeblock|{2 * 3 - 9}|]
|
||||
@list[@racket[(lambda (x) #'(+ 1 x))] (hspace 1)
|
||||
@racket[2] (hspace 1)
|
||||
@racket[1] (hspace 1)
|
||||
@codeblock|{* 3 - 9}|]
|
||||
@list[@racket[(lambda (x) (left #'(* 2 x)))] (hspace 1)
|
||||
@racket[2] (hspace 1)
|
||||
@racket[2] (hspace 1)
|
||||
@codeblock|{3 - 9}|]
|
||||
@list[@racket[(lambda (x) (left #'(* 2 x)))] (hspace 1)
|
||||
@racket[3] (hspace 1)
|
||||
@racket[2] (hspace 1)
|
||||
@codeblock|{- 9}|]
|
||||
@list[@racket[(lambda (x) #'(- (+ 1 (* 2 3)) x))] (hspace 1)
|
||||
@racket[#f] (hspace 1)
|
||||
@racket[1] (hspace 1)
|
||||
@codeblock|{9}|]
|
||||
@list[@racket[(lambda (x) #'(- (+ 1 (* 2 3)) x))] (hspace 1)
|
||||
@racket[9] (hspace 1)
|
||||
@racket[1] (hspace 1)
|
||||
@codeblock|{}|]
|
||||
]
|
||||
]
|
||||
|
||||
When the stream of tokens is empty the @bold{current} register is passed as an
|
||||
argument to the @bold{left} function which ultimately produces the expression
|
||||
@codeblock|{(- (+ 1 (* 2 3)) 9)}|
|
||||
|
||||
In this example @racket[+] and @racket[-] both have a precedence of 1 while
|
||||
@racket[*] has a precedence of 2. Currently, precedences can be any number that
|
||||
can be compared with @racket[<=].
|
||||
|
||||
The example takes some liberties with respect to how the actual implementation
|
||||
works. In particular the binary operators are syntax transformers that accept
|
||||
the left and right hand expressions as parameters and return new syntax objects.
|
||||
Also when the @racket[*] operator is parsed the @bold{left} function for
|
||||
@racket[+] is nested inside the new function for @racket[*].
|
||||
|
||||
An @deftech{expression} can be one of the following
|
||||
@itemlist[
|
||||
@item{@bold{datum} - number, string, or symbol. @codeblock|{5}|}
|
||||
@item{@bold{macro} - a symbol bound to a syntax transformer.
|
||||
@codeblock|{cond x = 5: true, else: false}|}
|
||||
@item{@bold{stop} - a symbol which immediately ends the current expression.
|
||||
these are currently , ; :}
|
||||
@item{@bold{lambda expression} - an identifier followed by @racket[(id ...)]
|
||||
followed by a block of code in braces. @codeblock|{add(x, y){ x + y }}|}
|
||||
@item{@bold{function application} - an expression followed by @racket[(arg
|
||||
...)]. @codeblock|{f(2, 2)}|}
|
||||
@item{@bold{list comprehension} - @codeblock|{[x + 1: x <- [1, 2, 3]]}|}
|
||||
@item{@bold{block of code} - a series of expressions wrapped in braces.}
|
||||
@item{@bold{expression grouping} - any expression inside a set of parenthesis
|
||||
@codeblock|{(1 + 1) * 2}|}
|
||||
]
|
||||
|
||||
@section{Macros}
|
||||
@section{Language}
|
||||
@section{Examples}
|
||||
|
|
Loading…
Reference in New Issue
Block a user