[honu] checkpoint for docs
This commit is contained in:
parent
13e16d2b12
commit
34689f1711
|
@ -463,8 +463,7 @@ Then, in the pattern above for 'if', 'then' would be bound to the following synt
|
||||||
(syntax->datum unparsed))
|
(syntax->datum unparsed))
|
||||||
;; if parsed is #f then we don't want to expand to anything that will print
|
;; if parsed is #f then we don't want to expand to anything that will print
|
||||||
;; so use an empty form, begin, `parsed' could be #f becuase there was no expression
|
;; so use an empty form, begin, `parsed' could be #f becuase there was no expression
|
||||||
;; in the input such as parsing just ";". hygiene should ensure that this variable
|
;; in the input such as parsing just ";".
|
||||||
;; will not collide with anything else
|
|
||||||
(with-syntax ([parsed (if (not parsed) #'(begin) parsed)]
|
(with-syntax ([parsed (if (not parsed) #'(begin) parsed)]
|
||||||
[(unparsed ...) unparsed])
|
[(unparsed ...) unparsed])
|
||||||
(if (null? (syntax->datum #'(unparsed ...)))
|
(if (null? (syntax->datum #'(unparsed ...)))
|
||||||
|
|
|
@ -1,236 +1,181 @@
|
||||||
#lang scribble/doc
|
#lang scribble/doc
|
||||||
@(require scribble/manual
|
@(require scribble/manual
|
||||||
scribble/bnf
|
scribble/bnf
|
||||||
(for-label scheme))
|
honu/core/read
|
||||||
|
(for-label honu/core/read))
|
||||||
|
|
||||||
@(define lcomma (litchar ", "))
|
@(define lcomma (litchar ", "))
|
||||||
|
|
||||||
@title{Honu}
|
@title{Honu}
|
||||||
|
|
||||||
@defterm{Honu} is a family of languages built on top of Racket. Honu
|
@defterm{Honu} is a language with Java-like syntax built on top of Racket.
|
||||||
syntax resembles Java. Like Racket, however, Honu has no fixed syntax,
|
Honu's main goal is to support syntactic abstraction mechanisms similar to
|
||||||
because Honu supports extensibility through macros and a base syntax
|
Racket. Currently, Honu is a prototype and may change without notice.
|
||||||
of @as-index{H-expressions}, which are analogous to S-expressions.
|
|
||||||
|
|
||||||
The Honu language currently exists only as a undocumented
|
|
||||||
prototype. Racket's parsing and printing of H-expressions is
|
|
||||||
independent of the Honu language, however, so it is documented here.
|
|
||||||
|
|
||||||
@table-of-contents[]
|
@table-of-contents[]
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
@; ----------------------------------------------------------------------
|
||||||
|
|
||||||
@section{H-expressions}
|
@defmodulelang[honu]
|
||||||
|
|
||||||
The Racket reader incorporates an H-expression reader, and Racket's
|
@section{Get started}
|
||||||
printer also supports printing values in Honu syntax. The reader can
|
To use Honu in a module, write the following line at the top of the file.
|
||||||
be put into H-expression mode either by including @litchar{#hx} in the
|
|
||||||
input stream, or by calling @racket[read-honu] or
|
|
||||||
@racket[read-honu-syntax] instead of @racket[read] or
|
|
||||||
@racket[read-syntax]. Similarly, @racket[print] (or, more precisely,
|
|
||||||
the default print handler) produces Honu output when the
|
|
||||||
@racket[print-honu] parameter is set to @racket[#t].
|
|
||||||
|
|
||||||
When the reader encounters @litchar{#hx}, it reads a single
|
@racketmod[honu]
|
||||||
H-expression, and it produces an S-expression that encodes the
|
|
||||||
H-expression. Except for atomic H-expressions, evaluating this
|
|
||||||
S-expression as Racket is unlikely to succeed. In other words,
|
|
||||||
H-expressions are not intended as a replacement for S-expressions to
|
|
||||||
represent Racket code.
|
|
||||||
|
|
||||||
Honu syntax is normally used via @litchar{#lang honu}, which reads
|
You can use Honu at the REPL on the command line by invoking racket like so
|
||||||
H-expressions repeatedly until an end-of-file is encountered, and
|
@verbatim{
|
||||||
processes the result as a module in the Honu language.
|
racket -Iq honu
|
||||||
|
|
||||||
Ignoring whitespace, an H-expression is either
|
|
||||||
|
|
||||||
@itemize[
|
|
||||||
|
|
||||||
@item{a number (see @secref["honu:numbers"]);}
|
|
||||||
|
|
||||||
@item{an identifier (see @secref["honu:identifiers"]);}
|
|
||||||
|
|
||||||
@item{a string (see @secref["honu:strings"]);}
|
|
||||||
|
|
||||||
@item{a character (see @secref["honu:chars"]);}
|
|
||||||
|
|
||||||
@item{a sequence of H-expressions between parentheses (see @secref["honu:parens"]);}
|
|
||||||
|
|
||||||
@item{a sequence of H-expressions between square brackets (see @secref["honu:parens"]);}
|
|
||||||
|
|
||||||
@item{a sequence of H-expressions between curly braces (see @secref["honu:parens"]);}
|
|
||||||
|
|
||||||
@item{a comment followed by an H-expression (see @secref["honu:comments"]);}
|
|
||||||
|
|
||||||
@item{@litchar{#;} followed by two H-expressions (see @secref["honu:comments"]);}
|
|
||||||
|
|
||||||
@item{@litchar{#hx} followed by an H-expression;}
|
|
||||||
|
|
||||||
@item{@litchar{#sx} followed by an S-expression (see @secref[#:doc
|
|
||||||
'(lib "scribblings/reference/reference.scrbl") "reader"]).}
|
|
||||||
|
|
||||||
]
|
|
||||||
|
|
||||||
Within a sequence of H-expressions, a sub-sequence between angle
|
|
||||||
brackets is represented specially (see @secref["honu:parens"]).
|
|
||||||
|
|
||||||
Whitespace for H-expressions is as in Racket: any character for which
|
|
||||||
@racket[char-whitespace?] returns true counts as a whitespace.
|
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
@subsection[#:tag "honu:numbers"]{Numbers}
|
|
||||||
|
|
||||||
The syntax for Honu numbers is the same as for Java. The S-expression
|
|
||||||
encoding of a particular H-expression number is the obvious Racket
|
|
||||||
number.
|
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
@subsection[#:tag "honu:identifiers"]{Identifiers}
|
|
||||||
|
|
||||||
The syntax for Honu identifiers is the union of Java identifiers plus
|
|
||||||
@litchar{;}, @litchar{,}, and a set of operator identifiers. An
|
|
||||||
@defterm{operator identifier} is any combination of the following
|
|
||||||
characters:
|
|
||||||
|
|
||||||
@t{
|
|
||||||
@hspace[2] @litchar{+} @litchar{-} @litchar{=} @litchar{?}
|
|
||||||
@litchar{:} @litchar{<} @litchar{>} @litchar{.} @litchar{!} @litchar{%}
|
|
||||||
@litchar{^} @litchar{&} @litchar{*} @litchar{/} @litchar{~} @litchar{|}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
The S-expression encoding of an H-expression identifier is the obvious
|
@section{Reader}
|
||||||
Racket symbol.
|
|
||||||
|
|
||||||
Input is parsed to form maximally long identifiers. For example, the
|
@subsection{Tokens}
|
||||||
input @litchar{int->int;} is parsed as four H-expressions represented
|
The Honu reader, @racket[honu-read], will tokenize the input stream according to
|
||||||
by symbols: @racket['int], @racket['->], @racket['int], and
|
the following regular expressions.
|
||||||
@racket['|;|].
|
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
@subsection[#:tag "honu:strings"]{Strings}
|
|
||||||
|
|
||||||
The syntax for an H-expression string is exactly the same as for an
|
|
||||||
S-expression string, and an H-expression string is represented by the
|
|
||||||
obvious Racket string.
|
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
@subsection[#:tag "honu:chars"]{Characters}
|
|
||||||
|
|
||||||
The syntax for an H-expression character is the same as for an
|
|
||||||
H-expression string that has a single content character, except that a
|
|
||||||
@litchar{'} surrounds the character instead of @litchar{"}. The
|
|
||||||
S-expression representation of an H-expression character is the
|
|
||||||
obvious Racket character.
|
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
@subsection[#:tag "honu:parens"]{Parentheses, Brackets, and Braces}
|
|
||||||
|
|
||||||
A H-expression between @litchar{(} and @litchar{)}, @litchar{[} and
|
|
||||||
@litchar{]}, or @litchar["{"] and @litchar["}"] is represented by a
|
|
||||||
Racket list. The first element of the list is @racket['#%parens] for a
|
|
||||||
@litchar{(}...@litchar{)} sequence, @racket['#%brackets] for a
|
|
||||||
@litchar{[}...@litchar{]} sequence, or @racket['#%braces] for a
|
|
||||||
@litchar["{"]...@litchar["}"] sequence. The remaining elements are the
|
|
||||||
Racket representations for the grouped H-expressions in order.
|
|
||||||
|
|
||||||
In an H-expression sequence, when a @litchar{<} is followed by a
|
|
||||||
@litchar{>}, and when nothing between the @litchar{<} and @litchar{>}
|
|
||||||
is an immediate symbol containing a @litchar{=}, @litchar{&}, or
|
|
||||||
@litchar{|}, then the sub-sequence is represented by a Racket list
|
|
||||||
that starts with @racket['#%angles] and continues with the elements of
|
|
||||||
the sub-sequence between the @litchar{<} and @litchar{>}
|
|
||||||
(exclusive). This representation is applied recursively, so that angle
|
|
||||||
brackets can be nested.
|
|
||||||
|
|
||||||
An angle-bracketed sequence by itself is not a single H-expression,
|
|
||||||
since the @litchar{<} by itself is a single H-expression; the
|
|
||||||
angle-bracket conversion is performed only when representing sequences
|
|
||||||
of H-expressions.
|
|
||||||
|
|
||||||
Symbols with a @litchar{=}, @litchar{&}, or @litchar{|} prevent
|
|
||||||
angle-bracket formation because they correspond to operators that
|
|
||||||
normally have lower or equal precedence compared to less-than and
|
|
||||||
greater-than operators.
|
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
@subsection[#:tag "honu:comments"]{Comments}
|
|
||||||
|
|
||||||
An H-expression comment starts with either @litchar{//} or
|
|
||||||
@litchar{/*}. In the former case, the comment runs until a linefeed or
|
|
||||||
return. In the second case, the comment runs until @litchar{*/}, but
|
|
||||||
@litchar{/*}...@litchar{*/} comments can be nested. Comments are
|
|
||||||
treated like whitespace.
|
|
||||||
|
|
||||||
A @litchar{#;} starts an H-expression comment, as in S-expressions. It
|
|
||||||
is followed by an H-expression to be treated as whitespace. Note that
|
|
||||||
@litchar{#;} is equivalent to @litchar{#sx#;#hx}.
|
|
||||||
|
|
||||||
@; ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
@subsection{Honu Output Printing}
|
|
||||||
|
|
||||||
Some Racket values have a standard H-expression representation. For
|
|
||||||
values with no H-expression representation but with a
|
|
||||||
@racket[read]able S-expression form, the Racket printer produces an
|
|
||||||
S-expression prefixed with @litchar{#sx}. For values with neither an
|
|
||||||
H-expression form nor a @racket[read]able S-expression form, then
|
|
||||||
printer produces output of the form @litchar{#<}...@litchar{>}, as in
|
|
||||||
Racket mode. The @racket[print-honu] parameter controls whether
|
|
||||||
Racket's printer produces Racket or Honu output.
|
|
||||||
|
|
||||||
The values with H-expression forms are as follows:
|
|
||||||
|
|
||||||
@itemize[
|
@itemize[
|
||||||
|
@item{Identifiers are [a-zA-Z_?][a-zA-Z_?0-9]*}
|
||||||
@item{Every real number has an H-expression form, although the
|
@item{Strings are "[^"]*"}
|
||||||
representation for an exact, non-integer rational number is
|
@item{Numbers are \d+(\.\d+)?}
|
||||||
actually three H-expressions, where the middle H-expression is
|
@item{And the following tokens + = * / - ^ || | && <= >= <- < > !
|
||||||
@racket[/].}
|
:: := : ; ` ' . , ( ) { } [ ]}
|
||||||
|
|
||||||
@item{Every character string is represented the same in H-expression
|
|
||||||
form as its S-expression form.}
|
|
||||||
|
|
||||||
@item{Every character is represented like a single-character string,
|
|
||||||
but (1) using a @litchar{'} as the delimiter instead of
|
|
||||||
@litchar{"}, and (2) protecting a @litchar{'} character content
|
|
||||||
with a @litchar{\} instead of protecting @litchar{"} character
|
|
||||||
content.}
|
|
||||||
|
|
||||||
@item{A list is represented with the H-expression sequence
|
|
||||||
@litchar{list(}@nonterm{v}@|lcomma|...@litchar{)},
|
|
||||||
where each @nonterm{v} is the representation of each element of
|
|
||||||
the list.}
|
|
||||||
|
|
||||||
@item{A pair that is not a list is represented with the H-expression
|
|
||||||
sequence
|
|
||||||
@litchar{cons(}@nonterm{v1}@|lcomma|@nonterm{v2}@litchar{)},
|
|
||||||
where @nonterm{v1} and @nonterm{v2} are the representations of
|
|
||||||
the pair elements.}
|
|
||||||
|
|
||||||
@item{A vector's representation depends on the value of the
|
|
||||||
@racket[print-vector-length] parameter. If it is @racket[#f],
|
|
||||||
the vector is represented with the H-expression sequence
|
|
||||||
@litchar{vectorN(}@nonterm{v}@|lcomma|...@litchar{)}, where
|
|
||||||
each @nonterm{v} is the representation of each element of the
|
|
||||||
vector. If @racket[print-vector-length] is set to @racket[#t],
|
|
||||||
the vector is represented with the H-expression sequence
|
|
||||||
@litchar{vectorN(}@nonterm{n}@|lcomma|@nonterm{v}@|lcomma|...@litchar{)},
|
|
||||||
where @nonterm{n} is the length of the vector and each
|
|
||||||
@nonterm{v} is the representation of each element of the
|
|
||||||
vector, and multiple instances of the same value at the end of
|
|
||||||
the vector are represented by a single @nonterm{v}.}
|
|
||||||
|
|
||||||
@item{The empty list is represented as the H-expression
|
|
||||||
@litchar{null}.}
|
|
||||||
|
|
||||||
@item{True is represented as the H-expression @litchar{true}.}
|
|
||||||
|
|
||||||
@item{False is represented as the H-expression @litchar{false}.}
|
|
||||||
|
|
||||||
]
|
]
|
||||||
|
|
||||||
|
@subsection{Structure}
|
||||||
|
|
||||||
|
After tokenization a Honu program will be converted into a tree with minimal
|
||||||
|
structure. Enclosing tokens will be grouped into a single object represented as
|
||||||
|
an s-expression. Enclosing tokens are pairs of (), {}, and [].
|
||||||
|
|
||||||
|
Consider the following stream of tokens
|
||||||
|
|
||||||
|
@codeblock|{
|
||||||
|
x ( 5 + 2 )
|
||||||
|
}|
|
||||||
|
|
||||||
|
This will be converted into
|
||||||
|
@codeblock|{
|
||||||
|
(x (#%parens 5 + 2))
|
||||||
|
}|
|
||||||
|
|
||||||
|
{} will be converted to (#%braces ...) and [] will be conveted to (#%brackets
|
||||||
|
...)
|
||||||
|
|
||||||
|
@defproc[(honu-read (port port?)) any]{
|
||||||
|
Read an s-expression from the given port.
|
||||||
|
}
|
||||||
|
|
||||||
|
@defproc[(honu-read-syntax (name any) (port port?)) any]{
|
||||||
|
Read a syntax object from the given port.
|
||||||
|
}
|
||||||
|
|
||||||
|
@defproc[(honu-lexer (port port?)) (list position-token?)]{
|
||||||
|
Tokenize a port into a stream of honu tokens.
|
||||||
|
}
|
||||||
|
|
||||||
|
@section{Parsing}
|
||||||
|
|
||||||
|
Honu is parsed using an algorithm based primarily on operator precedence. The
|
||||||
|
main focus of the operator precedence algorithm is to support infix operators.
|
||||||
|
In short, the algorithm operates in the following way
|
||||||
|
|
||||||
|
@itemlist[
|
||||||
|
@item{1. parse an @tech{expression}}
|
||||||
|
@item{2. check for a binary operator. if one is found then continue to step 3
|
||||||
|
otherwise return the expression from step 1 immediately.}
|
||||||
|
@item{3. parse another @tech{expression}}
|
||||||
|
@item{4. check for a binary operator. if one is found then check if its precedence is
|
||||||
|
higher than the operator found in step 2, and if so then continue parsing from
|
||||||
|
step 3. if the precedence is lower or an operator is not found then build an
|
||||||
|
infix expression from the left hand expression from step 1, the binary operator
|
||||||
|
in step 2, and the right hand expression in step 3.}
|
||||||
|
]
|
||||||
|
|
||||||
|
Parsing will maintain the following registers
|
||||||
|
@itemlist[
|
||||||
|
@item{@bold{left} - a function that takes the right hand side of an expression and
|
||||||
|
returns the infix expression by combining the left hand side and the
|
||||||
|
operator.}
|
||||||
|
@item{@bold{current} - the current right hand side}
|
||||||
|
@item{@bold{precedence} - represents the current precedence level}
|
||||||
|
@item{@bold{stream} - stream of tokens to parse}
|
||||||
|
]
|
||||||
|
|
||||||
|
This algorithm is illustrated with the following example. Consider the raw
|
||||||
|
stream of tokens
|
||||||
|
|
||||||
|
@codeblock|{ 1 + 2 * 3 - 9 }|
|
||||||
|
|
||||||
|
@tabular[
|
||||||
|
@list[
|
||||||
|
@list["left" (hspace 1) "current" (hspace 1) "precedence" (hspace 1) "stream"]
|
||||||
|
@list[@racket[(lambda (x) x)] (hspace 1)
|
||||||
|
@racket[#f] (hspace 1)
|
||||||
|
@racket[0] (hspace 1)
|
||||||
|
@codeblock|{1 + 2 * 3 - 9}|]
|
||||||
|
@list[@racket[(lambda (x) x)] (hspace 1)
|
||||||
|
@racket[1] (hspace 1)
|
||||||
|
@racket[0] (hspace 1)
|
||||||
|
@codeblock|{+ 2 * 3 - 9}|]
|
||||||
|
@list[@racket[(lambda (x) #'(+ 1 x))] (hspace 1)
|
||||||
|
@racket[#f] (hspace 1)
|
||||||
|
@racket[1] (hspace 1)
|
||||||
|
@codeblock|{2 * 3 - 9}|]
|
||||||
|
@list[@racket[(lambda (x) #'(+ 1 x))] (hspace 1)
|
||||||
|
@racket[2] (hspace 1)
|
||||||
|
@racket[1] (hspace 1)
|
||||||
|
@codeblock|{* 3 - 9}|]
|
||||||
|
@list[@racket[(lambda (x) (left #'(* 2 x)))] (hspace 1)
|
||||||
|
@racket[2] (hspace 1)
|
||||||
|
@racket[2] (hspace 1)
|
||||||
|
@codeblock|{3 - 9}|]
|
||||||
|
@list[@racket[(lambda (x) (left #'(* 2 x)))] (hspace 1)
|
||||||
|
@racket[3] (hspace 1)
|
||||||
|
@racket[2] (hspace 1)
|
||||||
|
@codeblock|{- 9}|]
|
||||||
|
@list[@racket[(lambda (x) #'(- (+ 1 (* 2 3)) x))] (hspace 1)
|
||||||
|
@racket[#f] (hspace 1)
|
||||||
|
@racket[1] (hspace 1)
|
||||||
|
@codeblock|{9}|]
|
||||||
|
@list[@racket[(lambda (x) #'(- (+ 1 (* 2 3)) x))] (hspace 1)
|
||||||
|
@racket[9] (hspace 1)
|
||||||
|
@racket[1] (hspace 1)
|
||||||
|
@codeblock|{}|]
|
||||||
|
]
|
||||||
|
]
|
||||||
|
|
||||||
|
When the stream of tokens is empty the @bold{current} register is passed as an
|
||||||
|
argument to the @bold{left} function which ultimately produces the expression
|
||||||
|
@codeblock|{(- (+ 1 (* 2 3)) 9)}|
|
||||||
|
|
||||||
|
In this example @racket[+] and @racket[-] both have a precedence of 1 while
|
||||||
|
@racket[*] has a precedence of 2. Currently, precedences can be any number that
|
||||||
|
can be compared with @racket[<=].
|
||||||
|
|
||||||
|
The example takes some liberties with respect to how the actual implementation
|
||||||
|
works. In particular the binary operators are syntax transformers that accept
|
||||||
|
the left and right hand expressions as parameters and return new syntax objects.
|
||||||
|
Also when the @racket[*] operator is parsed the @bold{left} function for
|
||||||
|
@racket[+] is nested inside the new function for @racket[*].
|
||||||
|
|
||||||
|
An @deftech{expression} can be one of the following
|
||||||
|
@itemlist[
|
||||||
|
@item{@bold{datum} - number, string, or symbol. @codeblock|{5}|}
|
||||||
|
@item{@bold{macro} - a symbol bound to a syntax transformer.
|
||||||
|
@codeblock|{cond x = 5: true, else: false}|}
|
||||||
|
@item{@bold{stop} - a symbol which immediately ends the current expression.
|
||||||
|
these are currently , ; :}
|
||||||
|
@item{@bold{lambda expression} - an identifier followed by @racket[(id ...)]
|
||||||
|
followed by a block of code in braces. @codeblock|{add(x, y){ x + y }}|}
|
||||||
|
@item{@bold{function application} - an expression followed by @racket[(arg
|
||||||
|
...)]. @codeblock|{f(2, 2)}|}
|
||||||
|
@item{@bold{list comprehension} - @codeblock|{[x + 1: x <- [1, 2, 3]]}|}
|
||||||
|
@item{@bold{block of code} - a series of expressions wrapped in braces.}
|
||||||
|
@item{@bold{expression grouping} - any expression inside a set of parenthesis
|
||||||
|
@codeblock|{(1 + 1) * 2}|}
|
||||||
|
]
|
||||||
|
|
||||||
|
@section{Macros}
|
||||||
|
@section{Language}
|
||||||
|
@section{Examples}
|
||||||
|
|
Loading…
Reference in New Issue
Block a user