
Also, updates some of the mzlib files to point at `racket/*' libraries rather than to `scheme/*' ones.
366 lines
13 KiB
Racket
366 lines
13 KiB
Racket
#lang scribble/doc
|
|
@(require scribble/manual
|
|
scribble/struct
|
|
scribble/decode
|
|
scribble/eval
|
|
"parse-common.rkt"
|
|
(for-label (only-in syntax/parse ...+)))
|
|
|
|
@(define-syntax-rule (defdummy id)
|
|
(defidentifier (quote-syntax id)
|
|
#:form? #t #:index? #f #:show-libs? #f))
|
|
|
|
@title[#:tag "stxparse-intro"]{Introduction}
|
|
|
|
@;{Dummy declaration}
|
|
@declare-exporting[syntax/scribblings/parse/parse-dummy-bindings]
|
|
|
|
This section provides an introduction to writing robust macros with
|
|
@racket[syntax-parse] and syntax classes.
|
|
|
|
As a running example we use the following task: write a macro named
|
|
@racket[mylet] that has the same syntax and behavior as Racket's
|
|
@racket[let] form. The macro should produce good error messages when
|
|
used incorrectly.
|
|
|
|
Here is the specification of @racket[mylet]'s syntax:
|
|
|
|
@;{bleh!}
|
|
@specform[#:literals (mylet)
|
|
(code:line (@#,(defdummy mylet) ([var-id rhs-expr] ...) body ...+)
|
|
(mylet loop-id ([var-id rhs-expr] ...) body ...+))]
|
|
|
|
For simplicity, we handle only the first case for now. We return to
|
|
the second case later in the introduction.
|
|
|
|
The macro can be implemented very simply using
|
|
@racket[define-syntax-rule]:
|
|
|
|
@myinteraction[
|
|
(define-syntax-rule (mylet ([var rhs] ...) body ...)
|
|
((lambda (var ...) body ...) rhs ...))
|
|
]
|
|
|
|
When used correctly, the macro works, but it behaves very badly in the
|
|
presence of errors. In some cases, the macro merely fails with an
|
|
uninformative error message; in others, it blithely accepts illegal
|
|
syntax and passes it along to @racket[lambda], with strange
|
|
consequences:
|
|
|
|
@myinteraction[
|
|
(mylet ([a 1] [b 2]) (+ a b))
|
|
(mylet (b 2) (sub1 b))
|
|
(mylet ([1 a]) (add1 a))
|
|
(mylet ([#:x 1] [y 2]) (* x y))
|
|
]
|
|
|
|
These examples of illegal syntax are not to suggest that a typical
|
|
programmer would make such mistakes attempting to use
|
|
@racket[mylet]. At least, not often, not after an initial learning
|
|
curve. But macros are also used by inexpert programmers and as targets
|
|
of other macros (or code generators), and many macros are far more
|
|
complex than @racket[mylet]. Macros must validate their syntax and
|
|
report appropriate errors. Furthermore, the macro writer benefits from
|
|
the @emph{machine-checked} specification of syntax in the form of more
|
|
readable, maintainable code.
|
|
|
|
We can improve the error behavior of the macro by using
|
|
@racket[syntax-parse]. First, we import @racket[syntax-parse] into the
|
|
@tech[#:doc '(lib
|
|
"scribblings/reference/reference.scrbl")]{transformer environment},
|
|
since we will use it to implement a macro transformer.
|
|
|
|
@myinteraction[(require (for-syntax syntax/parse))]
|
|
|
|
The following is the syntax specification above transliterated into a
|
|
@racket[syntax-parse] macro definition. It behaves no better than the
|
|
version using @racket[define-syntax-rule] above.
|
|
|
|
@myinteraction[
|
|
(define-syntax (mylet stx)
|
|
(syntax-parse stx
|
|
[(_ ([var-id rhs-expr] ...) body ...+)
|
|
#'((lambda (var-id ...) body ...) rhs-expr ...)]))
|
|
]
|
|
|
|
One minor difference is the use of @racket[...+] in the pattern;
|
|
@racket[...] means match zero or more repetitions of the preceding
|
|
pattern; @racket[...+] means match one or more. Only @racket[...] may
|
|
be used in the template, however.
|
|
|
|
The first step toward validation and high-quality error reporting is
|
|
annotating each of the macro's pattern variables with the @tech{syntax
|
|
class} that describes its acceptable syntax. In @racket[mylet], each
|
|
variable must be an @racket[identifier] (@racket[id] for short) and
|
|
each right-hand side must be an @racket[expr] (expression). An
|
|
@tech{annotated pattern variable} is written by concatenating the
|
|
pattern variable name, a colon character, and the syntax class
|
|
name.@margin-note*{For an alternative to the ``colon'' syntax, see the
|
|
@racket[~var] pattern form.}
|
|
|
|
@myinteraction[
|
|
(define-syntax (mylet stx)
|
|
(syntax-parse stx
|
|
[(_ ((var:id rhs:expr) ...) body ...+)
|
|
#'((lambda (var ...) body ...) rhs ...)]))
|
|
]
|
|
Note that the syntax class annotations do not appear in the template
|
|
(i.e., @racket[var], not @racket[var:id]).
|
|
|
|
The syntax class annotations are checked when we use the macro.
|
|
@myinteraction[
|
|
(mylet ([a 1] [b 2]) (+ a b))
|
|
(mylet (["a" 1]) (add1 a))
|
|
]
|
|
The @racket[expr] syntax class does not actually check that the term
|
|
it matches is a valid expression---that would require calling that
|
|
macro expander. Instead, @racket[expr] just means not a keyword.
|
|
@myinteraction[
|
|
(mylet ([a #:whoops]) 1)
|
|
]
|
|
Also, @racket[syntax-parse] knows how to report a few kinds of errors
|
|
without any help:
|
|
@myinteraction[
|
|
(mylet ([a 1 2]) (* a a))
|
|
]
|
|
There are other kinds of errors, however, that this macro does not
|
|
handle gracefully:
|
|
@myinteraction[
|
|
(mylet (a 1) (+ a 2))
|
|
]
|
|
It's too much to ask for the macro to respond, ``This expression is
|
|
missing a pair of parentheses around @racket[(a 1)].'' The pattern
|
|
matcher is not that smart. But it can pinpoint the source of the
|
|
error: when it encountered @racket[a] it was expecting what we might
|
|
call a ``binding pair,'' but that term is not in its vocabulary yet.
|
|
|
|
To allow @racket[syntax-parse] to synthesize better errors, we must
|
|
attach @emph{descriptions} to the patterns we recognize as discrete
|
|
syntactic categories. One way of doing that is by defining new syntax
|
|
classes:@margin-note*{Another way is the @racket[~describe] pattern
|
|
form.}
|
|
|
|
@myinteraction[
|
|
(define-syntax (mylet stx)
|
|
|
|
(define-syntax-class binding
|
|
#:description "binding pair"
|
|
(pattern (var:id rhs:expr)))
|
|
|
|
(syntax-parse stx
|
|
[(_ (b:binding ...) body ...+)
|
|
#'((lambda (b.var ...) body ...) b.rhs ...)]))
|
|
]
|
|
|
|
Note that we write @racket[b.var] and @racket[b.rhs] now. They are the
|
|
@tech{nested attributes} formed from the annotated pattern variable
|
|
@racket[b] and the attributes @racket[var] and @racket[rhs] of the
|
|
syntax class @racket[binding].
|
|
|
|
Now the error messages can talk about ``binding pairs.''
|
|
@myinteraction[
|
|
(mylet (a 1) (+ a 2))
|
|
]
|
|
Errors are still reported in more specific terms when possible:
|
|
@myinteraction[
|
|
(mylet (["a" 1]) (+ a 2))
|
|
]
|
|
|
|
There is one other constraint on the legal syntax of
|
|
@racket[mylet]. The variables bound by the different binding pairs
|
|
must be distinct. Otherwise the macro creates an illegal
|
|
@racket[lambda] form:
|
|
@myinteraction[
|
|
(mylet ([a 1] [a 2]) (+ a a))
|
|
]
|
|
|
|
Constraints such as the distinctness requirement are expressed as side
|
|
conditions, thus:
|
|
@myinteraction[
|
|
(define-syntax (mylet stx)
|
|
|
|
(define-syntax-class binding
|
|
#:description "binding pair"
|
|
(pattern (var:id rhs:expr)))
|
|
|
|
(syntax-parse stx
|
|
[(_ (b:binding ...) body ...+)
|
|
#:fail-when (check-duplicate-identifier
|
|
(syntax->list #'(b.var ...)))
|
|
"duplicate variable name"
|
|
#'((lambda (b.var ...) body ...) b.rhs ...)]))
|
|
]
|
|
@myinteraction[
|
|
(mylet ([a 1] [a 2]) (+ a a))
|
|
]
|
|
The @racket[#:fail-when] keyword is followed by two expressions: the
|
|
condition and the error message. When the condition evaluates to
|
|
anything but @racket[#f], the pattern fails. Additionally, if the
|
|
condition evaluates to a syntax object, that syntax object is used to
|
|
pinpoint the cause of the failure.
|
|
|
|
Syntax classes can have side conditions, too. Here is the macro
|
|
rewritten to include another syntax class representing a ``sequence of
|
|
distinct binding pairs.''
|
|
@myinteraction[
|
|
(define-syntax (mylet stx)
|
|
|
|
(define-syntax-class binding
|
|
#:description "binding pair"
|
|
(pattern (var:id rhs:expr)))
|
|
|
|
(define-syntax-class distinct-bindings
|
|
#:description "sequence of distinct binding pairs"
|
|
(pattern (b:binding ...)
|
|
#:fail-when (check-duplicate-identifier
|
|
(syntax->list #'(b.var ...)))
|
|
"duplicate variable name"
|
|
#:with (var ...) #'(b.var ...)
|
|
#:with (rhs ...) #'(b.rhs ...)))
|
|
|
|
(syntax-parse stx
|
|
[(_ bs:distinct-bindings . body)
|
|
#'((lambda (bs.var ...) . body) bs.rhs ...)]))
|
|
]
|
|
Here we've introduced the @racket[#:with] clause. A @racket[#:with]
|
|
clause matches a pattern with a computed term. Here we use it to bind
|
|
@racket[var] and @racket[rhs] as attributes of
|
|
@racket[distinct-bindings]. By default, a syntax class only exports
|
|
its patterns' pattern variables as attributes, not their nested
|
|
attributes.@margin-note*{The alternative would be to explicitly declare
|
|
the attributes of @racket[distinct-bindings] to include the nested
|
|
attributes @racket[b.var] and @racket[b.rhs], using the
|
|
@racket[#:attribute] option. Then the macro would refer to
|
|
@racket[bs.b.var] and @racket[bs.b.rhs].}
|
|
|
|
Alas, so far the macro only implements half of the functionality
|
|
offered by Racket's @racket[let]. We must add the
|
|
``named-@racket[let]'' form. That turns out to be as simple as adding
|
|
a new clause:
|
|
|
|
@myinteraction[
|
|
(define-syntax (mylet stx)
|
|
|
|
(define-syntax-class binding
|
|
#:description "binding pair"
|
|
(pattern (var:id rhs:expr)))
|
|
|
|
(define-syntax-class distinct-bindings
|
|
#:description "sequence of distinct binding pairs"
|
|
(pattern (b:binding ...)
|
|
#:fail-when (check-duplicate-identifier
|
|
(syntax->list #'(b.var ...)))
|
|
"duplicate variable name"
|
|
#:with (var ...) #'(b.var ...)
|
|
#:with (rhs ...) #'(b.rhs ...)))
|
|
|
|
(syntax-parse stx
|
|
[(_ bs:distinct-bindings body ...+)
|
|
#'((lambda (bs.var ...) body ...) bs.rhs ...)]
|
|
[(_ loop:id bs:distinct-bindings body ...+)
|
|
#'(letrec ([loop (lambda (bs.var ...) body ...)])
|
|
(loop bs.rhs ...))]))
|
|
]
|
|
We are able to reuse the @racket[distinct-bindings] syntax class, so
|
|
the addition of the ``named-@racket[let]'' syntax requires only three
|
|
lines.
|
|
|
|
But does adding this new case affect @racket[syntax-parse]'s ability
|
|
to pinpoint and report errors?
|
|
@myinteraction[
|
|
(mylet ([a 1] [b 2]) (+ a b))
|
|
(mylet (["a" 1]) (add1 a))
|
|
(mylet ([a #:whoops]) 1)
|
|
(mylet ([a 1 2]) (* a a))
|
|
(mylet (a 1) (+ a 2))
|
|
(mylet ([a 1] [a 2]) (+ a a))
|
|
]
|
|
The error reporting for the original syntax seems intact. We should
|
|
verify that the named-@racket[let] syntax is working, that
|
|
@racket[syntax-parse] is not simply ignoring that clause.
|
|
@myinteraction[
|
|
(mylet loop ([a 1] [b 2]) (+ a b))
|
|
(mylet loop (["a" 1]) (add1 a))
|
|
(mylet loop ([a #:whoops]) 1)
|
|
(mylet loop ([a 1 2]) (* a a))
|
|
(mylet loop (a 1) (+ a 2))
|
|
(mylet loop ([a 1] [a 2]) (+ a a))
|
|
]
|
|
|
|
How does @racket[syntax-parse] decide which clause the programmer was
|
|
attempting, so it can use it as a basis for error reporting? After
|
|
all, each of the bad uses of the named-@racket[let] syntax are also
|
|
bad uses of the normal syntax, and vice versa. And yet the macro doen
|
|
not produce errors like ``@racket[mylet]: expected sequence of
|
|
distinct binding pairs at: @racket[loop].''
|
|
|
|
The answer is that @racket[syntax-parse] records a list of all the
|
|
potential errors (including ones like @racket[loop] not matching
|
|
@racket[distinct-binding]) along with the @emph{progress} made before
|
|
each error. Only the error with the most progress is reported.
|
|
|
|
For example, in this bad use of the macro,
|
|
@myinteraction[
|
|
(mylet loop (["a" 1]) (add1 a))
|
|
]
|
|
there are two potential errors: expected @racket[distinct-bindings] at
|
|
@racket[loop] and expected @racket[identifier] at @racket["a"]. The
|
|
second error occurs further in the term than the first, so it is
|
|
reported.
|
|
|
|
For another example, consider this term:
|
|
@myinteraction[
|
|
(mylet (["a" 1]) (add1 a))
|
|
]
|
|
Again, there are two potential errors: expected @racket[identifier] at
|
|
@racket[(["a" 1])] and expected @racket[identifier] at
|
|
@racket["a"]. They both occur at the second term (or first argument,
|
|
if you prefer), but the second error occurs deeper in the
|
|
term. Progress is based on a left-to-right traversal of the syntax.
|
|
|
|
A final example: consider the following:
|
|
@myinteraction[
|
|
(mylet ([a 1] [a 2]) (+ a a))
|
|
]
|
|
There are two errors again: duplicate variable name at @racket[([a 1]
|
|
[a 2])] and expected @racket[identifier] at @racket[([a 1] [a
|
|
2])]. Note that as far as @racket[syntax-parse] is concerned, the
|
|
progress associated with the duplicate error message is the second
|
|
term (first argument), not the second occurrence of @racket[a]. That's
|
|
because the check is associated with the entire
|
|
@racket[distinct-bindings] pattern. It would seem that both errors
|
|
have the same progress, and yet only the first one is reported. The
|
|
difference between the two is that the first error is from a
|
|
@emph{post-traversal} check, whereas the second is from a normal
|
|
(i.e., pre-traversal) check. A post-traveral check is considered to
|
|
have made more progress than a pre-traversal check of the same term;
|
|
indeed, it also has greater progress than any failure @emph{within}
|
|
the term.
|
|
|
|
It is, however, possible for multiple potential errors to occur with
|
|
the same progress. Here's one example:
|
|
@myinteraction[
|
|
(mylet "not-even-close")
|
|
]
|
|
In this case @racket[syntax-parse] reports both errors.
|
|
|
|
Even with all of the annotations we have added to our macro, there are
|
|
still some misuses that defy @racket[syntax-parse]'s error reporting
|
|
capabilities, such as this example:
|
|
@myinteraction[
|
|
(mylet)
|
|
]
|
|
The philosophy behind @racket[syntax-parse] is that in these
|
|
situations, a generic error such as ``bad syntax'' is justified. The
|
|
use of @racket[mylet] here is so far off that the only informative
|
|
error message would include a complete recapitulation of the syntax of
|
|
@racket[mylet]. That is not the role of error messages, however; it is
|
|
the role of documentation.
|
|
|
|
This section has provided an introduction to syntax classes, side
|
|
conditions, and progress-ordered error reporting. But
|
|
@racket[syntax-parse] has many more features. Continue to the
|
|
@secref{stxparse-examples} section for samples of other features in
|
|
working code, or skip to the subsequent sections for the complete
|
|
reference documentation.
|