4 Pattern matching: syntax-case and syntax-rules
Most useful syntax transformers work by taking some input syntax, and
+rearranging the pieces into something else. As we saw, this is
+possible but tedious using list accessors such as
+cadddr. It’s more convenient and less error-prone to use
+match to do pattern-matching.
Historically, syntax-case and
+syntax-rules pattern matching came first. match was
+added to Racket later.
It turns out that pattern-matching was one of the first improvements
+to be added to the Racket macro system. It’s called
+syntax-case, and has a shorthand for simple situations called
+define-syntax-rule.
Recall our previous example:
Here’s what it looks like using syntax-case:
|
> (our-if-using-syntax-case #t "true" "false") |
"true" |
Pretty similar, huh? The pattern matching part looks almost exactly
+the same. The way we specify the new syntax is simpler. We don’t need
+to do quasi-quoting and unquoting. We don’t need to use
+datum->syntax. Instead, we supply a "template", which uses
+variables from the pattern.
There is a shorthand for simple pattern-matching cases, which expands
+into syntax-case. It’s called define-syntax-rule:
|
> (our-if-using-syntax-rule #t "true" "false") |
"true" |
Here’s the thing about define-syntax-rule. Because it’s so
+simple, define-syntax-rule is often the first thing people are
+taught about macros. But it’s almost deceptively simple. It looks so
+much like defining a normal run time function—yet it’s not. It’s
+working at compile time, not run time. Worse, the moment you want to
+do more than define-syntax-rule can handle, you can fall off
+a cliff into what feels like complicated and confusing
+territory. Hopefully, because we started with a basic syntax
+transformer, and worked up from that, we won’t have that problem. We
+can appreciate define-syntax-rule as a convenient shorthand,
+but not be scared of, or confused about, that for which it’s
+shorthand.
Most of the materials I found for learning macros, including the
+Racket Guide, do a very good job explaining
+how
+patterns and templates work. So I won’t regurgitate that here.
Sometimes, we need to go a step beyond the pattern and template. Let’s
+look at some examples, how we can get confused, and how to get it
+working.
4.1 Pattern variable vs. template—fight!
Let’s say we want to define a function with a hyphenated name, a-b,
+but we supply the a and b parts separately. The Racket struct
+macro does something like this: (struct foo (field1 field2))
+automatically defines a number of functions whose names are variations
+on the name foo—such as foo-field1,
+foo-field2, foo?, and so on.
So let’s pretend we’re doing something like that. We want to transform
+the syntax (hyphen-define a b (args) body) to the syntax
+(define (a-b args) body).
A wrong first attempt is:
|
eval:47:0: a: pattern variable cannot be used outside of a |
template |
in: a |
Huh. We have no idea what this error message means. Well, let’s try to
+work it out. The "template" the error message refers to is the
+#'(define (name args ...) body0 body ...) portion. The
+let isn’t part of that template. It sounds like we can’t use
+a (or b) in the let part.
In fact, syntax-case can have as many templates as you
+want. The obvious, required template is the final expression supplying
+the output syntax. But you can use syntax (a.k.a. #’) on a
+pattern variable. This makes another template, albeit a small, "fun
+size" template. Let’s try that:
No more error—good! Let’s try to use it:
> (hyphen-define/wrong1.1 foo bar () #t) |
> (foo-bar) |
foo-bar: undefined; |
cannot reference an identifier before its definition |
in module: 'program |
Apparently our macro is defining a function with some name other than
+foo-bar. Huh.
This is where the Macro Stepper in DrRacket is
+invaluable.
Even if you prefer mostly to use Emacs, this
+is a situation where it’s definitely worth temporarily using DrRacket
+for its Macro Stepper.

The Macro Stepper says that the use of our macro:
(hyphen-define/wrong1.1 foo bar () #t)
expanded to:
(define (name) #t)
Well that explains it. Instead, we wanted to expand to:
(define (foo-bar) #t)
Our template is using the symbol name but we wanted its
+value, such as foo-bar in this use of our macro.
Is there anything we already know that behaves like this—where using
+a variable in the template yields its value? Yes: Pattern
+variables. Our pattern doesn’t include name because we don’t
+expect it in the original syntax—indeed the whole point of this
+macro is to create it. So name can’t be in the main
+pattern. Fine—let’s make an additional pattern. We can do
+that using an additional, nested syntax-case:
Looks weird? Let’s take a deep breath. Normally our transformer
+function is given syntax by Racket, and we pass that syntax to
+syntax-case. But we can also create some syntax of our own,
+on the fly, and pass that to syntax-case. That’s all
+we’re doing here. The whole (datum->syntax ...) expression is
+syntax that we’re creating on the fly. We can give that to
+syntax-case, and match it using a pattern variable named
+name. Voila, we have a new pattern variable. We can use it in
+a template, and its value will go in the template.
We might have one more—just one, I promise!—small problem left.
+Let’s try to use our new version:
> (hyphen-define/wrong1.2 foo bar () #t) |
> (foo-bar) |
foo-bar: undefined; |
cannot reference an identifier before its definition |
in module: 'program |
Hmm. foo-bar is still not defined. Back to the Macro
+Stepper. It says now we’re expanding to:
(define (|#<syntax:11:24foo>-#<syntax:11:28 bar>|) #t)
Oh right: #'a and #'b are syntax objects. Therefore
(string->symbol (format "~a-~a" #'a #'b))
is the printed form of both syntax objects, joined by a hyphen:
|#<syntax:11:24foo>-#<syntax:11:28 bar>|
Instead we want the datum in the syntax objects, such as the symbols
+foo and bar. Which we get using
+syntax->datum:
|
> (hyphen-define/ok1 foo bar () #t) |
> (foo-bar) |
#t |
And now it works!
Next, some shortcuts.
Instead of an additional, nested syntax-case, we could use
+with-syntaxAnother name for
+with-syntax could be, "with new pattern variable".. This
+rearranges the syntax-case to look more like a let
+statement—first the name, then the value. Also it’s more convenient
+if we need to define more than one pattern variable.
|
> (hyphen-define/ok2 foo bar () #t) |
> (foo-bar) |
#t |
Again, with-syntax is simply syntax-case rearranged:
Whether you use an additional syntax-case or use
+with-syntax, either way you are simply defining additional
+pattern variables. Don’t let the terminology and structure make it
+seem mysterious.
We know that let doesn’t let us use a binding in a subsequent
+one:
|
a: undefined; |
cannot reference an identifier before its definition |
in module: 'program |
Instead we can nest lets:
> (let ([a 0]) | (let ([b a]) | b)) |
|
0 |
Or use a shorthand for nesting, let*:
Similarly, instead of writing nested with-syntaxs, we can use
+with-syntax*:
One gotcha is that with-syntax* isn’t provided by
+racket/base. We must (require (for-syntax racket/syntax)). Otherwise we may get a rather bewildering error
+message:
...: ellipses not allowed as an expression in: ....
There is a utility function in racket/syntax called
+format-id that lets us format identifier names more
+succinctly than what we did above:
> (require (for-syntax racket/syntax)) |
|
> (hyphen-define/ok3 bar baz () #t) |
> (bar-baz) |
#t |
Using format-id is convenient as it handles the tedium of
+converting from syntax to symbol datum to string ... and all the way
+back.
4.1.4 Another example
Finally, here’s a variation that accepts an arbitary number of name
+parts to be joined with hyphens:
> (require (for-syntax racket/string racket/syntax)) |
|
> (hyphen-define* (foo bar baz) (v) (* 2 v)) |
> (foo-bar-baz 50) |
100 |
To review:
You can’t use a pattern variable outside of a template. But
+you can use syntax or #’ on a pattern variable to make
+an ad hoc, "fun size" template.
If you want to munge pattern variables for use in the
+template, with-syntax is your friend, because it lets you
+create new pattern variables.
Usually you’ll need to use syntax->datum to get the
+interesting value inside.
format-id is convenient for formatting identifier
+names.
4.2 Making our own struct
Let’s apply what we just learned to a more-realistic example. We’ll
+pretend that Racket doesn’t already have a struct
+capability. Fortunately, we can write a macro to provide our own
+system for defining and using structures. To keep things simple, our
+structure will be immutable (read-only) and it won’t support
+inheritance.
Given a structure declaration like:
(our-struct name (field1 field2 ...))
We need to define some procedures:
A constructor procedure whose name is the struct name. We’ll
+represent structures as a vector. The structure name will be
+element zero. The fields will be elements one onward.
A predicate, whose name is the struct name with ?
+appended.
For each field, an accessor procedure to get its value. These
+will be named struct-field (the name of the struct, a hyphen, and the
+field name).
> (require (for-syntax racket/syntax)) |
|
; Test it out |
> (require rackunit) |
> (our-struct foo (a b)) |
> (define s (foo 1 2)) |
> (check-true (foo? s)) |
> (check-false (foo? 1)) |
> (check-equal? (foo-a s) 1) |
> (check-equal? (foo-b s) 2) |
|
; The tests passed. |
; Next, what if someone tries to declare: |
> (our-struct "blah" ("blah" "blah")) |
format-id: contract violation |
expected: (or/c string? symbol? identifier? keyword? char? |
number?) |
given: #<syntax:83:0 "blah"> |
The error message is not very helpful. It’s coming from
+format-id, which is a private implementation detail of our macro.
You may know that a syntax-case clause can take an
+optional "guard" or "fender" expression. Instead of
[pattern template]
It can be:
[pattern guard template]
Let’s add a guard expression to our clause:
> (require (for-syntax racket/syntax)) |
|
; Now the same misuse gives a better error message: |
> (our-struct "blah" ("blah" "blah")) |
eval:86:0: our-struct: not an identifier |
at: "blah" |
in: (our-struct "blah" ("blah" "blah")) |
Later, we’ll see how syntax-parse makes it even easier to
+check usage and provide helpful messages about mistakes.
4.3 Using dot notation for nested hash lookups
The previous two examples used a macro to define functions whose names
+were made by joining identifiers provided to the macro. This example
+does the opposite: The identifier given to the macro is split into
+pieces.
If you write programs for web services you deal with JSON, which is
+represented in Racket by a jsexpr?. JSON often has
+dictionaries that contain other dictionaries. In a jsexpr?
+these are represented by nested hasheq tables:
In JavaScript you can use dot notation:
In Racket it’s not so convenient:
(hash-ref (hash-ref (hash-ref js 'a) 'b) 'c)
We can write a helper function to make this a bit cleaner:
; This helper function: |
|
; Lets us say: |
> (hash-refs js '(a b c)) |
"value" |
That’s better. Can we go even further and use a dot notation somewhat
+like JavaScript?
; This macro: |
> (require (for-syntax racket/syntax)) |
|
; Gives us "sugar" to say this: |
> (hash.refs js.a.b.c) |
"value" |
; Try finding a key that doesn't exist: |
> (hash.refs js.blah) |
#f |
; Try finding a key that doesn't exist, specifying the default: |
> (hash.refs js.blah 'did-not-exist) |
'did-not-exist |
It works!
We’ve started to appreciate that our macros should give helpful
+messages when used in error. Let’s try to do that here.
> (require (for-syntax racket/syntax)) |
|
; See if we catch each of the misuses |
> (hash.refs) |
eval:96:0: hash.refs: Expected (hash.key0[.key1 ...] |
[default]) |
at: chain |
in: (hash.refs) |
> (hash.refs 0) |
eval:98:0: hash.refs: Expected (hash.key0[.key1 ...] |
[default]) |
at: 0 |
in: (hash.refs 0 #f) |
> (hash.refs js) |
eval:99:0: hash.refs: Expected hash.key |
at: js |
in: (hash.refs js #f) |
> (hash.refs js.) |
eval:100:0: hash.refs: Expected hash.key |
at: js. |
in: (hash.refs js. #f) |
Not too bad. Of course, the version with error-checking is quite a bit
+longer. Error-checking code generally tends to obscure the logic, and
+does here. Fortuantely we’ll soon see how syntax-parse can
+help mitigate that, in much the same way as contracts in normal
+Racket or types in Typed Racket.
Maybe we’re not convinced that writing (hash.refs js.a.b.c)
+is really clearer than (hash-refs js '(a b c)). Maybe we
+won’t actually use this approach. But the Racket macro system makes it
+a possible choice.