Fear of Macros

Contents:
4.1 "A pattern variable can’t be used outside of a template" |
1 Preface
I learned Racket after 25 years of mostly using C and C++.
Some psychic whiplash resulted.
"All the parentheses" was actually not a big deal. Instead, the first +
Contents:
4.1 "A pattern variable can’t be used outside of a template" |
1 Preface
I learned Racket after 25 years of mostly using C and C++.
Some psychic whiplash resulted.
"All the parentheses" was actually not a big deal. Instead, the first mind warp was functional programming. Before long I wrapped my brain around it, and went on to become comfortable and effective with many other aspects and features of Racket.
But two final frontiers remained: Macros and continuations.
I found that simple macros were easy and understandable, plus there @@ -188,37 +188,62 @@ work it out. The "template" the error message refers to is the want. The obvious, required template is the final expression supplying the output syntax. But you can use syntax (a.k.a. #’) on a pattern variable. This makes another template, albeit a small, "fun -size" template. Let’s try that:
> (define-syntax (hyphen-define/wrong1.1 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (let ([name (string->symbol (format "~a-~a" #'a #'b))]) #'(define (name args ...) body0 body ...))]))
No more error—
> (hyphen-define/wrong1.1 foo bar () #t) > (foo-bar) foo-bar: undefined;
cannot reference an identifier before its definition
in module: 'program
It seems we’re defining a function with a name other than -foo-bar?
This is where the Macro Stepper in DrRacket is invaluable. Even if you -prefer mostly to use Emacs, this is a situation where it’s worth using -DrRacket at least temporarily for its Macro Stepper.
The Macro Stepper says that the use of our macro:
(hyphen-define/wrong1.1 foo bar () #t)
expanded to:
(define (name) #t)
Well that explains it. Instead, we wanted to expand to:
(define (foo-bar) #t)
Our template is using the symbol name but we wanted its -value, such as foo-bar in this use of our macro.
A solution here is with-syntaxYou could -consider with-syntax to mean, "define pattern variables"., -which lets us say that name is something whose value can be -used in our output template. In effect, it lets us say that -name is an additional pattern variable.
> (define-syntax (hyphen-define/wrong1.3 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (with-syntax ([name (datum->syntax stx (string->symbol (format "~a-~a" #'a #'b)))]) #'(define (name args ...) body0 body ...))])) > (hyphen-define/wrong1.3 foo bar () #t) > (foo-bar) foo-bar: undefined;
cannot reference an identifier before its definition
in module: 'program
Hmm. foo-bar is still not defined. Back to the Macro -Stepper. It says now we’re expanding to:
(define (|#<syntax:11:24foo>-#<syntax:11:28 bar>|) #t)
Oh right: #'a and #'b are syntax objects, and -format is printing them as such. Instead we want the datum in -the syntax objects (the symbols foo and bar). Let’s -use syntax->datum:
> (define-syntax (hyphen-define/ok1 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (with-syntax ([name (datum->syntax stx (string->symbol (format "~a-~a" (syntax->datum #'a) (syntax->datum #'b))))]) #'(define (name args ...) body0 body ...))])) > (hyphen-define/ok1 foo bar () #t) > (foo-bar) #t
And now it works!
By the way, there is a utility function in racket/syntax -called format-id that lets us format identifier names more +size" template. Let’s try that:
> (define-syntax (hyphen-define/wrong1.1 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (let ([name (string->symbol (format "~a-~a" #'a #'b))]) #'(define (name args ...) body0 body ...))]))
No more error—
> (hyphen-define/wrong1.1 foo bar () #t) > (foo-bar) foo-bar: undefined;
cannot reference an identifier before its definition
in module: 'program
Apparently our macro is defining a function with some name other than +foo-bar. Huh.
Even if you prefer mostly to use Emacs, this +is a situation where it’s definitely worth temporarily using DrRacket +for its Macro Stepper.
The Macro Stepper says that the use of our macro:
(hyphen-define/wrong1.1 foo bar () #t)
expanded to:
(define (name) #t)
Well that explains it. Instead, we wanted to expand to:
(define (foo-bar) #t)
Our template is using the symbol name but we wanted its +value, such as foo-bar in this use of our macro.
Can we think of something we already know that behaves like
+this—
> (define-syntax (hyphen-define/wrong1.2 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (syntax-case (datum->syntax stx (string->symbol (format "~a-~a" #'a #'b))) () [name #'(define (name args ...) body0 body ...)])]))
Looks weird? Let’s take a deep breath. Normally our transformer +function is given syntax by Racket, and we pass that syntax to +syntax-case. But we can also create some syntax of our own, +on the fly, and pass that to syntax-case. That’s all +we’re doing here. The whole (datum->syntax ...) expression is +syntax that we’re creating on the fly. We can give that to +syntax-case, and match it using a pattern variable named +name. Voila, we have a new pattern variable. We can use it in +a template, and its value will go in the template.
We might have one more—
> (hyphen-define/wrong1.2 foo bar () #t) > (foo-bar) foo-bar: undefined;
cannot reference an identifier before its definition
in module: 'program
Hmm. foo-bar is still not defined. Back to the Macro +Stepper. It says now we’re expanding to:
(define (|#<syntax:11:24foo>-#<syntax:11:28 bar>|) #t)
Oh right: #'a and #'b are syntax objects. Therefore
(string->symbol (format "~a-~a" #'a #'b))
is something like
|#<syntax:11:24foo>-#<syntax:11:28 bar>|
—
Instead we want the datum in the syntax objects (such as the symbols +foo and bar). Let’s use syntax->datum to +get it:
> (define-syntax (hyphen-define/ok1 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (syntax-case (datum->syntax stx (string->symbol (format "~a-~a" (syntax->datum #'a) (syntax->datum #'b)))) () [name #'(define (name args ...) body0 body ...)])])) > (hyphen-define/ok1 foo bar () #t) > (foo-bar) #t
And now it works!
Now for two shortcuts.
Instead of an additional, nested syntax-case we could use
+with-syntaxAnother name for
+with-syntax could be, "define pattern variable".. This
+rearranges the syntax-case to look more like a let
+statement—
> (define-syntax (hyphen-define/ok2 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (with-syntax ([name (datum->syntax stx (string->symbol (format "~a-~a" (syntax->datum #'a) (syntax->datum #'b))))]) #'(define (name args ...) body0 body ...))])) > (hyphen-define/ok2 foo bar () #t) > (foo-bar) #t
Whether you use an additional syntax-case or use +with-syntax, either way you are simply defining an additional +pattern variable. Don’t let the terminology and structure make it seem +mysterious.
Also, there is a utility function in racket/syntax called +format-id that lets us format identifier names more succinctly. As we’ve learned, we need to require the module -using for-syntax, since we need it at compile time:
> (require (for-syntax racket/syntax))
> (define-syntax (hyphen-define/ok2 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (with-syntax ([name (format-id stx "~a-~a" #'a #'b)]) #'(define (name args ...) body0 body ...))])) > (hyphen-define/ok2 bar baz () #t) > (bar-baz) #t
Using format-id is convenient as it handles the tedium of -converting from syntax to datum and back again.
To review:
If you want to munge pattern variables for use in the -template, with-syntax is your friend.
You will need to use syntax or #’ on the pattern -variables to turn them into "fun size" templates.
Usually you’ll also need to use syntax->datum to get -the interesting value inside.
format-id is convenient for formatting identifier -names.
4.2 Making our own struct
In this example we’ll pretend that Racket doesn’t already have a -struct capability. Fortunately, we can define a macro to -provide this feature. To keep things simple, our structure will be -immutable (read-only) and it won’t support inheritance.
Given a structure declaration like:
(our-struct name (field1 field2 ...))
We need to define some procedures.
A constructor procedure whose name is the struct name. We’ll +using for-syntax, since we need it at compile time:
> (require (for-syntax racket/syntax)) > (define-syntax (hyphen-define/ok3 stx) (syntax-case stx () [(_ a b (args ...) body0 body ...) (with-syntax ([name (format-id stx "~a-~a" #'a #'b)]) #'(define (name args ...) body0 body ...))])) > (hyphen-define/ok3 bar baz () #t) > (bar-baz) #t
Using format-id is convenient as it handles the tedium of +converting from syntax to symbol datum to string ... and all the way +back.
Finally, here’s a variation that accepts any number of name parts that +are joined with hyphens:
> (require (for-syntax racket/string racket/syntax)) > (define-syntax (hyphen-define* stx) (syntax-case stx () [(_ (names ...) (args ...) body0 body ...) (let* ([names/sym (map syntax-e (syntax->list #'(names ...)))] [names/str (map symbol->string names/sym)] [name/str (string-join names/str "-")] [name/sym (string->symbol name/str)]) (with-syntax ([name (datum->syntax stx name/sym)]) #`(define (name args ...) body0 body ...)))])) > (hyphen-define* (foo bar baz) (v) (* 2 v)) > (foo-bar-baz 50) 100
To review:
You can’t use a pattern variable outside of a template. But +you can use syntax or #’ on a pattern variable to make +an ad hoc "fun size" template.
If you want to munge pattern variables for use in the +template, with-syntax is your friend, because it lets you +create new pattern variables.
Usually you’ll need to use syntax->datum to get the +interesting value inside.
format-id is convenient for formatting identifier +names.
4.2 Making our own struct
Let’s apply what we just learned to a more-realistic example. We’ll +pretend that Racket doesn’t already have a struct +capability. Fortunately, we can write a macro to provide our own +system for defining and using structures. To keep things simple, our +structure will be immutable (read-only) and it won’t support +inheritance.
Given a structure declaration like:
(our-struct name (field1 field2 ...))
We need to define some procedures:
A constructor procedure whose name is the struct name. We’ll represent structures as a vector. The structure name will be element zero. The fields will be elements one onward.
A predicate, whose name is the struct name with ? appended.
For each field, an accessor procedure to get its value. These will be named struct-field (the name of the struct, a hyphen, and the -field name).
> (require (for-syntax racket/syntax)) > (define-syntax (our-struct stx) (syntax-case stx () [(_ id (fields ...)) (with-syntax ([pred-id (format-id stx "~a?" #'id)]) #`(begin ; Define a constructor. (define (id fields ...) (apply vector (cons 'id (list fields ...)))) ; Define a predicate. (define (pred-id v) (and (vector? v) (eq? (vector-ref v 0) 'id))) ; Define an accessor for each field. #,@(for/list ([x (syntax->list #'(fields ...))] [n (in-naturals 1)]) (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)] [ix n]) #`(define (acc-id v) (unless (pred-id v) (error 'acc-id "~a is not a ~a struct" v 'id)) (vector-ref v ix))))))])) ; Test it out > (require rackunit) > (our-struct foo (a b)) > (define s (foo 1 2)) > (check-true (foo? s)) > (check-false (foo? 1)) > (check-equal? (foo-a s) 1) > (check-equal? (foo-b s) 2) > (check-exn exn:fail? (lambda () (foo-a "furble"))) ; The tests passed. ; Next, what if someone tries to declare: > (our-struct "blah" ("blah" "blah")) format-id: contract violation
expected: (or/c string? symbol? identifier? keyword? char?
number?)
given: #<syntax:71:0 "blah">
The error message is not very helpful. It’s coming from +field name).
> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx) (syntax-case stx () [(_ id (fields ...)) (with-syntax ([pred-id (format-id stx "~a?" #'id)]) #`(begin ; Define a constructor. (define (id fields ...) (apply vector (cons 'id (list fields ...)))) ; Define a predicate. (define (pred-id v) (and (vector? v) (eq? (vector-ref v 0) 'id))) ; Define an accessor for each field. #,@(for/list ([x (syntax->list #'(fields ...))] [n (in-naturals 1)]) (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)] [ix n]) #`(define (acc-id v) (unless (pred-id v) (error 'acc-id "~a is not a ~a struct" v 'id)) (vector-ref v ix))))))])) ; Test it out > (require rackunit) > (our-struct foo (a b)) > (define s (foo 1 2)) > (check-true (foo? s)) > (check-false (foo? 1)) > (check-equal? (foo-a s) 1) > (check-equal? (foo-b s) 2)
> (check-exn exn:fail? (lambda () (foo-a "furble"))) ; The tests passed. ; Next, what if someone tries to declare: > (our-struct "blah" ("blah" "blah")) format-id: contract violation
expected: (or/c string? symbol? identifier? keyword? char?
number?)
given: #<syntax:78:0 "blah">
The error message is not very helpful. It’s coming from format-id, which is a private implementation detail of our macro.
You may know that a syntax-case clause can take an -optional "guard" or "fender" expression. Instead of
[pattern template]
It can be:
[pattern guard template]
Let’s add a guard expression to our clause:
> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx) (syntax-case stx () [(_ id (fields ...)) ; Guard or "fender" expression: (for-each (lambda (x) (unless (identifier? x) (raise-syntax-error #f "not an identifier" stx x))) (cons #'id (syntax->list #'(fields ...)))) (with-syntax ([pred-id (format-id stx "~a?" #'id)]) #`(begin ; Define a constructor. (define (id fields ...) (apply vector (cons 'id (list fields ...)))) ; Define a predicate. (define (pred-id v) (and (vector? v) (eq? (vector-ref v 0) 'id))) ; Define an accessor for each field. #,@(for/list ([x (syntax->list #'(fields ...))] [n (in-naturals 1)]) (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)] [ix n]) #`(define (acc-id v) (unless (pred-id v) (error 'acc-id "~a is not a ~a struct" v 'id)) (vector-ref v ix))))))])) ; Now the same misuse gives a better error message: > (our-struct "blah" ("blah" "blah")) eval:74:0: our-struct: not an identifier
at: "blah"
in: (our-struct "blah" ("blah" "blah"))
Later, we’ll see how syntax-parse makes it even easier to +optional "guard" or "fender" expression. Instead of
[pattern template]
It can be:
[pattern guard template]
Let’s add a guard expression to our clause:
> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx) (syntax-case stx () [(_ id (fields ...)) ; Guard or "fender" expression: (for-each (lambda (x) (unless (identifier? x) (raise-syntax-error #f "not an identifier" stx x))) (cons #'id (syntax->list #'(fields ...)))) (with-syntax ([pred-id (format-id stx "~a?" #'id)]) #`(begin ; Define a constructor. (define (id fields ...) (apply vector (cons 'id (list fields ...)))) ; Define a predicate. (define (pred-id v) (and (vector? v) (eq? (vector-ref v 0) 'id))) ; Define an accessor for each field. #,@(for/list ([x (syntax->list #'(fields ...))] [n (in-naturals 1)]) (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)] [ix n]) #`(define (acc-id v) (unless (pred-id v) (error 'acc-id "~a is not a ~a struct" v 'id)) (vector-ref v ix))))))])) ; Now the same misuse gives a better error message: > (our-struct "blah" ("blah" "blah")) eval:81:0: our-struct: not an identifier
at: "blah"
in: (our-struct "blah" ("blah" "blah"))
Later, we’ll see how syntax-parse makes it even easier to check usage and provide helpful messages about mistakes.
4.3 Using dot notation for nested hash lookups
The previous two examples used a macro to define functions whose names were made by joining identifiers provided to the macro. This example does the opposite: The identifier given to the macro is split into @@ -228,7 +253,7 @@ dictionaries that contain other dictionaries. In a jsexpr?< these are represented by nested hasheq tables.
JavaScript you can use dot notation:
foo = js.a.b.c;
In Racket it’s not so convenient:
; Nested hasheqs typical of a jsexpr: > (define js (hasheq 'a (hasheq 'b (hasheq 'c "value")))) ; Typical annoying code to get something: > (hash-ref (hash-ref (hash-ref js 'a) 'b) 'c) "value"
We can write a helper function to make this a bit cleaner:
; This helper function:
> (define/contract (hash-refs h ks [def #f]) ((hash? (listof any/c)) (any/c) . ->* . any) (with-handlers ([exn:fail? (const (cond [(procedure? def) (def)] [else def]))]) (for/fold ([h h]) ([k (in-list ks)]) (hash-ref h k)))) ; Lets us say: > (hash-refs js '(a b c)) "value"
That’s not bad. Can we go even further and use a dot notation somewhat like JavaScript?
; This macro: > (require (for-syntax racket/syntax))
> (define-syntax (hash.refs stx) (syntax-case stx () [(_) (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])" stx #'chain)] [(_ chain) #'(hash.refs chain #f)] [(_ chain default) (unless (symbol? (syntax-e #'chain)) (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])" stx #'chain)) (let ([xs (map (lambda (x) (datum->syntax stx (string->symbol x))) (regexp-split #rx"\\." (symbol->string (syntax->datum #'chain))))]) (unless (and (>= (length xs) 2) (not (eq? (syntax-e (cadr xs)) '||))) (raise-syntax-error #f "Expected hash.key" stx #'chain)) (with-syntax ([h (car xs)] [ks (cdr xs)]) #'(hash-refs h 'ks default)))])) ; Gives us "sugar" to say this: > (hash.refs js.a.b.c) "value"
It works!
We’ve started to appreciate that our macros should give helpful messages when used in error. We tried to do that here. Let’s -deliberately elicit various errors. Are the messages helpful?
> (hash.refs) eval:80:0: hash.refs: Expected (hash.key0[.key1 ...]
[default])
at: chain
in: (hash.refs)
> (hash.refs 0) eval:83:0: hash.refs: Expected (hash.key0[.key1 ...]
[default])
at: 0
in: (hash.refs 0 #f)
> (hash.refs js) eval:84:0: hash.refs: Expected hash.key
at: js
in: (hash.refs js #f)
> (hash.refs js.) eval:85:0: hash.refs: Expected hash.key
at: js.
in: (hash.refs js. #f)
Not too bad.
Maybe we’re not convinced that writing (hash.refs js.a.b.c) +deliberately elicit various errors. Are the messages helpful?
> (hash.refs) eval:87:0: hash.refs: Expected (hash.key0[.key1 ...]
[default])
at: chain
in: (hash.refs)
> (hash.refs 0) eval:90:0: hash.refs: Expected (hash.key0[.key1 ...]
[default])
at: 0
in: (hash.refs 0 #f)
> (hash.refs js) eval:91:0: hash.refs: Expected hash.key
at: js
in: (hash.refs js #f)
> (hash.refs js.) eval:92:0: hash.refs: Expected hash.key
at: js.
in: (hash.refs js. #f)
Not too bad.
Maybe we’re not convinced that writing (hash.refs js.a.b.c) is really clearer than (hash-refs js '(a b c)). Maybe we won’t actually use this approach. But the Racket macro system makes it a possible choice.
5 Syntax parameters
"Anaphoric if" or "aif" is a popular macro example. Instead of writing:
(let ([tmp (big-long-calculation)]) (if tmp (foo tmp) #f))
You could write:
(aif (big-long-calculation) (foo it) #f)
In other words, when the condition is true, an it identifier
diff --git a/main.rkt b/main.rkt
index 0a8d62a..eea43bb 100644
--- a/main.rkt
+++ b/main.rkt
@@ -723,12 +723,13 @@ No more error---good! Let's try to use it:
(foo-bar)
]
-It seems we're defining a function with a name other than
-@racket[foo-bar]?
+Apparently our macro is defining a function with some name other than
+@racket[foo-bar]. Huh.
-This is where the Macro Stepper in DrRacket is invaluable. Even if you
-prefer mostly to use Emacs, this is a situation where it's worth using
-DrRacket at least temporarily for its Macro Stepper.
+This is where the Macro Stepper in DrRacket is
+invaluable. @margin-note{Even if you prefer mostly to use Emacs, this
+is a situation where it's definitely worth temporarily using DrRacket
+for its Macro Stepper.}
@image[#:scale 0.5 "macro-stepper.png"]
@@ -753,23 +754,39 @@ Well that explains it. Instead, we wanted to expand to:
Our template is using the symbol @racket[name] but we wanted its
value, such as @racket[foo-bar] in this use of our macro.
-A solution here is @racket[with-syntax]@margin-note*{You could
-consider @racket[with-syntax] to mean, "define pattern variables".},
-which lets us say that @racket[name] is something whose value can be
-used in our output template. In effect, it lets us say that
-@racket[name] is an additional pattern variable.
+Can we think of something we already know that behaves like
+this---where using a variable in the template yields its value? Sure
+we do: Pattern variables. Our pattern doesn't include @racket[name]
+because we don't expect it in the original syntax---indeed the whole
+point of this macro is to create it. So @racket[name] can't be in the
+main pattern. Fine---let's make an @italic{additional} pattern. We can
+do that using an additional, nested @racket[syntax-case]:
@i[
-(define-syntax (hyphen-define/wrong1.3 stx)
+(define-syntax (hyphen-define/wrong1.2 stx)
(syntax-case stx ()
[(_ a b (args ...) body0 body ...)
- (with-syntax ([name (datum->syntax stx
- (string->symbol (format "~a-~a"
- #'a
- #'b)))])
- #'(define (name args ...)
- body0 body ...))]))
-(hyphen-define/wrong1.3 foo bar () #t)
+ (syntax-case (datum->syntax stx
+ (string->symbol (format "~a-~a" #'a #'b))) ()
+ [name #'(define (name args ...)
+ body0 body ...)])]))
+]
+
+Looks weird? Let's take a deep breath. Normally our transformer
+function is given syntax by Racket, and we pass that syntax to
+@racket[syntax-case]. But we can also create some syntax of our own,
+on the fly, and pass @italic{that} to @racket[syntax-case]. That's all
+we're doing here. The whole @racket[(datum->syntax ...)] expression is
+syntax that we're creating on the fly. We can give that to
+@racket[syntax-case], and match it using a pattern variable named
+@racket[name]. Voila, we have a new pattern variable. We can use it in
+a template, and its value will go in the template.
+
+We might have one more---just one, I promise!---small problem left.
+Let's try to use our new version:
+
+@i[
+(hyphen-define/wrong1.2 foo bar () #t)
(foo-bar)
]
@@ -780,13 +797,47 @@ Stepper. It says now we're expanding to:
(define (|#