1 Introduction
2 The plan of attack
3 Transformers
3.1 What is a syntax transformer?
3.2 What is the input?
3.3 Actually transforming the input
3.4 Compile time vs. run time
4 Pattern matching: syntax-case and syntax-rules
4.1 Patterns and templates
4.1.1 "A pattern variable can’t be used outside of a template"
5 Syntax parameters
6 Robust macros: syntax-parse
7 Other questions
7.1 What’s the point of with-syntax?
7.2 What’s the point of begin-for-syntax?
7.3 What’s the point of racket/ splicing?
8 References/ Acknowledgements
9 Epilog
Version: 5.3

Fear of Macros

Copyright (c) 2012 by Greg Hendershott. All rights reserved.

    1 Introduction

    2 The plan of attack

    3 Transformers

      3.1 What is a syntax transformer?

      3.2 What is the input?

      3.3 Actually transforming the input

      3.4 Compile time vs. run time

    4 Pattern matching: syntax-case and syntax-rules

      4.1 Patterns and templates

        4.1.1 "A pattern variable can’t be used outside of a template"

    5 Syntax parameters

    6 Robust macros: syntax-parse

    7 Other questions

      7.1 What’s the point of with-syntax?

      7.2 What’s the point of begin-for-syntax?

      7.3 What’s the point of racket/splicing?

    8 References/Acknowledgements

    9 Epilog

1 Introduction

I learned Racket after 25 years of doing C/C++ imperative programming.

Some psychic whiplash resulted.

"All the parentheses" was actually not a big deal. Instead, the first mind warp was functional programming. Before long I wrapped my brain around it, and went on to become comfortable and effective with many other aspects and features of Racket.

But two final frontiers remained: Macros and continuations.

I found that simple macros were easy and understandable, plus there were many good tutorials available. But the moment I stepped past routine pattern-matching, I kind of fell off a cliff into a terminology soup. I marinaded myself in material, hoping it would eventually sink in after enough re-readings. I even found myself using trial and error, rather than having a clear mental model what was going on. Gah.

I’m starting to write this at the point where the shapes are slowly emerging from the fog.

My primary motive is selfish. Explaining something forces me to learn it more thorougly. Plus I expect that if I write something with mistakes, other people will be eager to point them out and correct me. Is that a social-engineering variation of meta-programming? Next question, please. :)

Finally I do hope it may help other people who have a similar background and/or learning style as me.

I want to show how Racket macro features have evolved as solutions to problems or annoyances. I learn more quickly and deeply when I discover the answer to a question I already have, or find the solution to a problem whose pain I already feel. Therefore I’ll give you the questions and problems first, so that you can better appreciate and understand the answers and solutions.

2 The plan of attack

The macro system you will mostly want to use for production-quality macros is called syntax-parse. And don’t worry, we’ll get to that soon.

But if we start there, you’re likely to feel overwhelmed by concepts and terminology, and get very confused. I did.

1. Instead let’s start with the basics: A syntax object and a function to change it (a "transformer"). We’ll work at that level for awhile to get comfortable and to de-mythologize this whole macro business.

2. Next, we’ll realize that some pattern-matching would make life easier. We’ll learn about syntax-case, and its shorthand cousin, define-syntax-rule. We’ll discover we can get confused if we want to munge pattern variables before sticking them back in the template, and learn how to do that.

3. At this point we’ll be able to write many useful macros. But, what if we want to write the ever-popular anaphoric if, with a "magic variable"? It turns out we’ve been protected from making certain kind of mistakes. When we want to do this kind of thing on purpose, we use a syntax parameter. [There are other, older ways to do this. We won’t look at them. We also won’t spend a lot of time talking about "hygiene".]

4. Finally, we’ll realize that our macros could be smarter when they’re used in error. Normal Racket functions can optionally have contracts and types. These can catch mistakes and provide clear, useful error messages. It would be great if there were somthing similar for macros, and there is. One of the more-recent Racket macro enhancements is syntax-parse.

3 Transformers

  YOU ARE INSIDE A ROOM.

  THERE ARE KEYS ON THE GROUND.

  THERE IS A SHINY BRASS LAMP NEARBY.

  

  IF YOU GO THE WRONG WAY, YOU WILL BECOME

  HOPELESSLY LOST AND CONFUSED.

  

  > pick up the keys

  

  YOU HAVE A SYNTAX TRANSFORMER

3.1 What is a syntax transformer?

A syntax transformer is not one of the トランスフォーマ transformers.

Instead, it is quite simple. It is a function. The function takes syntax and returns syntax. It transforms syntax.

Here’s a transformer function that ignores its input syntax, and always outputs syntax for a string literal:

> (define-syntax foo
    (lambda (stx)
      #'"I am foo"))
> (foo)

"I am foo"

When we use define-syntax, we’re making a transformer binding. This tells the Racket compiler, "Whenever you encounter a chunk of syntax starting with foo, please give it to my transformer function, and replace it with the syntax I give back to you." So Racket will give anything that looks like (foo ...) to our function, and we can change it. Much like a search-and-replace.

Maybe you know that the usual way to define a function in Racket:

(define (f x) ...)

is shorthand for:

(define f (lambda (x) ...))

That shorthand lets you avoid typing lambda and some parentheses.

Well there is a similar shorthand for define-syntax:

> (define-syntax (also-foo stx)
    #'"I am also foo")
> (also-foo)

"I am also foo"

What we want to remember is that this is simply shorthand. We are still defining a transformer function, which takes syntax and returns syntax. Everything we do with macros, will be built on top of this basic idea. It’s not magic.

Speaking of shorthand, there is also a shorthand for syntax, which is #:

> (define-syntax (quoted-foo stx)
    #'"I am also foo, using #' instead of syntax")
> (quoted-foo)

"I am also foo, using #' instead of syntax"

Of course, we can emit syntax that is more interesting than a string literal. How about returning (displayln "hi")?

> (define-syntax (say-hi stx)
    #'(displayln "hi"))
> (say-hi)

hi

When Racket expands our program, it sees the occurrence of (say-hi), and sees it has a transformer function for that. It calls our function with the old syntax, and we return the new syntax, which is used to evaluate and run our program.

3.2 What is the input?

Our examples so far have been ignoring the input syntax, and outputting a fixed syntax. Usually, we want to transform the input to something else.

But let’s start by looking at what the input is:

> (define-syntax (show-me stx)
    (print stx)
    #'(void))
> (show-me '(i am a list))

#<syntax:10:0 (show-me (quote (i am a list)))>

The (print stx) shows what our transformer is given: a syntax object.

A syntax object consists of several things. The first part is the s-expression representing the code, such as '(i am a list). Racket (and Scheme and Lisp) expressions are s-expressions— code and data have the same structure, and this makes it vastly easier to rewrite syntax, i.e. write macros.

Racket syntax is also decorated with some interesting information such as the source file, line number, and column. Finally, it has information about lexical scoping (which you don’t need to worry about now, but will turn out to be important later.)

There are a variety of functions available to access a syntax object:

> (define stx #'(if x (list "true") #f))
> (syntax->datum stx)

'(if x (list "true") #f)

> (syntax-e stx)

'(#<syntax:11:0 if> #<syntax:11:0 x> #<syntax:11:0 (list "true")> #<syntax:11:0 #f>)

> (syntax->list stx)

'(#<syntax:11:0 if> #<syntax:11:0 x> #<syntax:11:0 (list "true")> #<syntax:11:0 #f>)

> (syntax-source stx)

'eval

> (syntax-line stx)

11

> (syntax-column stx)

0

When we want to transform syntax, we’ll generally take the pieces we were given, maybe rearrange their order, perhaps change some of the pieces, and often introduce brand-new pieces.

3.3 Actually transforming the input

Let’s write a transformer function that reverses the syntax it was given:

> (define-syntax (reverse-me stx)
    (datum->syntax stx (reverse (cdr (syntax->datum stx)))))
> (reverse-me "backwards" "am" "i" values)

"i"

"am"

"backwards"

What’s going on here? First we take the input syntax, and give it to syntax->datum. This converts the syntax into a plain old list:

> (syntax->datum #'(reverse-me "backwards" "am" "i" values))

'(reverse-me "backwards" "am" "i" values)

Using cdr slics off the first item of the list, reverse-me, leaving the rmainder: ("backwards" "am" "i" values). Passing that to reverse changes it to (values "i" "am" "backwards"):

> (reverse (cdr '("backwards" "am" "i" values)))

'(values "i" "am")

Finally we use syntax->datum to convert this back to syntax:

> (datum->syntax #f '(values "i" "am" "backwards"))

#<syntax (values "i" "am" "backwards")>

That’s what our transformer function gives back to the Racket compiler, and that syntax is evaluated:

> (values "i" "am" "backwards")

"i"

"am"

"backwards"

3.4 Compile time vs. run time

Normal Racket code runs at ... run time. Duh.

Instead of "compile time vs. run time", you may hear it described as "syntax phase vs. runtime phase". Same difference.

But a syntax transformer is run by the Racket compiler, as part of the process of parsing, expanding and understanding your code. In other words, your syntax transformer function is evaluated at compile time.

This aspect of macros lets you do things that simply aren’t possible in normal code. One of the classic examples, is something like the Racket if form:

(if <condition> <true-expression> <false-expression>)

If if were implemented as a function, all of the arguments would be evaluated before being provided to the function.

> (define (our-if condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if #t
          "true"
          "false")

"true"

That seems to work. However, how about this:

> (define (display-and-return x)
    (displayln x)
    x)
> (our-if #t
          (display-and-return "true")
          (display-and-return "false"))

true

false

"true"

One answer is that functional programming is good, and side-effects are bad. But avoiding side-effects isn’t always practical.

Oops. Because the expressions have a side-effect, it’s obvious that they are both evaluated. And that could be a problem—what if the side-effect includes deleting a file on disk? You wouldn’t want (if user-wants-file-deleted? (delete-file) (void)) to delete a file even when user-wants-file-deleted? is #f.

So this simply can’t work as a plain function. However a syntax transformer can rearrange the syntax – rewrite the code – at compile time. The pieces of syntax are moved around, but they aren’t actually evaluated until run time.

Here is one way to do this:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))
> (our-if-v2 #t
             (display-and-return "true")
             (display-and-return "false"))

true

"true"

> (our-if-v2 #f
             (display-and-return "true")
             (display-and-return "false"))

false

"false"

That gave the right answer. But how? Let’s pull out the transformer function itself, and see what it did. We start with an example of some input syntax:

> (define stx #'(our-if-v2 #t "true" "false"))
> (displayln stx)

#<syntax:31:0 (our-if-v2 #t "true" "false")>

1. We take the original syntax, and use syntax->datum to change it into a plain Racket list:

> (define xs (syntax->datum stx))
> (displayln xs)

(our-if-v2 #t true false)

2. To change this into a Racket cond form, we need to take the three interesting pieces—the condition, true-expression, and false-expression—from the list using cadr, caddr, and cadddr and arrange them into a cond form:

`(cond [,(cadr xs) ,(caddr xs)]
       [else ,(cadddr xs)])

3. Finally, we change that into syntax using datum->syntax:

> (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                            [else ,(cadddr xs)]))

#<syntax (cond (#t "true") (else "fals...>

So that works, but using cdddr etc. to destructure a list is painful and error-prone. Maybe you know Racket’s match? Using that would let us do pattern-matching.

Notice that we don’t care about the first item in the syntax list. We didn’t take (car xs) in our-if-v2, and we didn’t use name when we used pattern-matching. In general, a syntax transformer won’t care about that, because it is the name of the transformer binding. In other words, a macro usually doesn’t care about its own name.

Instead of:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))

We can write:

> (define-syntax (our-if-using-match stx)
    (match (syntax->list stx)
      [(list name condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))
> (our-if-using-match #t "true" "false")

match: undefined;

 cannot reference an identifier before its definition

  in module: 'program

  phase: 1

But wait, we can’t. It’s complaining that match isn’t defined. We havne’t required the racket/match module?

It turns out we haven’t. Remember, this transformer function is working at compile time, not run time. And at compile time, only racket/base is required for you automatically. If we want something like racket/match, we have to require it ourselves—and require it for compile time. Instead of using plain (require racket/match), the way to say this is to use (require (for-syntax racket/match))the for-syntax part meaning, "for compile time".

So let’s try that:

> (require (for-syntax racket/match))
> (define-syntax (our-if-using-match-v2 stx)
    (match (syntax->list stx)
      [(list _ condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))
> (our-if-using-match-v2 #t "true" "false")

"true"

To review:

Syntax transformers work at compile time, not run time. The good news is this means we can do things like delay evaluation, and implement forms like if which simply couldn’t work properly as run time functions.

Some other good news is that there isn’t some special, weird language for writing syntax transformers. We can write these transformer functions using familiar Racket code. The semi-bad news is that the familiarity can make it easy to forget that we’re not working at run time. Sometimes that’s important to remember. For example only racket/base is required for us automtically. If we need other modules, we have to require them, and we have to require them for compile time using (require (for-syntax)).

4 Pattern matching: syntax-case and syntax-rules

Most useful syntax transformers work by taking some input syntax, and rearranging the pieces into something else. As we saw, this is possible but tedious using list accessors such as cdddr. It’s more convenient and less error-prone to use pattern-matching.

Historically, syntax-case and syntax-parse pattern matching came first. match was added to Racket later.

It turns out that pattern-matching was one of the first improvements to be added to the Racket macro system. It’s called syntax-case, and has a shorthand for simple situations called define-syntax-rule.

Recall our previous example:

(require (for-syntax racket/match))
(define-syntax (our-if-using-match-v2 stx)
  (match (syntax->list stx)
    [(list _ condition true-expr false-expr)
     (datum->syntax stx `(cond [,condition ,true-expr]
                               [else ,false-expr]))]))

Here’s what it looks like using syntax-case:

> (define-syntax (our-if-using-syntax-case stx)
    (syntax-case stx ()
      [(_ condition true-expr false-expr)
       #'(cond [condition true-expr]
               [else false-expr])]))
> (our-if-using-syntax-case #t "true" "false")

"true"

Prety similar, huh? The pattern part looks almost exactly the same. The "template" part—where we specify the new syntax—is simpler. We don’t need to do quasiquoting and unquoting. We don’t need to use datum->syntax. We simply supply a template, which uses variables from the pattern.

There is a shorthand for simple pattern-matching cases, which expands into syntax-case. It’s called define-syntax-rule:

> (define-syntax-rule (our-if-using-syntax-rule condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if-using-syntax-rule #t "true" "false")

"true"

Here’s the thing about define-syntax-rule. Because it’s so simple, define-syntax-rule is ofen the first thing people are taught about macros. But it’s almost deceptively simple. It looks so much like defining a normal run time function—yet it’s not. It’s working at compile time, not run time. Worse, the moment you want to do more than define-syntax-rule can handle, you can fall off a cliff into what feels like complicated and confusing territory. Hopefully, because we started with a basic syntax transformer, and worked up from that, we won’t have that problem. We can appreciate define-syntax-rule as a convenient shorthand, but not be scared of, or confused about, that for which it’s shorthand.

4.1 Patterns and templates

Most of the materials I found for learning macros, including the Racket Guide, do a very good job explaining how the patterns work. I’m not going to regurgitate that here.

Instead, let’s look at some ways we’re likely to get tripped up.

4.1.1 "A pattern variable can’t be used outside of a template"

Let’s say we want to define a function with a hyphenated name, a-b, but we supply the a and b parts separately. The Racket struct form does somethin like this—if we define a struct named foo, it defines a number of functions whose names are variations on the name foo, such as foo-field1, foo-field2, foo?, and so on.

So let’s pretend we’re doing something like that. We want to transform the syntax (hyphen-define a b (args) body) to the syntax (define (a-b args) body).

A wrong first attempt is:

> (define-syntax (hyphen-define/wrong stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (let ([name (string->symbol (format "~a-~a" a b))])
         #'(define (name args ...)
             body0 body ...))]))

eval:46:0: a: pattern variable cannot be used outside of a

template

  in: a

Huh. We have no idea what this error message means. Well, let’s see. The "template" is the #'(define (name args ...) body0 body ...) portion. The let isn’t part of that template. It sounds like we can’t use a (or b) in the let part.

It turns out we can use a pattern variable in another pattern—by using syntax-case again:

> (define-syntax (hyphen-define/wrong stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx (string->symbol (format "~a-~a"
                                                               (syntax->datum a)
                                                               (syntax->datum b)))) ()
         [name #'(define (name args ...)
                   body0 body ...)])]))

eval:47:0: a: pattern variable cannot be used outside of a

template

  in: a

I don’t have a clear explanation for why they need to be #a and #b. Can anyone help?

Well, not quite. We can’t use a and b directly. We have to wrap each one in syntax, or use its reader alias, #:

> (define-syntax (hyphen-define/ok1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx
                                   (string->symbol (format "~a-~a"
                                                           (syntax->datum #'a)
                                                           (syntax->datum #'b)))) ()
         [name #'(define (name args ...)
                   body0 body ...)])]))
> (hyphen-define/ok1 first second () #t)
> (first-second)

#t

And now it works!

There is a shorthand for using syntax-case this way. It’s called with-syntax. This makes it a little simpler:

> (define-syntax (hyphen-define/ok2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (datum->syntax stx
                                          (string->symbol (format "~a-~a"
                                                                  (syntax->datum #'a)
                                                                  (syntax->datum #'b))))])
         #`(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok2 foo bar () #t)
> (foo-bar)

#t

Another handy thing is that with-syntax will convert the expression to syntax automatically. So we don’t need the datum->syntax stuff, and now it becomes even simpler:

> (define-syntax (hyphen-define/ok3 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (string->symbol (format "~a-~a"
                                                   (syntax->datum #'a)
                                                   (syntax->datum #'b)))])
         #`(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok3 foo bar () #t)
> (foo-bar)

#t

Recap: If you want to munge pattern variables for use in the template, with-syntax is your friend. Just remember you have to use syntax or # on the pattern variables.

5 Syntax parameters

"Anaphoric if" or "aif" is a popular macro example. Instead of writing:

(let ([tmp (big-long-calculation)])
  (if tmp
      (foo tmp)
      #f))

It would be great to write:

(aif (big-long-calculation)
     (foo it)
     #f)

In other words, when the condition is true, an it identifier is automatically created and set to the value of the condition. This should be easy:

> (define-syntax-rule (aif condition true-expr false-expr)
    (let ([it condition])
      (if it
          true-expr
          false-expr)))
> (aif #t (displayln it) (void))

it: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Wait, what—it is undefined?

It turns out that all along we have been protected from making a certain kind of mistake in our macros. The mistake is to introduce a variable that accidentally conflicts with one in the code that is using our macro.

The Racket Reference Section 1.2.3.5 Transformer Bindings. has a good explanation of this, and an example. (You can stop when you reach the part about set! transformers.) Basically, the input syntax has "marks" to preserve lexical scope. This makes your macro behave like a normal function. If a normal function defines a variable named x, it won’t conflict with a variable named x in an outer scope.

This makes it easy to write reliable macros that behave predictably. Unfortunately, once in awhile, we want to introduce a magic variable like it for aif on purpose.

The way to do this is with define-syntax-parameter and syntax-parameterize. You’re probably familiar with regular parameters in Racket.

> (define current-foo (make-parameter "some default value"))
> (current-foo)

"some default value"

> (parameterize ([current-foo "I have a new value, for now"])
    (current-foo))

"I have a new value, for now"

> (current-foo)

"some default value"

Historically, there are other ways to do this. If you know them, you will want to unlearn them. But if you’re the target audience I’m writing for, you don’t know them yet. You can skip learning them now. (Someday if you want to understand someone else’s older macros, you can learn about them then.)

The syntax variation of them works similarly. The idea is, we’ll define it to mean an error by default. Only inside of our aif will it have a meaningful value:

> (require racket/stxparam)
> (define-syntax-parameter it
    (lambda (stx)
      (raise-syntax-error (syntax-e stx) "can only be used inside aif")))
> (define-syntax-rule (aif condition true-expr false-expr)
    (let ([tmp condition])
      (if tmp
          (syntax-parameterize ([it (make-rename-transformer #'tmp)])
            true-expr)
          false-expr)))
> (aif 10 (displayln it) (void))

10

> (aif #f (displayln it) (void))

We can still use it as a normal variable name:

> (define it 10)
> it

10

If we try to use it outside of an aif form, and it isn’t otherwise defined, we get an error like we want:

> (displayln it)

10

Perfect.

6 Robust macros: syntax-parse

TO-DO.

7 Other questions

Hopefully I will answer these in the course of the other sections. But just in case:

7.1 What’s the point of with-syntax?

Done.

7.2 What’s the point of begin-for-syntax?

TO-DO.

7.3 What’s the point of racket/splicing?

TO-DO.

8 References/Acknowledgements

Eli Barzliay wrote a blog post, Writing ‘syntax-case’ Macros, which explains many key details. However it’s written especially for people already familiar with "un-hygienic" "defmacro" style macros. If you’re not familiar with those, it may seem slightly weird to the extent it’s trying to convince you to change an opinion you don’t have. Even so, many key details are presented in Eli’s typically concise, clear fashion.

Eli Barzilay wrote another blog post, Dirty Looking Hygiene, which explains syntax-parameterize. I relied heavily on that, mostly just updating it since his post was written before PLT Scheme was renamed to Racket.

9 Epilog

"Before I had studied Chan (Zen) for thirty years, I saw mountains as mountains, and rivers as rivers. When I arrived at a more intimate knowledge, I came to the point where I saw that mountains are not mountains, and rivers are not rivers. But now that I have got its very substance I am at rest. For it’s just that I see mountains once again as mountains, and rivers once again as rivers"

–Buddhist saying originally formulated by Qingyuan Weixin, later translated by D.T. Suzuki in his Essays in Zen Buddhism.

Translated into Racket:

(dynamic-wind (lambda ()
                (and (eq? 'mountains 'mountains)
                     (eq? 'rivers 'rivers)))
              (lambda ()
                (not (and (eq? 'mountains 'mountains)
                          (eq? 'rivers 'rivers))))
              (lambda ()
                (and (eq? 'mountains 'mountains)
                     (eq? 'rivers 'rivers))))