diff --git a/.gitignore b/.gitignore index f8a06ea..e86bf95 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,2 @@ .DS_Store -main.html +index/ diff --git a/Epilogue.html b/Epilogue.html new file mode 100644 index 0000000..acf7478 --- /dev/null +++ b/Epilogue.html @@ -0,0 +1,9 @@ + +9 Epilogue

9 Epilogue

"Before I had studied Chan (Zen) for thirty years, I saw mountains as +mountains, and rivers as rivers. When I arrived at a more intimate +knowledge, I came to the point where I saw that mountains are not +mountains, and rivers are not rivers. But now that I have got its very +substance I am at rest. For it’s just that I see mountains once again +as mountains, and rivers once again as rivers"

–Buddhist saying originally formulated by Qingyuan Weixin, +later translated by D.T. Suzuki in his Essays in Zen +Buddhism.

Translated into Racket:

(dynamic-wind (lambda ()
                (and (eq? 'mountains 'mountains)
                     (eq? 'rivers 'rivers)))
              (lambda ()
                (not (and (eq? 'mountains 'mountains)
                          (eq? 'rivers 'rivers))))
              (lambda ()
                (and (eq? 'mountains 'mountains)
                     (eq? 'rivers 'rivers))))
 
\ No newline at end of file diff --git a/Our_plan_of_attack.html b/Our_plan_of_attack.html new file mode 100644 index 0000000..74bbe82 --- /dev/null +++ b/Our_plan_of_attack.html @@ -0,0 +1,22 @@ + +2 Our plan of attack

2 Our plan of attack

The macro system you will mostly want to use for production-quality +macros is called syntax-parse. And don’t worry, we’ll get to +that soon.

But if we start there, you’re likely to feel overwhelmed by concepts +and terminology, and get very confused. I did.

1. Instead let’s start with the basics: A syntax object and a function +to change it—a "transformer". We’ll work at that level for awhile to +get comfortable and to de-mythologize this whole macro business.

2. Soon we’ll realize that pattern-matching would make life +easier. We’ll learn about syntax-case and its shorthand +cousin, define-syntax-rule. We’ll discover we can get +confused if we want to munge pattern variables before sticking them +back in the template, and learn how to do that.

3. At this point we’ll be able to write many useful macros. But, what +if we want to write the ever-popular anaphoric if, with a "magic +variable"? It turns out we’ve been protected from making certain kind +of mistakes. When we want to do this kind of thing on purpose, we use +a syntax parameter. [There are other, older ways to do this. We won’t +look at them. We also won’t spend a lot of time +advocating "hygiene"—we’ll just stipulate that it’s good.]

4. Finally, we’ll realize that our macros could be smarter when +they’re used in error. Normal Racket functions optionally can have +contracts and types. These catch usage mistakes and provide clear, +useful error messages. It would be great if there were something +similar for macro. There is. One of the more-recent Racket macro +enhancements is syntax-parse.

 
\ No newline at end of file diff --git a/Preface.html b/Preface.html new file mode 100644 index 0000000..07de303 --- /dev/null +++ b/Preface.html @@ -0,0 +1,24 @@ + +1 Preface

1 Preface

I learned Racket after 25 years of mostly using C and C++.

Some psychic whiplash resulted.

"All the parentheses" was actually not a big deal. Instead, the first +mind warp was functional programming. Before long I wrapped my brain +around it, and went on to become comfortable and effective with many +other aspects and features of Racket.

But two final frontiers remained: Macros and continuations.

I found that simple macros were easy and understandable, plus there +were many good tutorials available. But the moment I stepped past +routine pattern-matching, I kind of fell off a cliff into a +terminology soup. I marinaded myself in material, hoping it would +eventually sink in after enough re-readings. I even found myself using +trial and error, rather than having a clear mental model what was +going on. Gah.

I’m starting to write this at the point where the shapes are slowly +emerging from the fog.

If you have any corrections, criticisms, complaints, or whatever, +please +let me know.

My primary motive is selfish. Explaining something forces me to learn +it more thoroughly. Plus if I write something with mistakes, other +people will be eager to point them out and correct me. Is that a +social-engineering variation of meta-programming? Next question, +please. :)

Finally I do hope it may help other people who have a similar +background and/or learning style as me.

I want to show how Racket macro features have evolved as solutions to +problems or annoyances. I learn more quickly and deeply when I +discover the answer to a question I already have, or find the solution +to a problem whose pain I already feel. Therefore I’ll give you the +questions and problems first, so that you can better appreciate and +understand the answers and solutions.

 
\ No newline at end of file diff --git a/References_and_Acknowledgments.html b/References_and_Acknowledgments.html new file mode 100644 index 0000000..5bab925 --- /dev/null +++ b/References_and_Acknowledgments.html @@ -0,0 +1,34 @@ + +8 References and Acknowledgments

8 References and Acknowledgments

Eli Barzliay’s blog post, +Writing +‘syntax-case’ Macros, helped me understand many key details and +concepts. It also inspired me to use a "bottom-up" approach. However +he wrote for a specific audience. If you’re not already familiar with +un-hygienic defmacro style macros, it may seem slightly weird to the +extent it’s trying to convince you to change an opinion you don’t +have. I’m writing for people who don’t have any opinion about macros +at all, except maybe that macros seem scary and daunting.

Eli wrote another blog post, +Dirty +Looking Hygiene, which explains syntax-parameterize. I relied +heavily on that, mostly just updating it since his post was written +before PLT Scheme was renamed to Racket.

Matthew Flatt’s +Composable +and Compilable Macros: You Want it When? explains how Racket handles +compile time vs. run time.

Chapter +8 of The Scheme Programming Language by Kent Dybvig +explains syntax-rules and syntax-case. Although +more "formal" in tone, you may find it helpful to read it. You never +know which explanation or examples of something will click for you.

After initially wondering if I was asking the wrong question and +conflating two different issues :), Shriram Krishnamurthi looked at an +early draft and encouraged me to keep going. Sam Tobin-Hochstadt and +Robby Findler also encouraged me. Matthew Flatt showed me how to make +a Scribble interaction print syntax as +"syntax" rather than as "#'". Jay McCarthy helped me +catch some mistakes and confusions. Jon Rafkind pointed out some +problems. Kieron Hardy reported a font issue and some typos.

Finally, I noticed something strange. After writing much of this, when +I returned to some parts of the Racket documentation, I noticed it had +improved since I last read it. Of course, it was the same. I’d +changed. It’s interesting how much of what we already know is +projected between the lines. My point is, the Racket documentation is +very good. The Guide provides helpful examples and +tutorials. The Reference is very clear and precise.

 
\ No newline at end of file diff --git a/Robust_macros__syntax-parse.html b/Robust_macros__syntax-parse.html new file mode 100644 index 0000000..89e6241 --- /dev/null +++ b/Robust_macros__syntax-parse.html @@ -0,0 +1,31 @@ + +7 Robust macros: syntax-parse
On this page:
7.1 Error-handling strategies for functions
7.2 Error-handling strategies for macros
7.3 Using syntax/ parse

7 Robust macros: syntax-parse

Functions can be used in error. So can macros.

7.1 Error-handling strategies for functions

With plain old functions, we have several choices how to handle +misuse.

1. Don’t check at all.

> (define (misuse s)
    (string-append s " snazzy suffix"))
; User of the function:
> (misuse 0)

string-append: contract violation

  expected: string?

  given: 0

  argument position: 1st

  other arguments...:

   " snazzy suffix"

; I guess I goofed, but what is this "string-append" of which you
; speak??

The problem is that the resulting error message will be confusing. Our +user thinks they’re calling misuse, but is getting an error +message from string-append. In this simple example they +could probably guess what’s happening, but in most cases they won’t.

2. Write some error handling code.

> (define (misuse s)
    (unless (string? s)
      (error 'misuse "expected a string, but got ~a" s))
    (string-append s " snazzy suffix"))
; User of the function:
> (misuse 0)

misuse: expected a string, but got 0

; I goofed, and understand why! It's a shame the writer of the
; function had to work so hard to tell me.

Unfortunately the error code tends to overwhelm and/or obscure our +function definition. Also, the error message is good but not +great. Improving it would require even more error code.

3. Use a contract.

> (define/contract (misuse s)
    (string? . -> . string?)
    (string-append s " snazzy suffix"))
; User of the function:
> (misuse 0)

misuse: contract violation

  expected: string?, given: 0

  in: the 1st argument of

      (-> string? string?)

  contract from: (function misuse)

  blaming: program

  at: eval:130.0

; I goofed, and understand why! I'm happier, and I hear the writer of
; the function is happier, too.

This is the best of both worlds.

The contract is a simple and concise. Even better, it’s +declarative. We say what we want, without needing to spell out what to +do.

On the other hand the user of our function gets a very detailed error +message. Plus, the message is in a standard, familiar format.

4. Use Typed Racket.

> (: misuse (String -> String))
> (define (misuse s)
    (string-append s " snazzy suffix"))
> (misuse 0)

eval:3:0: Type Checker: Expected String, but got Zero

  in: (quote 0)

With respect to error handling, Typed Racket has the same benefits as +contracts. Good.

7.2 Error-handling strategies for macros

For macros, we have similar choices.

1. Ignore the possibility of misuse. This choice is even worse for +macros. The default error messages are even less likely to make sense, +much less help our user know what to do.

2. Write error-handling code. We saw how much this complicated our +macros in our example of Using dot notation for nested hash lookups. And while we’re still +learning how to write macros, we especially don’t want more cognitive +load and obfuscation.

3. Use syntax/parse. For macros, this is the equivalent of +using contracts or types for functions. We can declare that input +pattern elements must be certain kinds of things, such as an +identifier. Instead of "types", the kinds are referred to as "syntax +classes". There are predefined syntax classes, plus we can define our +own.

7.3 Using syntax/parse

November 1, 2012: So here’s the deal. After writing everything up to +this point, I sat down to re-read the documentation for +syntax/parse. It was...very understandable. I didn’t feel +confused.

<span style='accent: "Kenau-Reeves"'>
Whoa.
</span>

Why? The documentation is written very well. Also, everything up to +this point prepared me to appreciate what syntax/parse does, +and why. That leaves the "how" of using it, which seems pretty +straightforward, so far.

This might well be a temporary state of me "not knowing what I don’t +know". As I dig in and use it more, maybe I’ll discover something +confusing or tricky. If/when I do, I’ll come back here and update +this.

But for now I’ll focus on improving the previous parts.

 
\ No newline at end of file diff --git a/Syntax_parameters.html b/Syntax_parameters.html new file mode 100644 index 0000000..91ee083 --- /dev/null +++ b/Syntax_parameters.html @@ -0,0 +1,24 @@ + +5 Syntax parameters

5 Syntax parameters

"Anaphoric if" or "aif" is a popular macro example. Instead of writing:

(let ([tmp (big-long-calculation)])
  (if tmp
      (foo tmp)
      #f))

You could write:

(aif (big-long-calculation)
     (foo it)
     #f)

In other words, when the condition is true, an it identifier +is automatically created and set to the value of the condition. This +should be easy:

> (define-syntax-rule (aif condition true-expr false-expr)
    (let ([it condition])
      (if it
          true-expr
          false-expr)))
> (aif #t (displayln it) (void))

it: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Wait, what? it is undefined?

It turns out that all along we have been protected from making a +certain kind of mistake in our macros. The mistake is if our new +syntax introduces a variable that accidentally conflicts with one in +the code surrounding our macro.

The Racket Reference section, +Transformer +Bindings, has a good explanation and example. Basically, syntax +has "marks" to preserve lexical scope. This makes your macro behave +like a normal function, for lexical scoping.

If a normal function defines a variable named x, it won’t +conflict with a variable named x in an outer scope:

> (let ([x "outer"])
    (let ([x "inner"])
      (printf "The inner `x' is ~s\n" x))
    (printf "The outer `x' is ~s\n" x))

The inner `x' is "inner"

The outer `x' is "outer"

When our macros also respect lexical scoping, it’s easier to write +reliable macros that behave predictably.

So that’s wonderful default behavior. But sometimes we want +to introduce a magic variable on purpose—such as it for +aif.

The way to do this is with a "syntax parameter", using +define-syntax-parameter and +syntax-parameterize. You’re probably familiar with regular +parameters in Racket:

> (define current-foo (make-parameter "some default value"))
> (current-foo)

"some default value"

> (parameterize ([current-foo "I have a new value, for now"])
    (current-foo))

"I have a new value, for now"

> (current-foo)

"some default value"

That’s a normal parameter. The syntax variation works similarly. The +idea is that we’ll define it to mean an error by +default. Only inside of our aif will it have a meaningful +value:

> (require racket/stxparam)
> (define-syntax-parameter it
    (lambda (stx)
      (raise-syntax-error (syntax-e stx) "can only be used inside aif")))
> (define-syntax-rule (aif condition true-expr false-expr)
    (let ([tmp condition])
      (if tmp
          (syntax-parameterize ([it (make-rename-transformer #'tmp)])
            true-expr)
          false-expr)))
> (aif 10 (displayln it) (void))

10

> (aif #f (displayln it) (void))

Inside the syntax-parameterize, it acts as an alias +for tmp. The alias behavior is created by +make-rename-transformer.

If we try to use it outside of an aif form, and +it isn’t otherwise defined, we get an error like we want:

> (displayln it)

it: can only be used inside aif

But we can still define it as a normal variable:

> (define it 10)
> it

10

For a deeper look, see Keeping it Clean with Syntax Parameters.

 
\ No newline at end of file diff --git a/Transform_.html b/Transform_.html new file mode 100644 index 0000000..5399275 --- /dev/null +++ b/Transform_.html @@ -0,0 +1,102 @@ + +3 Transform!
On this page:
3.1 What is a syntax transformer?
3.2 What’s the input?
3.3 Actually transforming the input
3.4 Compile time vs. run time
3.5 begin-for-syntax

3 Transform!

  YOU ARE INSIDE A ROOM.

  THERE ARE KEYS ON THE GROUND.

  THERE IS A SHINY BRASS LAMP NEARBY.

  

  IF YOU GO THE WRONG WAY, YOU WILL BECOME

  HOPELESSLY LOST AND CONFUSED.

  

  > pick up the keys

  

  YOU HAVE A SYNTAX TRANSFORMER

3.1 What is a syntax transformer?

A syntax transformer is not one of the トランスフォーマ +transformers.

Instead, it is simply a function. The function takes syntax and +returns syntax. It transforms syntax.

Here’s a transformer function that ignores its input syntax, and +always outputs syntax for a string literal:

> (define-syntax foo
    (lambda (stx)
      (syntax "I am foo")))

Using it:

> (foo)

"I am foo"

When we use define-syntax, we’re making a transformer +binding. This tells the Racket compiler, "Whenever you +encounter a chunk of syntax starting with foo, please give it +to my transformer function, and replace it with the syntax I give back +to you." So Racket will give anything that looks like (foo ...) to our function, and we can return new syntax to use +instead. Much like a search-and-replace.

Maybe you know that the usual way to define a function in Racket:

(define (f x) ...)

is shorthand for:

(define f (lambda (x) ...))

That shorthand lets you avoid typing lambda and some parentheses.

Well there is a similar shorthand for define-syntax:

> (define-syntax (also-foo stx)
    (syntax "I am also foo"))
> (also-foo)

"I am also foo"

What we want to remember is that this is simply shorthand. We are +still defining a transformer function, which takes syntax and returns +syntax. Everything we do with macros, will be built on top of this +basic idea. It’s not magic.

Speaking of shorthand, there is also a shorthand for syntax, +which is #:

# is short for syntax much like + is short for quote.

> (define-syntax (quoted-foo stx)
    #'"I am also foo, using #' instead of syntax")
> (quoted-foo)

"I am also foo, using #' instead of syntax"

We’ll use the #’ shorthand from now on.

Of course, we can emit syntax that is more interesting than a +string literal. How about returning (displayln "hi")?

> (define-syntax (say-hi stx)
    #'(displayln "hi"))
> (say-hi)

hi

When Racket expands our program, it sees the occurrence of +(say-hi), and sees it has a transformer function for that. It +calls our function with the old syntax, and we return the new syntax, +which is used to evaluate and run our program.

3.2 What’s the input?

Our examples so far have ignored the input syntax and output some +fixed syntax. But typically we will want to transform in the input +syntax into somehing else.

Let’s start by looking closely at what the input actually is:

> (define-syntax (show-me stx)
    (print stx)
    #'(void))
> (show-me '(+ 1 2))

#<syntax:10:0 (show-me (quote (+ 1 2)))>

The (print stx) shows what our transformer is given: a syntax +object.

A syntax object consists of several things. The first part is the +S-expression representing the code, such as '(+ 1 2).

Racket syntax is also decorated with some interesting information such +as the source file, line number, and column. Finally, it has +information about lexical scoping (which you don’t need to worry about +now, but will turn out to be important later.)

There are a variety of functions available to access a syntax object. +Let’s define a piece of syntax:

> (define stx #'(if x (list "true") #f))
> stx

#<syntax:11:0 (if x (list "true") #f)>

Now let’s use functions that access the syntax object. The source +information functions are:

(syntax-source stx) is returning 'eval, +only becaue of how I’m generating this documentation, using an +evaluator to run code snippets in Scribble. Normally this would be +somthing like "my-file.rkt".

> (syntax-source stx)

'eval

> (syntax-line stx)

11

> (syntax-column stx)

0

More interesting is the syntax "stuff" itself. syntax->datum +converts it completely into an S-expression:

> (syntax->datum stx)

'(if x (list "true") #f)

Whereas syntax-e only goes "one level down". It may return a +list that has syntax objects:

> (syntax-e stx)

'(#<syntax:11:0 if> #<syntax:11:0 x> #<syntax:11:0 (list "true")> #<syntax:11:0 #f>)

Each of those syntax objects could be converted by syntax-e, +and so on recursively—which is what syntax->datum does.

In most cases, syntax->list gives the same result as +syntax-e:

> (syntax->list stx)

'(#<syntax:11:0 if> #<syntax:11:0 x> #<syntax:11:0 (list "true")> #<syntax:11:0 #f>)

(When would syntax-e and syntax->list differ? Let’s +not get side-tracked now.)

When we want to transform syntax, we’ll generally take the pieces we +were given, maybe rearrange their order, perhaps change some of the +pieces, and often introduce brand-new pieces.

3.3 Actually transforming the input

Let’s write a transformer function that reverses the syntax it was +given:

> (define-syntax (reverse-me stx)
    (datum->syntax stx (reverse (cdr (syntax->datum stx)))))
> (reverse-me "backwards" "am" "i" values)

"i"

"am"

"backwards"

Understand Yoda, can we. Great, but how does this work?

First we take the input syntax, and give it to +syntax->datum. This converts the syntax into a plain old +list:

> (syntax->datum #'(reverse-me "backwards" "am" "i" values))

'(reverse-me "backwards" "am" "i" values)

Using cdr slices off the first item of the list, +reverse-me, leaving the remainder: +("backwards" "am" "i" values). Passing that to +reverse changes it to (values "i" "am" "backwards"):

> (reverse (cdr '(reverse-me "backwards" "am" "i" values)))

'(values "i" "am" "backwards")

Finally we use datum->syntax to convert this back to +syntax:

> (datum->syntax #f '(values "i" "am" "backwards"))

#<syntax (values "i" "am" "backwards")>

That’s what our transformer function gives back to the Racket +compiler, and that syntax is evaluated:

> (values "i" "am" "backwards")

"i"

"am"

"backwards"

3.4 Compile time vs. run time

(define-syntax (foo stx)
  (make-pipe) ;Ce n'est pas le temps d'exécution
  #'(void))

Normal Racket code runs at ... run time. Duh.

Instead of "compile time vs. run time", you may hear it +described as "syntax phase vs. runtime phase". Same difference.

But a syntax transformer is called by Racket as part of the process of +parsing, expanding, and compiling our program. In other words, our +syntax transformer function is evaluated at compile time.

This aspect of macros lets you do things that simply aren’t possible +in normal code. One of the classic examples is something like the +Racket form, if:

(if <condition> <true-expression> <false-expression>)

If we implemented if as a function, all of the arguments +would be evaluated before being provided to the function.

> (define (our-if condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if #t
          "true"
          "false")

"true"

That seems to work. However, how about this:

> (define (display-and-return x)
    (displayln x)
    x)
> (our-if #t
          (display-and-return "true")
          (display-and-return "false"))

true

false

"true"

One answer is that functional programming is good, and +side-effects are bad. But avoiding side-effects isn’t always +practical.

Oops. Because the expressions have a side-effect, it’s obvious that +they are both evaluated. And that could be a problem—what if the +side-effect includes deleting a file on disk? You wouldn’t want +(if user-wants-file-deleted? (delete-file) (void)) to delete +a file even when user-wants-file-deleted? is #f.

So this simply can’t work as a plain function. However a syntax +transformer can rearrange the syntax – rewrite the code – at compile +time. The pieces of syntax are moved around, but they aren’t actually +evaluated until run time.

Here is one way to do this:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))
> (our-if-v2 #t
             (display-and-return "true")
             (display-and-return "false"))

true

"true"

> (our-if-v2 #f
             (display-and-return "true")
             (display-and-return "false"))

false

"false"

That gave the right answer. But how? Let’s pull out the transformer +function itself, and see what it did. We start with an example of some +input syntax:

> (define stx #'(our-if-v2 #t "true" "false"))
> (displayln stx)

#<syntax:32:0 (our-if-v2 #t "true" "false")>

1. We take the original syntax, and use syntax->datum to +change it into a plain Racket list:

> (define xs (syntax->datum stx))
> (displayln xs)

(our-if-v2 #t true false)

2. To change this into a Racket cond form, we need to take +the three interesting pieces—the condition, true-expression, and +false-expression—from the list using cadr, caddr, +and cadddr and arrange them into a cond form:

`(cond [,(cadr xs) ,(caddr xs)]
       [else ,(cadddr xs)])

3. Finally, we change that into syntax using +datum->syntax:

> (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                            [else ,(cadddr xs)]))

#<syntax (cond (#t "true") (else "fals...>

So that works, but using cadddr etc. to destructure a list is +painful and error-prone. Maybe you know Racket’s match? +Using that would let us do pattern-matching.

Notice that we don’t care about the first item in the +syntax list. We didn’t take (car xs) in our-if-v2, and we +didn’t use name when we used pattern-matching. In general, a +syntax transformer won’t care about that, because it is the name of +the transformer binding. In other words, a macro usually doesn’t care +about its own name.

Instead of:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))

We can write:

> (define-syntax (our-if-using-match stx)
    (match (syntax->list stx)
      [(list name condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))

Great. Now let’s try using it:

> (our-if-using-match #t "true" "false")

match: undefined;

 cannot reference an identifier before its definition

  in module: 'program

  phase: 1

Oops. It’s complaining that match isn’t defined.

Our transformer function is working at compile time, not run time. And +at compile time, only racket/base is required for you +automatically—not the full racket.

Anything beyond racket/base, we have to require +ourselves—and require it for compile time using the +for-syntax form of require.

In this case, instead of using plain (require racket/match), +we want (require (for-syntax racket/match))the +for-syntax part meaning, "for compile time".

So let’s try that:

> (require (for-syntax racket/match))
> (define-syntax (our-if-using-match-v2 stx)
    (match (syntax->list stx)
      [(list _ condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))
> (our-if-using-match-v2 #t "true" "false")

"true"

Joy.

3.5 begin-for-syntax

We used for-syntax to require the +racket/match module because we needed to use match +at compile time.

What if we wanted to define our own helper function to be used by a +macro? One way to do that is put it in another module, and +require it using for-syntax, just like we did with +the racket/match module.

If instead we want to put the helper in the same module, we can’t +simply define it and use it—the definition would exist at +run time, but we need it at compile time. The answer is to put the +definition of the helper function(s) inside begin-for-syntax:

(begin-for-syntax
 (define (my-helper-function ....)
   ....))
(define-syntax (macro-using-my-helper-function stx)
  (my-helper-function ....)
  ....)

In the simple case, we can also use define-for-syntax, which +composes begin-for-syntax and define:

(define-for-syntax (my-helper-function ....)
  ....)
(define-syntax (macro-using-my-helper-function stx)
  (my-helper-function ....)
  ....)

To review:

 
\ No newline at end of file diff --git a/What_s_the_point_of_splicing-let_.html b/What_s_the_point_of_splicing-let_.html new file mode 100644 index 0000000..6a6ae69 --- /dev/null +++ b/What_s_the_point_of_splicing-let_.html @@ -0,0 +1,11 @@ + +6 What's the point of splicing-let?

6 What’s the point of splicing-let?

I stared at racket/splicing for the longest time. What does +it do? Why would I use it? Why is it in the Macros section of the +reference?

Step one, cut a hole in the box +de-mythologize it. For example, using splicing-let like this:

> (require racket/splicing)
> (splicing-let ([x 0])
    (define (get-x)
      x))
; get-x is visible out here:
> (get-x)

0

; but x is not:
> x

x: undefined;

 cannot reference an identifier before its definition

  in module: 'program

is equivalent to:

> (define get-y
    (let ([y 0])
      (lambda ()
        y)))
; get-y is visible out here:
> (get-y)

0

; but y is not:
> y

y: undefined;

 cannot reference an identifier before its definition

  in module: 'program

This is the classic Lisp/Scheme/Racket idiom sometimes called "let +over lambda". A +koan +about closures and objects. A closure hides y, which can +only be accessed via get-y.

So why would we care about the splicing forms? They can be more +concise, especially when there are multiple body forms:

> (require racket/splicing)
> (splicing-let ([x 0])
    (define (inc)
      (set! x (+ x 1)))
    (define (dec)
      (set! x (- x 1)))
    (define (get)
      x))

The splicing variation is more convenient than the usual way:

> (define-values (inc dec get)
    (let ([x 0])
      (values (lambda ()  ; inc
                (set! x (+ 1 x)))
              (lambda ()  ; dec
                (set! x (- 1 x)))
              (lambda ()  ; get
                x))))

When there are many body forms—and we’re generating them in a +macro—the splicing variations can be much easier.

 
\ No newline at end of file diff --git a/add-to-head.rkt b/add-to-head.rkt index 2ec9ec5..c21096f 100644 --- a/add-to-head.rkt +++ b/add-to-head.rkt @@ -41,9 +41,15 @@ EOF (define all (string-append metas web-font ga-code )) (define subst (regexp-replace* "\n" all "")) ;minify -(define old (file->string "main.html")) -(define new (regexp-replace "" old subst)) -(with-output-to-file (build-path 'same "index.html") - (lambda () (display new)) - #:mode 'text - #:exists 'replace) +(define (do-file path) + (define old (file->string path)) + (define new (regexp-replace old subst)) + (with-output-to-file path + (lambda () (display new)) + #:mode 'text + #:exists 'replace)) + +(for ([path (find-files (lambda (path) + (regexp-match? #rx"\\.html" path)) + (build-path 'same "index"))]) + (do-file path)) diff --git a/index.html b/index.html index 006ba21..93ad055 100644 --- a/index.html +++ b/index.html @@ -1,363 +1,3 @@ -Fear of Macros
1 Preface
2 Our plan of attack
3 Transform!
3.1 What is a syntax transformer?
3.2 What’s the input?
3.3 Actually transforming the input
3.4 Compile time vs. run time
3.5 begin-for-syntax
4 Pattern matching: syntax-case and syntax-rules
4.1 Pattern variable vs. template—fight!
4.1.1 with-syntax
4.1.2 with-syntax*
4.1.3 format-id
4.1.4 Another example
4.2 Making our own struct
4.3 Using dot notation for nested hash lookups
5 Syntax parameters
6 What’s the point of splicing-let?
7 Robust macros: syntax-parse
7.1 Error-handling strategies for functions
7.2 Error-handling strategies for macros
7.3 Using syntax/ parse
8 References and Acknowledgments
9 Epilogue

Fear of Macros

-
Copyright (c) 2012 by Greg Hendershott. All rights reserved.
Last updated 2012-11-13 12:56:57
Feedback and corrections are welcome here.

Contents:

    1 Preface

    2 Our plan of attack

    3 Transform!

      3.1 What is a syntax transformer?

      3.2 What’s the input?

      3.3 Actually transforming the input

      3.4 Compile time vs. run time

      3.5 begin-for-syntax

    4 Pattern matching: syntax-case and syntax-rules

      4.1 Pattern variable vs. template—fight!

        4.1.1 with-syntax

        4.1.2 with-syntax*

        4.1.3 format-id

        4.1.4 Another example

      4.2 Making our own struct

      4.3 Using dot notation for nested hash lookups

    5 Syntax parameters

    6 What’s the point of splicing-let?

    7 Robust macros: syntax-parse

      7.1 Error-handling strategies for functions

      7.2 Error-handling strategies for macros

      7.3 Using syntax/parse

    8 References and Acknowledgments

    9 Epilogue

1 Preface

I learned Racket after 25 years of mostly using C and C++.

Some psychic whiplash resulted.

"All the parentheses" was actually not a big deal. Instead, the first -mind warp was functional programming. Before long I wrapped my brain -around it, and went on to become comfortable and effective with many -other aspects and features of Racket.

But two final frontiers remained: Macros and continuations.

I found that simple macros were easy and understandable, plus there -were many good tutorials available. But the moment I stepped past -routine pattern-matching, I kind of fell off a cliff into a -terminology soup. I marinaded myself in material, hoping it would -eventually sink in after enough re-readings. I even found myself using -trial and error, rather than having a clear mental model what was -going on. Gah.

I’m starting to write this at the point where the shapes are slowly -emerging from the fog.

If you have any corrections, criticisms, complaints, or whatever, -please -let me know.

My primary motive is selfish. Explaining something forces me to learn -it more thoroughly. Plus if I write something with mistakes, other -people will be eager to point them out and correct me. Is that a -social-engineering variation of meta-programming? Next question, -please. :)

Finally I do hope it may help other people who have a similar -background and/or learning style as me.

I want to show how Racket macro features have evolved as solutions to -problems or annoyances. I learn more quickly and deeply when I -discover the answer to a question I already have, or find the solution -to a problem whose pain I already feel. Therefore I’ll give you the -questions and problems first, so that you can better appreciate and -understand the answers and solutions.

2 Our plan of attack

The macro system you will mostly want to use for production-quality -macros is called syntax-parse. And don’t worry, we’ll get to -that soon.

But if we start there, you’re likely to feel overwhelmed by concepts -and terminology, and get very confused. I did.

1. Instead let’s start with the basics: A syntax object and a function -to change it—a "transformer". We’ll work at that level for awhile to -get comfortable and to de-mythologize this whole macro business.

2. Soon we’ll realize that pattern-matching would make life -easier. We’ll learn about syntax-case and its shorthand -cousin, define-syntax-rule. We’ll discover we can get -confused if we want to munge pattern variables before sticking them -back in the template, and learn how to do that.

3. At this point we’ll be able to write many useful macros. But, what -if we want to write the ever-popular anaphoric if, with a "magic -variable"? It turns out we’ve been protected from making certain kind -of mistakes. When we want to do this kind of thing on purpose, we use -a syntax parameter. [There are other, older ways to do this. We won’t -look at them. We also won’t spend a lot of time -advocating "hygiene"—we’ll just stipulate that it’s good.]

4. Finally, we’ll realize that our macros could be smarter when -they’re used in error. Normal Racket functions optionally can have -contracts and types. These catch usage mistakes and provide clear, -useful error messages. It would be great if there were something -similar for macro. There is. One of the more-recent Racket macro -enhancements is syntax-parse.

3 Transform!

  YOU ARE INSIDE A ROOM.

  THERE ARE KEYS ON THE GROUND.

  THERE IS A SHINY BRASS LAMP NEARBY.

  

  IF YOU GO THE WRONG WAY, YOU WILL BECOME

  HOPELESSLY LOST AND CONFUSED.

  

  > pick up the keys

  

  YOU HAVE A SYNTAX TRANSFORMER

3.1 What is a syntax transformer?

A syntax transformer is not one of the トランスフォーマ -transformers.

Instead, it is simply a function. The function takes syntax and -returns syntax. It transforms syntax.

Here’s a transformer function that ignores its input syntax, and -always outputs syntax for a string literal:

> (define-syntax foo
    (lambda (stx)
      (syntax "I am foo")))

Using it:

> (foo)

"I am foo"

When we use define-syntax, we’re making a transformer -binding. This tells the Racket compiler, "Whenever you -encounter a chunk of syntax starting with foo, please give it -to my transformer function, and replace it with the syntax I give back -to you." So Racket will give anything that looks like (foo ...) to our function, and we can return new syntax to use -instead. Much like a search-and-replace.

Maybe you know that the usual way to define a function in Racket:

(define (f x) ...)

is shorthand for:

(define f (lambda (x) ...))

That shorthand lets you avoid typing lambda and some parentheses.

Well there is a similar shorthand for define-syntax:

> (define-syntax (also-foo stx)
    (syntax "I am also foo"))
> (also-foo)

"I am also foo"

What we want to remember is that this is simply shorthand. We are -still defining a transformer function, which takes syntax and returns -syntax. Everything we do with macros, will be built on top of this -basic idea. It’s not magic.

Speaking of shorthand, there is also a shorthand for syntax, -which is #:

# is short for syntax much like - is short for quote.

> (define-syntax (quoted-foo stx)
    #'"I am also foo, using #' instead of syntax")
> (quoted-foo)

"I am also foo, using #' instead of syntax"

We’ll use the #’ shorthand from now on.

Of course, we can emit syntax that is more interesting than a -string literal. How about returning (displayln "hi")?

> (define-syntax (say-hi stx)
    #'(displayln "hi"))
> (say-hi)

hi

When Racket expands our program, it sees the occurrence of -(say-hi), and sees it has a transformer function for that. It -calls our function with the old syntax, and we return the new syntax, -which is used to evaluate and run our program.

3.2 What’s the input?

Our examples so far have ignored the input syntax and output some -fixed syntax. But typically we will want to transform in the input -syntax into somehing else.

Let’s start by looking closely at what the input actually is:

> (define-syntax (show-me stx)
    (print stx)
    #'(void))
> (show-me '(+ 1 2))

#<syntax:10:0 (show-me (quote (+ 1 2)))>

The (print stx) shows what our transformer is given: a syntax -object.

A syntax object consists of several things. The first part is the -S-expression representing the code, such as '(+ 1 2).

Racket syntax is also decorated with some interesting information such -as the source file, line number, and column. Finally, it has -information about lexical scoping (which you don’t need to worry about -now, but will turn out to be important later.)

There are a variety of functions available to access a syntax object. -Let’s define a piece of syntax:

> (define stx #'(if x (list "true") #f))
> stx

#<syntax:11:0 (if x (list "true") #f)>

Now let’s use functions that access the syntax object. The source -information functions are:

(syntax-source stx) is returning 'eval, -only becaue of how I’m generating this documentation, using an -evaluator to run code snippets in Scribble. Normally this would be -somthing like "my-file.rkt".

> (syntax-source stx)

'eval

> (syntax-line stx)

11

> (syntax-column stx)

0

More interesting is the syntax "stuff" itself. syntax->datum -converts it completely into an S-expression:

> (syntax->datum stx)

'(if x (list "true") #f)

Whereas syntax-e only goes "one level down". It may return a -list that has syntax objects:

> (syntax-e stx)

'(#<syntax:11:0 if> #<syntax:11:0 x> #<syntax:11:0 (list "true")> #<syntax:11:0 #f>)

Each of those syntax objects could be converted by syntax-e, -and so on recursively—which is what syntax->datum does.

In most cases, syntax->list gives the same result as -syntax-e:

> (syntax->list stx)

'(#<syntax:11:0 if> #<syntax:11:0 x> #<syntax:11:0 (list "true")> #<syntax:11:0 #f>)

(When would syntax-e and syntax->list differ? Let’s -not get side-tracked now.)

When we want to transform syntax, we’ll generally take the pieces we -were given, maybe rearrange their order, perhaps change some of the -pieces, and often introduce brand-new pieces.

3.3 Actually transforming the input

Let’s write a transformer function that reverses the syntax it was -given:

> (define-syntax (reverse-me stx)
    (datum->syntax stx (reverse (cdr (syntax->datum stx)))))
> (reverse-me "backwards" "am" "i" values)

"i"

"am"

"backwards"

Understand Yoda, can we. Great, but how does this work?

First we take the input syntax, and give it to -syntax->datum. This converts the syntax into a plain old -list:

> (syntax->datum #'(reverse-me "backwards" "am" "i" values))

'(reverse-me "backwards" "am" "i" values)

Using cdr slices off the first item of the list, -reverse-me, leaving the remainder: -("backwards" "am" "i" values). Passing that to -reverse changes it to (values "i" "am" "backwards"):

> (reverse (cdr '(reverse-me "backwards" "am" "i" values)))

'(values "i" "am" "backwards")

Finally we use datum->syntax to convert this back to -syntax:

> (datum->syntax #f '(values "i" "am" "backwards"))

#<syntax (values "i" "am" "backwards")>

That’s what our transformer function gives back to the Racket -compiler, and that syntax is evaluated:

> (values "i" "am" "backwards")

"i"

"am"

"backwards"

3.4 Compile time vs. run time

(define-syntax (foo stx)
  (make-pipe) ;Ce n'est pas le temps d'exécution
  #'(void))

Normal Racket code runs at ... run time. Duh.

Instead of "compile time vs. run time", you may hear it -described as "syntax phase vs. runtime phase". Same difference.

But a syntax transformer is called by Racket as part of the process of -parsing, expanding, and compiling our program. In other words, our -syntax transformer function is evaluated at compile time.

This aspect of macros lets you do things that simply aren’t possible -in normal code. One of the classic examples is something like the -Racket form, if:

(if <condition> <true-expression> <false-expression>)

If we implemented if as a function, all of the arguments -would be evaluated before being provided to the function.

> (define (our-if condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if #t
          "true"
          "false")

"true"

That seems to work. However, how about this:

> (define (display-and-return x)
    (displayln x)
    x)
> (our-if #t
          (display-and-return "true")
          (display-and-return "false"))

true

false

"true"

One answer is that functional programming is good, and -side-effects are bad. But avoiding side-effects isn’t always -practical.

Oops. Because the expressions have a side-effect, it’s obvious that -they are both evaluated. And that could be a problem—what if the -side-effect includes deleting a file on disk? You wouldn’t want -(if user-wants-file-deleted? (delete-file) (void)) to delete -a file even when user-wants-file-deleted? is #f.

So this simply can’t work as a plain function. However a syntax -transformer can rearrange the syntax – rewrite the code – at compile -time. The pieces of syntax are moved around, but they aren’t actually -evaluated until run time.

Here is one way to do this:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))
> (our-if-v2 #t
             (display-and-return "true")
             (display-and-return "false"))

true

"true"

> (our-if-v2 #f
             (display-and-return "true")
             (display-and-return "false"))

false

"false"

That gave the right answer. But how? Let’s pull out the transformer -function itself, and see what it did. We start with an example of some -input syntax:

> (define stx #'(our-if-v2 #t "true" "false"))
> (displayln stx)

#<syntax:32:0 (our-if-v2 #t "true" "false")>

1. We take the original syntax, and use syntax->datum to -change it into a plain Racket list:

> (define xs (syntax->datum stx))
> (displayln xs)

(our-if-v2 #t true false)

2. To change this into a Racket cond form, we need to take -the three interesting pieces—the condition, true-expression, and -false-expression—from the list using cadr, caddr, -and cadddr and arrange them into a cond form:

`(cond [,(cadr xs) ,(caddr xs)]
       [else ,(cadddr xs)])

3. Finally, we change that into syntax using -datum->syntax:

> (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                            [else ,(cadddr xs)]))

#<syntax (cond (#t "true") (else "fals...>

So that works, but using cadddr etc. to destructure a list is -painful and error-prone. Maybe you know Racket’s match? -Using that would let us do pattern-matching.

Notice that we don’t care about the first item in the -syntax list. We didn’t take (car xs) in our-if-v2, and we -didn’t use name when we used pattern-matching. In general, a -syntax transformer won’t care about that, because it is the name of -the transformer binding. In other words, a macro usually doesn’t care -about its own name.

Instead of:

> (define-syntax (our-if-v2 stx)
    (define xs (syntax->list stx))
    (datum->syntax stx `(cond [,(cadr xs) ,(caddr xs)]
                              [else ,(cadddr xs)])))

We can write:

> (define-syntax (our-if-using-match stx)
    (match (syntax->list stx)
      [(list name condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))

Great. Now let’s try using it:

> (our-if-using-match #t "true" "false")

match: undefined;

 cannot reference an identifier before its definition

  in module: 'program

  phase: 1

Oops. It’s complaining that match isn’t defined.

Our transformer function is working at compile time, not run time. And -at compile time, only racket/base is required for you -automatically—not the full racket.

Anything beyond racket/base, we have to require -ourselves—and require it for compile time using the -for-syntax form of require.

In this case, instead of using plain (require racket/match), -we want (require (for-syntax racket/match))the -for-syntax part meaning, "for compile time".

So let’s try that:

> (require (for-syntax racket/match))
> (define-syntax (our-if-using-match-v2 stx)
    (match (syntax->list stx)
      [(list _ condition true-expr false-expr)
       (datum->syntax stx `(cond [,condition ,true-expr]
                                 [else ,false-expr]))]))
> (our-if-using-match-v2 #t "true" "false")

"true"

Joy.

3.5 begin-for-syntax

We used for-syntax to require the -racket/match module because we needed to use match -at compile time.

What if we wanted to define our own helper function to be used by a -macro? One way to do that is put it in another module, and -require it using for-syntax, just like we did with -the racket/match module.

If instead we want to put the helper in the same module, we can’t -simply define it and use it—the definition would exist at -run time, but we need it at compile time. The answer is to put the -definition of the helper function(s) inside begin-for-syntax:

(begin-for-syntax
 (define (my-helper-function ....)
   ....))
(define-syntax (macro-using-my-helper-function stx)
  (my-helper-function ....)
  ....)

In the simple case, we can also use define-for-syntax, which -composes begin-for-syntax and define:

(define-for-syntax (my-helper-function ....)
  ....)
(define-syntax (macro-using-my-helper-function stx)
  (my-helper-function ....)
  ....)

To review:

4 Pattern matching: syntax-case and syntax-rules

Most useful syntax transformers work by taking some input syntax, and -rearranging the pieces into something else. As we saw, this is -possible but tedious using list accessors such as -cadddr. It’s more convenient and less error-prone to use -match to do pattern-matching.

Historically, syntax-case and -syntax-rules pattern matching came first. match was -added to Racket later.

It turns out that pattern-matching was one of the first improvements -to be added to the Racket macro system. It’s called -syntax-case, and has a shorthand for simple situations called -define-syntax-rule.

Recall our previous example:

(require (for-syntax racket/match))
(define-syntax (our-if-using-match-v2 stx)
  (match (syntax->list stx)
    [(list _ condition true-expr false-expr)
     (datum->syntax stx `(cond [,condition ,true-expr]
                               [else ,false-expr]))]))

Here’s what it looks like using syntax-case:

> (define-syntax (our-if-using-syntax-case stx)
    (syntax-case stx ()
      [(_ condition true-expr false-expr)
       #'(cond [condition true-expr]
               [else false-expr])]))
> (our-if-using-syntax-case #t "true" "false")

"true"

Pretty similar, huh? The pattern matching part looks almost exactly -the same. The way we specify the new syntax is simpler. We don’t need -to do quasi-quoting and unquoting. We don’t need to use -datum->syntax. Instead, we supply a "template", which uses -variables from the pattern.

There is a shorthand for simple pattern-matching cases, which expands -into syntax-case. It’s called define-syntax-rule:

> (define-syntax-rule (our-if-using-syntax-rule condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if-using-syntax-rule #t "true" "false")

"true"

Here’s the thing about define-syntax-rule. Because it’s so -simple, define-syntax-rule is often the first thing people are -taught about macros. But it’s almost deceptively simple. It looks so -much like defining a normal run time function—yet it’s not. It’s -working at compile time, not run time. Worse, the moment you want to -do more than define-syntax-rule can handle, you can fall off -a cliff into what feels like complicated and confusing -territory. Hopefully, because we started with a basic syntax -transformer, and worked up from that, we won’t have that problem. We -can appreciate define-syntax-rule as a convenient shorthand, -but not be scared of, or confused about, that for which it’s -shorthand.

Most of the materials I found for learning macros, including the -Racket Guide, do a very good job explaining -how -patterns and templates work. So I won’t regurgitate that here.

Sometimes, we need to go a step beyond the pattern and template. Let’s -look at some examples, how we can get confused, and how to get it -working.

4.1 Pattern variable vs. template—fight!

Let’s say we want to define a function with a hyphenated name, a-b, -but we supply the a and b parts separately. The Racket struct -macro does something like this: (struct foo (field1 field2)) -automatically defines a number of functions whose names are variations -on the name foosuch as foo-field1, -foo-field2, foo?, and so on.

So let’s pretend we’re doing something like that. We want to transform -the syntax (hyphen-define a b (args) body) to the syntax -(define (a-b args) body).

A wrong first attempt is:

> (define-syntax (hyphen-define/wrong1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (let ([name (string->symbol (format "~a-~a" a b))])
         #'(define (name args ...)
             body0 body ...))]))

eval:47:0: a: pattern variable cannot be used outside of a

template

  in: a

Huh. We have no idea what this error message means. Well, let’s try to -work it out. The "template" the error message refers to is the -#'(define (name args ...) body0 body ...) portion. The -let isn’t part of that template. It sounds like we can’t use -a (or b) in the let part.

In fact, syntax-case can have as many templates as you -want. The obvious, required template is the final expression supplying -the output syntax. But you can use syntax (a.k.a. #’) on a -pattern variable. This makes another template, albeit a small, "fun -size" template. Let’s try that:

> (define-syntax (hyphen-define/wrong1.1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (let ([name (string->symbol (format "~a-~a" #'a #'b))])
         #'(define (name args ...)
             body0 body ...))]))

No more error—good! Let’s try to use it:

> (hyphen-define/wrong1.1 foo bar () #t)
> (foo-bar)

foo-bar: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Apparently our macro is defining a function with some name other than -foo-bar. Huh.

This is where the Macro Stepper in DrRacket is -invaluable.

Even if you prefer mostly to use Emacs, this -is a situation where it’s definitely worth temporarily using DrRacket -for its Macro Stepper.

The Macro Stepper says that the use of our macro:

(hyphen-define/wrong1.1 foo bar () #t)

expanded to:

(define (name) #t)

Well that explains it. Instead, we wanted to expand to:

(define (foo-bar) #t)

Our template is using the symbol name but we wanted its -value, such as foo-bar in this use of our macro.

Is there anything we already know that behaves like this—where using -a variable in the template yields its value? Yes: Pattern -variables. Our pattern doesn’t include name because we don’t -expect it in the original syntax—indeed the whole point of this -macro is to create it. So name can’t be in the main -pattern. Fine—let’s make an additional pattern. We can do -that using an additional, nested syntax-case:

> (define-syntax (hyphen-define/wrong1.2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx
                                   (string->symbol (format "~a-~a" #'a #'b)))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))

Looks weird? Let’s take a deep breath. Normally our transformer -function is given syntax by Racket, and we pass that syntax to -syntax-case. But we can also create some syntax of our own, -on the fly, and pass that to syntax-case. That’s all -we’re doing here. The whole (datum->syntax ...) expression is -syntax that we’re creating on the fly. We can give that to -syntax-case, and match it using a pattern variable named -name. Voila, we have a new pattern variable. We can use it in -a template, and its value will go in the template.

We might have one more—just one, I promise!—small problem left. -Let’s try to use our new version:

> (hyphen-define/wrong1.2 foo bar () #t)
> (foo-bar)

foo-bar: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Hmm. foo-bar is still not defined. Back to the Macro -Stepper. It says now we’re expanding to:

(define (|#<syntax:11:24foo>-#<syntax:11:28 bar>|) #t)

Oh right: #'a and #'b are syntax objects. Therefore

(string->symbol (format "~a-~a" #'a #'b))

is the printed form of both syntax objects, joined by a hyphen:

|#<syntax:11:24foo>-#<syntax:11:28 bar>|

Instead we want the datum in the syntax objects, such as the symbols -foo and bar. Which we get using -syntax->datum:

> (define-syntax (hyphen-define/ok1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx
                                   (string->symbol (format "~a-~a"
                                                           (syntax->datum #'a)
                                                           (syntax->datum #'b))))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))
> (hyphen-define/ok1 foo bar () #t)
> (foo-bar)

#t

And now it works!

Next, some shortcuts.

4.1.1 with-syntax

Instead of an additional, nested syntax-case, we could use -with-syntaxAnother name for -with-syntax could be, "with new pattern variable".. This -rearranges the syntax-case to look more like a let -statement—first the name, then the value. Also it’s more convenient -if we need to define more than one pattern variable.

> (define-syntax (hyphen-define/ok2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (datum->syntax stx
                                          (string->symbol (format "~a-~a"
                                                                  (syntax->datum #'a)
                                                                  (syntax->datum #'b))))])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok2 foo bar () #t)
> (foo-bar)

#t

Again, with-syntax is simply syntax-case rearranged:

(syntax-case <syntax> () [<pattern> <body>])
(with-syntax ([<pattern> <syntax>]) <body>)

Whether you use an additional syntax-case or use -with-syntax, either way you are simply defining additional -pattern variables. Don’t let the terminology and structure make it -seem mysterious.

4.1.2 with-syntax*

We know that let doesn’t let us use a binding in a subsequent -one:

> (let ([a 0]
        [b a])
    b)

a: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Instead we can nest lets:

> (let ([a 0])
    (let ([b a])
      b))

0

Or use a shorthand for nesting, let*:

> (let* ([a 0]
         [b a])
    b)

0

Similarly, instead of writing nested with-syntaxs, we can use -with-syntax*:

> (require (for-syntax racket/syntax))
> (define-syntax (foo stx)
    (syntax-case stx ()
      [(_ a)
        (with-syntax* ([b #'a]
                       [c #'b])
          #'c)]))

One gotcha is that with-syntax* isn’t provided by -racket/base. We must (require (for-syntax racket/syntax)). Otherwise we may get a rather bewildering error -message:

...: ellipses not allowed as an expression in: ....

4.1.3 format-id

There is a utility function in racket/syntax called -format-id that lets us format identifier names more -succinctly than what we did above:

> (require (for-syntax racket/syntax))
> (define-syntax (hyphen-define/ok3 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (format-id stx "~a-~a" #'a #'b)])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok3 bar baz () #t)
> (bar-baz)

#t

Using format-id is convenient as it handles the tedium of -converting from syntax to symbol datum to string ... and all the way -back.

4.1.4 Another example

Finally, here’s a variation that accepts an arbitary number of name -parts to be joined with hyphens:

> (require (for-syntax racket/string racket/syntax))
> (define-syntax (hyphen-define* stx)
    (syntax-case stx ()
      [(_ (names ...) (args ...) body0 body ...)
       (let* ([names/sym (map syntax-e (syntax->list #'(names ...)))]
              [names/str (map symbol->string names/sym)]
              [name/str (string-join names/str "-")]
              [name/sym (string->symbol name/str)])
         (with-syntax ([name (datum->syntax stx name/sym)])
           #`(define (name args ...)
               body0 body ...)))]))
> (hyphen-define* (foo bar baz) (v) (* 2 v))
> (foo-bar-baz 50)

100

To review:

4.2 Making our own struct

Let’s apply what we just learned to a more-realistic example. We’ll -pretend that Racket doesn’t already have a struct -capability. Fortunately, we can write a macro to provide our own -system for defining and using structures. To keep things simple, our -structure will be immutable (read-only) and it won’t support -inheritance.

Given a structure declaration like:

(our-struct name (field1 field2 ...))

We need to define some procedures:

> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx)
    (syntax-case stx ()
      [(_ id (fields ...))
       (with-syntax ([pred-id (format-id stx "~a?" #'id)])
         #`(begin
             ; Define a constructor.
             (define (id fields ...)
               (apply vector (cons 'id  (list fields ...))))
             ; Define a predicate.
             (define (pred-id v)
               (and (vector? v)
                    (eq? (vector-ref v 0) 'id)))
             ; Define an accessor for each field.
             #,@(for/list ([x (syntax->list #'(fields ...))]
                           [n (in-naturals 1)])
                  (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                [ix n])
                    #`(define (acc-id v)
                        (unless (pred-id v)
                          (error 'acc-id "~a is not a ~a struct" v 'id))
                        (vector-ref v ix))))))]))
; Test it out
> (require rackunit)
> (our-struct foo (a b))
> (define s (foo 1 2))
> (check-true (foo? s))
> (check-false (foo? 1))
> (check-equal? (foo-a s) 1)
> (check-equal? (foo-b s) 2)
> (check-exn exn:fail?
             (lambda () (foo-a "furble")))
; The tests passed.
; Next, what if someone tries to declare:
> (our-struct "blah" ("blah" "blah"))

format-id: contract violation

  expected: (or/c string? symbol? identifier? keyword? char?

number?)

  given: #<syntax:83:0 "blah">

The error message is not very helpful. It’s coming from -format-id, which is a private implementation detail of our macro.

You may know that a syntax-case clause can take an -optional "guard" or "fender" expression. Instead of

[pattern template]

It can be:

[pattern guard template]

Let’s add a guard expression to our clause:

> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx)
    (syntax-case stx ()
      [(_ id (fields ...))
       ; Guard or "fender" expression:
       (for-each (lambda (x)
                   (unless (identifier? x)
                     (raise-syntax-error #f "not an identifier" stx x)))
                 (cons #'id (syntax->list #'(fields ...))))
       (with-syntax ([pred-id (format-id stx "~a?" #'id)])
         #`(begin
             ; Define a constructor.
             (define (id fields ...)
               (apply vector (cons 'id  (list fields ...))))
             ; Define a predicate.
             (define (pred-id v)
               (and (vector? v)
                    (eq? (vector-ref v 0) 'id)))
             ; Define an accessor for each field.
             #,@(for/list ([x (syntax->list #'(fields ...))]
                           [n (in-naturals 1)])
                  (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                [ix n])
                    #`(define (acc-id v)
                        (unless (pred-id v)
                          (error 'acc-id "~a is not a ~a struct" v 'id))
                        (vector-ref v ix))))))]))
; Now the same misuse gives a better error message:
> (our-struct "blah" ("blah" "blah"))

eval:86:0: our-struct: not an identifier

  at: "blah"

  in: (our-struct "blah" ("blah" "blah"))

Later, we’ll see how syntax-parse makes it even easier to -check usage and provide helpful messages about mistakes.

4.3 Using dot notation for nested hash lookups

The previous two examples used a macro to define functions whose names -were made by joining identifiers provided to the macro. This example -does the opposite: The identifier given to the macro is split into -pieces.

If you write programs for web services you deal with JSON, which is -represented in Racket by a jsexpr?. JSON often has -dictionaries that contain other dictionaries. In a jsexpr? -these are represented by nested hasheq tables:

; Nested `hasheq's typical of a jsexpr:
> (define js (hasheq 'a (hasheq 'b (hasheq 'c "value"))))

In JavaScript you can use dot notation:

foo = js.a.b.c;

In Racket it’s not so convenient:

(hash-ref (hash-ref (hash-ref js 'a) 'b) 'c)

We can write a helper function to make this a bit cleaner:

; This helper function:
> (define/contract (hash-refs h ks [def #f])
    ((hash? (listof any/c)) (any/c) . ->* . any)
    (with-handlers ([exn:fail? (const (cond [(procedure? def) (def)]
                                            [else def]))])
      (for/fold ([h h])
        ([k (in-list ks)])
        (hash-ref h k))))
; Lets us say:
> (hash-refs js '(a b c))

"value"

That’s better. Can we go even further and use a dot notation somewhat -like JavaScript?

; This macro:
> (require (for-syntax racket/syntax))
> (define-syntax (hash.refs stx)
    (syntax-case stx ()
      ; If the optional `default' is missing, assume it's #f.
      [(_ chain)
       #'(hash.refs chain #f)]
      [(_ chain default)
       (let ([xs (map (lambda (x)
                        (datum->syntax stx (string->symbol x)))
                      (regexp-split #rx"\\."
                                    (symbol->string (syntax->datum #'chain))))])
         (with-syntax ([h (car xs)]
                       [ks (cdr xs)])
           #'(hash-refs h 'ks default)))]))
; Gives us "sugar" to say this:
> (hash.refs js.a.b.c)

"value"

; Try finding a key that doesn't exist:
> (hash.refs js.blah)

#f

; Try finding a key that doesn't exist, specifying the default:
> (hash.refs js.blah 'did-not-exist)

'did-not-exist

It works!

We’ve started to appreciate that our macros should give helpful -messages when used in error. Let’s try to do that here.

> (require (for-syntax racket/syntax))
> (define-syntax (hash.refs stx)
    (syntax-case stx ()
      ; Check for no args at all
      [(_)
       (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                           stx #'chain)]
      [(_ chain)
       #'(hash.refs chain #f)]
      [(_ chain default)
       ; Check that chain is a symbol, not e.g. a number or string
       (unless (symbol? (syntax-e #'chain))
         (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                             stx #'chain))
       (let ([xs (map (lambda (x)
                        (datum->syntax stx (string->symbol x)))
                      (regexp-split #rx"\\."
                                    (symbol->string (syntax->datum #'chain))))])
         ; Check that we have at least hash.key
         (unless (and (>= (length xs) 2)
                      (not (eq? (syntax-e (cadr xs)) '||)))
           (raise-syntax-error #f "Expected hash.key" stx #'chain))
         (with-syntax ([h (car xs)]
                       [ks (cdr xs)])
           #'(hash-refs h 'ks default)))]))
; See if we catch each of the misuses
> (hash.refs)

eval:96:0: hash.refs: Expected (hash.key0[.key1 ...]

[default])

  at: chain

  in: (hash.refs)

> (hash.refs 0)

eval:98:0: hash.refs: Expected (hash.key0[.key1 ...]

[default])

  at: 0

  in: (hash.refs 0 #f)

> (hash.refs js)

eval:99:0: hash.refs: Expected hash.key

  at: js

  in: (hash.refs js #f)

> (hash.refs js.)

eval:100:0: hash.refs: Expected hash.key

  at: js.

  in: (hash.refs js. #f)

Not too bad. Of course, the version with error-checking is quite a bit -longer. Error-checking code generally tends to obscure the logic, and -does here. Fortuantely we’ll soon see how syntax-parse can -help mitigate that, in much the same way as contracts in normal -Racket or types in Typed Racket.

Maybe we’re not convinced that writing (hash.refs js.a.b.c) -is really clearer than (hash-refs js '(a b c)). Maybe we -won’t actually use this approach. But the Racket macro system makes it -a possible choice.

5 Syntax parameters

"Anaphoric if" or "aif" is a popular macro example. Instead of writing:

(let ([tmp (big-long-calculation)])
  (if tmp
      (foo tmp)
      #f))

You could write:

(aif (big-long-calculation)
     (foo it)
     #f)

In other words, when the condition is true, an it identifier -is automatically created and set to the value of the condition. This -should be easy:

> (define-syntax-rule (aif condition true-expr false-expr)
    (let ([it condition])
      (if it
          true-expr
          false-expr)))
> (aif #t (displayln it) (void))

it: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Wait, what? it is undefined?

It turns out that all along we have been protected from making a -certain kind of mistake in our macros. The mistake is if our new -syntax introduces a variable that accidentally conflicts with one in -the code surrounding our macro.

The Racket Reference section, -Transformer -Bindings, has a good explanation and example. Basically, syntax -has "marks" to preserve lexical scope. This makes your macro behave -like a normal function, for lexical scoping.

If a normal function defines a variable named x, it won’t -conflict with a variable named x in an outer scope:

> (let ([x "outer"])
    (let ([x "inner"])
      (printf "The inner `x' is ~s\n" x))
    (printf "The outer `x' is ~s\n" x))

The inner `x' is "inner"

The outer `x' is "outer"

When our macros also respect lexical scoping, it’s easier to write -reliable macros that behave predictably.

So that’s wonderful default behavior. But sometimes we want -to introduce a magic variable on purpose—such as it for -aif.

The way to do this is with a "syntax parameter", using -define-syntax-parameter and -syntax-parameterize. You’re probably familiar with regular -parameters in Racket:

> (define current-foo (make-parameter "some default value"))
> (current-foo)

"some default value"

> (parameterize ([current-foo "I have a new value, for now"])
    (current-foo))

"I have a new value, for now"

> (current-foo)

"some default value"

That’s a normal parameter. The syntax variation works similarly. The -idea is that we’ll define it to mean an error by -default. Only inside of our aif will it have a meaningful -value:

> (require racket/stxparam)
> (define-syntax-parameter it
    (lambda (stx)
      (raise-syntax-error (syntax-e stx) "can only be used inside aif")))
> (define-syntax-rule (aif condition true-expr false-expr)
    (let ([tmp condition])
      (if tmp
          (syntax-parameterize ([it (make-rename-transformer #'tmp)])
            true-expr)
          false-expr)))
> (aif 10 (displayln it) (void))

10

> (aif #f (displayln it) (void))

Inside the syntax-parameterize, it acts as an alias -for tmp. The alias behavior is created by -make-rename-transformer.

If we try to use it outside of an aif form, and -it isn’t otherwise defined, we get an error like we want:

> (displayln it)

it: can only be used inside aif

But we can still define it as a normal variable:

> (define it 10)
> it

10

For a deeper look, see Keeping it Clean with Syntax Parameters.

6 What’s the point of splicing-let?

I stared at racket/splicing for the longest time. What does -it do? Why would I use it? Why is it in the Macros section of the -reference?

Step one, cut a hole in the box -de-mythologize it. For example, using splicing-let like this:

> (require racket/splicing)
> (splicing-let ([x 0])
    (define (get-x)
      x))
; get-x is visible out here:
> (get-x)

0

; but x is not:
> x

x: undefined;

 cannot reference an identifier before its definition

  in module: 'program

is equivalent to:

> (define get-y
    (let ([y 0])
      (lambda ()
        y)))
; get-y is visible out here:
> (get-y)

0

; but y is not:
> y

y: undefined;

 cannot reference an identifier before its definition

  in module: 'program

This is the classic Lisp/Scheme/Racket idiom sometimes called "let -over lambda". A -koan -about closures and objects. A closure hides y, which can -only be accessed via get-y.

So why would we care about the splicing forms? They can be more -concise, especially when there are multiple body forms:

> (require racket/splicing)
> (splicing-let ([x 0])
    (define (inc)
      (set! x (+ x 1)))
    (define (dec)
      (set! x (- x 1)))
    (define (get)
      x))

The splicing variation is more convenient than the usual way:

> (define-values (inc dec get)
    (let ([x 0])
      (values (lambda ()  ; inc
                (set! x (+ 1 x)))
              (lambda ()  ; dec
                (set! x (- 1 x)))
              (lambda ()  ; get
                x))))

When there are many body forms—and we’re generating them in a -macro—the splicing variations can be much easier.

7 Robust macros: syntax-parse

Functions can be used in error. So can macros.

7.1 Error-handling strategies for functions

With plain old functions, we have several choices how to handle -misuse.

1. Don’t check at all.

> (define (misuse s)
    (string-append s " snazzy suffix"))
; User of the function:
> (misuse 0)

string-append: contract violation

  expected: string?

  given: 0

  argument position: 1st

  other arguments...:

   " snazzy suffix"

; I guess I goofed, but what is this "string-append" of which you
; speak??

The problem is that the resulting error message will be confusing. Our -user thinks they’re calling misuse, but is getting an error -message from string-append. In this simple example they -could probably guess what’s happening, but in most cases they won’t.

2. Write some error handling code.

> (define (misuse s)
    (unless (string? s)
      (error 'misuse "expected a string, but got ~a" s))
    (string-append s " snazzy suffix"))
; User of the function:
> (misuse 0)

misuse: expected a string, but got 0

; I goofed, and understand why! It's a shame the writer of the
; function had to work so hard to tell me.

Unfortunately the error code tends to overwhelm and/or obscure our -function definition. Also, the error message is good but not -great. Improving it would require even more error code.

3. Use a contract.

> (define/contract (misuse s)
    (string? . -> . string?)
    (string-append s " snazzy suffix"))
; User of the function:
> (misuse 0)

misuse: contract violation

  expected: string?, given: 0

  in: the 1st argument of

      (-> string? string?)

  contract from: (function misuse)

  blaming: program

  at: eval:130.0

; I goofed, and understand why! I hear the writer of the function is
; happier.

This is the best of both worlds.

The contract is a simple and concise. Even better, it’s -declarative. We say what we want, without needing to spell out what to -do.

On the other hand the user of our function gets a very detailed error -message. Plus, the message is in a standard, familiar format.

4. Use Typed Racket.

> (: misuse (String -> String))
> (define (misuse s)
    (string-append s " snazzy suffix"))
> (misuse 0)

eval:3:0: Type Checker: Expected String, but got Zero

  in: (quote 0)

With respect to error handling, Typed Racket has the same benefits as -contracts. Good.

7.2 Error-handling strategies for macros

For macros, we have similar choices.

1. Ignore the possibility of misuse. This choice is even worse for -macros. The default error messages are even less likely to make sense, -much less help our user know what to do.

2. Write error-handling code. We saw how much this complicated our -macros in our example of Using dot notation for nested hash lookups. And while we’re still -learning how to write macros, we especially don’t want more cognitive -load and obfuscation.

3. Use syntax/parse. For macros, this is the equivalent of -using contracts or types for functions. We can declare that input -pattern elements must be certain kinds of things, such as an -identifier. Instead of "types", the kinds are referred to as "syntax -classes". There are predefined syntax classes, plus we can define our -own.

7.3 Using syntax/parse

November 1, 2012: So here’s the deal. After writing everything up to -this point, I sat down to re-read the documentation for -syntax/parse. It was...very understandable. I didn’t feel -confused.

<span style='accent: "Kenau-Reeves"'>
Whoa.
</span>

Why? The documentation is written very well. Also, everything up to -this point prepared me to appreciate what syntax/parse does, -and why. That leaves the "how" of using it, which seems pretty -straightforward, so far.

This might well be a temporary state of me "not knowing what I don’t -know". As I dig in and use it more, maybe I’ll discover something -confusing or tricky. If/when I do, I’ll come back here and update -this.

But for now I’ll focus on improving the previous parts.

8 References and Acknowledgments

Eli Barzliay’s blog post, -Writing -‘syntax-case’ Macros, helped me understand many key details and -concepts. It also inspired me to use a "bottom-up" approach. However -he wrote for a specific audience. If you’re not already familiar with -un-hygienic defmacro style macros, it may seem slightly weird to the -extent it’s trying to convince you to change an opinion you don’t -have. I’m writing for people who don’t have any opinion about macros -at all, except maybe that macros seem scary and daunting.

Eli wrote another blog post, -Dirty -Looking Hygiene, which explains syntax-parameterize. I relied -heavily on that, mostly just updating it since his post was written -before PLT Scheme was renamed to Racket.

Matthew Flatt’s -Composable -and Compilable Macros: You Want it When? explains how Racket handles -compile time vs. run time.

Chapter -8 of The Scheme Programming Language by Kent Dybvig -explains syntax-rules and syntax-case. Although -more "formal" in tone, you may find it helpful to read it. You never -know which explanation or examples of something will click for you.

After initially wondering if I was asking the wrong question and -conflating two different issues :), Shriram Krishnamurthi looked at an -early draft and encouraged me to keep going. Sam Tobin-Hochstadt and -Robby Findler also encouraged me. Matthew Flatt showed me how to make -a Scribble interaction print syntax as -"syntax" rather than as "#'". Jay McCarthy helped me -catch some mistakes and confusions. Jon Rafkind pointed out some -problems. Kieron Hardy reported a font issue and some typos.

Finally, I noticed something strange. After writing much of this, when -I returned to some parts of the Racket documentation, I noticed it had -improved since I last read it. Of course, it was the same. I’d -changed. It’s interesting how much of what we already know is -projected between the lines. My point is, the Racket documentation is -very good. The Guide provides helpful examples and -tutorials. The Reference is very clear and precise.

9 Epilogue

"Before I had studied Chan (Zen) for thirty years, I saw mountains as -mountains, and rivers as rivers. When I arrived at a more intimate -knowledge, I came to the point where I saw that mountains are not -mountains, and rivers are not rivers. But now that I have got its very -substance I am at rest. For it’s just that I see mountains once again -as mountains, and rivers once again as rivers"

–Buddhist saying originally formulated by Qingyuan Weixin, -later translated by D.T. Suzuki in his Essays in Zen -Buddhism.

Translated into Racket:

(dynamic-wind (lambda ()
                (and (eq? 'mountains 'mountains)
                     (eq? 'rivers 'rivers)))
              (lambda ()
                (not (and (eq? 'mountains 'mountains)
                          (eq? 'rivers 'rivers))))
              (lambda ()
                (and (eq? 'mountains 'mountains)
                     (eq? 'rivers 'rivers))))
 
\ No newline at end of file +Fear of Macros

Fear of Macros

+
Copyright (c) 2012 by Greg Hendershott. All rights reserved.
Last updated 2012-11-13 13:54:25
Feedback and corrections are welcome here.

Contents:

    1 Preface

    2 Our plan of attack

    3 Transform!

      3.1 What is a syntax transformer?

      3.2 What’s the input?

      3.3 Actually transforming the input

      3.4 Compile time vs. run time

      3.5 begin-for-syntax

    4 Pattern matching: syntax-case and syntax-rules

      4.1 Pattern variable vs. template—fight!

        4.1.1 with-syntax

        4.1.2 with-syntax*

        4.1.3 format-id

        4.1.4 Another example

      4.2 Making our own struct

      4.3 Using dot notation for nested hash lookups

    5 Syntax parameters

    6 What’s the point of splicing-let?

    7 Robust macros: syntax-parse

      7.1 Error-handling strategies for functions

      7.2 Error-handling strategies for macros

      7.3 Using syntax/parse

    8 References and Acknowledgments

    9 Epilogue

 
\ No newline at end of file diff --git a/main.rkt b/index.rkt similarity index 99% rename from main.rkt rename to index.rkt index e4b97f3..e860969 100644 --- a/main.rkt +++ b/index.rkt @@ -606,7 +606,7 @@ them available at compile time.} @; ---------------------------------------------------------------------------- @; ---------------------------------------------------------------------------- -@section{Pattern matching: syntax-case and syntax-rules} +@section[#:tag "pattern-matching"]{Pattern matching: syntax-case and syntax-rules} Most useful syntax transformers work by taking some input syntax, and rearranging the pieces into something else. As we saw, this is @@ -1493,8 +1493,8 @@ great. Improving it would require even more error code. (string-append s " snazzy suffix")) ;; User of the function: (misuse 0) -;; I goofed, and understand why! I hear the writer of the function is -;; happier. +;; I goofed, and understand why! I'm happier, and I hear the writer of +;; the function is happier, too. ) This is the best of both worlds. diff --git a/make-doc.sh b/make-doc.sh index ea5aad4..b5596f0 100755 --- a/make-doc.sh +++ b/make-doc.sh @@ -1,2 +1,3 @@ -scribble --html ++style gh.css ++xref-in setup/xref load-collections-xref --redirect-main "http://docs.racket-lang.org/" main.rkt +scribble --htmls ++style gh.css ++xref-in setup/xref load-collections-xref --redirect-main "http://docs.racket-lang.org/" index.rkt racket add-to-head.rkt +cp index/ ./ diff --git a/pattern-matching.html b/pattern-matching.html new file mode 100644 index 0000000..0b7b77e --- /dev/null +++ b/pattern-matching.html @@ -0,0 +1,121 @@ + +4 Pattern matching: syntax-case and syntax-rules
On this page:
4.1 Pattern variable vs. template—fight!
4.1.1 with-syntax
4.1.2 with-syntax*
4.1.3 format-id
4.1.4 Another example
4.2 Making our own struct
4.3 Using dot notation for nested hash lookups

4 Pattern matching: syntax-case and syntax-rules

Most useful syntax transformers work by taking some input syntax, and +rearranging the pieces into something else. As we saw, this is +possible but tedious using list accessors such as +cadddr. It’s more convenient and less error-prone to use +match to do pattern-matching.

Historically, syntax-case and +syntax-rules pattern matching came first. match was +added to Racket later.

It turns out that pattern-matching was one of the first improvements +to be added to the Racket macro system. It’s called +syntax-case, and has a shorthand for simple situations called +define-syntax-rule.

Recall our previous example:

(require (for-syntax racket/match))
(define-syntax (our-if-using-match-v2 stx)
  (match (syntax->list stx)
    [(list _ condition true-expr false-expr)
     (datum->syntax stx `(cond [,condition ,true-expr]
                               [else ,false-expr]))]))

Here’s what it looks like using syntax-case:

> (define-syntax (our-if-using-syntax-case stx)
    (syntax-case stx ()
      [(_ condition true-expr false-expr)
       #'(cond [condition true-expr]
               [else false-expr])]))
> (our-if-using-syntax-case #t "true" "false")

"true"

Pretty similar, huh? The pattern matching part looks almost exactly +the same. The way we specify the new syntax is simpler. We don’t need +to do quasi-quoting and unquoting. We don’t need to use +datum->syntax. Instead, we supply a "template", which uses +variables from the pattern.

There is a shorthand for simple pattern-matching cases, which expands +into syntax-case. It’s called define-syntax-rule:

> (define-syntax-rule (our-if-using-syntax-rule condition true-expr false-expr)
    (cond [condition true-expr]
          [else false-expr]))
> (our-if-using-syntax-rule #t "true" "false")

"true"

Here’s the thing about define-syntax-rule. Because it’s so +simple, define-syntax-rule is often the first thing people are +taught about macros. But it’s almost deceptively simple. It looks so +much like defining a normal run time function—yet it’s not. It’s +working at compile time, not run time. Worse, the moment you want to +do more than define-syntax-rule can handle, you can fall off +a cliff into what feels like complicated and confusing +territory. Hopefully, because we started with a basic syntax +transformer, and worked up from that, we won’t have that problem. We +can appreciate define-syntax-rule as a convenient shorthand, +but not be scared of, or confused about, that for which it’s +shorthand.

Most of the materials I found for learning macros, including the +Racket Guide, do a very good job explaining +how +patterns and templates work. So I won’t regurgitate that here.

Sometimes, we need to go a step beyond the pattern and template. Let’s +look at some examples, how we can get confused, and how to get it +working.

4.1 Pattern variable vs. template—fight!

Let’s say we want to define a function with a hyphenated name, a-b, +but we supply the a and b parts separately. The Racket struct +macro does something like this: (struct foo (field1 field2)) +automatically defines a number of functions whose names are variations +on the name foosuch as foo-field1, +foo-field2, foo?, and so on.

So let’s pretend we’re doing something like that. We want to transform +the syntax (hyphen-define a b (args) body) to the syntax +(define (a-b args) body).

A wrong first attempt is:

> (define-syntax (hyphen-define/wrong1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (let ([name (string->symbol (format "~a-~a" a b))])
         #'(define (name args ...)
             body0 body ...))]))

eval:47:0: a: pattern variable cannot be used outside of a

template

  in: a

Huh. We have no idea what this error message means. Well, let’s try to +work it out. The "template" the error message refers to is the +#'(define (name args ...) body0 body ...) portion. The +let isn’t part of that template. It sounds like we can’t use +a (or b) in the let part.

In fact, syntax-case can have as many templates as you +want. The obvious, required template is the final expression supplying +the output syntax. But you can use syntax (a.k.a. #’) on a +pattern variable. This makes another template, albeit a small, "fun +size" template. Let’s try that:

> (define-syntax (hyphen-define/wrong1.1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (let ([name (string->symbol (format "~a-~a" #'a #'b))])
         #'(define (name args ...)
             body0 body ...))]))

No more error—good! Let’s try to use it:

> (hyphen-define/wrong1.1 foo bar () #t)
> (foo-bar)

foo-bar: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Apparently our macro is defining a function with some name other than +foo-bar. Huh.

This is where the Macro Stepper in DrRacket is +invaluable.

Even if you prefer mostly to use Emacs, this +is a situation where it’s definitely worth temporarily using DrRacket +for its Macro Stepper.

The Macro Stepper says that the use of our macro:

(hyphen-define/wrong1.1 foo bar () #t)

expanded to:

(define (name) #t)

Well that explains it. Instead, we wanted to expand to:

(define (foo-bar) #t)

Our template is using the symbol name but we wanted its +value, such as foo-bar in this use of our macro.

Is there anything we already know that behaves like this—where using +a variable in the template yields its value? Yes: Pattern +variables. Our pattern doesn’t include name because we don’t +expect it in the original syntax—indeed the whole point of this +macro is to create it. So name can’t be in the main +pattern. Fine—let’s make an additional pattern. We can do +that using an additional, nested syntax-case:

> (define-syntax (hyphen-define/wrong1.2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx
                                   (string->symbol (format "~a-~a" #'a #'b)))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))

Looks weird? Let’s take a deep breath. Normally our transformer +function is given syntax by Racket, and we pass that syntax to +syntax-case. But we can also create some syntax of our own, +on the fly, and pass that to syntax-case. That’s all +we’re doing here. The whole (datum->syntax ...) expression is +syntax that we’re creating on the fly. We can give that to +syntax-case, and match it using a pattern variable named +name. Voila, we have a new pattern variable. We can use it in +a template, and its value will go in the template.

We might have one more—just one, I promise!—small problem left. +Let’s try to use our new version:

> (hyphen-define/wrong1.2 foo bar () #t)
> (foo-bar)

foo-bar: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Hmm. foo-bar is still not defined. Back to the Macro +Stepper. It says now we’re expanding to:

(define (|#<syntax:11:24foo>-#<syntax:11:28 bar>|) #t)

Oh right: #'a and #'b are syntax objects. Therefore

(string->symbol (format "~a-~a" #'a #'b))

is the printed form of both syntax objects, joined by a hyphen:

|#<syntax:11:24foo>-#<syntax:11:28 bar>|

Instead we want the datum in the syntax objects, such as the symbols +foo and bar. Which we get using +syntax->datum:

> (define-syntax (hyphen-define/ok1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx
                                   (string->symbol (format "~a-~a"
                                                           (syntax->datum #'a)
                                                           (syntax->datum #'b))))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))
> (hyphen-define/ok1 foo bar () #t)
> (foo-bar)

#t

And now it works!

Next, some shortcuts.

4.1.1 with-syntax

Instead of an additional, nested syntax-case, we could use +with-syntaxAnother name for +with-syntax could be, "with new pattern variable".. This +rearranges the syntax-case to look more like a let +statement—first the name, then the value. Also it’s more convenient +if we need to define more than one pattern variable.

> (define-syntax (hyphen-define/ok2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (datum->syntax stx
                                          (string->symbol (format "~a-~a"
                                                                  (syntax->datum #'a)
                                                                  (syntax->datum #'b))))])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok2 foo bar () #t)
> (foo-bar)

#t

Again, with-syntax is simply syntax-case rearranged:

(syntax-case <syntax> () [<pattern> <body>])
(with-syntax ([<pattern> <syntax>]) <body>)

Whether you use an additional syntax-case or use +with-syntax, either way you are simply defining additional +pattern variables. Don’t let the terminology and structure make it +seem mysterious.

4.1.2 with-syntax*

We know that let doesn’t let us use a binding in a subsequent +one:

> (let ([a 0]
        [b a])
    b)

a: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Instead we can nest lets:

> (let ([a 0])
    (let ([b a])
      b))

0

Or use a shorthand for nesting, let*:

> (let* ([a 0]
         [b a])
    b)

0

Similarly, instead of writing nested with-syntaxs, we can use +with-syntax*:

> (require (for-syntax racket/syntax))
> (define-syntax (foo stx)
    (syntax-case stx ()
      [(_ a)
        (with-syntax* ([b #'a]
                       [c #'b])
          #'c)]))

One gotcha is that with-syntax* isn’t provided by +racket/base. We must (require (for-syntax racket/syntax)). Otherwise we may get a rather bewildering error +message:

...: ellipses not allowed as an expression in: ....

4.1.3 format-id

There is a utility function in racket/syntax called +format-id that lets us format identifier names more +succinctly than what we did above:

> (require (for-syntax racket/syntax))
> (define-syntax (hyphen-define/ok3 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (format-id stx "~a-~a" #'a #'b)])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok3 bar baz () #t)
> (bar-baz)

#t

Using format-id is convenient as it handles the tedium of +converting from syntax to symbol datum to string ... and all the way +back.

4.1.4 Another example

Finally, here’s a variation that accepts an arbitary number of name +parts to be joined with hyphens:

> (require (for-syntax racket/string racket/syntax))
> (define-syntax (hyphen-define* stx)
    (syntax-case stx ()
      [(_ (names ...) (args ...) body0 body ...)
       (let* ([names/sym (map syntax-e (syntax->list #'(names ...)))]
              [names/str (map symbol->string names/sym)]
              [name/str (string-join names/str "-")]
              [name/sym (string->symbol name/str)])
         (with-syntax ([name (datum->syntax stx name/sym)])
           #`(define (name args ...)
               body0 body ...)))]))
> (hyphen-define* (foo bar baz) (v) (* 2 v))
> (foo-bar-baz 50)

100

To review:

4.2 Making our own struct

Let’s apply what we just learned to a more-realistic example. We’ll +pretend that Racket doesn’t already have a struct +capability. Fortunately, we can write a macro to provide our own +system for defining and using structures. To keep things simple, our +structure will be immutable (read-only) and it won’t support +inheritance.

Given a structure declaration like:

(our-struct name (field1 field2 ...))

We need to define some procedures:

> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx)
    (syntax-case stx ()
      [(_ id (fields ...))
       (with-syntax ([pred-id (format-id stx "~a?" #'id)])
         #`(begin
             ; Define a constructor.
             (define (id fields ...)
               (apply vector (cons 'id  (list fields ...))))
             ; Define a predicate.
             (define (pred-id v)
               (and (vector? v)
                    (eq? (vector-ref v 0) 'id)))
             ; Define an accessor for each field.
             #,@(for/list ([x (syntax->list #'(fields ...))]
                           [n (in-naturals 1)])
                  (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                [ix n])
                    #`(define (acc-id v)
                        (unless (pred-id v)
                          (error 'acc-id "~a is not a ~a struct" v 'id))
                        (vector-ref v ix))))))]))
; Test it out
> (require rackunit)
> (our-struct foo (a b))
> (define s (foo 1 2))
> (check-true (foo? s))
> (check-false (foo? 1))
> (check-equal? (foo-a s) 1)
> (check-equal? (foo-b s) 2)
> (check-exn exn:fail?
             (lambda () (foo-a "furble")))
; The tests passed.
; Next, what if someone tries to declare:
> (our-struct "blah" ("blah" "blah"))

format-id: contract violation

  expected: (or/c string? symbol? identifier? keyword? char?

number?)

  given: #<syntax:83:0 "blah">

The error message is not very helpful. It’s coming from +format-id, which is a private implementation detail of our macro.

You may know that a syntax-case clause can take an +optional "guard" or "fender" expression. Instead of

[pattern template]

It can be:

[pattern guard template]

Let’s add a guard expression to our clause:

> (require (for-syntax racket/syntax))
> (define-syntax (our-struct stx)
    (syntax-case stx ()
      [(_ id (fields ...))
       ; Guard or "fender" expression:
       (for-each (lambda (x)
                   (unless (identifier? x)
                     (raise-syntax-error #f "not an identifier" stx x)))
                 (cons #'id (syntax->list #'(fields ...))))
       (with-syntax ([pred-id (format-id stx "~a?" #'id)])
         #`(begin
             ; Define a constructor.
             (define (id fields ...)
               (apply vector (cons 'id  (list fields ...))))
             ; Define a predicate.
             (define (pred-id v)
               (and (vector? v)
                    (eq? (vector-ref v 0) 'id)))
             ; Define an accessor for each field.
             #,@(for/list ([x (syntax->list #'(fields ...))]
                           [n (in-naturals 1)])
                  (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                [ix n])
                    #`(define (acc-id v)
                        (unless (pred-id v)
                          (error 'acc-id "~a is not a ~a struct" v 'id))
                        (vector-ref v ix))))))]))
; Now the same misuse gives a better error message:
> (our-struct "blah" ("blah" "blah"))

eval:86:0: our-struct: not an identifier

  at: "blah"

  in: (our-struct "blah" ("blah" "blah"))

Later, we’ll see how syntax-parse makes it even easier to +check usage and provide helpful messages about mistakes.

4.3 Using dot notation for nested hash lookups

The previous two examples used a macro to define functions whose names +were made by joining identifiers provided to the macro. This example +does the opposite: The identifier given to the macro is split into +pieces.

If you write programs for web services you deal with JSON, which is +represented in Racket by a jsexpr?. JSON often has +dictionaries that contain other dictionaries. In a jsexpr? +these are represented by nested hasheq tables:

; Nested `hasheq's typical of a jsexpr:
> (define js (hasheq 'a (hasheq 'b (hasheq 'c "value"))))

In JavaScript you can use dot notation:

foo = js.a.b.c;

In Racket it’s not so convenient:

(hash-ref (hash-ref (hash-ref js 'a) 'b) 'c)

We can write a helper function to make this a bit cleaner:

; This helper function:
> (define/contract (hash-refs h ks [def #f])
    ((hash? (listof any/c)) (any/c) . ->* . any)
    (with-handlers ([exn:fail? (const (cond [(procedure? def) (def)]
                                            [else def]))])
      (for/fold ([h h])
        ([k (in-list ks)])
        (hash-ref h k))))
; Lets us say:
> (hash-refs js '(a b c))

"value"

That’s better. Can we go even further and use a dot notation somewhat +like JavaScript?

; This macro:
> (require (for-syntax racket/syntax))
> (define-syntax (hash.refs stx)
    (syntax-case stx ()
      ; If the optional `default' is missing, assume it's #f.
      [(_ chain)
       #'(hash.refs chain #f)]
      [(_ chain default)
       (let ([xs (map (lambda (x)
                        (datum->syntax stx (string->symbol x)))
                      (regexp-split #rx"\\."
                                    (symbol->string (syntax->datum #'chain))))])
         (with-syntax ([h (car xs)]
                       [ks (cdr xs)])
           #'(hash-refs h 'ks default)))]))
; Gives us "sugar" to say this:
> (hash.refs js.a.b.c)

"value"

; Try finding a key that doesn't exist:
> (hash.refs js.blah)

#f

; Try finding a key that doesn't exist, specifying the default:
> (hash.refs js.blah 'did-not-exist)

'did-not-exist

It works!

We’ve started to appreciate that our macros should give helpful +messages when used in error. Let’s try to do that here.

> (require (for-syntax racket/syntax))
> (define-syntax (hash.refs stx)
    (syntax-case stx ()
      ; Check for no args at all
      [(_)
       (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                           stx #'chain)]
      [(_ chain)
       #'(hash.refs chain #f)]
      [(_ chain default)
       ; Check that chain is a symbol, not e.g. a number or string
       (unless (symbol? (syntax-e #'chain))
         (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                             stx #'chain))
       (let ([xs (map (lambda (x)
                        (datum->syntax stx (string->symbol x)))
                      (regexp-split #rx"\\."
                                    (symbol->string (syntax->datum #'chain))))])
         ; Check that we have at least hash.key
         (unless (and (>= (length xs) 2)
                      (not (eq? (syntax-e (cadr xs)) '||)))
           (raise-syntax-error #f "Expected hash.key" stx #'chain))
         (with-syntax ([h (car xs)]
                       [ks (cdr xs)])
           #'(hash-refs h 'ks default)))]))
; See if we catch each of the misuses
> (hash.refs)

eval:96:0: hash.refs: Expected (hash.key0[.key1 ...]

[default])

  at: chain

  in: (hash.refs)

> (hash.refs 0)

eval:98:0: hash.refs: Expected (hash.key0[.key1 ...]

[default])

  at: 0

  in: (hash.refs 0 #f)

> (hash.refs js)

eval:99:0: hash.refs: Expected hash.key

  at: js

  in: (hash.refs js #f)

> (hash.refs js.)

eval:100:0: hash.refs: Expected hash.key

  at: js.

  in: (hash.refs js. #f)

Not too bad. Of course, the version with error-checking is quite a bit +longer. Error-checking code generally tends to obscure the logic, and +does here. Fortuantely we’ll soon see how syntax-parse can +help mitigate that, in much the same way as contracts in normal +Racket or types in Typed Racket.

Maybe we’re not convinced that writing (hash.refs js.a.b.c) +is really clearer than (hash-refs js '(a b c)). Maybe we +won’t actually use this approach. But the Racket macro system makes it +a possible choice.

 
\ No newline at end of file