From f5a1fd6d381d04e8e786a9b70b76c0aae790d503 Mon Sep 17 00:00:00 2001 From: Greg Hendershott Date: Mon, 12 Nov 2012 23:15:08 -0500 Subject: [PATCH] Regen html --- index.html | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/index.html b/index.html index 2726a3f..a74159c 100644 --- a/index.html +++ b/index.html @@ -1,6 +1,6 @@ -Fear of Macros

Fear of Macros

-
Copyright (c) 2012 by Greg Hendershott. All rights reserved.
Last updated 2012-11-12 19:48:10
Feedback and corrections are welcome here.

Contents:

    1 Preface

    2 Our plan of attack

    3 Transform!

      3.1 What is a syntax transformer?

      3.2 What’s the input?

      3.3 Actually transforming the input

      3.4 Compile time vs. run time

      3.5 begin-for-syntax

    4 Pattern matching: syntax-case and syntax-rules

      4.1 Pattern variable vs. template—fight!

        4.1.1 with-syntax

        4.1.2 format-id

        4.1.3 Another example

      4.2 Making our own struct

      4.3 Using dot notation for nested hash lookups

    5 Syntax parameters

    6 What’s the point of splicing-let?

    7 Robust macros: syntax-parse

      7.1 Error-handling strategies for functions

      7.2 Error-handling strategies for macros

      7.3 Using syntax/parse

    8 References and Acknowledgments

    9 Epilogue

1 Preface

I learned Racket after 25 years of mostly using C and C++.

Some psychic whiplash resulted.

"All the parentheses" was actually not a big deal. Instead, the first +Fear of Macros

Fear of Macros

+
Copyright (c) 2012 by Greg Hendershott. All rights reserved.
Last updated 2012-11-12 23:13:02
Feedback and corrections are welcome here.

Contents:

    1 Preface

    2 Our plan of attack

    3 Transform!

      3.1 What is a syntax transformer?

      3.2 What’s the input?

      3.3 Actually transforming the input

      3.4 Compile time vs. run time

      3.5 begin-for-syntax

    4 Pattern matching: syntax-case and syntax-rules

      4.1 Pattern variable vs. template—fight!

        4.1.1 with-syntax

        4.1.2 with-syntax*

        4.1.3 format-id

        4.1.4 Another example

      4.2 Making our own struct

      4.3 Using dot notation for nested hash lookups

    5 Syntax parameters

    6 What’s the point of splicing-let?

    7 Robust macros: syntax-parse

      7.1 Error-handling strategies for functions

      7.2 Error-handling strategies for macros

      7.3 Using syntax/parse

    8 References and Acknowledgments

    9 Epilogue

1 Preface

I learned Racket after 25 years of mostly using C and C++.

Some psychic whiplash resulted.

"All the parentheses" was actually not a big deal. Instead, the first mind warp was functional programming. Before long I wrapped my brain around it, and went on to become comfortable and effective with many other aspects and features of Racket.

But two final frontiers remained: Macros and continuations.

I found that simple macros were easy and understandable, plus there @@ -210,7 +210,7 @@ a template, and its value will go in the template.

We might have one more& Let’s try to use our new version:

> (hyphen-define/wrong1.2 foo bar () #t)
> (foo-bar)

foo-bar: undefined;

 cannot reference an identifier before its definition

  in module: 'program

Hmm. foo-bar is still not defined. Back to the Macro Stepper. It says now we’re expanding to:

(define (|#<syntax:11:24foo>-#<syntax:11:28 bar>|) #t)

Oh right: #'a and #'b are syntax objects. Therefore

(string->symbol (format "~a-~a" #'a #'b))

is the printed form of both syntax objects, joined by a hyphen:

|#<syntax:11:24foo>-#<syntax:11:28 bar>|

Instead we want the datum in the syntax objects, such as the symbols foo and bar. Which we get using -syntax->datum:

> (define-syntax (hyphen-define/ok1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx
                                   (string->symbol (format "~a-~a"
                                                           (syntax->datum #'a)
                                                           (syntax->datum #'b))))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))
> (hyphen-define/ok1 foo bar () #t)
> (foo-bar)

#t

And now it works!

4.1.1 with-syntax

Now for two shortcuts.

Instead of an additional, nested syntax-case, we could use +syntax->datum:

> (define-syntax (hyphen-define/ok1 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (syntax-case (datum->syntax stx
                                   (string->symbol (format "~a-~a"
                                                           (syntax->datum #'a)
                                                           (syntax->datum #'b))))
                    ()
         [name #'(define (name args ...)
                   body0 body ...)])]))
> (hyphen-define/ok1 foo bar () #t)
> (foo-bar)

#t

And now it works!

Next, some shortcuts.

4.1.1 with-syntax

Instead of an additional, nested syntax-case, we could use with-syntaxAnother name for with-syntax could be, "define pattern variable".. This rearranges the syntax-case to look more like a let @@ -218,18 +218,21 @@ statement—first the name, then the value. Also it’s more if we need to define more than one pattern variable.

> (define-syntax (hyphen-define/ok2 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (datum->syntax stx
                                          (string->symbol (format "~a-~a"
                                                                  (syntax->datum #'a)
                                                                  (syntax->datum #'b))))])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok2 foo bar () #t)
> (foo-bar)

#t

Whether you use an additional syntax-case or use with-syntax, either way you are simply defining an additional pattern variable. Don’t let the terminology and structure make it seem -mysterious.

4.1.2 format-id

Also, there is a utility function in racket/syntax called -format-id that lets us format identifier names more -succinctly. As we’ve learned, we need to require the module -using for-syntax, since we need it at compile time:

> (require (for-syntax racket/syntax))
> (define-syntax (hyphen-define/ok3 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (format-id stx "~a-~a" #'a #'b)])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok3 bar baz () #t)
> (bar-baz)

#t

Using format-id is convenient as it handles the tedium of +mysterious.

4.1.2 with-syntax*

We may recall that let doesn’t let us use a definition in a +subsequent clause:

> (let ([a 0]
        [b a])
    (values a b))

a: undefined;

 cannot reference an identifier before its definition

  in module: 'program

We could nest lets:

> (let ([a 0])
    (let ([b a])
      (values a b)))

0

0

Or we could use let*:

> (let* ([a 0]
         [b 0])
    (values a b))

0

0

Similarly there is a with-syntax* variation of +with-syntax:

> (require (for-syntax racket/syntax))
> (define-syntax (foo stx)
    (syntax-case stx ()
      [(_ a)
        (with-syntax* ([b #'a]
                       [c #'b])
          #'c)]))

One gotcha is that with-syntax* isn’t provided by +racket/base. We must (require (for-syntax racket/syntax)). Otherwise we may get a rather bewildering error +message:

...: ellipses not allowed as an expression in: ....

4.1.3 format-id

There is a utility function in racket/syntax called +format-id that lets us format identifier names more +succinctly than what we did above:

> (require (for-syntax racket/syntax))
> (define-syntax (hyphen-define/ok3 stx)
    (syntax-case stx ()
      [(_ a b (args ...) body0 body ...)
       (with-syntax ([name (format-id stx "~a-~a" #'a #'b)])
         #'(define (name args ...)
             body0 body ...))]))
> (hyphen-define/ok3 bar baz () #t)
> (bar-baz)

#t

Using format-id is convenient as it handles the tedium of converting from syntax to symbol datum to string ... and all the way -back.

4.1.3 Another example

Finally, here’s a variation that accepts an arbitary number of name +back.

4.1.4 Another example

Finally, here’s a variation that accepts an arbitary number of name parts to be joined with hyphens:

> (require (for-syntax racket/string racket/syntax))
> (define-syntax (hyphen-define* stx)
    (syntax-case stx ()
      [(_ (names ...) (args ...) body0 body ...)
       (let* ([names/sym (map syntax-e (syntax->list #'(names ...)))]
              [names/str (map symbol->string names/sym)]
              [name/str (string-join names/str "-")]
              [name/sym (string->symbol name/str)])
         (with-syntax ([name (datum->syntax stx name/sym)])
           #`(define (name args ...)
               body0 body ...)))]))
> (hyphen-define* (foo bar baz) (v) (* 2 v))
> (foo-bar-baz 50)

100

To review:

  • You can’t use a pattern variable outside of a template. But you can use syntax or # on a pattern variable to make an ad hoc, "fun size" template.

  • If you want to munge pattern variables for use in the template, with-syntax is your friend, because it lets you create new pattern variables.

  • Usually you’ll need to use syntax->datum to get the -interesting value inside.

  • format-id is convenient for formatting identifier +interesting value inside.

  • format-id is convenient for formatting identifier names.

4.2 Making our own struct

Let’s apply what we just learned to a more-realistic example. We’ll pretend that Racket doesn’t already have a struct capability. Fortunately, we can write a macro to provide our own @@ -240,9 +243,9 @@ represent structures as a ? appended.

  • For each field, an accessor procedure to get its value. These will be named struct-field (the name of the struct, a hyphen, and the -field name).

  • > (require (for-syntax racket/syntax))
    > (define-syntax (our-struct stx)
        (syntax-case stx ()
          [(_ id (fields ...))
           (with-syntax ([pred-id (format-id stx "~a?" #'id)])
             #`(begin
                 ; Define a constructor.
                 (define (id fields ...)
                   (apply vector (cons 'id  (list fields ...))))
                 ; Define a predicate.
                 (define (pred-id v)
                   (and (vector? v)
                        (eq? (vector-ref v 0) 'id)))
                 ; Define an accessor for each field.
                 #,@(for/list ([x (syntax->list #'(fields ...))]
                               [n (in-naturals 1)])
                      (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                    [ix n])
                        #`(define (acc-id v)
                            (unless (pred-id v)
                              (error 'acc-id "~a is not a ~a struct" v 'id))
                            (vector-ref v ix))))))]))
    ; Test it out
    > (require rackunit)
    > (our-struct foo (a b))
    > (define s (foo 1 2))
    > (check-true (foo? s))
    > (check-false (foo? 1))
    > (check-equal? (foo-a s) 1)
    > (check-equal? (foo-b s) 2)
    > (check-exn exn:fail?
                 (lambda () (foo-a "furble")))
    ; The tests passed.
    ; Next, what if someone tries to declare:
    > (our-struct "blah" ("blah" "blah"))

    format-id: contract violation

      expected: (or/c string? symbol? identifier? keyword? char?

    number?)

      given: #<syntax:78:0 "blah">

    The error message is not very helpful. It’s coming from -format-id, which is a private implementation detail of our macro.

    You may know that a syntax-case clause can take an -optional "guard" or "fender" expression. Instead of

    [pattern template]

    It can be:

    [pattern guard template]

    Let’s add a guard expression to our clause:

    > (require (for-syntax racket/syntax))
    > (define-syntax (our-struct stx)
        (syntax-case stx ()
          [(_ id (fields ...))
           ; Guard or "fender" expression:
           (for-each (lambda (x)
                       (unless (identifier? x)
                         (raise-syntax-error #f "not an identifier" stx x)))
                     (cons #'id (syntax->list #'(fields ...))))
           (with-syntax ([pred-id (format-id stx "~a?" #'id)])
             #`(begin
                 ; Define a constructor.
                 (define (id fields ...)
                   (apply vector (cons 'id  (list fields ...))))
                 ; Define a predicate.
                 (define (pred-id v)
                   (and (vector? v)
                        (eq? (vector-ref v 0) 'id)))
                 ; Define an accessor for each field.
                 #,@(for/list ([x (syntax->list #'(fields ...))]
                               [n (in-naturals 1)])
                      (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                    [ix n])
                        #`(define (acc-id v)
                            (unless (pred-id v)
                              (error 'acc-id "~a is not a ~a struct" v 'id))
                            (vector-ref v ix))))))]))
    ; Now the same misuse gives a better error message:
    > (our-struct "blah" ("blah" "blah"))

    eval:81:0: our-struct: not an identifier

      at: "blah"

      in: (our-struct "blah" ("blah" "blah"))

    Later, we’ll see how syntax-parse makes it even easier to +field name).

    > (require (for-syntax racket/syntax))
    > (define-syntax (our-struct stx)
        (syntax-case stx ()
          [(_ id (fields ...))
           (with-syntax ([pred-id (format-id stx "~a?" #'id)])
             #`(begin
                 ; Define a constructor.
                 (define (id fields ...)
                   (apply vector (cons 'id  (list fields ...))))
                 ; Define a predicate.
                 (define (pred-id v)
                   (and (vector? v)
                        (eq? (vector-ref v 0) 'id)))
                 ; Define an accessor for each field.
                 #,@(for/list ([x (syntax->list #'(fields ...))]
                               [n (in-naturals 1)])
                      (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                    [ix n])
                        #`(define (acc-id v)
                            (unless (pred-id v)
                              (error 'acc-id "~a is not a ~a struct" v 'id))
                            (vector-ref v ix))))))]))
    ; Test it out
    > (require rackunit)
    > (our-struct foo (a b))
    > (define s (foo 1 2))
    > (check-true (foo? s))
    > (check-false (foo? 1))
    > (check-equal? (foo-a s) 1)
    > (check-equal? (foo-b s) 2)
    > (check-exn exn:fail?
                 (lambda () (foo-a "furble")))
    ; The tests passed.
    ; Next, what if someone tries to declare:
    > (our-struct "blah" ("blah" "blah"))

    format-id: contract violation

      expected: (or/c string? symbol? identifier? keyword? char?

    number?)

      given: #<syntax:83:0 "blah">

    The error message is not very helpful. It’s coming from +format-id, which is a private implementation detail of our macro.

    You may know that a syntax-case clause can take an +optional "guard" or "fender" expression. Instead of

    [pattern template]

    It can be:

    [pattern guard template]

    Let’s add a guard expression to our clause:

    > (require (for-syntax racket/syntax))
    > (define-syntax (our-struct stx)
        (syntax-case stx ()
          [(_ id (fields ...))
           ; Guard or "fender" expression:
           (for-each (lambda (x)
                       (unless (identifier? x)
                         (raise-syntax-error #f "not an identifier" stx x)))
                     (cons #'id (syntax->list #'(fields ...))))
           (with-syntax ([pred-id (format-id stx "~a?" #'id)])
             #`(begin
                 ; Define a constructor.
                 (define (id fields ...)
                   (apply vector (cons 'id  (list fields ...))))
                 ; Define a predicate.
                 (define (pred-id v)
                   (and (vector? v)
                        (eq? (vector-ref v 0) 'id)))
                 ; Define an accessor for each field.
                 #,@(for/list ([x (syntax->list #'(fields ...))]
                               [n (in-naturals 1)])
                      (with-syntax ([acc-id (format-id stx "~a-~a" #'id x)]
                                    [ix n])
                        #`(define (acc-id v)
                            (unless (pred-id v)
                              (error 'acc-id "~a is not a ~a struct" v 'id))
                            (vector-ref v ix))))))]))
    ; Now the same misuse gives a better error message:
    > (our-struct "blah" ("blah" "blah"))

    eval:86:0: our-struct: not an identifier

      at: "blah"

      in: (our-struct "blah" ("blah" "blah"))

    Later, we’ll see how syntax-parse makes it even easier to check usage and provide helpful messages about mistakes.

    4.3 Using dot notation for nested hash lookups

    The previous two examples used a macro to define functions whose names were made by joining identifiers provided to the macro. This example does the opposite: The identifier given to the macro is split into @@ -251,7 +254,7 @@ represented in Racket by a jsexpr?. JSON often has dictionaries that contain other dictionaries. In a jsexpr? these are represented by nested hasheq tables:

    ; Nested `hasheq's typical of a jsexpr:
    > (define js (hasheq 'a (hasheq 'b (hasheq 'c "value"))))

    In JavaScript you can use dot notation:

    foo = js.a.b.c;

    In Racket it’s not so convenient:

    (hash-ref (hash-ref (hash-ref js 'a) 'b) 'c)

    We can write a helper function to make this a bit cleaner:

    ; This helper function:
    > (define/contract (hash-refs h ks [def #f])
        ((hash? (listof any/c)) (any/c) . ->* . any)
        (with-handlers ([exn:fail? (const (cond [(procedure? def) (def)]
                                                [else def]))])
          (for/fold ([h h])
            ([k (in-list ks)])
            (hash-ref h k))))
    ; Lets us say:
    > (hash-refs js '(a b c))

    "value"

    That’s better. Can we go even further and use a dot notation somewhat like JavaScript?

    ; This macro:
    > (require (for-syntax racket/syntax))
    > (define-syntax (hash.refs stx)
        (syntax-case stx ()
          ; If the optional `default' is missing, assume it's #f.
          [(_ chain)
           #'(hash.refs chain #f)]
          [(_ chain default)
           (let ([xs (map (lambda (x)
                            (datum->syntax stx (string->symbol x)))
                          (regexp-split #rx"\\."
                                        (symbol->string (syntax->datum #'chain))))])
             (with-syntax ([h (car xs)]
                           [ks (cdr xs)])
               #'(hash-refs h 'ks default)))]))
    ; Gives us "sugar" to say this:
    > (hash.refs js.a.b.c)

    "value"

    ; Try finding a key that doesn't exist:
    > (hash.refs js.blah)

    #f

    ; Try finding a key that doesn't exist, specifying the default:
    > (hash.refs js.blah 'did-not-exist)

    'did-not-exist

    It works!

    We’ve started to appreciate that our macros should give helpful -messages when used in error. Let’s try to do that here.

    > (require (for-syntax racket/syntax))
    > (define-syntax (hash.refs stx)
        (syntax-case stx ()
          ; Check for no args at all
          [(_)
           (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                               stx #'chain)]
          [(_ chain)
           #'(hash.refs chain #f)]
          [(_ chain default)
           ; Check that chain is a symbol, not e.g. a number or string
           (unless (symbol? (syntax-e #'chain))
             (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                                 stx #'chain))
           (let ([xs (map (lambda (x)
                            (datum->syntax stx (string->symbol x)))
                          (regexp-split #rx"\\."
                                        (symbol->string (syntax->datum #'chain))))])
             ; Check that we have at least hash.key
             (unless (and (>= (length xs) 2)
                          (not (eq? (syntax-e (cadr xs)) '||)))
               (raise-syntax-error #f "Expected hash.key" stx #'chain))
             (with-syntax ([h (car xs)]
                           [ks (cdr xs)])
               #'(hash-refs h 'ks default)))]))
    ; See if we catch each of the misuses
    > (hash.refs)

    eval:91:0: hash.refs: Expected (hash.key0[.key1 ...]

    [default])

      at: chain

      in: (hash.refs)

    > (hash.refs 0)

    eval:93:0: hash.refs: Expected (hash.key0[.key1 ...]

    [default])

      at: 0

      in: (hash.refs 0 #f)

    > (hash.refs js)

    eval:94:0: hash.refs: Expected hash.key

      at: js

      in: (hash.refs js #f)

    > (hash.refs js.)

    eval:95:0: hash.refs: Expected hash.key

      at: js.

      in: (hash.refs js. #f)

    Not too bad. Of course, the version with error-checking is quite a bit +messages when used in error. Let’s try to do that here.

    > (require (for-syntax racket/syntax))
    > (define-syntax (hash.refs stx)
        (syntax-case stx ()
          ; Check for no args at all
          [(_)
           (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                               stx #'chain)]
          [(_ chain)
           #'(hash.refs chain #f)]
          [(_ chain default)
           ; Check that chain is a symbol, not e.g. a number or string
           (unless (symbol? (syntax-e #'chain))
             (raise-syntax-error #f "Expected (hash.key0[.key1 ...] [default])"
                                 stx #'chain))
           (let ([xs (map (lambda (x)
                            (datum->syntax stx (string->symbol x)))
                          (regexp-split #rx"\\."
                                        (symbol->string (syntax->datum #'chain))))])
             ; Check that we have at least hash.key
             (unless (and (>= (length xs) 2)
                          (not (eq? (syntax-e (cadr xs)) '||)))
               (raise-syntax-error #f "Expected hash.key" stx #'chain))
             (with-syntax ([h (car xs)]
                           [ks (cdr xs)])
               #'(hash-refs h 'ks default)))]))
    ; See if we catch each of the misuses
    > (hash.refs)

    eval:96:0: hash.refs: Expected (hash.key0[.key1 ...]

    [default])

      at: chain

      in: (hash.refs)

    > (hash.refs 0)

    eval:98:0: hash.refs: Expected (hash.key0[.key1 ...]

    [default])

      at: 0

      in: (hash.refs 0 #f)

    > (hash.refs js)

    eval:99:0: hash.refs: Expected hash.key

      at: js

      in: (hash.refs js #f)

    > (hash.refs js.)

    eval:100:0: hash.refs: Expected hash.key

      at: js.

      in: (hash.refs js. #f)

    Not too bad. Of course, the version with error-checking is quite a bit longer. Error-checking code generally tends to obscure the logic, and does here. Fortuantely we’ll soon see how syntax-parse can help mitigate that, in much the same way as contracts in normal @@ -295,7 +298,7 @@ user thinks they’re calling misuse, but is get message from string-append. In this simple example they could probably guess what’s happening, but in most cases they won’t.

    2. Write some error handling code.

    > (define (misuse s)
        (unless (string? s)
          (error 'misuse "expected a string, but got ~a" s))
        (string-append s " snazzy suffix"))
    ; User of the function:
    > (misuse 0)

    misuse: expected a string, but got 0

    ; I goofed, and understand why! It's a shame the writer of the
    ; function had to work so hard to tell me.

    Unfortunately the error code tends to overwhelm and/or obscure our function definition. Also, the error message is good but not -great. Improving it would require even more error code.

    3. Use a contract.

    > (define/contract (misuse s)
        (string? . -> . string?)
        (string-append s " snazzy suffix"))
    ; User of the function:
    > (misuse 0)

    misuse: contract violation

      expected: string?, given: 0

      in: the 1st argument of

          (-> string? string?)

      contract from: (function misuse)

      blaming: program

      at: eval:125.0

    ; I goofed, and understand why! I hear the writer of the function is
    ; happier.

    This is the best of both worlds.

    The contract is a simple and concise. Even better, it’s +great. Improving it would require even more error code.

    3. Use a contract.

    > (define/contract (misuse s)
        (string? . -> . string?)
        (string-append s " snazzy suffix"))
    ; User of the function:
    > (misuse 0)

    misuse: contract violation

      expected: string?, given: 0

      in: the 1st argument of

          (-> string? string?)

      contract from: (function misuse)

      blaming: program

      at: eval:130.0

    ; I goofed, and understand why! I hear the writer of the function is
    ; happier.

    This is the best of both worlds.

    The contract is a simple and concise. Even better, it’s declarative. We say what we want, without needing to spell out what to do.

    On the other hand the user of our function gets a very detailed error message. Plus, the message is in a standard, familiar format.

    4. Use Typed Racket.

    > (: misuse (String -> String))
    > (define (misuse s)
        (string-append s " snazzy suffix"))
    > (misuse 0)

    eval:3:0: Type Checker: Expected String, but got Zero

      in: (quote 0)

    With respect to error handling, Typed Racket has the same benefits as