docs: guide rough draft

This commit is contained in:
Stephen Chang 2016-09-23 17:00:59 -04:00
parent e68480f0e6
commit f10190c26a
2 changed files with 433 additions and 110 deletions

View File

@ -1,7 +1,429 @@
#lang scribble/manual
@(require (for-label racket/base
(except-in turnstile/turnstile ⊢)
))
(except-in turnstile/turnstile ⊢))
"doc-utils.rkt")
@title{Guide}
@title{Guide}
This guide explains how to implement a series of languages with common type
system features.
@section[#:tag "judgements"]{A New Type Judgement}
Turnstile's syntax borrows from conventional type judgement syntax, specifically
a new kind of judgement that interleaves program rewriting (i.e., macro
expansion) and type checking. These type judgements rely on two core
@cite{bidirectional} relations:
@itemlist[#:style 'ordered
@item{@verbatim|{Γ ⊢ e ≫ e- ⇒ τ}|
reads: "In context Γ, @racket[e] expands to @racket[e-] and has type
τ."}
@item{@verbatim|{Γ ⊢ e ≫ e- ⇐ τ}|
reads: "In context Γ, @racket[e] expands to @racket[e-] and must
have type τ."
The key difference is that τ is an output in the first relation, and
an input in the second relation.
These input and output positions conveniently correspond to
@tech{syntax patterns} and
@tech{syntax templates}, respectively, as explained below.}]
For example, here are some rules that check and rewrite simply-typed
lambda-calculus terms to the untyped lambda-calculus.
@verbatim|{
[x ≫ x- : τ] ∈ Γ
[VAR] -----------------
Γ ⊢ x ≫ x- ⇒ τ
Γ,[x ≫ x- : τ1] ⊢ e ≫ e- ⇒ τ2
[LAM] -------------------------------
Γ ⊢ λx:τ1.e ≫ λx-.e- ⇒ τ1 → τ2
Γ ⊢ e1 ≫ e1- ⇒ τ1 → τ2
Γ ⊢ e2 ≫ e2- ⇐ τ1
[APP] -------------------------
Γ ⊢ e1 e2 ≫ e1- e2- ⇒ τ2
}|
@section{@racket[define-typed-syntax]}
@subsection{Input and Output}
Implementing the above judgements with Turnstile might look like
(we extended the forms to define multi-arity functions):
@label-code["Initial function and application definitions"
@#reader scribble/comment-reader
(racketblock0
;; [LAM]
(define-typed-syntax (λ ([x:id : τ_in:type] ...) e) ≫
[[x ≫ x- : τ_in.norm] ... ⊢ e ≫ e- ⇒ τ_out]
-------
[⊢ (λ- (x- ...) e-) ⇒ (→ τ_in.norm ... τ_out)])
;; [APP]
(define-typed-syntax (#%app e_fn e_arg ...) ≫
[⊢ e_fn ≫ e_fn- ⇒ (~→ τ_in ... τ_out)]
[⊢ e_arg ≫ e_arg- ⇐ τ_in] ...
--------
[⊢ (#%app- e_fn- e_arg- ...) ⇒ τ_out])
)]
The above code assumes some existing bindings:
@itemlist[
@item{A @racket[→] type constructor;}
@item{a @racket[~→] @tech{pattern expander} associated with the type
constructor;}
@item{a @racket[type] @tech{syntax class} recognizing valid types;}
@item{and core Racket forms with a @litchar{-} suffix.}
]
The new language implementation must define or import the first three items,
but we defer explaining how until later, while the last item is provided by
Turnstile.
A @racket[define-typed-syntax] form resembles a conventional Racket
macro-definition form. The above definitions begin with an input pattern, where
the right most identifier is the name of the macro, followed by a @racket[≫]
literal, a series of premises, and finally a conclusion.
Compared to type judgements, Turnstile @racket[define-typed-syntax] definitions
differ in a few obvious ways:
@itemlist[
@item{Each premise and conclusion is bracketed.}
@item{A conclusion is "split" into its inputs (at the top) and outputs (at the
bottom) to resemble other Racket macro-definition forms, enabling the
definition to be more easily read as code, top-to-bottom.
For example, the @racket[λ] definition conclusion input is the initial input
pattern @racket[(λ ([x:id : τ_in:type] ...) e)] and the conclusion outputs are
the output templates @racket[(λ- (x- ...) e-)] and
@racket[(→ τ_in.norm ... τ_out)]. The initial @racket[≫] separates the input
pattern from the premises while the @racket[⇒] in the conclusion associates
the type with the output expression.}
@item{The rule implementations do not thread through an explicit @racket[Γ].
Rather, Turnstile reuses Racket's lexical scope to as the type environment and
thus only new type environment bindings should be specified (to the left of
the @racket[⊢]), just like a @racket[let].}
@item{Since type environments use lexical scope, an explicit implementation of
the @tt{[VAR]} rule is unneeeded.}]
The @racket[define-typed-syntax] is roughly an extension of
@racket[define-syntax-parser]:
@itemlist[
@item{A programmer may specify @racket[syntax-parse]
options, e.g., @racket[#:datum-literals];}
@item{a pattern position may use any @racket[syntax-parse] combinators, e.g.
@racket[~and], @racket[~seq], or custom-defined @tech{pattern expanders};}
@item{and the premises may be interleaved with @racket[syntax-parse]
@tech{pattern directives}, e.g., @racket[#:with] or @racket[#:when].}]
@subsection{Premises}
Like the type judgements, @racket[define-typed-syntax] supports two core
@cite{bidirectional}-style type checking clauses.
A @racket[[⊢ e ≫ e- ⇒ τ]] premise expands expression @racket[e], binds its
expanded form to @racket[e-] and its type to @racket[τ]. In other words,
@racket[e] is an input syntax template, and @racket[e-] and @racket[τ] are
output patterns.
Dually, one may write @racket[[⊢ e ≫ e- ⇐ τ]] to check that @racket[e] has type
@racket[τ]. Here, both @racket[e] and @racket[τ] are inputs (templates) and only
@racket[e-] is an output (pattern).
For example, the first @racket[#%app] premise above,
@racket[[⊢ e_fn ≫ e_fn- ⇒ (~→ τ_in ... τ_out)]], expands function @racket[e_fn],
binds it to pattern variable @racket[e_fn-], its input types to
@racket[(τ_in ...)], and its output type to @racket[τ_out]. Macro expansion
stops with a type error if @racket[e_fn] does not have a function type.
The second @racket[#%app] premise then uses the @racket[⇐] to check that the
given inputs have types that match the expected types. Again, a type error is
reported if the types do not match.
The @racket[λ] definition above also utilizes a @racket[⇒] premise, except it
must add bindings to the type environment, which are written to the left of the
@racket[⊢]. Observe how ellipses may be used in exactly the same manner as
other Racket macros. (The @racket[norm] is an attribute of the @racket[type]
syntax class and is bound to the expanded representation of the type. This is
used to avoid double-expansions of the types.)
@subsection{@racket[syntax-parse] directives}
A @racket[define-typed-syntax] definition may also use @racket[syntax-parse]
options and @tech{pattern directives}. Here is an @racket[#%app] that reports a
more precise error for an arity mismatch:
@label-code["Function application with a better error message"
@#reader scribble/comment-reader
(racketblock0
;; [APP]
(define-typed-syntax (#%app e_fn e_arg ...) ≫
[⊢ e_fn ≫ e_fn- ⇒ (~→ τ_in ... τ_out)]
#:fail-unless (stx-length=? #'[τ_in ...] #'[e_arg ...])
(format "arity mismatch, expected ~a args, given ~a"
(stx-length #'[τ_in ...]) #'[e_arg ...])
[⊢ e_arg ≫ e_arg- ⇐ τ_in] ...
--------
[⊢ (#%app- e_fn- e_arg- ...) ⇒ τ_out]))]
@subsection{Defining types}
We next extend our language with some type definitions:
@label-code["STLC, with type definitions"
@#reader scribble/comment-reader
(racketblock0
(define-syntax-category type)
(define-type-constructor → #:arity > 0)
;; [APP] and [LAM] as before
(define-typed-syntax (λ ([x:id : τ_in:type] ...) e) ≫
[[x ≫ x- : τ_in.norm] ... ⊢ e ≫ e- ⇒ τ_out]
-------
[⊢ (λ- (x- ...) e-) ⇒ (→ τ_in.norm ... τ_out)])
(define-typed-syntax (#%app e_fn e_arg ...) ≫
[⊢ e_fn ≫ e_fn- ⇒ (~→ τ_in ... τ_out)]
#:fail-unless (stx-length=? #'[τ_in ...] #'[e_arg ...])
(format "arity mismatch, expected ~a args, given ~a"
(stx-length #'[τ_in ...]) #'[e_arg ...])
[⊢ e_arg ≫ e_arg- ⇐ τ_in] ...
--------
[⊢ (#%app- e_fn- e_arg- ...) ⇒ τ_out]))]
The @racket[define-syntax-category] declaration defines various forms, like
@racket[define-base-type] and @racket[define-type-constructor], that make it
easy to define types. It also defines the aforementioned @racket[type]
@tech{syntax class}.
The @racket[define-type-constructor] declaration defines the @racket[→]
function type, which must have at least one argument, along with the
aforementioned @racket[~→] @tech{pattern expander} for that type, which makes it
easier to match on the type in @tech{syntax patterns}.\
@subsection{Defining @racket[⇐] rules}
The astute reader has probably noticed that the type judgements from the
@secref{judgements} section. Specifically, @\racket[⇐] clauses appear in the
premises but never in the conclusion.
To complete the judgements, one can imagine an implicit @racket[⇐] rule that
dispatches to the appropriate @racket[⇒] rule:
@verbatim|{
Γ ⊢ e ≫ e- ⇒ τ2
τ1 = τ2
[FLIP] -----------------
Γ ⊢ e ≫ e- ⇐ τ1
}|
Turnstile similarly defines an implicit @racket[⇐] rule so programmers need not
specify a @racket[⇐] variant of every rule. A programmer may still specify an
explicit @racket[⇐] rule if desired, however. This is especially useful for
implementing (local) type inference or type annotations. Here is an
extended lambda that additionally specifies a @racket[⇐] clause.
@label-code["lambda with inference, and ann"
@#reader scribble/comment-reader
(racketblock0
;; [LAM]
(define-typed-syntax λ #:datum-literals (:)
[(_ ([x:id : τ_in:type] ...) e) ≫
[[x ≫ x- : τ_in.norm] ... ⊢ e ≫ e- ⇒ τ_out]
-------
[⊢ (λ- (x- ...) e-) ⇒ (→ τ_in.norm ... τ_out)]]
[(_ (x:id ...) e) ⇐ (~→ τ_in ... τ_out) ≫
[[x ≫ x- : τ_in] ... ⊢ e ≫ e- ⇐ τ_out]
---------
[⊢ (λ- (x- ...) e-)]])
(define-typed-syntax ann #:datum-literals (:)
[(_ e : τ:type) ≫
[⊢ e ≫ e- ⇐ τ.norm]
--------
[⊢ e- ⇒ τ.norm]]))]
This new lambda definition uses an alternate, multi-clause
@racket[define-typed-syntax] syntax, analogous to @racket[syntax-parse] (whereas
the simpler form above is analogous to @racket[define-simple-macro]).
The first clause is the same as before. The second clause has an additional
input type pattern and the clause matches only if the both patterns match,
indicating that the type of the expression can be inferred. Observe that the
second lambda clause expression has no type annotations. Since lambda body's
type is already known, the premise in the second clause also uses the
@racket[⇐] arrow. Finally, the conclusion specifies only the expanded
expression. The input type is automatically attached to the output expression.
We also define an annotation form @racket[ann], which invokes the @racket[⇐]
clause of a type rule.
@subsection{Defining primops}
Our typed language is coming along nicely, but we cannot write any well-typed
progams since there are no base types. Let's fix that:
@label-code["defining a base type, literal, and primop"
@#reader scribble/comment-reader
(racketblock0
(define-base-type Int)
(define-primop + : (→ Int Int Int))
(define-typed-syntax #%datum
[(_ . n:integer) ≫
--------
[⊢ (#%datum- . n) ⇒ Int]]
[(_ . x) ≫
--------
[_ #:error (type-error #:src #'x #:msg "Unsupported literal: ~v" #'x)]]))]
The code above defines a base type @racket[Int], and attaches type information
to @racket[+] and integer literals.
@racket[define-primop] creates an identifier macro that attaches the specified
type to the specified identifier. As with any Racket macro, the base @racket[+]
here is whatever the implementing language is. By default is is Racket.
@subsection{A Complete language}
We may now write well-typed programs! For completeness, here is the complete
language implementation:
@; how to typeset #lang turnstile?
@label-code["A complete STLC implementation, created with #lang turnstile"
@#reader scribble/comment-reader
(racketblock0
(define-syntax-category type)
(define-type-constructor → #:arity > 0)
(define-base-type Int)
(define-primop + : (→ Int Int Int))
;; [APP]
(define-typed-syntax (#%app e_fn e_arg ...) ≫
[⊢ e_fn ≫ e_fn- ⇒ (~→ τ_in ... τ_out)]
#:fail-unless (stx-length=? #'[τ_in ...] #'[e_arg ...])
(format "arity mismatch, expected ~a args, given ~a"
(stx-length #'[τ_in ...]) #'[e_arg ...])
[⊢ e_arg ≫ e_arg- ⇐ τ_in] ...
--------
[⊢ (#%app- e_fn- e_arg- ...) ⇒ τ_out])
;; [LAM]
(define-typed-syntax λ #:datum-literals (:)
[(_ ([x:id : τ_in:type] ...) e) ≫
[[x ≫ x- : τ_in.norm] ... ⊢ e ≫ e- ⇒ τ_out]
-------
[⊢ (λ- (x- ...) e-) ⇒ (→ τ_in.norm ... τ_out)]]
[(_ (x:id ...) e) ⇐ (~→ τ_in ... τ_out) ≫
[[x ≫ x- : τ_in] ... ⊢ e ≫ e- ⇐ τ_out]
---------
[⊢ (λ- (x- ...) e-)]])
;; [ANN]
(define-typed-syntax ann #:datum-literals (:)
[(_ e : τ:type) ≫
[⊢ e ≫ e- ⇐ τ.norm]
--------
[⊢ e- ⇒ τ.norm]])
;; [DATUM]
(define-typed-syntax #%datum
[(_ . n:integer) ≫
--------
[⊢ (#%datum- . n) ⇒ Int]]
[(_ . x) ≫
--------
[_ #:error (type-error #:src #'x #:msg "Unsupported literal: ~v" #'x)]]))]
@subsection{Extending a language}
Imagine our language from the previous section is named @tt{STLC}. Since it
consists of just a series of macros, like any other Racket #lang, its forms may
be imported and exported and may be easily reused in other languages. Let's see
how we can reuse the above implementation to implement a subtyping language.
@label-code["A language with subtyping that reuses STLC"
@(racketblock0 #:escape esc
(extends STLC #:except #%datum +)
(define-base-types Top Num Nat)
(define-primop + : (→ Num Num Num))
(define-typed-syntax #%datum
[(_ . n:nat) ≫
--------
[⊢ (#%datum- . n) ⇒ Nat]]
[(_ . n:integer) ≫
--------
[⊢ (#%datum- . n) ⇒ Int]]
[(_ . n:number) ≫
--------
[⊢ (#%datum- . n) ⇒ Num]]
[(_ . x) ≫
--------
[≻ (STLC:#%datum . x)]])
(begin-for-syntax
(define (sub? t1 t2)
(code:comment "need this because recursive calls made with unexpanded types")
(define τ1 ((current-type-eval) t1))
(define τ2 ((current-type-eval) t2))
(or ((current-type=?) τ1 τ2)
(Top? τ2)
(syntax-parse (list τ1 τ2)
[(_ ~Num) ((current-sub?) τ1 #'Int)]
[(_ ~Int) ((current-sub?) τ1 #'Nat)]
[((~→ τi1 ... τo1) (~→ τi2 ... τo2))
(and (subs? #'(τi2 ...) #'(τi1 ...))
((current-sub?) #'τo1 #'τo2))]
[_ #f])))
(define current-sub? (make-parameter sub?))
(current-typecheck-relation sub?)
(define (subs? τs1 τs2)
(and (stx-length=? τs1 τs2)
(stx-andmap (current-sub?) τs1 τs2)))
(define (join t1 t2)
(cond
[((current-sub?) t1 t2) t2]
[((current-sub?) t2 t1) t1]
[else #'Top]))
(define current-join (make-parameter join)))
@code:comment{[IF]}
(define-typed-syntax if
[(_ e_tst e1 e2) ≫
[⊢ e_tst ≫ e_tst- ⇒ _] ; a non-false value is truthy.
[⊢ e1 ≫ e1- ⇒ τ1]
[⊢ e2 ≫ e2- ⇒ τ2]
--------
[⊢ (if- e_tst- e1- e2-) ⇒ @#,((current-join) #'τ1 #'τ2)]]))]
This language uses subtyping instead of type equality as its "typecheck
relation". Specifically, the language defines a @racket[sub?] subtyping relation
and sets it as the @racket[current-typecheck-relation]. Thus it is able to reuse
the @racket[λ] and @racket[#%app] rules from the previous sections without
modification. Accordingly, the @racket[extends] require form imports the
previous rules and re-exports them.
It does not reuse @racket[#%datum] and @racket[+], however, since the the
subtyping language updates these forms. Specifically, the new language defines a
hierarchy of numeric base types, gives @racket[+] a more general type, and
redefines @racket[#%datum] to assign more precise types to numeric literals.
Observe that @racket[#%datum] dispatches to @tt{STLC}'s datum in the "else"
clause, using the @racket[≻] conclusion form, which dispatches to another
(typed) macro. In this manner, the new typed language may still define and invoke
macros like any other Racket program.
Since the new language uses subtyping, it also defines a @racket[join]
function, which is needed by conditional forms like @racket[if]. The
@racket[if] definition actually uses the @racket[current-join] parameter, to
make it reusable by other languages. Observe that the output type in the
@racket[if] definition uses @racket[unquote]. In general, all @tech{syntax
template} positions in Turnstile as @racket[quasiquote]s.

View File

@ -10,113 +10,14 @@
@section{Introduction}
Turnstile aims to help Racket programmers create typed languages. It does so by
introducing extensions of Racket's macro-definition forms that facilitate
implementation of type rules alongside macro code. As a result, the macros that
implement the language directly type check the program when they are expanded,
obviating the need to create and call out to a separate type checker. Thus, the
complete language implementation remains a series of macro definitions and may
be imported and exported in the standard way that Racket programmers are
accustomed to.
@subsection{Extending Type Judgements}
Turnstile's syntax borrows from conventional type judgements, extended to
interleave program rewriting (i.e., macro expansion) and type checking. These
type judgements rely on two core @cite{bidirectional} relations:
@itemlist[#:style 'ordered
@item{@verbatim|{Γ ⊢ e ≫ e- ⇒ τ}|
reads: "In context Γ, @racket[e] expands to @racket[e-] and has type
τ."}
@item{@verbatim|{Γ ⊢ e ≫ e- ⇐ τ}|
reads: "In context Γ, @racket[e] expands to @racket[e-] and must
have type τ."
The key difference is that τ is an output position in the first
relation, and an input position in the second relation.
These conveniently correspond to syntax pattern-match and
template positions, respectively, as explained below.}]
For example, here are rules checking and rewriting simply-typed
lambda-calculus terms to the untyped lambda-calculus.
@verbatim|{
[x ≫ x- ⇒ τ] ∈ Γ
[VAR] ---------------
Γ ⊢ x ≫ x- ⇒ τ
Γ,[x ≫ x- ⇒ τ1] ⊢ e ≫ e- ⇒ τ2
[LAM] -------------------------------
Γ ⊢ λx:τ1.e ≫ λx-.e- ⇒ τ1 → τ2
Γ ⊢ e1 ≫ e1- ⇒ τ1 → τ2
Γ ⊢ e2 ≫ e2- ⇐ τ1
[APP] -------------------------
Γ ⊢ e1 e2 ≫ e1- e2- ⇒ τ2
}|
@subsection{@racket[define-typed-syntax]}
If the above type judgements were extended to multi-arity functions, the
Turnstile code might be:
@#reader scribble/comment-reader
(racketblock
;; [LAM]
(define-typed-syntax λ #:datum-literals (:)
[(_ ([x:id : τ_in:type] ...) e) ≫
[[x ≫ x- : τ_in.norm] ... ⊢ e ≫ e- ⇒ τ_out]
-------
[⊢ (λ- (x- ...) e-) ⇒ (→ τ_in.norm ... τ_out)]])
;; [APP]
(define-typed-syntax #%app
[(_ e_fn e_arg ...) ≫
[⊢ e_fn ≫ e_fn- ⇒ (~→ τ_in ... τ_out)]
[⊢ e_arg ≫ e_arg- ⇐ τ_in] ...
--------
[⊢ (#%app- e_fn- e_arg- ...) ⇒ τ_out]])
)
The above code assumes some existing bindings:
@itemlist[
@item{A @racket[→] type constructor;}
@item{a @racket[~→] @tech{pattern expander} associated with the type
constructor;}
@item{a @racket[type] @tech{syntax class} recognizing valid types;}
@item{and core Racket forms with a @litchar{-} suffix.}
]
The new language implementation must define or import the first three items
while the last item is provided by Turnstile.
A @racket[define-typed-syntax] form resembles a conventional Racket
macro-definition form and consists of a series of clauses that begin with a
syntax pattern. Compared to their analogous type judgements, Turnstile
@racket[define-typed-syntax] definitions differ in a few obvious ways:
@itemlist[
@item{Each premise and conclusion is bracketed.}
@item{A conclusion is "split" into its inputs (at the top) and outputs (at the
bottom) to resemble other Racket macro-definition forms, enabling the
definition to be read top-to-bottom. For example, the @racket[λ] definition
splits its conclusion into its input pattern
@racket[(_ ([x:id : τ_in:type] ...) e)] and output syntax templates
@racket[(λ- (x- ...) e-)] and @racket[(→ τ_in.norm ... τ_out)]. The initial
@racket[≫] separates the input pattern from the premises while the @racket[⇒]
in the conclusion associates the type with the output expression.}
@item{The rule implementations do not thread through an explicit @racket[Γ].
Rather, Turnstile reuses Racket's lexical scope to handle the type environment
and thus the rules need only specify new bindings, to the right of the
@racket[⊢], to add to the type environment, just like a @racket[let].}]
The @racket[define-typed-syntax] is roughly an extension of
@racket[define-syntax-parser]:
@itemlist[
@item{A programmer may specify @racket[syntax-parse]
options, e.g., @racket[#:datum-literals];}
@item{a pattern position may use any @racket[syntax-parse] combinators, e.g.
@racket[~and], @racket[~seq], or custom-defined @tech{pattern expanders};}
@item{and the premises may be interleaved with @racket[syntax-parse]
@tech{pattern directives}, e.g., @racket[#:with] or @racket[#:when].}]
Turnstile aims to help Racket programmers create typed languages. It does so
with extensions of Racket's macro-definition forms that facilitate
implementation of type rules along with normal macro code. As a result, the
macros implementing a new language directly type check the program during
expansion, obviating the need to create and call out to a separate type checker.
Thus, a complete typed language implementation remains a series of macro
definitions and may be imported and exported in the standard way that Racket
programmers are accustomed to.
@include-section{guide.scrbl}
@include-section{reference.scrbl}