[icfp] new intro draft
This commit is contained in:
parent
acaaa91349
commit
64c83f3178
|
@ -1,4 +1,12 @@
|
|||
#lang scribble/sigplan
|
||||
#lang scribble/sigplan @onecolumn
|
||||
|
||||
@; TODO need a word for these 'observable properties'
|
||||
@; - regexp groups
|
||||
@; - format characters
|
||||
@; - procedure arity
|
||||
@; - tuple size
|
||||
@; - vector length
|
||||
@; - database schema
|
||||
|
||||
@; Tuples in Haskell http://hackage.haskell.org/package/tuple
|
||||
@; Regexp
|
||||
|
@ -31,109 +39,78 @@ That is, once some basic structure of the input is fixed, the general
|
|||
problem becomes a much simpler, specific problem.
|
||||
If ever a 16-argument function needs currying, we can write a new function for
|
||||
the task.
|
||||
@; This pearl explains how ...
|
||||
By creating the specific @racket[curry] or @racket[first] for each size of
|
||||
function or tuple value in a program, we
|
||||
|
||||
In general, we consider functions like @racket[curry] and @racket[first] for
|
||||
arbitrarily-sized
|
||||
Regular expression patterns are another common value whose structure is not
|
||||
expressed by conventional type systems.
|
||||
Consider the function @racket[regexp-match] from Typed Racket:
|
||||
|
||||
@; Introduce name? Talk about indexed families?
|
||||
@interaction[
|
||||
(regexp-match #rx"types" "types@ccs.neu.edu")
|
||||
|
||||
(regexp-match #rx"lambda" "types@ccs.neu.edu")
|
||||
|
||||
|
||||
@;These functions behave in a straightforward manner, but depend on
|
||||
@; characteristics of their input that most type systems do not express generically.
|
||||
|
||||
|
||||
|
||||
|
||||
Many workarounds = polydots, pi-types, printf-types, ...
|
||||
|
||||
|
||||
@; Suppose @racket[regexp-match] is a function for matching a
|
||||
@; regular expression pattern against a string value.
|
||||
@; If the match succeeds then the function returns a list containing the
|
||||
@; part of the string that was matched and all matched groups
|
||||
@; corresponding to parenthesized sub-patterns of the regular expression.
|
||||
@; Otherwise, it returns a value indicating the match failed.
|
||||
@; @; TODO Specify the type system?
|
||||
@; Suppose also that @racket[pattern] and @racket[group] are variables
|
||||
@; bound to string values at run-time.
|
||||
@; Given these premises, what is the type of this expression?
|
||||
@;
|
||||
@; @racketblock[
|
||||
@; (regexp-match pattern str)
|
||||
@; ]
|
||||
@;
|
||||
@; One simple answer is @racket[(Option (Listof String))].
|
||||
@; If all goes well the resulting value will be a list of strings (of unspecified
|
||||
@; length), but we know ahead of time that the match may fail.
|
||||
@;
|
||||
@; In a programming language with the full power of dependent types, we can encode
|
||||
@; the relationship between @racket[pattern] and @racket[str] and give a very
|
||||
@; specific type relating the number of groups in @racket[pattern] to the number
|
||||
@; of elements in the result list.
|
||||
@; Dependent types, however, typically require strategic use of type annotations
|
||||
@; and carefully-selected data representations.
|
||||
@; One relatively simple type for the expression above is
|
||||
@; @racket[(Option (List[N] String))] where @racket[N] is the number of groups
|
||||
@; in @racket[pattern], but this requires the type @racket[List] to carry length
|
||||
@; information and a type for regular expressions that counts groups.
|
||||
@; Building the infrastructure necessary to express that particular type of
|
||||
@; the call to @racket[regexp-match] may give more trouble than relief.
|
||||
@;
|
||||
@; This paper explores a solution between simple and dependent types that
|
||||
@; should apply in any typed language with sufficiently powerful
|
||||
@; @emph{syntax extensions}, or macros.
|
||||
@; The key idea is to refine types based on values apparent in the program text.
|
||||
@; When the parameters to @racket[regexp-match] are bound at run-time, we
|
||||
@; conservatively assign the type @racket[(Option (Listof String))].
|
||||
@; But if there is more information at hand, we use it.
|
||||
@; For instance, both these expressions have the type
|
||||
@; @racket[(Option (List String String))] in our system.@note{Where @racket[List]
|
||||
@; is Typed Racket's type for heterogenous, sized lists; akin to tuple types in ML.}
|
||||
@;
|
||||
@; @racketblock[
|
||||
@; (regexp-match "hello(o*)" str)
|
||||
@;
|
||||
@; (let ([pattern "hello(o*)"])
|
||||
@; (regexp-match pattern str))
|
||||
@; ]
|
||||
@;
|
||||
@; The pattern here has exactly one group, delimited by the parentheses.
|
||||
@; Thus if the match succeeds we get a list of two elements: the first being
|
||||
@; the entire matched substring and the second a string matching the regular
|
||||
@; expression @racket{o*}.
|
||||
@; All this is clear from the program text to the programmer; our contribution
|
||||
@; is parsing the relevant information and forwarding it to the type checker.
|
||||
@;
|
||||
@; @;Additionally replacing @racket[str] with a value gives an even more precise type by
|
||||
@; @; evaluating the expression while type checking.
|
||||
|
||||
|
||||
@section{Prior & Current Work}
|
||||
|
||||
A macro system provides a general framework for transforming and analyzing
|
||||
program syntax.
|
||||
Languages with strong macro systems are surprisingly expressive. @;@~cite[f-scp-1991].
|
||||
Case in point, Herman and Meunier demonstrated how macros can propogate
|
||||
information embedded in string values and syntax patterns to a
|
||||
static analyzer@~cite[hm-icfp-2004].
|
||||
Their illustrative examples were format strings, regular expression patterns,
|
||||
and database queries.
|
||||
We adapt these examples to a typed programming language
|
||||
and give additional examples inspired by the literature on dependent types.
|
||||
By inserting type annotations and boolean guards our macros indirectly
|
||||
facilitate type-checking.
|
||||
Quite often---but not always---the inferred types can remove the need for
|
||||
run-time assertions.
|
||||
|
||||
Contents:
|
||||
@itemlist[
|
||||
@item{A language-agnostic framework for value-dependent macros (Section 2).}
|
||||
@item{Diverse motivating examples, from a type-safe @racket[printf] to a basic typeclass system (Section 3).}
|
||||
@item{Description and evaluation of a Typed Racket implementation (Section 4).}
|
||||
@item{Correctness requirements for transformations (Section 5).}
|
||||
@item{Related work and conclusion (Section 6).}
|
||||
(regexp-match #rx"(.*)@(.*)" "types@ccs.neu.edu")
|
||||
]
|
||||
@; TODO note that nested groups cannot fail.
|
||||
|
||||
When called with a pattern @racket[p] and string @racket[s] to match with,
|
||||
@racket[(regexp-match p s)] returns a list of all substrings in @racket[s]
|
||||
matching groups in the pattern @racket[p].
|
||||
The pattern itself counts for one group and parentheses within @racket[p]
|
||||
delimit nested groups.
|
||||
Hence the third example returns a list of three elements.
|
||||
A match can also fail, in which case the value @racket[#f] is returned.
|
||||
|
||||
Although Typed Racket's @racket[regexp-match] has the type@note{Assuming that a
|
||||
successful match implies that all nested groups successfully match.}
|
||||
@racket[(Regexp String -> (U #f (Listof String)))], the number of strings
|
||||
in the result list is determined by the number of groups in the input regular
|
||||
expression.
|
||||
Like the arity of a function or the size of a tuple, the number of groups
|
||||
in a pattern is often clear to the programmer.
|
||||
We could imagine using
|
||||
an indexed family of @racket[regexp-match] functions for patterns of
|
||||
two, three, or more groups.
|
||||
Ideally though, the programming language should understand
|
||||
these unconventional or domain-specific flavors of polymorphism.
|
||||
|
||||
This pearl describes a technique for extending a simple type system with
|
||||
a value-indexed form of polymorphism.
|
||||
By analyzing the syntactic structure of values and partially evaluating
|
||||
constant expressions before typechecking, we specialize the types of functions
|
||||
like @racket[curry], @racket[first], and @racket[regexp-match] at their
|
||||
call-sites when possible.
|
||||
Whereas the general type of @racket[curry] is @racket[(⊥ -> ⊤)],
|
||||
our system infers when it is safe to use a subtype instead.
|
||||
For instance:
|
||||
|
||||
@racketblock[
|
||||
(curry (λ (x y) x))
|
||||
]
|
||||
|
||||
generates the type constraint
|
||||
|
||||
@racketblock[
|
||||
curry : ((A B -> A) -> (A -> B -> A))
|
||||
]
|
||||
|
||||
The technique does not require any changes to the underlying type system
|
||||
or annotations from the programmer.
|
||||
Instead, we leverage existing tools for writing syntax extensions.
|
||||
Our implementation happens to use Racket's macro system, but (at least)
|
||||
Clojure,
|
||||
Haskell,
|
||||
JavaScript,
|
||||
OCaml,
|
||||
Rust,
|
||||
and Scala
|
||||
are equally capable.
|
||||
|
||||
@section{Coming Attractions}
|
||||
|
||||
Section 2 describes our approach,
|
||||
Section 3 gives applications.
|
||||
Section 4 presents the code
|
||||
and Section 5 reports on practical experience.
|
||||
We conclude with related work and reflections.
|
||||
|
||||
|
|
Loading…
Reference in New Issue
Block a user