[icfp] textualist intro

This commit is contained in:
ben 2016-03-15 04:56:07 -04:00
parent 95944b1ed0
commit f29ca6c309
4 changed files with 127 additions and 70 deletions

View File

@ -2,9 +2,15 @@ trivial @ icfp 2016, hopefully
=== ===
TODO TODO
- tighten abstract? "United States v. Hayes, 555 U.S. 415 (2009)")
> (regexp-match #rx"(a*)b" "aab")
'("aab" "aa")
> (regexp-match #rx"(a*)b" "ccc")
#false
- -
- -
Like Shakespeare's Portia, we understand that the phrase ``pound of flesh''
says nothing about drawing blood and specialize accordingly.
--- ---

View File

@ -186,6 +186,20 @@
#:title @elem{java.lang.reflect.Proxy} #:title @elem{java.lang.reflect.Proxy}
#:url "http://download.oracle.com/javase/6/docs/api/java/lang/reflect/Proxy.html")) #:url "http://download.oracle.com/javase/6/docs/api/java/lang/reflect/Proxy.html"))
(define clojure-macros
(make-bib
#:date 2016
#:author "Clojure 1.8.0"
#:title "Macros"
#:url "http://clojure.org/reference/macros"))
(define rust-compiler-plugins
(make-bib
#:date 2016
#:author "Rust 1.7"
#:title "Compiler Plugins"
#:url "https://doc.rust-lang.org/book/compiler-plugins.html"))
(define gal-firefox (define gal-firefox
(make-bib (make-bib
#:date 2010 #:date 2010
@ -1168,3 +1182,24 @@
#:location (proceedings-location gpce) #:location (proceedings-location gpce)
#:date 2010)) #:date 2010))
(define kkt-pldi-2016
(make-bib
#:title "Occurrence Typing Modulo Theories"
#:author (authors "Andrew Kent" "David Kempe" "Sam Tobin-Hochstadt")
#:location (proceedings-location pldi)
#:date 2016))
(define fcdb-jfp-2012
(make-bib
#:title "Macros that Work Together: Compile-time bindings, partial expansion, and definition contexts"
#:author (authors "Matthew Flatt" "Ryan Culpepper" "David Darais" "Robert Bruce Findler")
#:location (proceedings-location jfp #:pages '(181 216))
#:date 2012))
(define ramho-hosc-2013
(make-bib
#:title "Scala-Virtualized: Linguistic Reuse for Deep Embeddings"
#:author (authors "Tiark Rompf" "Nada Amin" "Adriaan Moors" "Philipp Haller" "Martin Odersky")
#:location (journal-location hosc #:volume 25 #:number 1 #:pages '(165 207))
#:date 2012))

View File

@ -19,98 +19,114 @@ Well-typed programs @emph{do} go wrong.
All the time, in fact: All the time, in fact:
@codeblock{ @codeblock{
Prelude> [0,1,2] !! 3 > (vector-ref (build-vector 2 (λ (i) i)) 3)
*** Exception: Prelude.!!: index too large ; vector-ref: index is out of range
Prelude> 1 `div` 0 > (/ 1 0)
*** Exception: divide by zero ; /: division by zero
Prelude> import Text.Printf > (printf "~s")
Prelude Text.Printf> printf "%d" ; printf: format string requires 1 arguments, given 0
*** Exception: printf: argument list ended prematurely
} }
Of course, Milner's catchphrase was about preventing type errors. Of course, Milner's catchphrase was about preventing type errors.
The above are all @emph{value errors} that depend on properties not expressed The above are all @emph{value errors} that depend on properties not expressed
by Haskell's standard list, integer, and string datatypes. by Typed Racket's standard vector, integer, and string datatypes.
Even so, it is obvious to the programmer that the expressions will go wrong Even so, it is clear to the programmer that the expressions will go wrong.
and there have been many proposals for detecting these and other value
errors@~cite[a-icfp-1999 ddems-icse-2011 ew-haskell-2012 lb-sigplan-2014].
What stands between these proposals and their adoption is the complexity or
annotation burden they impose on language users.
Likewise, there are useful functions that many type systems cannot express. Likewise, there are useful functions that many type systems cannot express.
Simple examples include a @racket[first] function for tuples of arbitrary size Simple examples include a @racket[first] function for tuples of arbitrary size
and a @racket[curry] function for procedures that consume such tuples. and a @racket[curry] function for procedures that consume such tuples.
The standard work-around@~cite[fi-jfp-2000] is to write size-indexed families of functions to handle The standard work-around@~cite[fi-jfp-2000] is to maintain size-indexed
the common cases, for instance: families of functions to handle the common cases, for instance:
@codeblock{ @codeblock{
Prelude> let curry_3 f = \ x y z -> f (x,y,z) > (define (curry_3 f)
(λ (x) (λ (y) (λ (z) (f x y z)))))
} }
@; Prelude> let first_3 (x, y, z) = x @; Prelude> let first_3 (x, y, z) = x
This pearl describes a technique for statically detecting value errors and These problems are well known, and are often used to motive research on
statically generalizing value-indexed functions. dependently typed programming languages@~cite[a-icfp-1999].
We catch all the above-mentioned wrong programs and offer a single implementation Short of abandoning ship for a completely new type system, languages including
of @racket[curry] that obviates the need to copy/paste and manage size-indexed Haskell, OCaml, Java, and Typed Racket have seen proposals for detecting
versions. certain values errors or expressing the polymorphism in functions
Furthermore, we demonstrate applications to regular expression matching, such as @racket[curry] with few (if any) changes to the existing type system@~cite[ddems-icse-2011 ks-plpv-2006 kkt-pldi-2016 lb-sigplan-2014].
vectorized operations, and querying a database. What stands between these proposals and their adoption is the complexity or
annotation burden they impose on language users.
The key to our success--and also our weakness---is that we specialize This pearl describes a low-complexity, annotation-free (@Secref{sec:experience})
procedure call sites based on compile-time constant values. technique for detecting
Run-time input foils the technique, but nonetheless we have found the idea useful value errors and expressing polymorphism over values.
for many common programming tasks. The key is to run a @emph{textualist}@note{A textualist interprets laws by
Moreover, the approach may be implemented as a library and used as a drop-in reading exactly the words on the page, not by guessing the words' intended meaning.}
solution for existing code. elaboration over programs before type-checking and propagate value information
Simply importing the library overrides standard procedures with specialized ones. evident from the program syntax to the type checker.
No further annotations are necessary; if specialization fails we default to In terms of the first example in this section, our elaborator infers that the
the program's original behavior. @racket[vector-ref] at position @racket[3] is out of bounds because it knows that
Put another way, our technique interprets the @emph{letter} @racket[(build-vector n f)]
of programs before the type system conducts its coarser, type-of-values analysis. constructs a vector with @racket[n] elements.
Like Shakespeare's Portia, we understand that the phrase ``pound of flesh''
says nothing about drawing blood and specialize accordingly.
@; Like XXX laborers who did work-to-rule ...
Our implementation happens to be for Typed Racket, but Our implementation is a Typed Racket library that shadows functions
Typed Clojure, such as @racket[build-vector] with textualist elaborators following the
Haskell, guidelines stated in @Secref{sec:solution}.
OCaml, We make essential use of Racket's powerful macro system@~cite[fcdb-jfp-2012]
Rust, to reason locally, associate inferred data with bound identifiers, and
and Scala cooperate with the rules of lexical scope.
would have been equally suitable hosts. For the adventurous reader, @Secref{sec:implementation} describes the
The main requirement is that the language provides a means of altering the syntax crucial services provided by Racket's macros.
of a program before type checking. Nevertheless, Typed Clojure@~cite[clojure-macros],
Such tools are more formally known as @emph{macro} or @emph{syntax extension} Rust@~cite[rust-compiler-plugins], and Scala@~cite[ramho-hosc-2013]
systems. could implement our approach just as well.
At any rate, we sketch implementations for the five languages
listed above in the conclusion.
Until that time when we must part, this pearl first describes our general For a sense of the practical use-cases we envision, consider the function
approach in @Secref{sec:solution} and then illustrates the approach with @racket[regexp-match], which matches a regular expression pattern against
specific examples in @Secref{sec:usage}. a string and returns either a list of matched substrings or @racket[#false]
We briefly report on practical experiences with our library if the match failed.
in @Secref{sec:experience}.
Adventurous readers may enjoy learning about implementation details
in @Secref{sec:implementation}, but everyone else is invited to skip to the
end and try implementing a letter-of-values analysis in their language of choice.
@racketblock[
> (regexp-match #rx"(.*) v\\. (.*),"
"Chisom v. Roemer, 501 U.S. 380")
'("Chisom v. Roemer," "Chisom" "Roemer")
]
The parentheses in the regular expression delimit groups to match and
return.
In this example, there are two groups.
We have written an elaborator for @racket[regexp-match] that will statically
parse its first argument, count these groups, and refine the
result type for specific calls to @racket[regexp-match].
The elaborator @emph{also} handles the common case where the regular expression
argument is a compile-time constant and respects @exact{$\alpha$}-equivalence.
@codeblock{
(define rx-case #rx"(.*) v\\. (.*),")
; Other code here
(define rxm regexp-match)
(define (get-plaintiff (s : String)) : String
(cond
[(rxm rx-case s)
=> cadr]
[else "J. Doe"]))
}
@Secref{sec:usage} has more examples.
@Secref{sec:conclusion} concludes.
@; ============================================================================= @; =============================================================================
@parag{Lineage} @parag{Lineage}
Herman and Meunier demonstrated how Racket macros can propagate Herman and Meunier demonstrated how Racket macros can propagate
information embedded in string values and syntax patterns to a data embedded in string values and syntax patterns to a
static analyzer@~cite[hm-icfp-2004]. static analyzer@~cite[hm-icfp-2004].
Their illustrative examples were format strings, regular expressions, Their illustrative examples were format strings, regular expressions,
and database queries. and database queries.
Relative to their pioneering work, our contribution is adapting Herman & Meunier's Relative to their pioneering work, we adapt Herman & Meunier's
transformations to a typed programming language. transformations to a typed language by inserting type annotations and boolean
By inserting type annotations and boolean guards, our transformations indirectly guards into the programs.
cooperate with the type checker without significantly changing the program's Our treatment of @racket[define] and @racket[let] forms is also new.
syntax.
We also give a modern description of Racket's macro system and handle definitions
as well as in-line constants.
@parag{Eager Evaluation} @parag{Eager Evaluation}

View File

@ -22,9 +22,9 @@
This pearl presents an elaboration-based technique for refining the This pearl presents an elaboration-based technique for refining the
analysis of an existing type system on existing code analysis of an existing type system on existing code
@emph{from outside} the type system. @emph{from outside} the type system.
We have implemented the technique as a Typed Racket library; We have implemented the technique as a Typed Racket library.
from the programmers' perspective, simply importing the library makes the type From the programmers' perspective, simply importing the library makes the type
checker more capable---no annotations or new syntax required. system more perceptive---no annotations or new syntax required.
} }
@;@category["D.3.3" "Programming Languages" "Language Constructs and Features"] @;@category["D.3.3" "Programming Languages" "Language Constructs and Features"]