134 lines
6.0 KiB
Racket
134 lines
6.0 KiB
Racket
#lang scribble/sigplan @onecolumn
|
|
|
|
@require["common.rkt" (only-in scribble/base nested) "bib.rkt"]
|
|
|
|
@title[#:tag "sec:intro"]{Type Eleborators Need APIs}
|
|
|
|
Type systems such as ML's, Haskell's, or Scala's help developers convey
|
|
their thinking to their successors in a sound and concise manner. The
|
|
implementations of such type systems tend to employ elaboration passes to
|
|
check the types of a specific program. As an elaborator traverses the
|
|
(parsed) synatx, it simultaneously reconstructs types, checks their
|
|
consistency according to the underlying type theory, replaces many of the
|
|
surface-syntax constructs with constructs from the kernel language, and
|
|
inserts type information to create an annotated representation.
|
|
|
|
Some programming languages also support a meta-API, that is, a way to
|
|
programmatically direct the elaborator. For example, Rust and Scala come
|
|
with compiler plug-ins. Typed Racket and Types Clojure inherit the macro
|
|
mechanisms of their parents. This paper refers to all such mechanisms as
|
|
@defterm{elaborator API}s.
|
|
|
|
Equipping type-checking elaborators with APIs---or using the existing
|
|
APIs---promises new ways to expand the power of type checking at a
|
|
relatively low cost. Specifically, the developers of libraries can
|
|
@defterm{tailor} typing rules to the exported functions and linguistic
|
|
constructs.
|
|
|
|
Consider the example of tailoring the API of a library that implements a
|
|
string-based, domain-specific languages. Examples of such libraries abound:
|
|
formatting, regular-expression matching, database queries, and so on. The
|
|
creators of these DSLs tend to know a lot about how programs written in
|
|
these DSLs relate to the host program, but they (usually) cannot express
|
|
this knowledge in API types. Type tailoring allows such programmers to
|
|
refine the typing rules for the APIs in a relatively straightforward
|
|
way. Recall the API of the regular-expression library. In a typical typed
|
|
language, the matching function is exported with a type like this:
|
|
@;
|
|
@exact|{
|
|
\begin{mathpar}
|
|
\mbox{regexp-match} : \mbox{String} \rightarrow \mbox{Opt[Listof[String]]}
|
|
\end{mathpar}\hspace{-.3em}}|
|
|
@; the above is an ***incredibly disgusting hack to work around a scribble bug***
|
|
When the result is a list of strings, the length of the list depends on the
|
|
argument string---but the API for the regular-expression library cannot
|
|
express this relationship between the regular expression and the
|
|
surrounding host program. As a result, the author of the host program must
|
|
inject additional complexity, often in the form of (hidden) run-time checks.
|
|
|
|
If type tailoring is available, the creator of the library can refine the
|
|
type of @tt{regexp-match} with rules such as this one:
|
|
@;
|
|
@exact|{
|
|
\begin{mathpar}
|
|
\inferrule*[]{
|
|
\elabsto{e}{e'}{\mbox{String}}, \\\\
|
|
\mbox{\it Program} \vdash e' \mbox{ does not contain a grouping}
|
|
}{
|
|
\elabsto{\mbox{regexp-match}~e}{\mbox{regexp-match}~e'}{\mbox{Opt[List[String]]}}
|
|
}
|
|
\end{mathpar}\hspace{-.3em}}|
|
|
That is, when the extended elaborator can use (a fragment of) the program
|
|
to prove that the given string for a specific use of @tt{regexp-match}
|
|
does not contain grouping specifications---ways to extract a sub-string via
|
|
matching---the @tt{Some} result type is a list of a single
|
|
string. The original specifications remains the default rule, which the
|
|
type checker uses when it cannot use the specific ones.
|
|
|
|
A similar rule would say if the string contains exactly one such grouping
|
|
(and no alternative), a match produces a list of two strings. Critically,
|
|
this refined type for the result of applying @tt{regexp-match} enables
|
|
further type refinements, just like constant folding enables additional
|
|
compiler optimizations. In this specific case, the program context of the
|
|
application of @tt{regexp-match} may dereference the list with unsafe---and
|
|
thus faster---versions of @tt{first} and @tt{second} once it has confirmed
|
|
a match.
|
|
|
|
Vectors offer a similar opportunity for unsafe dereferencing via
|
|
type tailoring. Typically, such a library exports a dereferencing construct with
|
|
the following type rule:
|
|
@exact|{
|
|
\begin{mathpar}
|
|
\inferrule*[]{
|
|
\elabsto{e_1}{e_1'}{\tarray}
|
|
\\
|
|
\elabsto{e_2}{e_2'}{\tint}
|
|
}{
|
|
\elabsto{{e_1}[{e_2}]}{\checkedref~e_1'~e_2'}{\tint}
|
|
}
|
|
\end{mathpar}\hspace{-.3em}}|
|
|
@;
|
|
If the elaborator can prove, however, that the indexing expression @tt{e}
|
|
is equal to or evaluates to a specific integer @tt{i}, the rule can be
|
|
strengthened as follows:
|
|
@;
|
|
@exact|{
|
|
\begin{mathpar}
|
|
|
|
\inferrule*[]{
|
|
\elabsto{e_1}{\vectorvn}{\tarray},
|
|
\\
|
|
\elabsto{e_2}{e'_2}{\tint},
|
|
\\\\
|
|
\mbox{\it Program} \vdash e'_2 = i, \quad
|
|
i \in \ints, \quad
|
|
0 \le i < n
|
|
}{
|
|
\elabsto{{e_1}[{e_2}]}{\unsaferef~\vectorvn~i}{\tint}
|
|
}
|
|
\end{mathpar}\hspace{-.3em}}|
|
|
|
|
This paper introduces and evaluates the novel idea of type tailoring. It
|
|
uses Typed Racket and its API to the elaborator (section 2) because the
|
|
language already supports appropriate type and run-time systems and because
|
|
it is relatively straightforward to program the type
|
|
elaborator---@emph{without modifying it}. To make type tailoring concrete
|
|
and to demonstrate its usefulness, the paper presents two case studies
|
|
(sections 3 and 4). Each report on a case study consists of three parts: a
|
|
type soundness argument assuming a type soundness argument for the complete
|
|
language exists; the actual
|
|
@;
|
|
@margin-note*{BEN: this evaluation is missing an idea}
|
|
@;
|
|
implementation; and an evaluation that reports how often the revised type
|
|
elaborator can improve the code. The first type tailoring---chosen for the
|
|
wide familiarity of the domain---enriches the type-checking of
|
|
vector-referencing operations (section 3). The second example explains how
|
|
the implementor of a string-based embedded DSL---a regexp-matching DSL,
|
|
also widely familiar to programming researchers---may tailor the types of
|
|
the interpretation function (section 4). In addition, the paper sketches
|
|
several other applications of type tailoring in Typed Racket (section
|
|
5). The final two sections compare programmability of the type elaborator
|
|
to work on dependent types and sketch how such an API could be implemented
|
|
for other languages.
|