\begin{schemeregion} \label{chap:prior} The design of Typed Scheme builds on extensive prior work. This chapter surveys the most important strands. The first section discusses the extensive history of static checking for untyped languages, in particular LISP and Scheme. While Typed Scheme differs significantly from all of these prior efforts, their design choices and experience informed the choices made in creating Typed Scheme. % The second section reviews prior work on designing multi-language systems, especially those that use contracts at the boundaries. % The third section describes previous solutions to the problem of typechecking a language with macros, as it has been explored both in the context of static checkers and soft typing. % The final section covers two key innovative technologies that Typed Scheme builds on--runtime software contracts and module systems. \section{Static Checking for Scheme and LISP} The history of programming languages knows many attempts to add or to use type information in conjunction with untyped languages. In 1976, \citet{c:typed-lisp} proposed a type system to simplify the operation and use of Boyer-Moore-style theorem provers~\cite{boyer-moore97}. Since then, the designers of Common Lisp~\cite{cl2} as well as many other languages have included type declarations in such languages, often to help compilers, sometimes to assist programmers. From the late 1980s until recently, people have studied soft typing~\cite{cf:pldi91,awl:popl94,wc:toplas97,hr:fpca95,ff:toplas99,mff:popl06}, a form of type inference to assist programmers debug their programs statically. This work has mainly been in the context of Scheme but has also been applied to Python~\cite{some-mit-ms-thesis} and Erlang~\cite{mw:erlang}. %Recently, %the slogan of ``gradual typing'' has resurrected the LISP-style %annotation mechanisms and has had a first impact with its tentative %inclusion in Perl6~\cite{tang}. This section surveys this body of work, starting with previous static type systems for Scheme and LISP. \subsection{Static Type Systems} Many researchers have developed systems that brought together both static type systems and LISP or Scheme programming, usually to make it easier to reason about programs. In 1976, Cartwright presented TYPED LISP~\citeyearpar{c:typed-lisp}. However, in his paper (section 5) he describes having to abandon the policy of rejecting type-incorrect programs because the types of variables in conditionals had overly broad types, which is precisely the issue that Typed Scheme and occurrence typing address. TYPED LISP also did not consider the problem of interoperation with untyped code. Wand's Semantic Prototyping System (or SPS)~\cite{wand84} included a type system for Scheme programs, designed for implementing denotational models of languages. The type system did not include any features for handling typical Scheme idioms, nor for interoperating soundly with untyped code. Haynes~\cite{h:infer} designed a type system called Infer based on row types for Scheme. His system did not accommodate Scheme programming idioms in the way that Typed Scheme does, and aimed for complete type inference for all programs, a non-goal of the Typed Scheme project. However, his system does support restricted forms of variable-arity polymorphism~\cite{dh:var-ar}. This system is discussed in detail in section~\ref{sec:esop-related}. \citet{m:nisp} developed Nisp, an unsound type system for Common LISP. It allows the description and checking of many types similar to those in Typed Scheme, but does not provide checking for typical LISP idioms. It also does not provide sound interaction with untyped code. Leavens~\cite{leavens} designed an ML-like type system for Scheme, but made no attempt to accommodate Scheme idioms, or to integrate with untyped code. The Bigloo Scheme compiler~\cite{bigloo} provides a simple type system for Scheme, primarily for the purpose of optimization. This system does not have safe interoperation with untyped code, nor does it accommodate Scheme idioms. The Strongtalk system~\cite{strongtalk} includes a sophisticated typechecker for Smalltalk programs, designed by analysis of existing Smalltalk idioms. The system was therefore able to accomodate large bodies of existing Smalltalk code, including portions of the then-standard library. However, Strongtalk did not attempt to check the behavior of still-untyped portions of a program. \subsection{Soft Typing} The goal of the soft typing research agenda is to provide an optional type checker for programs in untyped languages. One key premise is that programmers shouldn't have to write down any type definitions or type declarations. Soft typing should work via type inference only. Another premise is that soft type systems should never prevent programmers from running any program. If the type checker encounters an ill-typed program, it should insert run-time checks that restore typability and ensure that the invariants of the type system are not violated. Naturally, a soft type system should minimize these insertions of run-time checks. Furthermore, since these insertions represent potential failures of type invariants, a good soft type system must allow programmers to inspect the sites of these run-time checks to determine whether they represent genuine errors or weaknesses of the type system. Such soft typing systems are complex and brittle, however. On one hand, these systems may infer extremely large types for seemingly simple expressions, greatly confusing the programmer, either the original or the maintenance programmer who has taken on old code. On the other hand, a small syntactic change to a program without semantic consequences can introduce vast changes into the types of both nearby and remote expressions. Experiments with undergraduates---representative of average programmers---suggest that only the very best understood the tools well enough to make sense of the inferred types and to exploit them for the assigned tasks. For the others, these tools turned into time sinks with little benefit. Roughly speaking, soft typing systems fall into one of two classes, depending on the kind of underlying inference system. The first soft type systems~\cite{cf:pldi91,wc:toplas97,hr:fpca95, h:dynamic-article, am:static-types-dynamic-language} used inference engines based on Hindley-Milner though with extensible record types. These systems are able to type many actual Scheme programs, including those using outlandish-looking recursive datatypes. Unfortunately, these systems severely suffer from the general Hindley-Milner error-recovery problem. That is, when the type system signals a type error, it is extremely difficult---often impossible---to decipher its meaning and to fix it. % \footnote{Fagan gives an example of the \scheme|taut| function, which % checks if arbitrary-arity boolean functions are tautologies. This example % works in Typed Scheme as well.} In response to this error-recovery problem, others built inference systems based on Shivers's control-flow analyses~\citeyearpar{shivers-thesis} and Aiken's and Heintze's set-based analyses~\cite{awl:popl94,nh:sba}. Roughly speaking, these soft typing systems introduce sets-of-values constraints for variables and values and propagate them via a generalized transitive-closure propagation~\cite{awl:popl94,ff:toplas99}. In this world, it is easy to communicate to a programmer how a value might flow into a particular operation and violate a type invariant, thus eliminating one of the major problems of Hindley-Milner based soft typing~\cite{ffkwf:mrspidey}. My experience and evaluation, as well as that of other users of soft typing systems, suggest that Typed Scheme works well compared to soft typing. First, programmers can easily convert entire modules with just a few type declarations and annotations to function headers. Second, assigning explicit types and rejecting programs actually pinpoints errors better than soft typing systems, where programmers must always keep in mind that the type inference system is conservative. Third, soft typing systems do not support type abstractions well. Starting from an explicit, static type system for an untyped language should help introduce these features and deploy them as needed. The prior soft typing research inspired this work on occurrence typing. These systems employed simplistic \scheme{if}-splitting rules that performed a case analysis for types based on the syntactic predicates in the test expression. This idea was derived from \citet{c:typed-lisp}'s \texttt{typecase} construct (also see below) and---due to its usefulness---inspired the generalization to occurrence typing. The major advantage of soft typing over an explicitly typed Scheme is that it does not require any assistance from the programmer. In the future, I hope to borrow techniques from soft typing for automating some of the conversion process from untyped modules to typed modules. \section{Interlanguage Interoperability} Smooth operation between typed and untyped code is essential to Typed Scheme. The design of this interoperation relies heavily on prior work on language interoperation. \citet{gff:oopsla05} developed ProfessorJ, which combines PLT Scheme and Java. Their interoperability strategy relies on custom \emph{mirrors}, generated specifically for each class, which dynamically check the invariants of methods. They also extend each language with the features of the other, using the Java-like class system of PLT Scheme~\cite{fff:classes}, and adding first-class functions to their implementation of Java. Their system also automatically generates contracts as part of the mirrors for each class. \citet{mf:multilang-toplas} consider interlanguage interoperability between a typed and untyped lambda calculus, and demonstrate the such interoperability corresponds to contracts, although their work does not generate contracts. They claim, but do not prove, that only untyped portions of the program can be blamed. However, their system has only two parties. \section{Implementing Types in Scheme} The implementation of typecheckers for Scheme poses several challenges. First, the typechecker must somehow cope with user-written macros. Second, the typechecker must accommodate pre-existing code in a sound manner, preferably without requiring changes to the untyped Scheme implementation. Third, several systems, like Typed Scheme, have implemented the type system via the underlying macro system. The SPS system mentioned above used macros to implement the typechecker, although without hygiene. Typed Scheme scales up the approach of SPS to a modern macro and module system, enabling the seamless integration between typed and untyped code. Flanagan's static analyzer for DrScheme~\cite{ffkwf:mrspidey} analyzed expanded programs and used syntax source information to display the analysis results in the program editor. The analyzer included a macro protocol that let programmers annotate their programs with hints for the analyzer. The Ziggurat project~\cite{fs:ziggurat-jfp} has investigated alternate approaches to static analysis and other program observations in the presence of macros. Analyses in Ziggurat are implemented as methods on expression nodes, and derived expression forms (that is, macros) may either override analysis methods with special behavior or defer to the default analysis of the macro's expansion. Ziggurat represents a new approach to defining macro protocols, and it is as yet unclear how the Ziggurat approach compares with the mechanisms described here. \section{Contracts and Modules} The design and implementation of Typed Scheme builds heavily on prior work in contracts and module systems. Runtime software contracts have a long history, but traditionally did not support two features necessary for the design of Typed Scheme. First, Scheme, and thus Typed Scheme, makes extensive use of higher-order language constructs, which must therefore be accommodated by the contract system. Second, Typed Scheme isolates typed from untyped code, and thus needs a mechanism for determining which portion of the program violated a contract. These two shortcomings are addressed by Findler's contract system~\cite{ff:ho-contracts}, as implemented in PLT Scheme~\cite{plt-tr2009-reference-v4.2.2}. This system provides guarantees that ensure that the appropriate portion of code is blamed for violating a contract, even when the violation is of a higher-order contract and takes place arbitrarily later in the execution of the program. Typed Scheme builds directly on this system, automatically generating contracts from types. Typed Scheme also relies on a module system that provides strong abstraction barriers, in order to prevent unauthorized access by the untyped code to unprotected portions of the typed program. For Typed Scheme, this module system is the PLT module system~\cite{f:modules,plt-tr2009-reference-v4.2.2}. This system provides strong abstraction barriers for both values and macros. It also allows Typed Scheme to be implemented entirely as a library, using the techniques described in chapter~\ref{chap:impl}. \end{schemeregion}