\begin{schemeregion} \label{sec:informal} In this chapter, I describe the general outlines of Typed Scheme via a series of examples. This includes simple typed programs, the integration between typed and untyped code, occurrence typing, and variable-arity polymorphism. In sections~\ref{sec:simple}-\ref{sec:refinement-intro} these are described informally, and section~\ref{sec:semiformal} provides more detail to give an overview of how the system works. \section{Simple Typed Scheme} \label{sec:simple} The obligatory {\it Hello, World} Typed Scheme program is: \begin{exmp} \begin{schemedisplay} langts (display "Hello, World!\n") \end{schemedisplay} \label{ex:hello} \end{exmp} \noindent This example demonstrates several salient features of Typed Scheme. First, a Typed Scheme program is a PLT Scheme module~\cite{f:modules}, which begins with the token \scheme|hlang|. Second, the \scheme|typed-scheme| token after \scheme|hlang| specifies that the module is written in the Typed Scheme language. The program is otherwise identical to the corresponding untyped Scheme program. When more complex programs are required, the type system requires slightly more help from the programmer: \begin{exmp} \begin{schemedisplay} langts (: collatz (Number -> Number)) (define (collatz n) (cond [(= 1 n) 1] [(even? n) (collatz (/ n 2))] [else (collatz (add1 (* 3 n)))])) (collatz 17) \end{schemedisplay} \label{ex:collatz} \end{exmp} \noindent In this example, we define the function \scheme|collatz|, and provide an annotation for its input and output types. These annotations are required for every top-level definition. Typed Scheme programs can also define and manipulate new data structures: \begin{exmp} \begin{schemedisplay} langts (define-struct: person ([first : String] [last : String])) (: greeting (person -> String)) (define (greeting n) (format "~a ~a" (person-first n) (person-last n))) (greeting (make-person "Bob" "Smith")) \end{schemedisplay} \label{ex:struct} \end{exmp} The \scheme|define-struct:| form defines a new structure, with selectors, predicate, and constructor. The name of the structure (here \scheme|person|) is used as the name of the type. The fields must be given types by the programmer, but the selectors (\scheme|person-first| and \scheme|person-last|), predicate (\scheme|person?|), and constructors (\scheme|make-person|) are given types automatically from the field types. \section{Polymorphism and Local Type Inference} \label{sec:poly} Typed Scheme supports first-class polymorphic functions.\footnote{Such functions are not always parametric, because occurrence typing can be used to examine the arguments.} For example, \scheme|list-ref| has the type \scheme|(All (alpha) ((Listof alpha) Integer -> alpha))|. It can be defined in Typed Scheme as follows: \begin{exmp} \begin{schemedisplay} (: list-ref (All (alpha) ((Listof alpha) Integer -> alpha))) (define (list-ref l i) (cond [(not (pair? l)) (error "empty list")] [(= 0 i) (car l)] [else (list-ref (cdr l) (- i 1))])) \end{schemedisplay} \label{ex:list-ref} \end{exmp} The example illustrates two important aspects of polymorphism in Typed Scheme. First, the abstraction over types is explicit in the polymorphic type of \scheme|list-ref| but implicit in the function definition. Second, typical uses of polymorphic functions, e.g., \scheme|car| and \scheme|list-ref|, do not require explicit type instantiation. Instead, the required type instantiations are reconstructed from the types of the arguments. \section{Integration with Untyped Scheme} \label{sec:integrate} Typed Scheme integrates smoothly and safely with existing untyped Scheme code. This means both that Typed Scheme modules can depend on untyped Scheme modules, and that untyped Scheme modules can depend on Typed Scheme modules. Further, the untyped code can never violate the invariants of the Typed Scheme type system. This allows free choice of which modules are refactored and ported to Typed Scheme. The simplest example of typed/untyped integration merely uses a value from Typed Scheme in untyped code. \begin{exmp} \begin{schemedisplay} langts ;; module m1 (: x Number) (define x 42) (provide x) \end{schemedisplay} \vspace{1mm} \begin{schemedisplay} langs (require m1) (add1 x) \end{schemedisplay} \label{ex:integrate} \end{exmp} \noindent Since the value is being used by untyped code, no new types must be specified. In contrast, when importing values \emph{from} untyped code, the type must be specified: \begin{exmp} \begin{schemedisplay} langs ;; module m2 (define x 42) (provide x) \end{schemedisplay} \vspace{1mm} \begin{schemedisplay} langts (require/typed m2 [x Number]) (add1 x) \end{schemedisplay} \label{ex:integrate-reverse} \end{exmp} The \scheme|require/typed| form allows the programmer to specify the type of an imported binding (here \scheme|x|). The type is converted to a runtime contract~\cite{m:design-by-contract-article}, which ensures that the value is appropriate to the type, and raises an error if not. In the case of an error, for example if the value of \scheme|x| was \scheme|"42"|, \emph{blame} is assigned to the series of modules involved. The Typed Scheme type system ensures (see chapter~\ref{chap:integrate}) that the typed portion of the program is never blamed for such runtime contract errors---instead, the blame falls on one of the untyped modules. The contract system also ensures appropriate blame for higher-order values~\cite{ff:ho-contracts}. Here, errors are possible in both directions. Several examples are given in figure~\ref{fig:ho-contract-ex}. In example~\ref{ex:ho-integrate}, the untyped module uses the \scheme|add5| procedure incorrectly, resulting in a runtime contract error blaming the untyped module. In example~\ref{ex:ho-integrate-reverse}, an incorrect use of the \scheme|add5| procedure would result in a static type error. In example~\ref{ex:ho-integrate-reverse-wrong}, the implementation of the untyped \scheme|add5| procedure is incorrect, and thus a runtime contract error is signaled, again blaming the untyped module. Finally, in example~\ref{ex:ho-ho}, the contract error happens during the \emph{second} application, but still correctly blames the untyped module. \begin{figure} \begin{exmp} \begin{schemedisplay} langts ;; module ho1 (: add5 (Number -> Number)) (define (add5 n) (+ 5 n)) (provide add5) \end{schemedisplay} \vspace{1mm} \begin{schemedisplay} langs (require ho1) (add5 7) (add5 "seven") ;; contract error \end{schemedisplay} \label{ex:ho-integrate} \end{exmp} %\vspace{2mm}\hrule\vspace{2mm} \begin{exmp} \begin{schemedisplay} langs ;; module ho2 (define (add5 n) (+ 5 n)) (provide add5) \end{schemedisplay} \vspace{1mm} \begin{schemedisplay} langts (require ho2 [add5 (Number -> Number)]) (add5 7) ;;(add5 "seven") - static type error \end{schemedisplay} \label{ex:ho-integrate-reverse} \end{exmp} %\vspace{2mm}\hrule\vspace{2mm} \begin{exmp} \begin{schemedisplay} langs ;; module ho3 (define (add5 n) (number->string (+ 5 n))) (provide add5) \end{schemedisplay} \vspace{1mm} \begin{schemedisplay} langts (require ho3 [add5 (Number -> Number)]) (add5 7) ;; contract error \end{schemedisplay} \label{ex:ho-integrate-reverse-wrong} \end{exmp} %\vspace{2mm}\hrule\vspace{2mm} \begin{exmp} \begin{schemedisplay} langts ;; module ho4 (: add-blaster (Number -> (Number -> Number))) (define ((add-blaster x) y) (+ x y)) $\vspace{1mm}$ langs (require ho4) ((add-blaster 1) "wrong") ;; contract error \end{schemedisplay} \label{ex:ho-ho} \end{exmp} \caption{Higher-order Contract Examples} \label{fig:ho-contract-ex} \end{figure} \section{Occurrence Typing} \label{sec:occur} Here is the simplest example of occurrence typing:\footnote{In the remainder of this chapter, module declarations will be omitted where unnecessary.} \begin{exmp} \begin{schemedisplay} ... (if (number? x) (add1 x) 0) ... \end{schemedisplay} \label{ex:simple} \end{exmp} \noindent Regardless of the value of \scheme{x}, this program fragment always produces a number. Thus, our type system should accept this fragment, regardless of the type assigned to \scheme{x}, even if it is a type such as \scheme|String| or \scheme|Any|, which are not legitimate argument types for \scheme|add1|. The key to typing example~\ref{ex:simple} is to assign the second occurrence of \scheme|x| a different, more precise, type than it has in the outer context. Fortunately, we know that for any value of type \scheme|Number|, \scheme|number?| returns \scheme|#t| (otherwise, it returns \scheme|#f|). Therefore, it is safe to use \scheme|Number| as the type of \scheme|x| in the \tbranch. \begin{exmp} % \strut\par \begin{schemedisplay} (define f (lambda ([x : (U String Number)]) (if (number? x) (add1 x) (string-length x)))) \end{schemedisplay} \label{ex:numstring} \end{exmp} \noindent The function \scheme|f| in example~\ref{ex:numstring} always produces a number. If \scheme|(number? x)| produces \scheme|#t|, \scheme|x| is an appropriate input to \scheme|add1|. If it produces \scheme|#f|, \scheme|x| must be a \scheme|String| by process of elimination, and it is therefore an acceptable input to \scheme|string-length|. Handling this program means that the type system must take into account not only the consequences when predicates hold, but also when they do not. \subsection{More Complex Tests} Of course, simple applications of predicates such as \scheme|(number? x)| are not the only kind of test that Scheme programmers write. For example, it is possible to use logical connectives to combine the results of predicates: \begin{exmp} % \strut\par \begin{schemedisplay} ... (if (or (number? x) (string? x)) (f x) ;; \scheme|f| from example~\ref{ex:numstring} 0) ... \end{schemedisplay} \label{ex:or} \end{exmp} \noindent For the fragment in example~\ref{ex:or} to typecheck, the type system must recognize that the expression \scheme|(or (number? x) (string? x))| ensures that \scheme|x| has type \scheme|(U String Number)| in the \tbranch, the domain of \scheme|f| from above. For \scheme|and|, there is no such neat connection. In example~\ref{ex:andone}, conclusions are safely drawn in the \tbranch, regardless of the types of \scheme|x| and \scheme|y|. \begin{exmp} % \strut\par \begin{schemedisplay} ... (if (and (number? x) (string? y)) (+ x (string-length y)) 0) ... \end{schemedisplay} \label{ex:andone} \end{exmp} \noindent However, similar assumptions cannot be made in the \ebranch. \begin{exmp} % \strut\par \begin{schemedisplay} ;; \scheme|x| is either a \scheme|Number| or a \scheme|String| ... (if (and (number? x) (string? y)) (+ x (string-length y)) (string-length x)) ... \end{schemedisplay} \label{ex:andtwo} \end{exmp} \label{sec:andintro} \noindent In example~\ref{ex:andtwo}, the programmer falsely assumed \scheme|x| to be a \scheme|String| when the test fails. But the test may produce \scheme|#f| because \scheme|x| is actually a \scheme|String|, in which case the program would succeed, or because \scheme|y| is not a \scheme|String|, but \scheme|x| \emph{is} a number, which would cause \scheme|(string-length x)| to fail. In general, when a conjunction is false, we do not know which conjunct is false. \subsection{Abstraction over Predicates} So far, we have seen how programmers can use predefined predicates. It is important, however, that programmers can also abstract over existing predicates and create new ones. \begin{exmp} % \strut\par \begin{schemedisplay} (define strnum? (lambda ([x : Any]) (or (string? x) (number? x)))) \end{schemedisplay} \label{ex:strnum} \end{exmp} \noindent Taking our previous example of a test for the type \scheme|(U String Number)|, we can create the function \scheme|strnum?|, which behaves as a predicate for that type. This means that the type system must be able to represent the fact that \scheme|strnum?| is a predicate for this type, so that it can exploit it for conditionals. \subsection{Variables, Predicates and Selectors} An important feature of Scheme that Typed Scheme must also handle is the ability to use arbitrary non-\scheme|#f| values as true and to use \scheme|#f| as a marker for missing results, analogous to ML's {\texttt{NONE}}.\footnote{Like other dynamically typed languages, Scheme treats all values that are not \scheme|#f| as true.} \begin{exmp} % \strut\par \begin{schemedisplay} ... (let ([x (member v l)]) (if x $\text{--- compute with \scheme|x| ---}$ (error 'fail))) ... \end{schemedisplay} \label{ex:member} \end{exmp} \noindent Example~\ref{ex:member} represents the essence of a common Scheme idiom. The \scheme|member| procedure yields either the desired result, if \scheme|v| is found in \scheme|l|, or \scheme|#f|, if \scheme|v| is not found. Therefore, in the \tbranch we know \scheme|x| is the desired result, otherwise the \ebranch is taken. All of the tests thus far have involved variables such as \scheme|x| and \scheme|y|. It is also useful to test arbitrary expressions. For example, we can test that the \scheme|car| of a pair is a \scheme|Number| with the expression \scheme|(number? (car p))|. Integrating this form of reasoning into the type system requires further modifications, as we will see. \begin{exmp} % \strut\par \begin{schemedisplay} ... (if (number? (car p)) ;; \scheme|p| : \scheme|(Pair Any Any)| (add1 (car p)) 7) ... \end{schemedisplay} \label{ex:numcar} \end{exmp} \noindent If \scheme|p| has the type \scheme|(Pair Any Any)|, then example~\ref{ex:numcar} should produce a number.\footnote{This relies on the recent change to PLT Scheme to make pairs immutable by default.} Of course, simply accommodating repeated applications of \scheme|car| is insufficient for real programs. Instead, the relevant portions of the type of \scheme|p| must be refined in the \tebranch of the if. In example~\ref{ex:gp}, we assume that \scheme|g| has argument type \scheme|(Pair Number Number)|: % Thus, the test expression must refine %the type of \scheme|p| to \scheme|(Pair Number Number)|, which is the %expected result of the conjunction of tests on the \scheme|car| and %\scheme|cdr|. \begin{exmp} \begin{schemedisplay} ... (if (and (number? (car p)) (number? (cdr p))) (g p) 'nope) ... \end{schemedisplay} \label{ex:gp} \end{exmp} \noindent Thus, the test expression must refine the type of \scheme|p| to \scheme|(Pair Number Number)|, which is the expected result of the conjunction of tests on the \scheme|car| and \scheme|cdr|. As example~{\ref{ex:carnum}} shows, programmers can abstract the use of predicates and selectors together. \begin{exmp} % \strut\par \begin{schemedisplay} (define carnum? (lambda ([x : (Pair Any Any)]) (number? (car x)))) \end{schemedisplay} \label{ex:carnum} \end{exmp} \noindent The \scheme|carnum?| predicate tests if the \scheme|car| of its argument is a \scheme|Number|. \subsection{Reasoning Logically} Of course, we \emph{do} learn something when conjunctions such as those in examples~\ref{ex:andone} and \ref{ex:andtwo} are false. When a conjunction is false, we know that one of the conjuncts is false, and thus when all but one are true, the remaining one must be false. In Scheme form, this reasoning principle is found quite often in nested conditionals. \begin{exmp} % \strut\par \begin{schemedisplay} ... (cond [(and (number? x) (string? y)) $\text{\ --- 1 ---}$] [(number? x) $\hspace{32.1mm}\text{\ --- 2 ---}$] [else $\hspace{47mm}\text{\ --- 3 ---}$]) ... \end{schemedisplay} \label{ex:logic} \end{exmp} \noindent This program represents a common idiom.\footnote{The introductory textbook \textit{How to Design Programs}~\cite{fffk:htdp} devotes an entire section to this idiom.} In clause 1, we obviously know that \scheme|x| is a \scheme|Number| and \scheme|y| is a \scheme|String|. In clause 2, \scheme|x| is again a \scheme|Number|. But we also know that \scheme|y| cannot be a \scheme|String|. To make sense of the type discipline employed in such programs, the Typed Scheme type system must be able to follow this reasoning. \subsection{Putting it all Together} Finally, we can combine all of these features into a single example that demonstrates all aspects of occurrence typing. \begin{exmp} % \strut\par \begin{schemedisplay} (lambda ([input : (U Number String)] [extra : (Pair Any Any)]) (cond [(and (number? input) (number? (car extra))) (+ input (car extra))] [(number? (car extra)) (+ (string->number input) (car extra))] [else 0])) \end{schemedisplay} \label{ex:full} \end{exmp} \section{Variable-arity Functions} % LocalWords: langts \label{sec:varar} PLT Scheme allows programmers to define functions that do not take a fixed number of arguments. Some functions can accept any one of a set of numbers of arguments, or a variable number. To accommodate these programming patterns, Typed Scheme supports a variety of forms of variable-arity function definitions. \subsection{Uniform Variable-Arity Functions} \label{ssec:uni-va} \emph{Uniform} variable-arity functions are those that expect their rest parameter to be a homogeneous list. Consider the following three examples of type signatures: \begin{schemedisplay} (: + (Integer$^*$ -> Integer)) (: string-append (String$^*$ -> String)) (: list (All (alpha) (alpha$^*$ -> (Listof alpha)))) \end{schemedisplay} The syntax \scheme{Type$^*$} indicates that 0 or more arguments of type \scheme|Type| are required. Here is a definition of variable-arity \scheme|+| in Typed Scheme: \begin{exmp} \begin{schemedisplay} (: + (Integer$^*$ -> Integer)) \end{schemedisplay} \schemeinput{esop09/examples/plus.ss} \label{ex:var-plus} \end{exmp} \subsection{Non-uniform Variable-Arity} \label{ssec:beyond-uva} While assuming that rest parameters are homogeneous lists of values makes typechecking simple, not all variable-arity functions allow this assumption. Typechecking heterogeneous rest parameters requires analyzing other relationships between types. For example, the length of the list assigned to the rest parameter may be connected to the types of other parameters or the returned value. For example, Scheme's \scheme|map| function maps an $n$-ary function over $n$ lists, unlike its counterparts in ML or Haskell. When \scheme{map} receives a function \scheme{f} and $n$ lists, it expects \scheme{f} to accept $n$ arguments. Also, the type of the $k$th function parameter must match the element type of the $k$th list. The following example is taken from the PLT Scheme code base: \begin{exmp} \begin{schemedisplay} ;; implements a wrapper that prints \scheme{f}'s arguments (: verbose (All (beta alpha ...) ((alpha ... -> beta) -> (alpha ... -> beta)))) (define (verbose f) (if quiet? f (lambda a (printf "xform-cpp: ~a\n" a) (apply f a)))) \end{schemedisplay} \end{exmp} The intent of the programmer is clear---the result of applying \scheme|verbose| to a function \scheme|f| should have the same type as \scheme|f| for {\em any} function type. Typed Scheme represents this by allowing \scheme|verbose| to be instantiated with a \emph{sequence} of type arguments, one for each of the arguments to \scheme|f|. Below are the types for some additional example functions: \begin{exmp} \begin{schemedisplay} ;; \scheme{map} is the standard Scheme map (: map (All (gamma alpha beta ...) ((alpha beta ... -> gamma) (Listof alpha) (Listof beta) ... -> (Listof gamma)))) \end{schemedisplay} \label{ex:map-dots} \end{exmp} \begin{exmp} \begin{schemedisplay} ;; \scheme{map-with-funcs} takes any number of functions, ;; and then an appropriate set of arguments, and then produces ;; the results of applying all the functions to the arguments (: map-with-funcs (All (beta alpha ...) ((alpha ... -> beta)$^*$ -> (alpha ... -> (Listof beta))))) \end{schemedisplay} \label{ex:map-func-dots} \end{exmp} When a variable-arity polymorphic type is instantiated, the dotted sequence is replaced with the provided sequence of type arguments. For \scheme|map-with-funcs|, this works as follows: %% \scheme|Number Integer Boolean String| \begin{schemedisplay} (inst map-with-funcs Number Integer Boolean String) \end{schemedisplay} results in a value with the type: \begin{schemedisplay} ((Integer Boolean String -> Number)$^*$ -> (Integer Boolean String -> (Listof Number))) \end{schemedisplay} Typed Scheme also provides local inference of the appropriate type arguments for dotted polymorphic functions, so explicit type instantiation is rarely needed~\cite{dots-tr}. A more substantial definition of a variable-arity function is \scheme|fold-left|. \begin{exmp} \schemeinput{esop09/examples/fold-left.ss} \label{ex:fold-left} \end{exmp} Its type shows that it accepts at least three arguments: a function \scheme{f}; an initial element \scheme{c}; and at least one list \scheme{as}. Optionally, \scheme{fold-left} consumes additional lists, received as rest argument \scheme|bss|, which is a sequence of lists. For this combination to work out, \scheme{f} must consume as many arguments as there are lists plus one; in addition, the types of these lists must match the types of \scheme{f}'s parameters because each list item becomes an argument. \section{Refinement Types} \label{sec:refinement-intro} Refinement types, introduced originally by \citet{fp:refinement}, are types which describe subsets of conventional types. For example, the type of even integers is a refinement of the type of integers. In Typed Scheme, we can describe a set of values with a simple Scheme predicate. The fundamental idea is that a boolean-valued function, such as \scheme|even?|, can be treated as defining a type, which is a subtype of the input type of \scheme|even?|. This type has no constructors, but it is trivial to determine if a value is a member by using the predicate \scheme|even?|. For example, consider the \scheme|just-even| function, which produces solely even numbers, and the \scheme|halve| function, which consumes only even numbers. \begin{exmp} \begin{schemedisplay} (: just-even (Number -> (Refinement even?))) (define (just-even n) (if (even? n) n (error 'not-even))) (: halve ((Refinement even?) -> Number)) (define (halve n) (/ n 2)) \end{schemedisplay} \label{ex:refinement} \end{exmp} This technique harnesses occurrence typing to work with arbitrary predicates, and not just those that correspond to Scheme data types. \end{schemeregion}