487 lines
20 KiB
TeX
487 lines
20 KiB
TeX
\newcommand\Strust{\textsc{trust}}
|
|
|
|
\begin{schemeregion}
|
|
\section{Typing Modules}
|
|
\label{sect:type-multi}
|
|
|
|
Type-checking a typed module is more complicated than type-checking an
|
|
isolated definition or expression. Module bodies may refer to
|
|
variables that are neither primitive nor locally-defined, but imported
|
|
from other modules. Furthermore, module exports must be protected from
|
|
misuse in other modules, both typed and untyped.
|
|
|
|
As with a single definition or expression, type-checking a module
|
|
involves fully expanding the contents of the module and then analyzing
|
|
the result. Typed Scheme uses the module transformer hook
|
|
to type-check the contents of the module.
|
|
|
|
The variable protocol handles variables whose definitions or bindings
|
|
occur within the body of the module, but typing imported variables
|
|
requires additional communication between typed modules. The revised
|
|
protocol affects the way a typed module's exports are compiled.
|
|
|
|
There are three kinds of module interactions that typed modules can
|
|
participate in:
|
|
\begin{enumerate}
|
|
\item A typed module requires an untyped module.
|
|
\item A typed module requires another typed module.
|
|
\item An untyped module requires a typed module.
|
|
\end{enumerate}
|
|
The first case simply requires a method of importing untrusted code in
|
|
such a way that it cannot break the type system's invariants, which
|
|
demands appropriate input from the programmer. The other two cases
|
|
determine the behavior of a typed module's exports. Those two cases
|
|
essentially demand different behaviors from a typed module depending
|
|
on its use context.
|
|
|
|
This section explains how Typed Scheme interacts with the module
|
|
system. We begin with the simplest case, a typed module importing
|
|
untyped code. This case can be explained in terms of just the import
|
|
statement. Then we consider the case of a typed module importing
|
|
another typed module, and we develop the basic typed-module framework.
|
|
Finally, we show how to extend the behavior of exports to support the
|
|
case of importing a typed module into an untyped context.
|
|
|
|
\subsection{Untyped to Typed}
|
|
|
|
|
|
Typed modules cannot use untyped modules without additional protection.%
|
|
\footnote{However, typed modules can safely import untyped
|
|
\emph{macro} libraries (such as \scheme{match}) if the macros do not
|
|
expand into untyped, non-primitive variables.}
|
|
%
|
|
Instead, typed modules use a special \scheme{require/typed} form to
|
|
import names at specific types. The \scheme{require/typed} form wraps
|
|
the untyped imports with contracts~\cite{ff:ho-contracts} that enforce
|
|
the supplied types via runtime checks. It also adds the name to the type
|
|
environment with the specified type.
|
|
|
|
For example, the following use of \scheme{require/typed} imports the
|
|
\scheme{find-files} procedure from a standard library
|
|
module:\footnote{The \scheme|Path| of this library is a filesystem
|
|
path, not the paths of chapter~\ref{chap:occur-extend}.}
|
|
\begin{schemedisplay}
|
|
(require/typed scheme/file
|
|
[find-files ((Path -> Boolean) Path -> (Listof Path))])
|
|
\end{schemedisplay}
|
|
It is equivalent to the following code fragment:
|
|
\begin{schemedisplay}
|
|
(require (rename-in scheme/file unsafe-find-files find-files))$^{\mbox{\scriptsize \Strust}}$
|
|
(define: find-files : ((Path -> Boolean) Path -> (Listof Path))
|
|
(contract (type->contract
|
|
((Path -> Boolean) Path -> (Listof Path)))
|
|
unsafe-find-files
|
|
'find-files
|
|
'<typed-scheme>)$^{\mbox{\scriptsize \Strust}}$)
|
|
\end{schemedisplay}
|
|
The $\Strust$ annotation indicates a syntax property that directs the
|
|
type-checker to accept the labeled expression as-is.
|
|
%
|
|
The \scheme{contract} expression wraps the unsafe version of the
|
|
\scheme{find-files} procedure with a contract derived from the given
|
|
type. The last two arguments indicate the parties involved in the
|
|
contract; if something goes wrong, one of the parties is blamed.
|
|
|
|
The \scheme{find-files} contract checks the procedure's arguments and
|
|
result. If the untyped version of \scheme{find-files} returns a
|
|
non-path result, the contract catches it and blames
|
|
\scheme{'find-files} before the faulty value can interfere with the
|
|
typed program.
|
|
%
|
|
The first argument contract is itself a higher-order contract, so the
|
|
contract system wraps the function passed to \scheme{find-files} with
|
|
a contract corresponding to the \scheme{(Path -> Boolean)} type. This
|
|
contract prevents the untyped \scheme{find-files} from calling the
|
|
function with faulty arguments; if it does so, the contract system
|
|
raises an error and blames \scheme{'find-files} for the violation.
|
|
%
|
|
The second argument contract is a first-order contract. It can only be
|
|
violated if typed code supplies an argument of the wrong type, which
|
|
cannot happen if the type system is sound.
|
|
%
|
|
Finally, if \scheme{find-files} were to return something other than a
|
|
list of paths, the contract system would stop the program and thus
|
|
protect the typed code that expects to process the result.
|
|
|
|
%% Some types have no contract rep, like polymorphic types
|
|
%% prob. also unions of function types.
|
|
|
|
\subsection{Typed to Typed}
|
|
|
|
Typed Scheme installs a \scheme{HPmodule-begin} macro that first
|
|
performs the normal module expansion (using \scheme{local-expand}),
|
|
analyzes the result, and produces a module body that follows a new
|
|
\emph{module variable protocol}, which provides the type-checker with
|
|
the types of module variables:
|
|
\begin{schemedisplay}
|
|
(define-syntax (module-begin stx)
|
|
(syntax-case stx ()
|
|
[(module-begin form ...)
|
|
(type-check-module-body
|
|
(local-expand #'(#%plain-module-begin form ...)
|
|
'module-begin
|
|
null))]))
|
|
\end{schemedisplay}
|
|
Unlike the type-checking procedure for top-level forms,
|
|
\scheme{type-check-module-body} not only type-checks the module body;
|
|
it also transforms the code to produce the module body.
|
|
|
|
When one typed module requires another typed module, type-checking the
|
|
first module requires knowing the types associated with the all of the
|
|
definitions of the second module. The type-checker needs the types for
|
|
all of the definitions, even the unexported ones, because an imported
|
|
macro can expand into references to the unexported variables of the
|
|
module it was defined in.
|
|
%
|
|
This requires a new protocol, the module variable protocol.
|
|
|
|
Let us consider the protocol mechanisms introduced in
|
|
section~\ref{sect:protocols}.
|
|
%
|
|
An imported identifier does not carry any syntax properties, so syntax
|
|
properties alone are insufficient.
|
|
%
|
|
Static binding provides a partial solution: instead of directly
|
|
providing a variable, a typed module could instead provide a macro
|
|
that expands into a use of the actual variable. The macro would place
|
|
a type annotation on the reference as a syntax property.
|
|
%
|
|
The problem with the static binding approach is that it annotates only
|
|
the references that cross the public import/export boundary.
|
|
% FIXME: make sure to explain this point in section 3.
|
|
Variable references introduced by imported macros, however, do not go
|
|
through the static binding mechanism; they refer directly to the
|
|
module variables.
|
|
%
|
|
Since Typed Scheme aims to support macros, static binding is not
|
|
a viable approach.
|
|
|
|
That leaves compile-time side effects. We extend the type environment
|
|
table to include all known typed-module definitions instead of just
|
|
primitives and local definitions. A typed module relies on the global
|
|
type environment to contain types for all variables that appear within
|
|
its body, and it guarantees that its client modules have access to its
|
|
own type associations.
|
|
\begin{quotation}\noindent
|
|
\textbf{The Module Variable Protocol:}
|
|
During the compilation of a typed module, the global type environment
|
|
contains bindings for all definitions in all typed modules
|
|
transitively required by the module being compiled.
|
|
\end{quotation}
|
|
|
|
Since a module's contributions to the global type environment need to
|
|
be present during the compilation of every module that depends on it,
|
|
we use the persistent effect pattern described in
|
|
section~\ref{sect:syntax:persistent}. In addition to verifying the
|
|
correctness of the module's contents, the
|
|
\scheme{type-check-module-body} procedure also appends compile-time
|
|
type declarations to the end of the module.
|
|
%
|
|
We illustrate the effect of the module transformer on the following
|
|
modules:
|
|
\begin{schemedisplay}
|
|
langts ;; one
|
|
(provide one)
|
|
(: one Number)
|
|
(define one 1)
|
|
|
|
langts ;; plus
|
|
(provide plus1)
|
|
(: plus1 (Number -> Number))
|
|
(define (plus1 n)
|
|
(+ n one))
|
|
\end{schemedisplay}
|
|
The first module passes the type-checker, which also adds a type
|
|
declaration for \scheme{one} to the end of the compiled module:
|
|
\begin{schemedisplay}
|
|
(compiled-module one
|
|
(require typed-scheme)
|
|
(provide one)
|
|
(define one 1)
|
|
(begin-for-syntax
|
|
(declare-type! #'one (typeKW Number))))
|
|
\end{schemedisplay}
|
|
The reference to \scheme{declare-type!} was inserted by a macro from
|
|
the \scheme{typed-scheme} module. Even though \scheme{one} does not
|
|
import the \scheme{env} module directly, the procedure is available
|
|
indirectly through \scheme{typed-scheme}. Since \scheme{typed-scheme}
|
|
imports \scheme{env} via \scheme{for-syntax}, it is correct to
|
|
use \scheme{declare-type!} within the compile-time part of
|
|
\scheme{one}.
|
|
|
|
When the compiler encounters the \scheme{plus} module, the module
|
|
system invokes the compile-time part of \scheme{typed-scheme},
|
|
initializing the global type environment with the primitive bindings
|
|
only. Then, when the compiler encounters the import of \scheme{one} in
|
|
the module body, it invokes the compile-time part of the \scheme{one}
|
|
module, which loads its type declaration for \scheme{one} into
|
|
the type environment.
|
|
|
|
The \scheme{plus} module includes just one new definition, and the
|
|
module transformer adds the corresponding declaration to the module:
|
|
\begin{schemedisplay}
|
|
(compiled-module plus
|
|
(require typed-scheme)
|
|
(provide plus)
|
|
(define plus (lambda (n) (+ n 1)))
|
|
(begin-for-syntax
|
|
(declare-type! #'plus (typeKW (Number -> Number)))))
|
|
\end{schemedisplay}
|
|
|
|
The two modules are able to communicate using \scheme{typed-scheme}'s
|
|
type environment because the compile-time parts of the \scheme{one}
|
|
module and the \scheme{plus} module share a single invocation of
|
|
\scheme{typed-scheme} and thus a single invocation of the \scheme{env}
|
|
module.
|
|
|
|
Figures~\ref{fig:typed-scheme-module} and~\ref{fig:type-check-module}
|
|
show the implementation of typed modules and the module variable
|
|
protocol.
|
|
|
|
\begin{figure}[p!]
|
|
\begin{schemedisplay}
|
|
langs ;; typed-scheme
|
|
(require (for-syntax type-check))
|
|
(provide (rename-out module-begin HPmodule-begin)
|
|
(rename-out top-interaction HPtop-interaction)
|
|
(except-out (all-from-out scheme)
|
|
HPmodule-begin HPtop-interaction)
|
|
define:
|
|
lambda:)
|
|
(define-syntax (module-begin stx)
|
|
(syntax-case stx ()
|
|
[(module-begin form ...)
|
|
(type-check-module-body
|
|
(local-expand #'(#%plain-module-begin form ...)
|
|
'module-begin
|
|
null))]))
|
|
(define-syntax top-interaction ELIDED)
|
|
(define-syntax define: ELIDED)
|
|
(define-syntax lambda: ELIDED)
|
|
\end{schemedisplay}
|
|
\caption{The \variablefont{typed-scheme} module}
|
|
\label{fig:typed-scheme-module}
|
|
\end{figure}
|
|
|
|
\begin{figure}[p!]
|
|
\begin{schemedisplay}
|
|
langs ;; context
|
|
(provide typed-context?)
|
|
;; typed-context? : (box-of boolean)
|
|
;; True when the module being \emph{compiled} is a typed module.
|
|
(define typed-context? (box #f))
|
|
|
|
langs ;; typed-scheme
|
|
ELIDED
|
|
(require (for-syntax context))
|
|
(define-syntax (module-begin stx)
|
|
(syntax-case stx ()
|
|
[(module-begin form ...)
|
|
(begin
|
|
(set-box! typed-context #t)
|
|
(type-check-module-body
|
|
(local-expand #'(#%plain-module-begin form ...)
|
|
'module-begin
|
|
null)))]))
|
|
ELIDED
|
|
\end{schemedisplay}
|
|
\caption{Modified \variablefont{typed-scheme} module}
|
|
\label{fig:new-ts-mod}
|
|
\end{figure}
|
|
|
|
|
|
\begin{figure}[p!]
|
|
\begin{schemedisplay}
|
|
langs ;; type-check
|
|
(require env)
|
|
(provide (all-defined-out))
|
|
;; type-check-top-level : syntax $\rightarrow$ void
|
|
(define (type-check-top-level form) ELIDED)
|
|
;; type-check-module-body : syntax $\rightarrow$ syntax
|
|
(define (type-check-module-body form)
|
|
(syntax-case form ()
|
|
[(module-begin top-level-form ...)
|
|
(let ([def-types
|
|
(get-definition-types (syntax->list #'(top-level-form ...)))])
|
|
(for ([def def-types])
|
|
(declare-type! (binding-id def) (binding-type def)))
|
|
(for-each type-check-module-level-form
|
|
(syntax->list #'(top-level-form ...)))
|
|
;; Generate declarations to reload types into the
|
|
;; global type environment
|
|
(with-syntax ([(type-declaration ...)
|
|
(map binding->type-declaration def-types)])
|
|
#'(module-begin top-level-form ... type-declaration ...)))]))
|
|
;; type-check-module-level-form : syntax $\rightarrow$ void
|
|
(define (type-check-module-level-form form) ELIDED)
|
|
;; type-check-expression : syntax environment $\rightarrow$ type
|
|
(define (type-check-expression expr env) ELIDED)
|
|
;; get-definition-types : (list-of syntax) $\rightarrow$ (list-of binding)
|
|
(define (get-definition-types forms)
|
|
(if (null? forms)
|
|
null
|
|
(syntax-case (car forms) (define)
|
|
[(define name rhs)
|
|
(cons (make-binding #'name (get-id-type #'name))
|
|
(get-definition-types (cdr forms)))]
|
|
[_ (get-definition-types (cdr forms))])))
|
|
;; get-id-type : identifier $\rightarrow$ type
|
|
(define (get-id-type id) ELIDED)
|
|
;; binding$\rightarrow$type-declaration : binding $\rightarrow$ syntax
|
|
(define (binding->type-declaration b)
|
|
(with-syntax ([id (binding-id b)]
|
|
[type-expr (type->type-expression (binding-type b))])
|
|
#'(begin-for-syntax (declare-type! #'id type-expr))))
|
|
;; type$\rightarrow$type-expression : type $\rightarrow$ syntax
|
|
(define (type->type-expression type) ELIDED)
|
|
\end{schemedisplay}
|
|
\caption{Type Checker}
|
|
\label{fig:type-check-module}
|
|
\end{figure}
|
|
|
|
\subsection{Typed to Untyped}
|
|
|
|
When a typed module is imported into another typed module, it must
|
|
provide its definitions and load the type declarations into the global
|
|
type environment. The type-checker ensures that the exported values
|
|
are used safely, so there is no need for run-time checking or
|
|
wrapping.
|
|
|
|
In contrast, when a typed module is imported into an untyped module,
|
|
it should protect its exports so that the untyped context cannot
|
|
destroy the type invariants. As in the ``untyped to typed'' case, we use
|
|
contracts to enforce the type constraints of the definitions. For any
|
|
defined variable, it is a simple matter to generate a definition that
|
|
wraps the variable in the protection of the appropriate contract.
|
|
%
|
|
For example, the \scheme{plus} module above has a \scheme{plus1}
|
|
procedure with type \scheme{(Number -> Number)}. Given that information,
|
|
we can generate \scheme{defensive-plus1}:
|
|
|
|
\begin{schemedisplay}
|
|
(define/contract defensive-plus1
|
|
(type->contract (Number -> Number))
|
|
plus1)
|
|
\end{schemedisplay}
|
|
\noindent
|
|
The \scheme{define/contract} form is like a definition that uses
|
|
\scheme{contract} explicitly, except that it automatically computes
|
|
the blame parties.
|
|
|
|
A typed module, then, needs to provide one set of definitions to typed
|
|
contexts and another set of definitions to untyped contexts.
|
|
%
|
|
Of course, no module can actually change the contents of its
|
|
\scheme{provide} clauses once it is compiled. Instead, it can provide
|
|
a set of \emph{indirection} macros that choose whether to expand into
|
|
the trusting or defensive versions of exported names, assuming the macros
|
|
can determine whether the importing context is typed or untyped. PLT
|
|
Scheme provides \emph{rename transformers} as a convenient way of
|
|
writing such identifier-to-identifier translations.
|
|
|
|
Continuing the \scheme{plus} module example, the module transformer
|
|
rewrites
|
|
\begin{schemedisplay}
|
|
(provide plus1)
|
|
\end{schemedisplay}
|
|
into the following indirection definition and renamed-provide clause:
|
|
\begin{schemedisplay}
|
|
(define-syntax export-plus1
|
|
(if ELIDED ;; Will it be used in a typed context?
|
|
(make-rename-transformer #'plus1)
|
|
(make-rename-transformer #'defensive-plus1)))
|
|
(provide (rename export-plus1 plus1))
|
|
\end{schemedisplay}
|
|
The indirection definitions depend on some way of determining whether
|
|
the context they are imported into is typed or untyped. The context
|
|
that matters is the main module currently being compiled. If the require
|
|
chain includes intervening modules, they have already been compiled,
|
|
and references within the compiled modules are already resolved to the
|
|
right version of the exports. Thus, the problem boils down to
|
|
determining whether the main module currently being compiled is a typed
|
|
module.
|
|
|
|
The property that distinguishes a typed module is that it specifies
|
|
\scheme{typed-scheme} as its language module, and thus its module body
|
|
is under the control of the typed module transformer. Given that, it
|
|
is critical to understand the exact order of events in the compilation
|
|
process:
|
|
\begin{enumerate}
|
|
\item
|
|
The compiler invokes the initial language module's compile-time
|
|
part.\footnote{Although this invocation occurs prior to any
|
|
compilation of a typed module, it cannot be used to determine
|
|
whether compilation is occurring in a typed context, since the
|
|
Typed Scheme module can be required from untyped as well as typed
|
|
modules. }
|
|
\item
|
|
Then, it executes the initial language module's module transformer on
|
|
the body of the module being compiled.
|
|
\item
|
|
As the compiler encounters \scheme{require}s in the module's body, it
|
|
invokes the compile-time parts of the relevant modules.
|
|
\end{enumerate}
|
|
In particular, the execution of the module transformer precedes the
|
|
execution of any of the indirection definitions in compiled typed
|
|
modules. The Typed Scheme module transformer can therefore set a flag
|
|
indicating that the module being compiled is a typed module, and the
|
|
indirection definitions can simply check the value of the flag.
|
|
%
|
|
Figure~\ref{fig:new-ts-mod} presents the modified \scheme{typed-scheme} module.
|
|
|
|
|
|
The \scheme{type-check} module also adds \scheme{(require context)} so
|
|
that the indirection definitions it inserts can refer to
|
|
\scheme{typed-context?}.
|
|
|
|
The following program illustrate how the flag works. We add an untyped
|
|
\scheme{main} module to the \scheme{one} and \scheme{plus} modules
|
|
from our earlier examples.
|
|
\begin{schemedisplay}
|
|
langts ;; one
|
|
(provide one)
|
|
(: one Number)
|
|
(define one 1))
|
|
|
|
langts ;; plus
|
|
(require one)
|
|
(provide plus1)
|
|
(: plus1 (Number -> Number))
|
|
(define (plus1 x)
|
|
(+ x one)))
|
|
|
|
langs ;; main
|
|
(require plus)
|
|
(display (plus1 41)) (newline)
|
|
\end{schemedisplay}
|
|
The compiler processes the typed \scheme{one} module first, creating
|
|
the context-dependent indirection definition for the exported variable
|
|
\scheme{one}.
|
|
%
|
|
When the compiler encounters the typed \scheme{plus} module,
|
|
it first invokes the compile-time part of \scheme{typed-scheme}. That,
|
|
in turn, causes the invocation of the \scheme{context} module,
|
|
including a new \scheme{typed-context?} box initialized to
|
|
false. Executing the Typed Scheme \scheme{HPmodule-begin} macro sets
|
|
the value in the \scheme{typed-context?} box to true. Subsequently,
|
|
when the compiler encounters the \scheme{(require one)} form in the
|
|
module body, it invokes \scheme{one}'s compile-time part. Since the
|
|
\scheme{typed-context?} variable is set to true, the indirections are
|
|
set to the typed variants, and the compiler resolves uses of the
|
|
imported names to the unwrapped definitions.
|
|
|
|
The compilation of the \scheme{main} module proceeds differently. When
|
|
the compiler encounters the \scheme{(require plus)} form, it invokes
|
|
\scheme{plus}'s compile-time part, which invokes
|
|
\scheme{typed-scheme}'s compile-time part and invokes
|
|
\scheme{context}. This creates a fresh \scheme{typed-context?} box
|
|
initialized to false, just as before. The box's value is never changed
|
|
to true, however, because Typed Scheme's \scheme{HPmodule-begin} macro
|
|
is not used in the expansion of the \scheme{main} module. Thus when
|
|
\scheme{plus}'s indirection definitions are executed, they point to
|
|
the contract-wrapped variants. Thus the occurrence of \scheme{plus1}
|
|
in the \scheme{main} module is wrapped in code to verify the type of
|
|
its argument.
|
|
|
|
\end{schemeregion}
|