update and expand IMPLEMENTATION.md

Incorporate text and explanation from Andy Keep at https://groups.google.com/d/msg/chez-scheme/dz6nn-8KDQE/FUaPu695BAAJ original commit: 5b8a00fc3ef9b892de9af1ae05352fa204e72270
2020-07-24 14:30:43 -06:00 · 2020-07-24 14:30:43 -06:00 · 56049bcd47
commit 56049bcd47
parent f78dc5724e
1 changed files with 183 additions and 54 deletions
--- a/IMPLEMENTATION.md
+++ b/IMPLEMENTATION.md
@ -1,7 +1,11 @@
 # Getting Started

-Most of the Chez Scheme implementation is in the "s" directory. The
-C-implemented kernel is in the "c" directory.
+The majority of the Chez Scheme compiler and libraries are implemented
+in Scheme and can be found in the "s" (for Scheme) subdirectory. The
+run-time kernel (including the garbage collector, support for
+interacting with the operating system, and some of the more
+complicated math library support) are implemented in C and can be
+found in the "c" directory.

 Some key files in "s":

@ -21,6 +25,77 @@ Some key files in "s":
   provides platform-specific constants that feed into "cmacro.ss" and
   selects the backend used by "cpnanopass.ss"

+Chez Scheme is a bootstrapped compiler, meaning you need a Chez Scheme
+compiler to build a Chez Scheme compiler. The compiler and makefiles
+support cross-compilation, so you can work from an already supported
+host to cross-compile the boot files and produce the header files for
+a new platform. In particular, the `pb` (portable bytecode) machine
+type can run on any supported hardward and operating system, so having
+`pb` boot files is one way to get started in a new environment.
+
+# Build System
+
+Chez Scheme assigns a `machine-type` name to each platform it runs on.
+The `machine-type` name carries three pieces of information:
+
+ * *whether the system threaded*: A `t` indicates that it is, and an
+    absence indicates that it's not threaded;
+
+ * *the hardware platform*: `i3` for x86, `a6` for x86_64, `arm32` for
+   AArch32, `arm64` for AArch64, and `ppc32` for 32-bit PowerPC; and
+
+ * *the operating system*: `le` for Linux, `nt` for Windows, `osx` for
+   Mac OS, etc.
+
+When you run "configure", it looks for boot and header files as the
+directory "boot/*machine-type*". (If it doesn't find them, then
+configuration cannot continue.)
+
+The supported machine types are listed in "cmacros.ss" and reflected
+by a "boot/*machine-type*" directory for boot and headers files, a
+"s/*machine-type*.def" file to describe the platform, a
+"s/Mf-*machine-type*" makefile to select relevant files in "s", a
+"c/Mf-*machine-type*" makefile for configration in "c", and a
+"mats/Mf-*machine-type*" makefile to configure testing.
+
+The "workarea" script in the root of the Chez Scheme project is used
+to generate a subdirectory with the appropriate contents to build for
+that particular machine. This is the script that "configure" runs when
+configuring for doing the build, but you can also run the "workarea"
+script on your own, supplying the machine type you'd like to build.
+
+If you have a working Chez Scheme build and you want to cross-compile
+to generate *machine-type* boot and header files, the easiest approach
+is `make` *machine-type*`.boot`. The output is written to the
+"boot/*machine-type*" directory.
+
+# Porting to a New Platform
+
+Porting to a new system requires both getting the C run time compiled
+on the new platform and updating the Scheme compiler to generate
+machine code for the platform. There are several places where the C
+kernel and code generated by the compiler need to work in harmony in
+order to get the system to run. For instance, the C kernel needs to
+know the type tags, sizes, and field offsets into Scheme objects, so
+that the garbage collector in the C kernel can do its job. This is
+handled by having the Scheme compiler generate a couple of C headers:
+"scheme.h" and "equates.h", that the contain the information about the
+Scheme compiler the C kernel needs to do its job.
+
+Most of the work of porting to a new platform is producing a new
+"*machine-type*.def" file, which (except in simple ports to a new
+operating system) will require a new "*isa*.ss" compiler backend.
+You'll also have to set up all the "Mf-*machine-type*" makefiles and
+update "configure", "cmacro.ss", and "version.h" (plus maybe other
+files). Once you have all of the pieces working together, you
+cross-compile boot files, then copy them over to the the new machine
+to start compiling there.
+
+You can port to a new operating system by imitating the files of a
+similar supported oerating system, but building a new backend for a
+new processor requires much more understanding of the compiler and
+runtime system.
+
 # Scheme Objects

 A Scheme object is represented at run time by a pointer. The low bits
@ -30,7 +105,11 @@ additional tag word to further refine the pointer-tag type.

 See also:

-> *Don't Stop the BiBOP: Flexible and Efficient Storage Management for Dynamically Typed Languages.* by R. Kent Dybvig, David Eby, and Carl Bruggeman, Indiana University TR #400, 1994.
+> *Don't Stop the BiBOP: Flexible and Efficient Storage Management for
+> Dynamically Typed Languages*
+> by R. Kent Dybvig, David Eby, and Carl Bruggeman,
+> Indiana University TR #400, 1994.
+> [PDF](http://www.cs.indiana.edu/ftp/techreports/TR400.pdf)

 For example, if "cmacro.ss" says

@ -57,8 +136,8 @@ of a Scheme record, that first word will be a record-type descriptor
 as a record. The based record type, `#!base-rtd` has itself as its
 record type. Since the type bits are all ones, on a 64-bit machine,
 every object tagged with an additional type workd will end in "F" in
-hexadecimal, and adding 1 to the pointer produces the address
-containing the record content (which starts with the rrecord type, so
+hexadecimal, and adding 1 to the pointer produces the <address
+containing the record content (which starts with the record type, so
 add 9 instead to get to the first field in the record).

 As another example, a vector is represented as `type-typed-object`
@ -98,8 +177,15 @@ and continuation operations are handled as needed at the boundaries.

 See also:
 
-> *Representing Control in the Presence of First-Class Continuations* by Robert Hieb, R. Kent Dybvig, and Carl Bruggeman, Programming Language Design and Implementation, 1990.
-> *Compiler and Runtime Support for Continuation Marks* by Matthew Flatt and R. Kent Dybvig, Programming Language Design and Implementation, 2020.
+> *Representing Control in the Presence of First-Class Continuations*
+> bby Robert Hieb, R. Kent Dybvig, and Carl Bruggeman,
+> Programming Language Design and Implementation, 1990.
+> [PDF](https://legacy.cs.indiana.edu/~dyb/pubs/stack.pdf)
+
+> *Compiler and Runtime Support for Continuation Marks*
+> by Matthew Flatt and R. Kent Dybvig,
+> Programming Language Design and Implementation, 2020.
+> [PDF](https://www.cs.utah.edu/plt/publications/pldi20-fd.pdf)

 To the degree that the runtime system needs global state, that state
 is in the thread context (so, it's thread-local), which we'll
@ -215,10 +301,18 @@ Compilation
   involves many individual passes that convert through many different
   intermediate forms (see "np-language.ss").

+It's worth noting that Chez Scheme produces machine code directly,
+instead of relying on a system-provided assembler. Chez Scheme also
+implements its own linker to connect compiled code to runtime kernel
+facilaties and shared symbols.
+ 
 See also:

-> *Nanopass compiler infrastructure* by Dipanwita Sarkar, Indiana University PhD dissertation, 2008
-> *A Nanopass Framework for Commercial Compiler Development* by Andrew W. Keep, Indiana University PhD dissertation, 2013
+> *Nanopass compiler infrastructure* by Dipanwita Sarkar,
+> Indiana University PhD dissertation, 2008.
+
+> *A Nanopass Framework for Commercial Compiler Development*
+> by Andrew W. Keep, Indiana University PhD dissertation, 2013.

 Note that the core macro expander always converts its input to the
 `Lsrc` intermediate form. That intermediate form can be converted back
@ -259,19 +353,19 @@ Each `<reg>` has the form
    [<name> ... <preserved? / callee-saved?> <num> <type>]
 ```

- * The <name>s in one <reg> will all refer to the same register, and
-   the first <name> is used as the canonical name. By convention, each
-   <name> starts with `%`. The compiler gives specific meaning to a
+ * The `<name>`s in one `<reg>` will all refer to the same register, and
+   the first `<name>` is used as the canonical name. By convention, each
+   `<name>` starts with `%`. The compiler gives specific meaning to a
   few names listed below, and a backend can use any names otherwise.

 * The information on preserved (i.e, callee-saved) registers helps
   the compiler save registers as needed before some C interactions.

- * The <num> value is for the private use of the backend. Typically,
+ * The `<num>` value is for the private use of the backend. Typically,
   it corresponds to the register's representation within machine
   instructions.

- * The <type> is either 'uptr or 'fp, indicating whether the register
+ * The `<type>` is either `'uptr` or `'fp`, indicating whether the register
   holds a pointer/integer value (i.e., an unsigned integer that is
   the same size as a pointer) or a floating-point value. For
   `allocatable` registers, the different types of registers represent
@ -297,37 +391,47 @@ category are automatically saved as needed for C interactions.
 The main recognized register names, roughly in order of usefulness as
 real machine registers:

- %tc - the first reserved register, must be mapped as reserved
- %sfp - the second reserved register, must be mapped as reserved
- %ap - allocation pointer (for fast bump allocation)
- %trap - counter for when to check signals, including GC signal
+ * `%tc` - the first reserved register, must be mapped as reserved

- %eap - end of bump-allocatable region
- %esp - end of current stack segment
+ * `%sfp` - the second reserved register, must be mapped as reserved

- %cp - used for a procedure about to be called
- %ac0 - used for argument count and call results
+ * `%ap` - allocation pointer (for fast bump allocation)

- %ac1 - various scratch and communication purposes
- %xp  - ditto
- %yp  - ditto
+ * `%trap` - counter for when to check signals, including GC signal
+
+
+ * `%eap` - end of bump-allocatable region
+
+ * `%esp` - end of current stack segment
+
+
+ * `%cp` - used for a procedure about to be called
+
+ * `%ac0` - used for argument count and call results
+
+
+ * `%ac1` - various scratch and communication purposes
+
+ * `%xp`  - ditto
+
+ * `%yp`  - ditto

 Each of the registers maps to a slot in the TC, so they are sometimes
 used to communicate between compiled code and the C-implemented
 kernel. For example, `S_call_help` expects the function to be called
-in AC1 with the argument count in AC0 (as usual).
+in AC1 with the argument count in AC0 (as usual). If a recognized name
+is not mapped to a register, it exists only as a TC slot.

 A few more names are recognized to direct the compiler in different
 ways:

- %ret - use a return register insteda of just SFP[0]
+ * `%ret` - use a return register insteda of just SFP[0]

- %reify1, %reify2 - a kind of manual allocation of registers for
+ * `%reify1`, `%reify2` - a kind of manual allocation of registers for
                          certain hand-coded routines, which otherwise could
                           run out of registers to use

-Variables and Register Allocation
---------------------------------
+# Variables and Register Allocation

 A variables in Scheme code can be allocated either to a register or to
 a location in the stack frame, and the same goes for temporaries that
@ -486,7 +590,7 @@ register plus an offset instead of two registers, because the offset
 is too big, because the offset does not have a required alignment, and
 so on.

-# Instruction Selection: Compiler <-> Backend
+# Instruction Selection: Compiler to Backend

 For each primitive that the compiler will reference via `inline`,
 there must be a `declare-primitive` in "np-language.ss". Each
@ -509,7 +613,7 @@ binds `%logand`. The `(%inline name ,arg ...)` macro expands to
 `(inline ,null-info ,%name ,arg ...)` macro, so that's why you don't
 usually see the `%` written out.

-The backend implementation of a prrimitive is a function that takes as
+The backend implementation of a primitive is a function that takes as
 many arguments as the `inline` form, plus an additional initial
 argument for the destination in the case of a `value` primitive on the
 right-hand side of a `set!`. The result of the primitive function is a
@ -575,9 +679,9 @@ see "Foreign Function ABI" below.

 To summarize the interface between the compiler and backend is:

- primitive : L15c.Triv ... -> (listof L15d.Effect)
+ * `primitive : L15c.Triv ... -> (listof L15d.Effect)`

- instruction : (listof code) L16.Triv ... -> (listof code)
+ * `instruction : (listof code) L16.Triv ... -> (listof code)`

 A `code` is mostly bytes to be emitted, but it also contains
 relocation entries and human-readable forms that are printed when
@ -617,13 +721,16 @@ registers or a register and an immediate, but the immediate value has
 to be representable with a funky encoding. The pattern forms above
 require that the destination is always a register/variable, and either
 of the arguments can be a literal that fits into the funky encoding or
-a register/variable. The `define-instruction` macro is itself
-implemented in "arm64.ss", so it can support specialized patterns like
-`funkymask`.
+a register/variable. The `define-instruction` macro is parameterized
+over patterns like `funkymask` via `coercible?` and `coerce-opnd`
+macros, so a backend like "arm64.ss" can support specialized patterns
+like `funkymask`.

 If a call to this `%logand` function is triggered by a form

+```scheme
  `(set! ,info (mref ,var1 ,%zero 8) ,var2 ,7)
+```

 then the code generated by `define-instruction` will notice that the
 first argument is not a register/variable, while 7 does encode as a
@ -651,10 +758,10 @@ would have to generate an `add` into a second temporary variable.
 Otherwise, `asm-move` would not be able to deal with the generated
 `set!` to move `u` into the destination. The implementation of
 `define-instruction` uses a `mem->mem` helper function to simplify
-`mref`s. In the "arm32.ss" backend, there's an additional `fpmem`
-pattern and `fpmem->fpmem` helper, because the constraints on memory
-references for floating-point operations are different than than the
-constraints on memory references to load an integer/pointer.
+`mref`s. There's an additional `fpmem` pattern and `fpmem->fpmem`
+helper, because the constraints on memory references for
+floating-point operations can be different than than the constraints
+on memory references to load an integer/pointer (e.g., on "arm32.ss").

 Note that `%logand` generates a use of the same `(asm-logand #f)`
 instruction for the register--register and the register--immediate
@ -710,6 +817,25 @@ human-readable addition.
 All of that could be done with just plain functions, but the macros
 help with boilerplate and arrange some helpful compile-time checking.

+# Linking
+
+Besides actual machine code in the output of the assembly step,
+machine-specific linking dierctives can appear. In the case of
+"arm32.ss", the linking options are `arm32-abs` (load an absolute
+address), `arm32-call` (call an asolute address while setting the link
+register), and a`arm32-jump` (jump to an asolute address). These are
+turned into relocation entries associated with compiled code by steps
+in "compile.ss". Relocaiton entires are used when loding an GCing with
+update routines implemented in "fasl.c".
+
+Typically, a linking directive is written just after some code that is
+generated as installing a dummy value, and theen the update routine in
+"fasl.c" writes the non-dummy value when it becomes available later.
+Each linking directive must be handled in "compile.ss", and it must
+know the position and size of the code (relative to the direction) to
+be updated. Overall, there's a close conspiracy among the backend, the
+handling in "compile.ss", and the update routine in "fasl.c".
+
 # Foreign Function ABI

 Support for foreign procedures and callables in Chez Scheme boils down
@ -736,15 +862,18 @@ duplicated in the result (matching the C view) and an argument
 neither the C nor Scheme view, but either view can be reconstructed.

 The compiler creates wrappers to take care of further conversion
-to/from these primitive shapes.
+to/from these primitive shapes. You can safely ignore the
+foreign-callable support, at first, when porting to a new platforrm,
+but foreign-callable support is needed for generated code to access
+runtime kernel functionality.

 The `asm-foreign-call` function returns 5 values:

- * allocate : -> L13.Effect
+ * `allocate : -> L13.Effect`

   Any needed setup, such as allocating C stack space for arguments.

- * c-args : (listof (uvar/reg -> L13.Effect))
+ * `c-args : (listof (uvar/reg -> L13.Effect))`

   Generate code to convert each argument. The generated code will be
   in reverse order, with the first argument last, because that tends
@ -762,14 +891,14 @@ The `asm-foreign-call` function returns 5 values:
     - integer or pointer: a 'uptr-typed variable that has the integer
     - "&": a 'uptr-typed variable that has a pointer to the argument

- * c-call : uvar/reg boolean -> L13.Effect
+ * `c-call : uvar/reg boolean -> L13.Effect`

   Generate code to call the C function whose address is in the given
   register. The boolean if #t if the call can assume that the C
   function is not a varargs function on platformss where varargs
   support is the default.

- * c-result : uvar/reg -> L13.Effect
+ * `c-result : uvar/reg -> L13.Effect`

   Similar to the conversions in `c-args`, but for the result, so the
   given argument is a destination variable. This function will not be
@ -777,19 +906,19 @@ The `asm-foreign-call` function returns 5 values:
   floating-point value, the provided destination variable has type
   'fp.

- * allocate : -> L13.Effect
+ * `allocate : -> L13.Effect`

   Any needed teardown, such as deallocating C stack space.

 The `asm-foreign-callable` function returns 4 values:

- * c-init : -> L13.Effect
+ * `c-init : -> L13.Effect`

   Anything that needs to be done just before transitioning into
   Scheme, such as saving preserved registers that call be used within
   the callable stub.

- * c-args : (listof (uvar/reg -> L13.Effect))
+ * `c-args : (listof (uvar/reg -> L13.Effect))`

   Similar to the `asm-foreign-call` result case, but each function
   should fill a destination variable form platform-specific argument
@ -807,7 +936,7 @@ The `asm-foreign-callable` function returns 4 values:
     - integer or pointer: a 'uptr-typed variable to receive the value
     - "&": a 'uptr-typed variable to receive the pointer

- * c-result : (uvar/reg -> L13.Effect) or (-> L13.Effect)
+ * `c-result : (uvar/reg -> L13.Effect) or (-> L13.Effect)`

   Similar to the `asm-foreign-call` argument cases, but for a
   floating-point result, the given result register holds pointer to a
@ -815,7 +944,7 @@ The `asm-foreign-callable` function returns 4 values:
   `c-result` takes no argument (because the destination pointer was
   already produced or there's no result).

- * c-return : (-> L13.Effect)
+ * `c-return : (-> L13.Effect)`

   Generate the code for a C return, including any teardown needed to
   balance `c-init`.