From 318d3e0a9fae52b4f4244a109bd948a37896cb51 Mon Sep 17 00:00:00 2001 From: Matthew Flatt Date: Mon, 23 Nov 2020 08:11:55 -0700 Subject: [PATCH] Chez Scheme: implementation notes on adding functions Also, explain primitives and entries, which helps make sense of definitions like (define set-car! (lambda (p v) (#2%set-car! p v))) --- racket/src/ChezScheme/IMPLEMENTATION.md | 149 +++++++++++++++++++++++- 1 file changed, 144 insertions(+), 5 deletions(-) diff --git a/racket/src/ChezScheme/IMPLEMENTATION.md b/racket/src/ChezScheme/IMPLEMENTATION.md index 9c6043c8f2..fe4110bae4 100644 --- a/racket/src/ChezScheme/IMPLEMENTATION.md +++ b/racket/src/ChezScheme/IMPLEMENTATION.md @@ -64,10 +64,15 @@ that particular machine. This is the script that "configure" runs when configuring for doing the build, but you can also run the "workarea" script on your own, supplying the machine type you'd like to build. -If you have a working Chez Scheme build and you want to cross-compile -to generate *machine-type* boot and header files, the easiest approach -is `make` *machine-type*`.boot`. The output is written to the -"boot/*machine-type*" directory. +Bootstrap from scratch by running the Racket program +"rktboot/main.rkt", which should work even with a relatively old +version of Racket. Output is written directly to a "boot" +subdirectory. + +If you have a working Chez Scheme build for the current sources and +you want to cross-compile to generate *machine-type* boot and header +files, the fastest approach is `make` *machine-type*`.boot`. The +output is written to the "boot/*machine-type*" directory. # Porting to a New Platform @@ -98,6 +103,44 @@ similar supported oerating system, but building a new backend for a new processor requires much more understanding of the compiler and runtime system. +# Adding Functionality + +If new functionality can be implemented in terms of existing +functionality, then you don't have to understand too much about the +compiler internals. Just write Scheme code, and put it in a reasonable +existing "s/*...*.ss" file. + +The main catch is that all bindings have to be declared in +"primdata.ss". The declarations are organized by exporting library and +functions versus non-functions. Unless you're also changing a +standard, your addition will go in one of the sets that is declared +with `[libraries]`, meaning that the binding is in the `chezscheme` +library. + +When a helper function needs to be a different source file than the +place where it's used, so it can't just be locally defined, prefix the +helper function name with `$` and register it in the `[flags system +proc]` group in "primdata.ss". + +There's usually not much of a bootstrapping problem with new bindings, +since you can add declarations in "primdata.ss" and implement them any +time afterward. If you get into a bad state, however, you can always +bootstrap from scratch using "rktboot/main.rkt". In the rare case that +your new functionality is needed to compile Chez Scheme itself, you'll +have to implement a copy of the functionality (or enough of it) in +"rktboot". + +Take care to implement and new functionality as safe, which means +checking arguments fully. Keep in mind that your implementation tself +will be compiled as unsafe. While testing and debugging your +additions, however, you'll probably want to use `make o=0` in the +"*machine-type*/s" workarea space, which compiles in safe mode. + +Tests go in "mats/*...*.ms". In "*machine-type*/mats", you can use +`make 7.mo` to build and run `7.ms`. Remove `7.mo` to re-run without +changing `7.ms`. Makefile variables like `o` control the way tests +are run; for example, use `make o=3 7.mo` to test in unsafe mode. + # Scheme Objects A Scheme object is represented at run time by a pointer. The low bits @@ -283,6 +326,91 @@ So, when you disassemble code generated by the Chez Scheme compiler, you may see garbage instructions mingled with the well-formed instructions, but the garbage will always be jumped over. +# Primitives, Library Entries, and C Entries + +Chez Scheme functions are mostly implemented in Chez Scheme, but some +primitives are hand-coded within the compiler, and some primitives are +implemented or supported by C code in the kernel. + +For example, the definition of `set-car!` is in "prims.ss" is + +```scheme +(define set-car! + (lambda (p v) + (#2%set-car! p v))) +``` + +This turns out not to be a circular definition, because the compiler +recogizes an immediate application of the `set-car!` primitive and +inlines its implementation. The `#2%` prefix instructs the compiler to +inline the safe implementation of `set-car!`, which checks whether its +first argument is a pair. Look for `define-inline 2 set-car!` in +"cpnanopass.ss" for that part of the compiler. The content of +"prims.ss" is compiled in unsafe mode, so that's why safe mode needs +to be selected explicitly when needed. + +What if the argument to `set-car!` is not a pair? The implementation +of inline `set-car!` in "cpnanopass.ss" includes + +```scheme +(build-libcall #t src sexpr set-car! e-pair e-new) +``` + +which calls a `set-car!` *library function*. That's defined in +"library.ss" by + +```scheme +(define-library-entry (set-car! x y) (pair-oops 'set-car! x)) +``` + +That is, the `set-car!` library function always reports an error, +because that's the only reason the library function is called. Some +other library functions implement the slow path for functions where +the compiler inlines only a fast path. + +Every library function has to be declared in "cmacros.ss" in the +`declare-library-entries` form. That form declares a vector of +*library entries*, which the linker uses to replace an address stub +(as inserted into machine code via `build-libcall`) with the run-time +address of the library function. The vector is filled in by loading +"library.ss". Since some library functions can refer to others, the +order is important; the linker encouters the forms of "library.ss" one +at a time, and a library entry must be registered before it is +referenced. + +Some functions or other pointers, such as the thread-context mutex, +are created by the kernel in C. Those pointers are stored in an array +of *C entries* that is used by the linker. The kernel registers C +entries with `S_install_c_entry` in "prim.c". Machine code that refer +to a C entry is generated in the compiler with `(make-info-literal #f +'entry (lookup-c-entry ....) ....)`. All C entries are also declared +in "cmacros.ss" with `declare-c-entries`. + +Adding a new library entry or C entry shifts indices that are +generated by the Scheme compiler. If you change the set of entries, +it's usually easiest to re-bootstrap from scratch using +"rktboot/main.rkt". To avoid confusion, be sure to change the version +number first (see "Changing the Version Number" below). + +Some primitives are implemented directly in the compiler but should +not be inlined. Those functions are implemented with a `$hand-coded` +form. For example, `list` is implemented in "prims.ss" as + +```scheme +(define list ($hand-coded 'list-procedure)) +``` + +Look for `list-procedure` in "cpnanopass.ss" to see the +implementation. + +Finally, some primitives in "prims.ss" are implemented in the kernel +and simply accessed with `foreign-procedure`. Other parts of the +implementation also use `foreign-procedure` instead of having a +defintion in "prims.ss". + +If you're looking for math primitives, see "mathprims.ss" instead of +"prims.ss". + # Compilation Pipeline Compilation @@ -842,7 +970,7 @@ handling in "compile.ss", and the update routine in "fasl.c". Support for foreign procedures and callables in Chez Scheme boils down to foriegn calls and callable stubs for the backend. A backend's -`asm-foreign-call` and `asm-forieng-callbable` function receives an +`asm-foreign-call` and `asm-foreign-callbable` function receives an `info-foreign` record, which describes the argument and result types in relatively primitive forms: @@ -950,3 +1078,14 @@ The `asm-foreign-callable` function returns 4 values: Generate the code for a C return, including any teardown needed to balance `c-init`. + +# Changing the Version Number + +To change the version number: + + * Edit the `version` definition in "cmacro.ss" + + * Edit the `Version` macro in "makefiles/Mf-install" + +After changing the version number, re-bootstrap from scratch using +"rktboot/main.rkt".