Commit Graph

264 Commits

Author SHA1 Message Date
Matthew Flatt
f78dc5724e add pb (portable bytecode) backend
This commit does four things:

 * Adds "pb.ss" and "pb.c", which implement a portable bytecode
   backend and interpreter that is intended for bootstrapping. A
   single set of pb bootfiles can support bootstrapping on all
   platforms --- as long as the C compiler supports a 64-bit integer
   type. The pb machine supports foreign calls for only a small set of
   recognized prototypes, and it does not support foriegn callables.
   Use `./configure --pb` to build the pb variant.

 * Changes the kernel's casts between `ptr` and `void*` types. In a pb
   build, the `ptr` type can be a 64-bit integer type while `void*` is
   a 32-bit pointer type, so casts must go through an intermediate
   integer type.

 * Adjusts the compiler to accomodate run-time-determined endianness.
   Making the compiler agnostic to word size is not practical, but
   only a few pieces depend on the target machine's endianness, and
   those can generally be deferred to a run-time choice of byte-based
   operations. The one exception is that ftype bit fields are not
   allowed unless accompanied by an explicit endianness declaration.

 * Start reducing duplication among platform-specific makefiles. For
   example, `Mf-ta6osx` chains to `Mf-a6osx` to avoid repeating most
   of it. A lot more can be done here.

original commit: 97533fa9d8b8400b0dc1a890768c7d30c91257e0
2020-07-24 13:13:46 -06:00
Matthew Flatt
b8c1ce63c6 add option to omit RTD descriptions in fasl output
original commit: 294ca9da084d76aa7b649059856066a1f86fe21b
2020-07-14 20:22:59 -06:00
Matthew Flatt
ec05bac0cf add "externals" fasl support, allow non-strings in sfd
"Externals" supports fasling with some values lifted out an provided
separately.

Lifting the restriction on source file descriptor paths, formerly to
strings, means that paths can be represented in a different way, and
they can be fasled through a different means than the built-in
encodings.

original commit: b6b0ae67b08f2e9bc8b7fafe5ebad0375b6ce9db
2020-07-14 20:22:59 -06:00
Matthew Flatt
fd3b903c1c sync with https://github.com/cisco/ChezScheme on fasl compression
Merge changes in the way that fasl streams are compressed. The new
approach makes compression explicit in the fasl representation, which
means that tricks like uzing zcat on a fasl file will no longer work
(at least not efficiently).

original commit: 167ac7294a2dc400821e4336f0cfc4de621efe97
2020-07-12 19:07:05 -06:00
Matthew Flatt
b2f74f014e add AArch64 (aka Arm64) support as tarm64le
original commit: 9964f27f64cc743fd1dbff7418fce940a4291b01
2020-07-09 06:32:41 -06:00
Matthew Flatt
bdd1eaa874 add tarm32le
Besides adding supportt for `__collect-safe` and other repairs,
introduce a write-write fence with the write barrier, which is
intended to avoid one thread using an object created in another thread
before the object's initializing writes are visible.

original commit: 543bd16739c08e5a8f88c470b52db0f23a27d260
2020-06-29 05:55:47 -06:00
Matthew Flatt
9bdc112b4d ppc32: fix icache flush
original commit: d9bf4ebbc5fe32a1d3d35ba096a54e7b78d1f33c
2020-06-22 17:35:47 -06:00
Matthew Flatt
257a29216e update for ppc32
Besides updating for unboxed floating point, the ppc32 build uses a
return register, and the continuation-attachments implementation was
not right for that mode.

original commit: dd2d01fb26ace819c73f258b9b53739f9dda1d34
2020-06-20 07:36:02 -06:00
Matthew Flatt
d1f20019ae unbox more flonum operations
Flonum operations like `fltruncate` and `flsin` are implemented by
calling functions from the C library. Unboxing these involves a
generalazation the `foreign-call` intermediate form to handle unboxing
and to work in a non-tail position (especially by telling the register
allocator that caller-saved registers will be trashed). An internal
'atomic convention on a foreign call indicates that no callback into
Scheme is possible, so some setup/teardown (including stashing
callee-saved registers) can be skipped.

original commit: fd89919634d0d5272e046b47bb81bcc66e22a741
2020-06-13 14:25:52 -06:00
Matthew Flatt
4b322677fa flush instruction cache on vfasl load
original commit: 57a7c47dcf1f602d208d14f51f456edb3e2689ae
2020-06-12 14:41:00 -06:00
Matthew Flatt
23e3597778 fix vfasl for library/C entry 0
original commit: ab36ca79585b69db135b9edeadbc26e9a071f813
2020-06-11 17:24:17 -06:00
Matthew Flatt
6395bd92ff fix foreign-callable handling of bytevector arguments
This is a follow-up to 276f8da076, where `(%tc-ref cp)` was supposed
to be preserved by moving it into %cp, but intrinisics for bytevector
arguments can kill %cp. Use a temporary to expose things properly to
the register allocator.

original commit: 3a29db06a452e46e69ebcde524b3b9acb435dec3
2020-06-06 19:44:40 -06:00
Matthew Flatt
bbbd5a76ac fix vfasl relocation for arm32
original commit: e15c51c2c29aea545fbb4790f36b15002b7a25a5
2020-06-06 14:29:32 -06:00
Matthew Flatt
0adffe2c19 fix psuedo-random state C view for arm32
original commit: 348c1798d88eea3504961effe7953103044e3ee4
2020-06-06 12:16:11 -06:00
Matthew Flatt
a106c50798 gc repairs
* Fix calculation of segment index for 32-bit platforms

 * Fix allocation of mark-bit and list-bit arrays in certain unusual
   cases.

 * Fix dirty sweep of records on marked pages that have non-pointer
   fields.

 * Fix allocation of eveen-sized immobile vectors; a pad word needs to
   be cleared.

 * Fix and extend the heap checker (which was used to find several of
   the other problems).

original commit: 8b5e65f5eafac5aea7394901e1dd2f2fc3ccf2bd
2020-05-15 14:40:55 -06:00
Matthew Flatt
96616baa47 unbreak non-threaded build
original commit: c077acf7dd65bcb397e846c786ac546888b5798a
2020-05-15 07:19:51 -06:00
Paulo Matos
74ee485b21 Ensure that the literal 1 is wide enough for a shift (#23)
Fixes runtime error found by ubsan.
original commit: 65e05772a1ee14d73c368f311e837b00af771a23
2020-05-07 17:34:45 +02:00
Matthew Flatt
c7f4261611 fix ephemerons when dirty and reachable during counting
Part of the repair makes it ok to re-sweep an ephemeron, which is more
consistent with evertything else.

original commit: 2c11bb39129b1492108390a704eb08deaa5d6bcc
2020-04-28 09:02:44 -06:00
Matthew Flatt
a9e37d0548 sync simpler handling of tc U, V, W, X, Y
They apparently don't need to be preserved across a GC.

original commit: 830d176bdaf0c19c44e5f4037da0de621d3d9957
2020-04-26 20:13:54 -06:00
Matthew Flatt
120082f3f9 add list-assuming-immutable?
Build in a Racket-style `list?` using GC cooperation to make recording
the result cheaper.

original commit: 32189af3e4dfc3596fba3163fd1a8295b830448b
2020-04-25 15:33:56 -06:00
Matthew Flatt
7ba7a815b0 tweak copy-vs-mark dispatching
The C compiler doesn't generate a tail call in a place where I
expected one, and maybe it's better to branch at the call site anyway.

original commit: 70fa8e7f7bd891c548c877cabdd15073aa2aa01b
2020-04-24 10:20:50 -06:00
Matthew Flatt
752ee94563 avoid fragmentation at the chunk level
original commit: 5b52a846af7f5d9c030e6dc71f46d83b3f1b8e4c
2020-04-23 17:25:03 -06:00
Matthew Flatt
d755dbc00f cs: fix phantom bytes effect on maximum-memory-bytes
original commit: 78f2c1e3ee1329f44742a23c28a76538eef8cbdd
2020-04-22 16:30:47 -06:00
Matthew Flatt
f53f20b5b9 GC marking (non-copying) mode
Change the GC so that it can mark and sweep objects in-place, instead
of always copying. This change is helpful for reducing peak memory
use while performing a collection on a large, old heap.

Some non-copying support was already in place for locked objects,
but the new implementation is faster and more general. As an
alternative to locking, the storage manager now provides "immobile"
allocation (currently only for bytevectors, vectors, and boxes),
which allocates an object that won't move but that can be GCed if
it's not referenced. A locked object is an object that has been
immobiled and that is on a global list --- mostly the old,
non-scalable implementation of locked objects brought back, since
immobile objects cover the cases that need to scale.

original commit: aecb7b736cb1d52764c292fa6364a674958dfde3
2020-04-22 07:10:02 -06:00
Matthew Flatt
f4de537e1c gc: generate sweep_dirty_object
The `sweep_dirty_intersecting` function still had hand-implemented
sweep cases.

original commit: c51b46b3cc71ed0dbc523071dce3cc496965e0b6
2020-04-18 10:40:15 -06:00
Matthew Flatt
c4ffe39efb fix leak related to object counts
When collecting to the maximum generation with object counts enabled,
a structure type would effectively become permanently reachable.

Also, add `bytes-finalized` to report how many bytes were associated
with guardian-based finalization by the most recent collection.

original commit: 852f5e2de95a26d3500321c4d4d732407945a57a
2020-04-16 16:16:13 -06:00
Matthew Flatt
63baf24ad5 repairs for locking
Fix clearing of locked-object information and copying adjacent pairs.

original commit: 53d092c50c1c24017c52b6e002e6073b81747e09
2020-04-04 16:05:20 -06:00
Matthew Flatt
5458323280 fix segment initialization for new fields
original commit: 90f358a2a33f90d9b64b6750988f679a6fcfcc7d
2020-04-04 12:43:04 -06:00
Matthew Flatt
afebbdd6a9 convert GC to "mkgc.ss" implementation
Replace repetitive C code in "gc.c" and "vfasl.c" with an
implementation using a little "Parenthe-C" language, which is a
somewhat declarative description of object tracing. From that
descrition, we generate different kinds of tracing functions, such as
the copy function or the sweep function.

The little language is still bascially C, just with parentheses and
parameterization that is much better than trying to use the C
preprocessor. (The "mkgc.ss" file includes the compiler from
Parenthe-C to C.)

Besides replacing existing code, we also generate a new traversal to
implement `compute-object-sizes`. Finally, the GC can now perform a
fused `collect` and `compute-object-sizes` in a single traversal.

Also improve the way that locked objects are detected during GC. This
can make a significant difference (on the order of 10-20% for a full
collection) when locked objects are long-lived.

original commit: de1f5c41d729ac75822a1f1e633ec6d042c883dc
2020-04-04 10:21:16 -06:00
Matthew Flatt
8656bbae7e fix ephemeron allocation
Only half(!) of the needed space was actually allocated. The extra
space is ony used after a GC, however, and a GC makes the extra room,
so that's why things haven't fallen over completely, but that's more
subtle than intended.

original commit: 3d72bc14b9247d6764809cb651403dbb4063a905
2020-04-04 10:01:04 -06:00
Matthew Flatt
f828cb1eaa fix emphemeron-key tracking in a segment with locked objects
original commit: 9d1252b176e972f92030599dae0ce159c9d36c5b
2020-04-01 07:53:32 -06:00
Matthew Flatt
de465e4f92 fix vfasl problems
Fix problems with record meta-types and symbol interning interleaved
with vfasl loading.

original commit: 2d98d94b3c4d634ba882f10eaebc627a5d9a1ccd
2020-03-28 08:34:48 -06:00
Matthew Flatt
c920f3953d collect in main thread when active
For a collect rendezvous, call the collect-notify handler in
the main thread if it is active. A collect-notify handler can
then make sure the main thread is active and try again, if
that's useful to an application.

original commit: 0bc286e81827f029dd02a3627a192edd053b3b91
2020-03-23 15:32:00 -06:00
Matthew Flatt
5f57648104 add call-in-continuation
This operation effectively allows sending an expression back to a
continuation, instead of just a value. It's the same as Marc Feeley's
`continuation-slice` operation, but adjusted slightly to support
continuation attachments.

original commit: d0e36e72d20a6eaa5d9d8b795da5e77abde75289
2020-03-12 04:48:39 -06:00
Matthew Flatt
d2961790b0 add fasl terminator
While "\44\26\2\f6" currently works as a terminator for non-compressed
fasl streams, the working byte sequence varies as the fasl format
changes. Add "\177" as a simpler and unchanging terminator.

original commit: 332019360491be6cedd2063c9a8056183d764bbb
2020-03-05 17:05:22 -07:00
dybvig
0a5700cef6 support for internal fasl compression to allow seeking past compile-time info at run time and run-time info at compile time
- the collector now releases bignum temporaries in the collector
  rather than relocating them so we don't keep around huge bignum
  temporaries forever.
     gc.c
- removed the presumably useless vector-handling code from load()
  which used to be required to handle fasl groups.
     scheme.c
- object files are no longer compressed as a whole, and the parameter
  compile-compressed is no longer defined.  instead, the individual
  fasl objects within an object file are compressed whenever the
  new parameter fasl-compressed is set to its default value, #t.
  this allows the fasl reader to seek past portions of an object
  file that are not of interest, i.e., visit-only code and data
  when "revisiting" an object file and revisit-only code and data
  when "visiting" an object file.  the compressed portions are
  compressed using the format and level specified by the compress-format
  and compress-level parameters.  the C-coded fasl reader and
  boot-file loader no longer handle compressed files; these are
  handled, less efficiently, by the Scheme entry point (fasl-read).
  a warning exception is raised the first time a program attempts
  to create or read a compressed fasl file.
    7.ss, s/Mf-base, back.ss, bytevector.ss, cmacros.ss, compile.ss,
    fasl-helpers.ss, fasl.ss, primdata.ss, strip.ss, syntax.ss,
    externs.h, fasl.c, gc.c, scheme.c, thread.c,
    mats/6.ms, mats/7.ms, mats/bytevector.ms, mats/misc.ms, patch*,
    root-experr*,
    intro.stex, use.stex, io.stex, system.stex,
    release_notes.stex
- added begin wrappers around many of the Scheme source files that
  contained multiple expressions to cut down the number of top-level
  fasl objects and increase compressibility.  also removed the
  string filenames for debugging at the start of each file that had
  one---these are best inserted universally by a modified compile-file
  during a debugging session when desired.  also removed unnecessary
  top-level placeholder definitions for the assignments that follow.
    4.ss, 5_1.ss, 5_2.ss, 5_3.ss, 5_7.ss, 6.ss, 7.ss, bytevector.ss,
    cafe.ss, cback.ss, compile.ss, cp0.ss, cpcommonize.ss, cpletrec.ss,
    cpnanopass.ss, cprep.ss, cpvalid.ss, date.ss, engine.ss, enum.ss,
    env.ss, event.ss, exceptions.ss, expeditor.ss, fasl.ss, foreign.ss,
    format.ss, front.ss, ftype.ss, inspect.ss, interpret.ss, io.ss,
    library.ss, mathprims.ss, newhash.ss, pdhtml.ss, pretty.ss,
    prims.ss, primvars.ss, print.ss, read.ss, record.ss, reloc.ss,
    strnum.ss, syntax.ss, trace.ss

original commit: b7f161bf2939dfedce8accbfa82b92dbe011d32a
2020-03-04 16:53:35 -05:00
Bob Burger
12081203af handle CTRL-C in ta6nt without expression editor
original commit: 7ca7ad78a278278df55140617a3112f5271f42d8
2020-03-04 16:23:47 -05:00
Bob Burger
54112e9bf1 simplification
original commit: 8e4b5f7893b6bb1ee557b4a30ff341bf6268816d
2020-03-04 16:23:47 -05:00
Neal Alexander
e7bb4def71 added unicode support to windows console i/o
original commit: e7e638e871ac4b46a84149dda93aae8741683e0a
2020-03-04 16:23:47 -05:00
Bob Burger
68c114c930 fixed typo
original commit: 29c9bfebf730a2691e4302ad82c0be7c22e0d2d2
2020-02-25 11:14:51 -05:00
Bodie Solomon
c7b4ce90a0 Exclude unresolved symbol scheme_signals_registered on Win32 builds
Commit 72d90e4 ("library-manager, numeric, and bytevector-compres
improvements") introduced a regression causing Win32 nmake builds to
fail due to an undefined symbol in c\schsig.c.

This symbol (scheme_signals_registered) is defined in a preprocessor-
conditional code block excluding WIN32 between lines 544 and 737.

This commit bypasses the build regression by enclosing the unresolved
expression in a preprocessor conditional excluding WIN32.

See GitHub Issue #497 for more details:
https://github.com/cisco/ChezScheme/issues/497

original commit: 10bccf39badee76f80d87326d2fc7c4d808fa08e
2020-02-25 10:34:21 -05:00
Matthew Flatt
5b7f4e2fd8 unbreak Windows build
original commit: 6c062f550486dfb9b25dfc62f6d1a829bbce1d1b
2020-02-22 19:41:02 -07:00
Matthew Flatt
995e53ca71 Merge github.com:cisco/ChezScheme
original commit: 8cf52012e2a7b5928cb2602bb17e0128ae0f2776
2020-02-22 15:18:47 -07:00
dybvig
d0b405ac8b library-manager, numeric, and bytevector-compres improvements
- added invoke-library
    syntax.ss, primdata.ss,
    8.ms, root-experr*,
    libraries.stex, release_notes.stex
- updated the date
    release_notes.stex
- libraries contained within a whole program or library are now
  marked pending before their invoke code is run so that invoke
  cycles are reported as such rather than as attempts to invoke
  while still loading.
    compile.ss, syntax.ss, primdata.ss,
    7.ms, root-experr*
- the library manager now protects against unbound references
  from separately compiled libraries or programs to identifiers
  ostensibly but not actually exported by (invisible) libraries
  that exist only locally within a whole program.  this is done by
  marking the invisibility of the library in the library-info and
  propagating it to libdesc records; the latter is checked upon
  library import, visit, and invoke as well as by verify-loadability.
  the import and visit code of each invisible no longer complains
  about invisibility since it shouldn't be reachable.
    syntax.ss, compile.ss, expand-lang.ss,
    7.ms, 8.ms, root-experr*, patch*
- documented that compile-whole-xxx's linearization of the
  library initialization code based on static dependencies might
  not work for dynamic dependencies.
    system.stex
- optimized bignum right shifts so the code (1) doesn't look at
  shifted-off bigits if the bignum is positive, since it doesn't
  need to know in that case if any bits are set; (2) doesn't look
  at shifted-off bigits if the bignum is negative if it determines
  that at least one bit is set in the bits shifted off the low-order
  partially retained bigit; (3) quits looking, if it must look, for
  one bits as soon as it finds one; (4) looks from both ends under
  the assumption that set bits, if any, are most likely to be found
  toward the high or low end of the bignum rather than just in the
  middle; and (5) doesn't copy the retained bigits and then shift;
  rather shifts as it copies.  This leads to dramatic improvements
  when the shift count is large and often significant improvements
  otherwise.
    number.c,
    5_3.ms,
    release_notes.stex
- threaded tc argument through to all calls to S_bignum and
  S_trunc_rem so they don't have to call get_thread_context()
  when it might already have been called.
    alloc.c, number.c, fasl.c, print.c, prim5.c, externs.h
- added an expand-primitive handler to partially inline integer?.
    cpnanopass.ss
- added some special cases for basic arithmetic operations (+, -, *,
  /, quotient, remainder, and the div/div0/mod/mod0 operations) to
  avoid doing unnecessary work for large bignums when the result
  will be zero (e.g,. multiplying by 0), the same as one of the
  inputs (e.g., adding 0 or multiplying by 1), or the additive
  inverse of one of the inputs (e.g., subtracting from 0, dividing
  by -1).  This can have a major beneficial affect when operating
  on large bignums in the cases handled.  also converted some uses
  of / into integer/ where going through the former would just add
  overhead without the possibility of optimization.
    5_3.ss,
    number.c, externs.h, prim5.c,
    5_3.ms, root-experr, patch*,
    release_notes.stex
- added a queue to hold pending signals for which handlers have
  been registered via register-signal-handler so up to 63 (configurable
  in the source code) unhandled signals are buffered before the
  handler has to start dropping them.
    cmacros.ss, library.ss, prims.ss, primdata.ss,
    schsig.c, externs.h, prim5.c, thread.c, gc.c,
    unix.ms,
    system.stex, release_notes.stex
- bytevector-compress now selects the level of compression based
  on the compress-level parameter.  Prior to this it always used a
  default setting for compression.  the compress-level parameter
  can now take on the new minimum in addition to low, medium, high,
  and maximum.  minimum is presently treated the same as low
  except in the case of lz4 bytevector compression, where it
  results in the use of LZ4_compress_default rather than the
  slower but more effective LZ4_compress_HC.
    cmacros,ss, back.ss,
    compress_io.c, new_io.c, externs.h,
    bytevector.ms, mats/Mf-base, root-experr*
    io.stex, objects.stex, release_notes.stex

original commit: 72d90e4c67849908da900d0b6249a1dedb5f8c7f
2020-02-21 13:48:47 -08:00
Matthew Flatt
745482e3e4 vfasl: repairs for fcallables
A 0 relocation is used by fcallable code as a recognizable cookie, and
its relocations must be preserved.

original commit: 38fb3fdf75cf6540d6bd2568f015af6272d22995
2020-02-20 13:24:47 -07:00
Matthew Flatt
5d45d6dca2 adjust event-detour path again to apply more often
Instead of constaining the use of event-detour so much, make it merely
unlikely that the detour will have to allocate when used in a loop
that otherwise doesn't allocate. We'll only have to allocate if the
available stack space turns out to be too small --- and if we do
allocate, it's not the end of the world.

original commit: f1dbed82df415c18c8304bedcee2ecf4912badc7
2020-02-09 09:43:26 -07:00
Matthew Flatt
baf3bba9de constrain smaller trap-check code to avoid allocation
Having the trap check allocate is questionable, since it can be
triggered during a loop that otherwise performs no allocation. Also,
on platforms where at most 1 argument is passed in a register, then
sending two arguments to the event handler could potentially need
stack space that isn't there. So, constrain the smaller trap-check
code to cases where no stack space is needed and where no allocation
happens unless the wrong number of arguments are provided.

original commit: 260a7ef5bc0bf851d9848587b0a78bdb4aab59f8
2020-02-07 15:27:07 -07:00
Matthew Flatt
d4981dd8c3 less code for trap checks
When a proceudre starts with a trap check, move the check to the very
beginning, even before checking the argument count. That way, event
detection can turn into a compact jump to an event handler, instead of
inserting a general call to `$event` in the procedure body.

original commit: 06b12d505698a2378734689370bb9e0f8eda06b9
2020-02-07 10:56:15 -07:00
Matthew Flatt
27e21e6e7d code inspector: improvements to reloc reporting
Fix 'reloc to avoid a crash on static-generation code, and add
'reloc+offset to report an offset for each entry.

original commit: 4d4195044377f9c619cfb46056e365044069d5bc
2020-01-29 16:22:52 -07:00
Matthew Flatt
26ff90e8e6 more compact return points for function calls
In the general form of a function call, the return point embeds 4
words of information: offset to the start of the enclosing function,
frame size, live-veriable mask, and multiple-value return address. In
the common case, however, the multiple-value return address is either
the same as the return address or it is a `values-error` library
function, and the frame size and live-variable mask fit into a word
with bits to spare. This patch implements a more compact return point
for that common case, which shrinks the 4 words to 2 and also avoids a
relocation (= 1 more word).

Multiple-value returns are more complex with this change (i.e.,
require more code), since they must check whether the return point is
compact or not. But multiple-value returns are far less common than
function calls, so saving function-call space is a clear win.

Overall, this change tends to reduce code size by about 10% on x86_64.

original commit: 1f53b5eabef966db01086cb32e544bbf8deacfca
2020-01-24 19:19:32 -07:00