The collector tries to use roughly the same amount of parallelism as
the number of active threads, but make sure that it doesn't fall back
to an ownership-mangling non-parallel collection if all but one place
happens to be stalled.
Instead of one big lock, have one big lock and one small lock for just
manipulating allocation state. This separation better reflects the
locking needs of parallel collection, where the main collecting thread
holds the one big lock, and helper threads need to take a different
lock while changing allocation state.
This commit also includes changes to cooperate better with LLVM's
thread sanitizer. It doesn't work for parallel collection, but it
seems to work otherwise and helped expose a missing lock and a
questionable use of a global variable.
Move per-thread allocation information to a separate object, so the
needs of the collector are better separated from the thread
representation. This refactoring sets up a change in the collector to
detangle sweeper threads from Scheme threads.
Change the internal parallelism strategy for the GC to record an owner
for each allocated segment of memory, and have the owner be solely
responsible for copying or marking objects of the segment. When
sweeping, a collecting thread handles references to objects that it
owns or that have been copied or marked already, and it asks another
collecting thread to resweep an object that refers to objects owned by
that that thread. At worst, an object ends up being swept by all
collecting threads, one at a time, but that's unlikely for a given
object.
The approach seems likely to scale better than a lock-based approach,
even the one that used a lightweight, CAS-based lock and retries on
lock failure.
All allocation is now thread-local, which recovers a small bit of
performance that was lost when adding thread-local allocation
alongside global allocation.
Parallelism uses thread contexts created by Chez Scheme threads (which
correspond to Racket places and future-running threads), but it
creates its own OS-level threads to perform collection. The number of
collection-helper threads is limited to the number of active Chez
Scheme threads. Only the main "sweep" pass runs in parallel --- that
is, after roots have been traversed, and before weak references and
finalization are handled --- but that's the bulk of collection work.
Also, memory-accounting collections always run as single-threaded.
Improved support for thread-location allocation (and using more
fine-grained locks in fasl reading) may provide a small direct
benefit, but the change is mainly intended as setup for more
parallelism in the collector.
The `flsingle` function takes a flonum and discards precision that
wouldn't fit in a single-flonum. If single-precision arithmetic is
somehow useful, then combine flonum operations with `flsingle` to
discard precision on the result; even on Racket BC, that's likely to
perform better than using generic arithmetic on single flonums.
This is analogous to PLT_SETUP_OPTIONS, but for the `raco pkg update`
step. This is useful for adding an option like `--pull try` to ignore
failures to update local clones.
The top-level makefile now builds Racket CS as `racket` by default.
Use `racket bc` to build Racket BC as `racket`. Use `make both` to
build both CS and BC (the latter with the `bc` suffix) overlayed in a
single build. By using `make both` insted of `make cs` plus `make bc`,
you can avoid redundant package downloads and documentation rendering.
To build Racket BC as `racket`, use `racket bc RACKETBC_SUFFIX=`, but
you must consistently use `RACKETBC_SUFFIX=` with `make` every time.
The `--enable-cs` and `--enable-csdefault` flags now enabled just the
CS build, as long as pb boot files are present or
`--enable-racket=...` is supplied.
Currently, `make` is the same as `make bc`, but the idea is that
`make` can become the same as `make cs` when some other pieces are in
place.
The source of the top-level makefile is now ".makefile", and it's
turned into "Makefile" using "racket/src/makemake.rkt". The
transformation makes non-GNU `make` variants and `nmake` act like GNU
make to propagate variables, which makes abstraction through targets
plus variables (admitedly an abuse of `make`) more reliable and
consistent.
Why abuse `make` this way? Because `make` variants and `nmake` are
similar enough that to constitute a portable scripting language, and
one that conveniently provides a large number of entry points.
Also, schemified "thread", "io", "regexp", "schemify", and "expander"
layers are checked in. Overall, building Racket CS no longer requires
first building Racket BC.
The "cs-bootstrap" package is now in the `racket/ChezScheme` repo,
because it needs to track the Chez Scheme implementation instead of
the Racket implementation.
When potentially adding the `--disable-lib` flag, don't drop existing
extra configure flags. Specifically, the `--enable-crossany` flag
from distro-build could get lost, which breaks a soutrce distribution
with built packages.
Implementing a thorough `clean` target for a repo-based is tedious and
error-prone. Instead, make the `clean` target suggest `git clean -d -x
-f .`, which is more effective in most situations (but seems too
dangerous to run automatically).
Use `--no-user` for the `raco setup` that is supposed to finish a
bundle. Otherwise, things installed in user scope for the same Racket
version (i.e., the one being bundled) can interfere with the bundling
process.
configure scripts look for and read a local configuration file given by the
environment variable CONFIG_SITE. This can set variables such as prefix.
Racket’s build system was assuming that prefix would be set to NONE unless a
--prefix command line argument was given. But it could be set by a
CONFIG_SITE configuration file instead.
Hence, for in-place build add an explicit --disable-useprefix option, to
cause any prefix setting to be ignored, and use that in the top-level
Makefile.
Regenerate the configure scripts to get the updated code.
Closes#2659 by both recognizing `lib64` as a default path and by
having `--enable-origtree` override inference and specified when
running `configure` through the root makefile.