For a collect rendezvous, call the collect-notify handler in
the main thread if it is active. A collect-notify handler can
then make sure the main thread is active and try again, if
that's useful to an application.
original commit: 0bc286e81827f029dd02a3627a192edd053b3b91
In the general form of a function call, the return point embeds 4
words of information: offset to the start of the enclosing function,
frame size, live-veriable mask, and multiple-value return address. In
the common case, however, the multiple-value return address is either
the same as the return address or it is a `values-error` library
function, and the frame size and live-variable mask fit into a word
with bits to spare. This patch implements a more compact return point
for that common case, which shrinks the 4 words to 2 and also avoids a
relocation (= 1 more word).
Multiple-value returns are more complex with this change (i.e.,
require more code), since they must check whether the return point is
compact or not. But multiple-value returns are far less common than
function calls, so saving function-call space is a clear win.
Overall, this change tends to reduce code size by about 10% on x86_64.
original commit: 1f53b5eabef966db01086cb32e544bbf8deacfca
The `unlock-object` operation was O(N) with N currently locked objects
--- so, O(N^2) to lock N objects and then unlock them --- because
locked objects were stored in and searched in a global list. Also, GC
was O(N) at any generation with N locked objects across generations,
since every locked object was scanned.
Fix these poblems so that locking and unlocking is practically O(1)
and GC is not poportional to locked objects. More precisely, locking
and unlocking is now O(C) for locking an individual object C times to
be balanced by C unlocks. (Since multiple locks on a single object
is rare, this performance seems good enough.)
The implementation replaces the global list with segment-specific
lists. Backpointers are managed using the general generational
support, so that unmodified, old-generation locked objects do not
need to be swept duing a new-generation collection.
original commit: a57d256ca73a3d507792c471facb7e35afbe88b3
Also adds `get-initial-thread`, since threa values are useful with
`compute-size[-increments]`.
Changes the compiler to inline `weak-pair?` and `ephemeron-pair?`,
since that provides better performance for `compute-size-increments`.
original commit: 57d0cc13f8e932972cba3837b4f54e9c86786091
- when thread_get_room exhausts the local allocation area, it now
goes through a common path with S_get_more_room to allocate a new
local allocation area when appropriate. this can greatly reduce
the use of global allocation (and the number of tc mutex acquires
in threaded builds) when a lot of small objects are allocated by
C code with no intervening Scheme-side allocation or dirty writes.
alloc.c, types.h, externs.h
original commit: 93dfa7674a95837e5a22bc622fecc50b0224f60d
to simplify ($fxu< (most-positive-fixnum) e) => (fx< e 0) so we
don't have any incentive in special casing length checks where
the maximum length happens to be (most-positive-fixnum).
5_4.ss, 5_6.ss, bytevector.ss, cmacros.ss, cp0.ss, cpnanopass.ss,
mkheader.ss, primdata.ss, prims.ss,
fasl.c, gc.c, types.h
root-experr*, patch*
original commit: 9eb63deda025fd4560b54746b21a881c01af46d6
because these fields can be accessed from multiple threads concurrently.
Updated $yield and $thread-check in mats/thread.ms to be more tolerant of timing variability.
original commit: 0a6a1e14e7ecb9e39fa7a10a8584ed2fec24cbf4