Rewrite the handler of record? and $sealed-record? to make it easier to
understand.
Also, delay the reductions of lambdas in a sequence of arguments. This helps
to reduce for example
(map (lambda (x) (box? b)) (unbox b))
=>
(map (lambda (x) #t) (unbox b))
original commit: 20e478b9280c779e260f5557c2eee74946313a44
Having the trap check allocate is questionable, since it can be
triggered during a loop that otherwise performs no allocation. Also,
on platforms where at most 1 argument is passed in a register, then
sending two arguments to the event handler could potentially need
stack space that isn't there. So, constrain the smaller trap-check
code to cases where no stack space is needed and where no allocation
happens unless the wrong number of arguments are provided.
original commit: 260a7ef5bc0bf851d9848587b0a78bdb4aab59f8
When a proceudre starts with a trap check, move the check to the very
beginning, even before checking the argument count. That way, event
detection can turn into a compact jump to an event handler, instead of
inserting a general call to `$event` in the procedure body.
original commit: 06b12d505698a2378734689370bb9e0f8eda06b9
This is a `$` function because it is defined only for record types
that have pointer-sized fields (i.e., the normal case).
original commit: 47213a7c8450aa52bd18e8f605c02b6c1081eadf
Fix 'reloc to avoid a crash on static-generation code, and add
'reloc+offset to report an offset for each entry.
original commit: 4d4195044377f9c619cfb46056e365044069d5bc
In the general form of a function call, the return point embeds 4
words of information: offset to the start of the enclosing function,
frame size, live-veriable mask, and multiple-value return address. In
the common case, however, the multiple-value return address is either
the same as the return address or it is a `values-error` library
function, and the frame size and live-variable mask fit into a word
with bits to spare. This patch implements a more compact return point
for that common case, which shrinks the 4 words to 2 and also avoids a
relocation (= 1 more word).
Multiple-value returns are more complex with this change (i.e.,
require more code), since they must check whether the return point is
compact or not. But multiple-value returns are far less common than
function calls, so saving function-call space is a clear win.
Overall, this change tends to reduce code size by about 10% on x86_64.
original commit: 1f53b5eabef966db01086cb32e544bbf8deacfca
Allow a library-defined function to be inlined when the inlined
expressions refer to other library-defined functions. Since the
library function's body may already have inlined calls, don't allow
further inlining of calls within the inlined code.
This commit also adds `$app/no-inline`, which can be used to prevent
inlining of a function. For consumers other than Racket on Chez
Scheme, probably it would make sense to provide a nicer-looking
syntactic form that expands to use the internal `$app/no-inline`
function.
original commit: 628d57e1bd2e658aad4da97a3e85bda72c38f6ab
On x86_64, a POPCNT instruction is usually available, and it can speed
up `fxpopcount` operations by a factor of 2-3.
Since POPCNT isn't always available, code using `fxpopcount` is
compiled to a call to a generic implementation. The linker substitutes
a POPCNT instruction when it determines at runtime that POPCNT is
available.
Some measurements on a 2018 MacBook Pro (2.7 GHz Core i7) using the
program below:
popcnt = this implementation, POPCNT discovered
nocnt = this implementation, POPCNT considered unavailable
optcnt = compile to use POPCNT directly (no linker work)
cpcnt = compile to inlined generic (no linker work, no POPCNT)
Since the generic implementation is always a 64-bit popcount, it's not
as good as an inlined version for `fxpopcount32`, but otherwise the
link-edit approach to POPCNT works well:
fxpopcount fxpopcount32
popcnt: 0.098s
nocnt: 0.284s
optcnt 0.109s [slower means noise?]
cpcnt: 0.279s 0.188s
(optimize-level 3)
(time
(let loop ([v #f] [i 100000000])
(if (fx= i 0)
v
(loop (fxpopcount i) (fx- i 1)))))
original commit: 5f090e509f8fe5edc777ed9f0463b20c2e571336
Instead of using `%` to compute the index into an oblist, use a power
of 2 for the oblist length and bit masking to compute an index. (Maybe
the old hashing function was bad; the current hashing function should
produce good hash-code variation at the level of bits.) Also, make the
oblist array a little sparser to reduce bucket chaining.
original commit: fb87fcb8e47902b80654789d059a25bd4a7a8def
After a bignum computation using temporary thread registers W, U, or V
is complete, clear ther register. (The X and Y registers hold only
small bignums, so clearing them doesn't matter in the same way.)
original commit: a9e11fcf9e86aee5d149764476e1fabfeee12f84
Try `fxquotient` with a `fx*` check to implement `/` on fixnums.
That's fast enough to be much faster when it works, and only slows
down a more general `/` a little.
original commit: e91430be9b71f4913965db688a15f6d7206b38f3
It's not available with musl, either, musl intentionally
doesn't provide a preprocessor test, and we're avoiding
(for now) `configure`-time tests in the style of autoconf.
original commit: a9bfb72027fc83ed6bb690d033bc6fed0629dba7
Don't run cptypes, when cp0 is disabled, for example with
(run-cp0 (lamba (cp0 x) x)
This is easier to understand because run-cp0 is a single point to control
all the cp reductions. The reductions in cptypes can be independently disable
using enable-type-recovery.
original commit: b23645e669fbf02806a261a2d87160fdbe06db93
Use the high bit of a byte to continue instead of the low bit.
That way, ASCII strings look like themselves in uncompressed fasl
form.
original commit: 89a8d24cc051123a7b2b6818c5c4aef144d48797
Uninterned symbols are slightly more expensive to allocate than 0- or
1-argument calls to `gensym`, but they're much cheaper to hash (and
print). They're also more consistently distinct when unfasled, and the
fasled form is determinsitic.
original commit: 3167083008031b1f880e76a6f573563c7d9c888c
The result of `mktime` is -1 for an error. The result is also -1 if
the time is 1 second before the epoch. That's not useful, so ignore
it.
original commit: aa8ca31cef223128fd8ed1abdc76beb31a0e077a
With this flag the primitive is not tested in primvars.ms but other
parts of the compiler can use the signature/flags.
Also, add a signature to every system boolean primitive.
primvars.ms, primdata.ss
original commit: ee023c673bda6557bc223de7f8b0e732600619bc