When a chaperone-wrapped function leads to a slow-path tail
call, the continuation-mark depth can be made too deep when
resolving the slow tail call.
Closes#1265
Reduce
(eq? v v) ==> #t
(if t v v) ==> (begin t v)
(if v v #f) ==> v
when v is a local or a top level variable.
Previously, the last two reductions were used only
with local variables.
Also, move the (if x #t #f) ==> (not x) reduction
after branch optimization.
When a key is removed at a level that other only has a collision
table, the HAMT representation was not adjusted properly by
eliminating the layer. As aresult, table comparison via
`equal?` could fail. The problem could show up with hash tables
used to represent scope sets, where an internal "subset?" test
could fail and produce an incorrect binding resolution.
The transformation from
(begin (let <bindings> (begin <e1> ...)) <e2> ...)
to
(let <bindings> (begin <e1> ... <e2> ...))
makes things look simpler and might help the optimizer a little. But
it also tends to make the run-time stack deeper, and that slows some
programs a small but measurable amount.
A better solution would be to keep the transformation but add another
pass that moves expressions out of a `let`.
Since this operation only moves the code and doesn't make the final
bytecode bigger, it's not necessary to decrease the fuel and then it
is available for further inlining.
The calculation of used variables in a possibly unused function did
not work right when the function is referenced by a more deeply
nested function that itself is unused. The extra uses triggered by
more nested uses need to be registered as tentative in the more nested
frame, not in the outer frame.
Closes#1247
Correct the second-biggest design flaw in the bytecode optimizer:
instead of using a de Bruijn-like representation of variable
references in the optimizer pass, use variable objects.
This change is intended to address limitations on programs like the
one in
http://bugs.racket-lang.org/query/?cmd=view&pr=15244
where the optimizer could not perform a straightforward-seeming
transformation due to the constraints of its representation.
Besides handling the bug-report example better, there are other minor
optimization improvements as a side effect of refactoring the code. To
simplify the optimizer's implementation (e.g., eliminate code that I
didn't want to convert) and also preserve success for optimizer tests,
the optimizer ended up getting a little better at flattening and
eliminating `let` forms and `begin`--`let` combinations.
Overall, the optimizer tests in "optimize.rktl" pass, which helps
ensure that no optimizations were lost. I had to modify just a few
tests:
* The test at line 2139 didn't actually check against reordering as
intended, but was instead checking that the bug-report limitation
was intact (and now it's not).
* The tests around 3095 got extra `p` references, because the
optimizer is now able to eliminate an unused `let` around the
second case, but it still doesn't discover the unusedness of `p` in
the first case soon enough to eliminate the `let`. The extra
references prevent eliminating the `let` in both case, since that's
not the point of the tests.
Thanks to Gustavo for taking a close look at the changes.
LocalWords: pkgs rkt
Found with `-fsanitize=undefined`. The only changes that are potentially
bug repairs involve some abuses of pointers that can end up misaligned
(which is not an x86 issue, but might be on other platforms). Most of
the changes involve casting a signed integer to unsigned, which
effectively requests the usual two's complement behavior.
Some undefined behavior still present:
* floating-point operations that can divide by zero or coercions
from `double` to `float` that can fail;
* offset calculations such as `&SCHEME_CDR((Scheme_Object *)0x0)`,
which are supposed to be written with `offsetof`, but using
a NULL address composes better with macros.
* unaligned operations in the JIT for x86 (which are ok, because
they're platform-specific).
Hints for using `-fsanitize=undefined`:
* Add `-fsanitize=undefined` to both CPPFLAGS and LDFLAGS
* Add `-fno-sanitize=alignment -fno-sanitize=null` to CPPFLAGS to
disable those checks.
* Add `-DSTACK_SAFETY_MARGIN=200000` to CPPFLAGS to avoid stack
overflow due to large frames.
* Use `--enable-noopt` so that the JIT compiles.
The `alarm-evt` tests are inherently racy, since they depend on
the scheduler polling quickly enough. The old time values were
close enough that a test failure is particularly likely on
Windows, where the clock resolution is around 16ms. To reduce
failures, make the time differents much bigger.
Closes issue #1232
A reference to a local may be reduced in a branch to a constant, while it's unchanged in the
other because the optimizer has different type information for each branch. Try to use the
type information of the other branch to see if both branches are actually equivalent.
For example, (if (null? x) x x) is first reduced to (if (null? x) null x) using the type
information of the #t branch. But both branches are equivalent so they can be
reduced to (begin (null? x) x) and then to just x.
The functions expr_implies_predicate was very similar to
expr_produces_local_type, and slighty more general.
Merging them, is possible to use the type information
is expressions where the optimizer used only the
local types that were visible at the definition.
For example, this is useful in this expression to
transform bitwise-xor to it's unsafe version.
(lambda (x)
(when (fixnum? x)
(bitwise-xor x #xff)))
The recently added fast path for property-only chaperones did not
propagate the original object in the case that the property-only
chaperone wraps a `chaperone-procedure*` chaprerone.
Merge to v6.4
Fix `procedure-specialize` for a procedure that refers to a
syntax-object literal. A syntax-object literal becomes part of the
procedure closure, but in a special way that nomrally allows syntax
objects to be loaded on demand. For now, specialization counts as
a demand of the syntax object.
Merge to v6.4
When a procedure created by `unsafe-{chaperone,impersonate}-procedure`
is given the wrong number of arguments, the original procedure's name
should be used in the error message.
During inlining, the type information gathered in
code that was inside the lambda is copied to the outer
context. But the coordinates of the type information
were shifted in the wrong direction, so the type was
assigned to the wrong variable.
This bug is difficult to trigger, so the test is convoluted.
Merge to v6.4
Add 'module-body-inside-context, 'module-body-outside-context, and
'module-body-context-simple? properties to the expansion of a
`module` form. These properties expose scopes that are used by
`module->namespace` and taht appear in marshaled bytecode.
Made the hash-set chaperones essentially forward the hash chaperone
operations, but now explain them all in terms of set-based operations
in the docs.
Also adjusted value-blame and has-blame? to support late-neg projections