The optimizer tries to reduce the `if` assuming that the result will be used.
In case it later detects that the result will be ignored, it can try to
apply some additional reductions to the branches and to the whole expression.
This function exposes the fast subset operation that is built in for
immutable hash tables (and used by the set-of-scopes implementation).
Also, make the space optimization implicit for `eq?`-based hash tables
that contain only #t values (instead of explicit and only available
internally). It turns out to be easy and efficient to make the
representation automatic, because the HAMT implementation can support
a mixture of nodes with some containing explicit values and some
containing implicit #t values.
When the properties argument for `make-struct-type` is non-empty,
then we cant; guarantee that `make-struct-type` succeeds, but
if it does, then we can still know that the result is a structure
type and (as long as `prop:chaperone-unsafe-undefined` is not
involved) the properties don't affect the constructor, predicate,
selector, or mutators.
This fixes an immediate problem, but the macro expander should have
complained about an unbound `maybe` at phase 2. (A new implementation
of the macro expander detected the unbound `maybe`.)
When an array value is provided, make sure that it's an array
with at least the expected length (or longer) and same element
layout. That's weaker than checking that the array elements have
the right type, because an `eq?` check at the ctype layer seems
too strong, and the ctype API doesn't provide enough information
for a more flexible equality.
A phase shift was mising on `begin-for-syntax`es introduced by
`syntax-local-lift-module-end-declaration`, which is in turn
used to implement` module+`, so `module+` didn't work under
two or more `begin-for-syntaxes`.
Closes#1312
Syntax objects generally make sense as properties in other syntax
objects, but they require special care when marshaling to bytecode
(as syntax objects do in general). To make that special handling
possible and reliable, constrain the shape of allowed values.
The name `path-extension` created a conflict for an existing
registered package, so it should not have been added to
`racket/path`.
Also, `path-get-extension` was intended to work on a path
that is syntactically a directory, so fix and test that.
Change the one expansion mode as far as I can tell) that disables
lifts so that lifts are now allowed, which means that
`(syntax-transforming?)` implies `(syntax-transforming--with-lifts?)`.
The old documentation incorrectly characterized when lifts
were allowed. Ryan noticed the documentation problem, and that
observation led to this simplication.
A `#:name` identifier picks the name that is bound to static
information about a structure type. An `#:extra-name` identifier
specifies an additional name to be bound to the information.
This pair of options is analogous to `#:constructor-name`
and `#:extra-constructor-name`.
Based on Jen Axel's suggestion and implementation.
Closes#1309
Provide a cleaned-up set up path-extension functions. In contrast
to `path-{add,replace}-suffix` and `filename-extension`, a dot
at the beginning of a path element is not treated as an extension
separator. Also, `path-extension` returns an extension including
its separator, which is more consistent with other extension
functions.
The new `path-has-extension?` function replaces many uses of
regexp matching in the base collections.
Closes#1307
Reduce (unbox (box x)) => x
Extend the reductions for cXr to the unsafe versions, for
example reduce (unsafe-car (cons x y)) => x
Check and save types in unsafe operations
Pass a string to the handler to describe the problem.
Also, fix minor issues (GC registration, contracts and `history`
in docs) and make `pregexp`, etc., report compilation errors as
`pregexp`, etc.
Allow `system-type` on non-Windows platforms to run `uname` to get
machine information, even in a sandbox or other contexts with a
limiting secutiry guard.
Check that it works to apply a continuation that shares with
an enclosing continuation, where a runstack overflow happens
between the continuations.
Closes PR 15281
While expanding a module, the root of module-relative references is a
fresh notion of "this module".
After expansion, "this module" is shifted to "an expanded module",
which is a global constant (for top-level modules). When an expanded
module is re-expanded, "an expanded module" is shifted to a fresh
"this module" during re-expansion, and so on.
One problem with this approach is that the shift from "this module" to
"an expanded module" isn't applied to syntax properties --- but
there's some extra trickery to make it work out by mutating "this
module" to make it look like "an expanded module".
Submodule expansion introduces an intermediate "parent of this module"
that wasn't currently covered by the extra trickery, so fix that.
Repair a mismatch between `syntax-local-lift-expression` and the
way that `compile` tries to avoid creating bindings while
compiling a top-level `define` form.
Closes#1284 and #1282
A syntax property is added as preserved or not. For backward
compatibility, the default for a 'paren-shape key is preserved, and
any other key's default is non-preserved.
Cross-module inlining that pulls a variable reference across a
module boundary imposes a more struct requirement that run-time
"constant" detection is consistent with the optimizer's view of
"constant" within a module. So, make sure they're the same.
Sometimes the optimizer removes all the references to a variable but it
doesn't detect that the variable is unused, so it keeps the definition.
Later, the sfs detects the unused variable so it marks it, but it doesn't
remove the let form.
Formerly, cross-module inlining would not work for a function like
(define (f x)
(if .... .... (slow x)))
unless `slow` was also inlined into `f`. This commit changes
cross-module inlining so that it allows a call to `f` to be replaced
with an expression that references other module-level bindings (that
are not primitives), such as `slow`.
Adjusting the inlining rules can always make some program worse. In
this case, a hueristic about whether to export an optimized or
unoptimized variant of a fnuciton for inlining tends to collide with
the adjusted inlining rule, so this commit tweaks that heuristic, too.
Enable the optimizer to figure to figure out that a loop
argument is always a real number, for example, in much the
same way that it can detect fixnums and flonums for unboxing.
Unboxing information was only needed at the resolve level,
but `real?` information is useful only to the optimizer, so
the generalization enables the optimizer to reach
approximations of type information earlier (e.g., among
a subset of a function's arguments).
Simplify `(wcm <k1> <v1> (wcm <k1> <v2> <e>))` to
`(begin <v1> (wcm <k1> <v2> <e>))` for a simple enough <k1>.
A variable simple enough, so this is useful for improving
errortrace output.
Compute an `equal?` hash code for `read`able values that
is a constant, at least for a given version of Racket. Only
(interned) symbols failed to have that property before.
With the old representation of local variables, optimize_info_lookup
had to search the stack for the frame with the information about the
variable. This was complicated so it has many flags to be used in
different situations and extract different kind of information.
With the new representation this process is easier, so it's possible
to split the function into a few smaller functions with an easier
control flow.
In particular, this is useful to avoid marking a variable as used
inside a lambda when the reference in immediately reduced to a
constant using the type information.
The iterator saves the return points in a list. For small immutable hashes,
encode the values in the list in the bits of a fixnum to avoid allocations.
Expose tagged allocation and a function that interprets a description
of tagged shapes. As a furst cut, the description can only specify
constant offsets for pointers within the object, but future extensions
are possible.
When a chaperone-wrapped function leads to a slow-path tail
call, the continuation-mark depth can be made too deep when
resolving the slow tail call.
Closes#1265
Reduce
(eq? v v) ==> #t
(if t v v) ==> (begin t v)
(if v v #f) ==> v
when v is a local or a top level variable.
Previously, the last two reductions were used only
with local variables.
Also, move the (if x #t #f) ==> (not x) reduction
after branch optimization.
When a key is removed at a level that other only has a collision
table, the HAMT representation was not adjusted properly by
eliminating the layer. As aresult, table comparison via
`equal?` could fail. The problem could show up with hash tables
used to represent scope sets, where an internal "subset?" test
could fail and produce an incorrect binding resolution.
The transformation from
(begin (let <bindings> (begin <e1> ...)) <e2> ...)
to
(let <bindings> (begin <e1> ... <e2> ...))
makes things look simpler and might help the optimizer a little. But
it also tends to make the run-time stack deeper, and that slows some
programs a small but measurable amount.
A better solution would be to keep the transformation but add another
pass that moves expressions out of a `let`.
Since this operation only moves the code and doesn't make the final
bytecode bigger, it's not necessary to decrease the fuel and then it
is available for further inlining.
The calculation of used variables in a possibly unused function did
not work right when the function is referenced by a more deeply
nested function that itself is unused. The extra uses triggered by
more nested uses need to be registered as tentative in the more nested
frame, not in the outer frame.
Closes#1247
Correct the second-biggest design flaw in the bytecode optimizer:
instead of using a de Bruijn-like representation of variable
references in the optimizer pass, use variable objects.
This change is intended to address limitations on programs like the
one in
http://bugs.racket-lang.org/query/?cmd=view&pr=15244
where the optimizer could not perform a straightforward-seeming
transformation due to the constraints of its representation.
Besides handling the bug-report example better, there are other minor
optimization improvements as a side effect of refactoring the code. To
simplify the optimizer's implementation (e.g., eliminate code that I
didn't want to convert) and also preserve success for optimizer tests,
the optimizer ended up getting a little better at flattening and
eliminating `let` forms and `begin`--`let` combinations.
Overall, the optimizer tests in "optimize.rktl" pass, which helps
ensure that no optimizations were lost. I had to modify just a few
tests:
* The test at line 2139 didn't actually check against reordering as
intended, but was instead checking that the bug-report limitation
was intact (and now it's not).
* The tests around 3095 got extra `p` references, because the
optimizer is now able to eliminate an unused `let` around the
second case, but it still doesn't discover the unusedness of `p` in
the first case soon enough to eliminate the `let`. The extra
references prevent eliminating the `let` in both case, since that's
not the point of the tests.
Thanks to Gustavo for taking a close look at the changes.
LocalWords: pkgs rkt
Found with `-fsanitize=undefined`. The only changes that are potentially
bug repairs involve some abuses of pointers that can end up misaligned
(which is not an x86 issue, but might be on other platforms). Most of
the changes involve casting a signed integer to unsigned, which
effectively requests the usual two's complement behavior.
Some undefined behavior still present:
* floating-point operations that can divide by zero or coercions
from `double` to `float` that can fail;
* offset calculations such as `&SCHEME_CDR((Scheme_Object *)0x0)`,
which are supposed to be written with `offsetof`, but using
a NULL address composes better with macros.
* unaligned operations in the JIT for x86 (which are ok, because
they're platform-specific).
Hints for using `-fsanitize=undefined`:
* Add `-fsanitize=undefined` to both CPPFLAGS and LDFLAGS
* Add `-fno-sanitize=alignment -fno-sanitize=null` to CPPFLAGS to
disable those checks.
* Add `-DSTACK_SAFETY_MARGIN=200000` to CPPFLAGS to avoid stack
overflow due to large frames.
* Use `--enable-noopt` so that the JIT compiles.
The `alarm-evt` tests are inherently racy, since they depend on
the scheduler polling quickly enough. The old time values were
close enough that a test failure is particularly likely on
Windows, where the clock resolution is around 16ms. To reduce
failures, make the time differents much bigger.
Closes issue #1232
A reference to a local may be reduced in a branch to a constant, while it's unchanged in the
other because the optimizer has different type information for each branch. Try to use the
type information of the other branch to see if both branches are actually equivalent.
For example, (if (null? x) x x) is first reduced to (if (null? x) null x) using the type
information of the #t branch. But both branches are equivalent so they can be
reduced to (begin (null? x) x) and then to just x.
The functions expr_implies_predicate was very similar to
expr_produces_local_type, and slighty more general.
Merging them, is possible to use the type information
is expressions where the optimizer used only the
local types that were visible at the definition.
For example, this is useful in this expression to
transform bitwise-xor to it's unsafe version.
(lambda (x)
(when (fixnum? x)
(bitwise-xor x #xff)))
The recently added fast path for property-only chaperones did not
propagate the original object in the case that the property-only
chaperone wraps a `chaperone-procedure*` chaprerone.
Merge to v6.4
Fix `procedure-specialize` for a procedure that refers to a
syntax-object literal. A syntax-object literal becomes part of the
procedure closure, but in a special way that nomrally allows syntax
objects to be loaded on demand. For now, specialization counts as
a demand of the syntax object.
Merge to v6.4
When a procedure created by `unsafe-{chaperone,impersonate}-procedure`
is given the wrong number of arguments, the original procedure's name
should be used in the error message.
During inlining, the type information gathered in
code that was inside the lambda is copied to the outer
context. But the coordinates of the type information
were shifted in the wrong direction, so the type was
assigned to the wrong variable.
This bug is difficult to trigger, so the test is convoluted.
Merge to v6.4
Add 'module-body-inside-context, 'module-body-outside-context, and
'module-body-context-simple? properties to the expansion of a
`module` form. These properties expose scopes that are used by
`module->namespace` and taht appear in marshaled bytecode.
Made the hash-set chaperones essentially forward the hash chaperone
operations, but now explain them all in terms of set-based operations
in the docs.
Also adjusted value-blame and has-blame? to support late-neg projections
When a module defines <name-1> and doesn't export it, but when
the module imports <name-2> and re-exports that refers to another
module's definition of <name-1>, then <name-1> wasn't properly
registered as an unexported binding.
Most of the implementation change is just a clean-up of an
unnecessary traversal from before the addition of a `count`
field in each hash table.
Also, add `#:skip-filtered-directory?` to `find-files`.
Less significantly, adjust `pathlist-closure` to be consistent in the
way that it includes a separator at the end of a directory path.
Although `procedure-specialize` should be useful in places where
inlining does not apply, allowing inlining and related optimizations
through it, anyway.
The strategy of converting a bignum to a flonum by converting on word
boundaries can lose one bit of precision. (If the use of a word
boundary causes a single bit to get rounded away, but the first bit of
the next word is non-zero, then the rounding might have been down when
it should have been up.)
Avoid the problem by aligning relative to the high bit, instead.
Fix even basic readind when extflonums are not supported, but
also fix reading extflonums with large exponents (related to
the other recent changes to number parsing).
Allow a more dynamic (than `impersonator-prop:application-mark`)
determination of continuation marks and associated values to wrap the
call of an impersonated procedure.
When an internal-definition context is used with `local-expand`, the
any binding added to the context affect expansion, but the binding do
not appear in the expansion. As a result, Check Syntax was unable to
draw an arrow from the `s` use to its binding in
(class object%
(define-struct s ())
s)
The general solution is to add the internal-definition context's
bindings to the expansion as a 'disappeared-bindings property. The new
`internal-definitionc-context-track` function does that using a new
`internal-definition-context-binding-identifier` primitive.
Repairs 3eb2c20ad0, which used a scope-set comparison for
a table that maps scopes to propagation actions (add, remove,
or flip).
Closes#1113
Merge to v6.3
In `syntax-local-lift-require`, avoid scope adjustments intended
to deal with `require` forms that are compiled in one namespace
and evaluated in another.
When an import is shadowed by another import or by a definition, don't
include it in the set of bindings in the resut of
`syntax-local-module-required-identifiers` or in the set that can be
exported by `all-from-out`.
Merge to v6.3
After some expansions, a expression with the syntax property 'inferred-name of
'x is converted to one with ('x . 'x), so it's not useful to get the name of a
procedure. So we simplify the syntax property 'inferred-name to handle
these cases.
Removing all original module context doesn't work, because it
doesn't distinguish between fragments of syntax that had the
"inside-edge" scope without the "outside-edge" scope.
Record the presence of the outside-edge scope by using the
root scope, and convert the root scope to the current namespace's
outside-edge scope on evaluation.
The bug could cause
#lang racket/base
(define x 'outer)
(define-syntax-rule (def-and-use-m given-x)
(begin
(define-syntax-rule (m)
(let ()
(define given-x 'inner)
x))
(m)))
(def-and-use-m x)
to produce 'inner when it should produce 'outer.
Thanks to Brian Mastenbrook for pointing the problem and
providing examples.
The GC supported allocation for an array of objects where
the first one provides a tag, but at this point it was
used only in some corners. Change those corner and simplify
the GC by removing support for arrays of tagged objects.
The main corner to clean up is in the handling of a macro-expansion
observer and inferred names. Move those into the compile-time
environment. It's possible that name inference has been
broken by the changes, but in addition to passing the tests,
the generated bytecode for the base collections is exactly the
same as before the change.
First bug:
When the optimize converts
(let-values ([(X ...) (values M ...)])
....)
to
(let ([X M] ...)
....)
it incorrectly attached a virtual timestamp to each "[X M]" binding
that corresponds to the timestamp after the whole `(values M ...)`.
The solution is to approximate tracking the timestamp for invidual
expressions.
Second bug:
The compiler could reorder a continuation-capturing expression past
an allocation.
The solution is to track allocations with a new virtual clock.
Make `eval-syntax`, `compile-syntax`, and `expand-syntax` more
consistent (with intent and each other) by not installing a fallback
automatically. In particular, a fallback is not installed for a
`module` form, so that different ways of expanding a `module` form
produce consistent results (e.g., for ambiguous bindings).
Test added in 8ee717520f was broken, because it used
`(current-milliseconds)` instead of `(current-ienxact-milliseconds)`
to construct an argument to`alarm-evt`'
Refine the changes in 16c198805b so that `(define id ... id ... )` at
the top level compiles more consistently when `id` is an identifier
whose lexical context does not include `#%top`.
When `compile` is used on a top-level definition, do not
create a binding in the current namespace, but arrange for
a suitable binding to be in place for the target namespace.
Closes#1036
This repair adjusts the bug fix of commit 769ad3e98. That older commit
ensured that `sync/enable-break` doesn't both break and accept a
channel message or semaphore wait. But it effectively disables those
actions if the break is continued.
Instead of (partially!) ending the `sync` get out of semaphore
and channel queues so that no event can be selected during
the break, and then get back in line if the break is continued.
When a path is made relative for marshaling to bytecode, record
a list of byte strings in stead of a platform-specific relative
path.
For syntax-object source locations, convert any non-relative path to a
string that shows just the last couple of path elements preceded by
".../". This conversion avoids embedding absolute paths in bytecode,
but at the cost of some information. A more complete and consistent
solution would invove using a module-path index instead of a path, but
that would be a big change at several layers.
The `prop:expansion-contexts` property can control the expansion
of a rename transformer in much the same that conditionals on
`(syntax-local-context)` can control the expansion of other
transformers.
The `from` string argument is converted to a regexp and cached. When `from` is
a mutable string this can cause wrong results in the following calls
to string-replace. So the string is first converted to an immutable string to
be used as the key for the cache.
Progress toward making the bytecode compiler deterministic, so that a
fresh `make base` always produces exactly the same bytecode from the
same sources. Most changes involve avoiding hash-table order
dependencies and adjusting scope identity. The namespace used to load
a reader extension is also better defined. Plus many other little
changes.
The identity of a scope that is unmarshaled from a bytecode file now
incorporates the hash of the file, and the relative order of scopes is
preserved in a bytecode file. This combination allows compilation to
start with modules that loaded and compiled in different orders
(including delayed loading of bytecode fragments within one file).
Formerly, a reader extension triggered by `#lang` or `#reader` was
loaded in whatever namespace happens to be current. That's
unpredictable and can pollute a module build at the level of bytecode.
To help make builds deterministic, reader extensions are now loaded in
a root namespace of the current namespace.
Deterministic compilation in general relies on deterministic macros.
The two most common ways for a macro to be non-deterministic are by
using `gensym` (use `generate-temporaries`, instead) and by using an
unsorted hash-table traversal (don't do that).
At this point, bytecode generation is unlikely to be completely
deterministic, since I uncovered non-determinism mostly by iterating
attempts over the base collections. For now, the intent is not to
provide guarantees outside of the compilation of the base collections
--- but "more deterministic" is likely to be useful in the short run,
and we can improve further in the long run.