The test cases relied on arity and changing log to have an arity
of both 1 and 2 activated new (incorrect) tests. I fixed the
incorrect tests as they applied to the log function.
Make `log` in `racket/base` optionally accept a second argument.
The second argument is the log `base`. The docs also recommend
`fllogb` when precision is important.
* Error message when base is 1
* Added docs.
* Add tests.
Optimization to convert `(hash-ref <ht> <key> (lambda () <constant>))`
to `(hash-ref <ht> <key> <constant>)` didn't check that the `lambda`
for had zero argument.
Closes#1648
Fix problem with once-use tracking and delayed variable-use marking
that is performed for local function bodies. A delayed variable-use
registration might happen after a once-used variable is replaced by
its use.
This scenario is difficult to provoke, because the optimizer has to
first decide not to move a once-use function, and in a latter pass
decide to move it after all. There's not enough information to
retract the tentative use plus its transitive implications.
The solution is to avoid the generic once-use layer for `lambda` forms
whose uses are delayed (and that likely has a good effect on inlining
anyway). The other half of the solution is to avoid transitive use
marking on a once-used variable whose expression has been moved (and
there are no transitive things to skip, because that expression isn't
a `lambda` form).
When the second argument to `bytes-set!` is a reference to a
module-level variable that is definitely defined but not a known
constant, then an incorrect reordering was used that would cause
the third argument value to get overwritten before the call.
Closes#1601
The primitive `read` uses a shortcut --- a private "ungetc"
implementation --- that did not count position correctly for
non-ASCII characters.
Closes#1599
The documentation says that it should work on any output port,
although there's special treatment of ports that originate
from `pretty-print` itself.
Closes#1579.
After some reductions, the new rator advance less the effect
clocks than the original rator. For example in
(equal? x 7) ==> (eq? x 7)
(my-struct? x) ==> #t or #f
The lambdas can be marked as single valued and/or mark preserving.
With this information is possible to remove unnecessary wrapping
like the `values` in
(let ([f (lambda () '(1))])
(display f f)
(values (f)))
or in reductions like
(car (list (f))) ==> (values (f)) ==> (f)
Moreover, this is useful to test that the optimizer has marked
correctly the function f as single valued and mark preserving.
If a module has any sort of complex bindings, such as a definition of
a macor-introduced identifiers, then `module->namespace` and variants
(like `variable-reference->namespace`) need to recreate suitable
bindings. Make sure that the module-path index for recreated bindings
is the run-time one, not the compile-time one.
Closes#1584
To avoid moving expressions that may have a side effect, the optimizer must
recognize that in this position this will cause an error and advance
the virtual clock.
Currently the only primitive that is flagged as SCHEME_PRIM_IS_OMITABLE and
may have multiple return values is `values`.
Thanks to Robby for finding the original version of the test.
When a hash table or other special value appears immediately on the
right-hand side of `define-values`, it needs to be protected by an
explicit quote when writing to bytecode.
Closes#1580
Continuing the saga that includes 8190a7730d and d1ba9fbb6e, it turns
out that a 0-binding clause as the last one isn't so special after
all. A little later in the optimizer, now that we're sometimes moving
an error to the body, we can't assume that the body can be discard
if an error was detected.
Set up bindings and shift phases as needed to make
`variable-reference->namespace` work in a run-time position when the
enclosing module is instantiated at a phase other than 0.
Thanks to Rohin Shah for the bug report.
Support an external implementation of `read-syntax` by exposing
functionality that is currently internal to `read-syntax`: a srcloc
argument to a "special"-producing port function and wrapping special
results to reliably distinguish them from characters.
When multiple-binding `let-values` form is split into a single-binding
form on the grounds that the right-hand side will definitely error,
the optimizer's effect clocks were advance incorrectly.
Closes#1552
For example, an `unsafe-unbox` call should not be moved past the
call to an unknown function that might change a box's content.
Thanks to Sergey Pinaev for the report.
The objective of lookup_constant_proc and the first part of
optimize_for_inline was to find out if the value of an expression was a
procedure and get it to analyze its properties or try to inline it. Both
were called together in a few places, because each one had some special
cases that were missing in the other.
So, move the lookup and special cases from optimize_for_inline to
lookup_constant_proc, and keep only the code relevant to inlinig in
optimize_for_inline.
Both function have a similar purpose and implementation, so merge them to consider
all the special cases for both uses.
In particular, detect that:
(if x (error 'e) (void)) is single-valued
(with-continuation-mark <chaperone-key> <val> <omittable>) is not tail sensitive.
Also, as ensure_single_value was checking also that the expression was has not a
continuation mark in tail position, it added in some cases an unnecessary
wrapper. Now ensure_single_value checks only that the expression produces
a single vale and a new function ensure_single_value_noncm checks both
properties like the old function.
Adjust list and stream handling as sequences so that during the body
(for ([i (in-list l)])
....)
then `i` and its cons cell in `l` are not implicitly retained while
the body is evaluated. A `for .... in-stream` similarly avoids
retaining the stream whose head is being used in the loop body.
The `map`, `for-each`, `andmap`, and `ormap` functions are similarly
updated.
The `make-do-sequence` protocol allows an optional extra result so
that new sequence types could have the same properties. It's not clear
that using `make-do-sequence` is any more useful than creating the new
sequence as a stream, but it was easier to expose the new
functionality than to hide it.
Making this work required a repair to the optimizer, which would
incorrectly move an `if` expression in a way that could affect
space complexity, as well as a few repairs to the run-time system
(especially in the vicinity of the built-in `map`, which we should
just get rid of eventually, anyway).
Compile a `for[*]/list` form to behave more like `map` by `cons`ing
onto a recursive call, instead of accumulating a list to reverse.
This style of compilation requires a different strategy than before.
A form like
(for*/fold ([v 0]) ([i (in-range M)]
[j (in-range N)])
j)
compiles as nested loops, like
(let i-loop ([v 0] [i 0])
(if (unsafe-fx< i M)
(i-loop (let j-loop ([v v] [j 0])
(if (unsafe-fx< j N)
(j-loop (SEL v j) (unsafe-fx+ j 1))
v))
(unsafe-fx+ i 1))
v))
instead of mutually recursive loops, like
(let i-loop ([v 0] [i 0])
(if (unsafe-fx< i M)
(let j-loop ([v v] [j 0])
(if (unsafe-fx< j N)
(j-loop (SEL v j) (unsafe-fx+ j 1))
(i-loop v (unsafe-fx+ i 1))))
v))
The former runs slightly faster. It's difficult to say why, for
certain, but the reason may be that the JIT can generate more direct
jumps for self-recursion than mutual recursion. (In the case of mutual
recursion, the JIT has to generate one function or the other to get a
known address to jump to.)
Nested loops con't work for `for/list`, though, since each `cons`
needs to be wrapped around the whole continuation of the computation.
So, the `for` compiler adapts, depending on the initial form. (With a
base, CPS-like approach to support `for/list`, it's easy to use the
nested mode when it works by just not fully CPSing.)
Forms that use `#:break` or `#:final` use the mutual-recursion
approach, because `#:break` and #:final` are easier and faster that
way. Internallt, that simplies the imoplementation. Externally, a
`for` loop with `#:break` or `#:final` can be slightly faster than
before.
The optimizer can detect that some expressions will escape through
an error, and it can discard surrounding code in that case. It should
not change the tailness of a `with-continuation-mark` form by
liftng it out of a nested position, however. Doing so can eliminate
stack frames that should be visible via errotrace, for example.
This change fixes the optimizer to wrap an extra `(begin ... (void))`
around an expression if it's lifted out of a nested context and
might have a `with-continuation-mark` form in tail position.
In
(with-syntax ([x ....])
#'(x y))
and property on the source syntax object `(x y)` was lost in
constructing a new syntax object to substitute for `x`, while
properties on preserved literal syntax objects, such as `y`
were intact. Change `syntax` to preserve properties for
reconstructed parts of the template.
This change exposes a problem with 'transparent taint modes,
where the internal "is original?" property was preserved while
losing scopes that wuld cancel originalness. So, that's fixed
here, too.
extend optimize_ignore to go inside expressions with
begin, begin0 and let.
Also, try to reuse begin's in the first argument of
make_discarding_sequence.
If a mutable hash table changes while it's being printed,
various parts of the printing function could see a mismatch
between the current size and an old array size. To avoid this
problem, extract the size whenever extracting the array.
An identifier that gets a module context via `module->namespace` plus
`namespace-syntax-introduce` should not count as having the module as
its source as reported by `syntax-source-module`.
The correct behavior happened for the wrong reason prior to commit
cb6af9664c.
Closes#1515
For a template expression that involevs ellipses, a wrapper is added
to catch failures an report as an "incompatible ellipsis match count"
error. The wrapper was only added when there are multiple pattern
variables with ellipses, but it turns out that it's possible to fail
with incompatible counts using a single pattern variable.
Besides handlign that case, the revised check avoids an unnecessary
wrapper in cases where multiple pattern variables have ellipses but
they are used independently in a template.
Closes#1511
Without this repair,
#lang racket/base
(require 2htdp/abstraction)
(for/list ((dropping-which-one (in-naturals)))
1)
fails to compile with a "optimizer clock tracking has gone wrong"
error. A variant of this test (that doesn't depend on `2htdp`)
is now in the "optimize.rktl"; a simpler and more direct test
should be possible, but I wasn't able to construct one.
When a thread that is blocked on a set of semaphores and channels
is suspended and resumed after one of the events becomes ready,
and if the event has a wrapper function, then the wrapper was
not applied and the event selection was not reported correctly.
Thanks to Philip McGrath for reporting the problem.
This kind of reductions were applied only when x or y was a constant.
Classify the relevant predicates in 4 categories. In particular,
if <expr> satisfy pred? we can use this classification to apply
the correct reduction:
(equal? <expr> y) ==> [no reduction, unless y has a different type]
(equal? <expr> y) ==> (eqv? <expr> y)
(equal? <expr> y) ==> (eq? <expr> y)
(equal? <expr> y) ==> (begin <expr> (pred? y))
Also, add a new primitive interned-char? that is hidden, but it's
useful to track in the optimizer the the chars? with a value < 256
that are interned because they are treated specially, and if they
are equal? then they are eq?.
... + prefix-in + relative-path module. All of those ingredients
(or some similar alternatives) are necessary to trigger a slow
way of saving module context for interaction evaluation where
a module-path index shift was getting lost.
Previously the relevant predicates where disjoint, and until this commit
the only predicate that recognizes #f was `not`. So it's necessary to fix
two reductions to allow other predicates that recognize #f, like `boolean?`.
Add a hidden `true-object?` primitive that recognizes only #t, that is also
useful to calculate unions and complements with `boolean?` and `not`.
Also, extend a special case for expressions like
(or (symbol? x) (something))
where the optimizer is confused by the temporal variable that saves the
result of `(symbol? x)`, and the final expression is equivalent to
(let ([temp (symbol? x)])
(if temp #t (something)))
This extension detects that the temporal variable is a `boolean?` and
reduces the expression to
(if (symbol? x) #t (something))
* Wrong contract for syntax-local-value in the documentation.
* Clarified signature in documentation for expand-import, expand-export and pre-expand-export
* Corrected typo in documentation for "for".
* Fixed error message for function which seems to have been renamed in the docs
* Fixed typo in a comment in the tests
* Fixed a typo in the documentation for set-subtract.
* Use double ellipses for the free-id-table-set*, free-id-table-set*!, bound-id-table-set* and bound-id-table-set*! operations
The optimizer assumed a fixnum result if either argument to
`bitwise-and` implies a fixnum result. That's not correct if the
fixnum agument is negative.
Thanks to Peter Samarin for a bug report.
Merge to v6.7
When calling a procedure that is attached as a
`prop:rename-transformer` property value, make sure that
any available expansion context is accessible as reflected by
`(syntax-transforming?)`.
Syntax parameters as rename transformers particularly rely on that
information for local expansion.
Thanks to Jay for the "stxparam.rktl" test.
Closes#1479
In a pattern like
a*b
a naive attempt to match will take quadratic time on an input that
contains all "a"s an no "b". To improve that case, the regexp compiler
detects that a match will require a "b" and checks the input for a "b"
to enable linear-time failure.
That optimization mishandled `(?!...)` and `(?<!...)` patterns,
treating the must-not-match subpatterns as things that must match.
So,
(regexp-match "a*(?!b)" "aaaxy")
returned false, because the input doesn't contain "b".
Thie commit repairs the optimization.
Closes#1468
Fix a regression relative to v6.4 caused by a refactoring of the
compiler between v6.4 and v6.5. The refactoring lost information about
letrecs that are converted internally to let* when a mutable variable
is involved, and it ends up allocating a closure before the box of a
mutable variable that is referenced by the closure. Something like
`with-continuation-mark` is needed around the closure's `lambda` to
prevent other optimizations from hiding the bug.
Closes#1462
The `if` case of the compiler's space-safety pass abused its "last
non-tail call relative to the closest enclosing binding" state as
"last non-tail call relative to the enclosing run time", which could
cause it to not clear a stack position as needed to maintain space
safety.
This makes two changes to the forms in racket/splicing to adjust how
syntax properties are propagated through expansion:
1. Uses of make-syntax-introducer are passed #t as the first argument,
which causes the introduced scope to be consider a use-site rather
than macro-introduction scope. This prevents syntax objects from
being unnecessarily marked as unoriginal, in the syntax-original?
sense.
2. Uses of syntax/loc have been adjusted to copy syntax properties
from original syntax objects, which were previously discared. Forms
that were spliced into the surrounding context, such as begin,
define-values, and define-syntaxes, recreated the top-level syntax
objects, which did not preserve syntax properties from the
originals.
This is not a perfect solution, mostly because it potentially elides
properties that may be associated with captured literals (that is,
properties attached directly to begin, define-values, or define-syntaxes
identifiers themselves). However, it seems to accommodate most of the
common use-cases: propagation of syntax-original?-ness and forms like
`struct`, which attach properties like 'sub-range-binders.
fixes#1410
A `struct-copy` form can generates a call for a constructor that
includes a sequence of `unsafe-struct-ref` arguments. Each
`unsafe-struct-ref` must still check for a chaperone. Make the JIT
recognize that pattern an turn it into a single test instead of
one test per `unsafe-struct-ref`.
Make the optimizer recognize and track `make-struct-property-type`
values, and use that information to recognize `make-struct-type`
calls that will defnitely succeed because a property that hs no
guard is given a value in the list of properties.
Combined with the change to require-keyword expansion, this
change allows the optimizer to inline `f` in
(define (g y)
(f #:x y))
(define (f #:x x)
(list x))
because the `make-struct-type` that appears between `g` and `f`
is determined to have no side-effect that would prevent `f` from
having its expected value.
When a module defines and exports an identifier at two phases,
and when another module imports both of them at the same phase,
an error was not reported as it should have been.
Some expressions are omittable only when the arguments have certain types.
In this case the application is marked with APPN_FLAG_OMITTABLE instead of relaying on the flags of the primitive.
The optimizer can't use this flag to move the expression inside a lamba or across a potential continuation capture, unlike other omittable expressions. They can be moved
only in more restricted conditions.
For example, in this program
#lang racket/base
(define n 10000)
(define m 10000)
(time
(define xs (build-list n (lambda (x) 0)))
(length xs)
(define ws (list->vector xs)) ; <-- omittable
(for ([i (in-range m)])
(vector-ref ws 0))) ; <-- ws is used once
If the optimizer moves the expression in the definition of ws inside the recursive
lambda that is created by the for, then the code is equivalent to:
#lang racket/base
(define n 10000)
(define m 10000)
(time
(define xs (build-list n (lambda (x) 0)))
(length xs)
(for ([i (in-range m)])
(vector-ref (list->vector xs) 0))) ; <-- moved here
And the new code is O(n*m) instead of O(n+m). This example is a minimized version
of the function kde from the plot package, where n=m and the bug changed the run
time from linear to quadratic.
The application of some procedures are omittables when the arguments have
certain properties. Check the arity of the procedure before marking the application as omittable.
The only case that appears to be relevant is the expression (-).
The relevant predicates are almost disjoint. The superposition
is solved with predicate_implies and predicate_implies_not.
This is also valid considering the equivalence classes modulo
eqv? and equal?. So if the optimizer knows that two expressions
X and Y have different relevant types, then it can reduce
(equal? X Y) ==> (begin X Y #f).
Allow a `struct` form to be recognized when it provides
a number as the 8th argument to `make-struct-type`. In
particular, that change allows the construction of
optional-keyword functions to be recognized as a
purely functional operation.
Also, allow the optimizer to use information about imports
when deciding whether a module-level form is functional.
It's ok to use that information, because the validator has
it, too.
This combination of changes allows something like
(define (f #:optional [x #f])
(later))
(define (later) ....)
to compile to a reference to `later` wihout a check.
Due to an obvious problem in the setup, the letrec-check pass wasn't
running an intended dead-code pruning pass. Correcting the problem
cuases one test in "optimize.rktl" to change, because the letrec-check
pass can see more in one case than thanother.
(Problem discovered by accidentally fixing the setup in a Racket
branch based on "linklets".)
Along with the `PLT_COMPILED_FILE_CHECK` environment variable, allows
the timestamp check to be disabled when deciding whether to use a
compiled bytecode file.
In accomodating this change, `raco make` and `raco setup` in all modes
check whether the SHA1 hash of a module source matches the one
recorded in its ".dep" file, even if the timestamp on the bytecode
file is newer. (If the compile-file check mode is 'exists, the
timestamp is completely ignored.)
After refactoring the test for the inferred types of some procedures that
use vector?/bytes?/string?/list? it was easier to spot the missing information.
Note that in the documentation, some arguments like the position in
(vector-ref <vector> <position>)
are documented as exact-nonnegative-integer? but due to the implementation
details they are actually in a subset of fixnum?s.
When `read` parses a literal hash table, it inserts an placeholder
just in case it's needed for cycles. The `unsafe-immutable-hash-...`
operations in some cases did not detect and remove the placeholder.
Closes#1376
Merge to v6.6
The namespace returned by `variable-reference->namespace` (or
`namespace-anchor->namespace`) may be used via `eval` to define new
bindings, so enable top-level binding support for the namespace.
for example, make the optimizer convert something like
(struct a (x))
(lambda (v) (if (a? v) (a-x v) #f))
to
(struct a (x))
(lambda (v) (if (a? v) (unsafe-struct-ref v 0) #f))
The optimizer change in e887fa56d1 recognized struct declarations that
involved only whitelisted properties to guarantee that constructor
properties are preserved --- while `prop:chaperone-unsafe-undefined`
can affect the constructor, and other properties might imply that one.
But the optimizer's transformer aren't actually invalidated by
`prop:chaperone-unsafe-undefined`; the JIT's assumptions are affected,
but that's handled in a different way. So, remove the whitelist and
allow any property list.
The optimizer tries to reduce the `if` assuming that the result will be used.
In case it later detects that the result will be ignored, it can try to
apply some additional reductions to the branches and to the whole expression.
This function exposes the fast subset operation that is built in for
immutable hash tables (and used by the set-of-scopes implementation).
Also, make the space optimization implicit for `eq?`-based hash tables
that contain only #t values (instead of explicit and only available
internally). It turns out to be easy and efficient to make the
representation automatic, because the HAMT implementation can support
a mixture of nodes with some containing explicit values and some
containing implicit #t values.
When the properties argument for `make-struct-type` is non-empty,
then we cant; guarantee that `make-struct-type` succeeds, but
if it does, then we can still know that the result is a structure
type and (as long as `prop:chaperone-unsafe-undefined` is not
involved) the properties don't affect the constructor, predicate,
selector, or mutators.
This fixes an immediate problem, but the macro expander should have
complained about an unbound `maybe` at phase 2. (A new implementation
of the macro expander detected the unbound `maybe`.)
When an array value is provided, make sure that it's an array
with at least the expected length (or longer) and same element
layout. That's weaker than checking that the array elements have
the right type, because an `eq?` check at the ctype layer seems
too strong, and the ctype API doesn't provide enough information
for a more flexible equality.
A phase shift was mising on `begin-for-syntax`es introduced by
`syntax-local-lift-module-end-declaration`, which is in turn
used to implement` module+`, so `module+` didn't work under
two or more `begin-for-syntaxes`.
Closes#1312
Syntax objects generally make sense as properties in other syntax
objects, but they require special care when marshaling to bytecode
(as syntax objects do in general). To make that special handling
possible and reliable, constrain the shape of allowed values.
The name `path-extension` created a conflict for an existing
registered package, so it should not have been added to
`racket/path`.
Also, `path-get-extension` was intended to work on a path
that is syntactically a directory, so fix and test that.
Change the one expansion mode as far as I can tell) that disables
lifts so that lifts are now allowed, which means that
`(syntax-transforming?)` implies `(syntax-transforming--with-lifts?)`.
The old documentation incorrectly characterized when lifts
were allowed. Ryan noticed the documentation problem, and that
observation led to this simplication.
A `#:name` identifier picks the name that is bound to static
information about a structure type. An `#:extra-name` identifier
specifies an additional name to be bound to the information.
This pair of options is analogous to `#:constructor-name`
and `#:extra-constructor-name`.
Based on Jen Axel's suggestion and implementation.
Closes#1309
Provide a cleaned-up set up path-extension functions. In contrast
to `path-{add,replace}-suffix` and `filename-extension`, a dot
at the beginning of a path element is not treated as an extension
separator. Also, `path-extension` returns an extension including
its separator, which is more consistent with other extension
functions.
The new `path-has-extension?` function replaces many uses of
regexp matching in the base collections.
Closes#1307
Reduce (unbox (box x)) => x
Extend the reductions for cXr to the unsafe versions, for
example reduce (unsafe-car (cons x y)) => x
Check and save types in unsafe operations
Pass a string to the handler to describe the problem.
Also, fix minor issues (GC registration, contracts and `history`
in docs) and make `pregexp`, etc., report compilation errors as
`pregexp`, etc.
Allow `system-type` on non-Windows platforms to run `uname` to get
machine information, even in a sandbox or other contexts with a
limiting secutiry guard.
Check that it works to apply a continuation that shares with
an enclosing continuation, where a runstack overflow happens
between the continuations.
Closes PR 15281
While expanding a module, the root of module-relative references is a
fresh notion of "this module".
After expansion, "this module" is shifted to "an expanded module",
which is a global constant (for top-level modules). When an expanded
module is re-expanded, "an expanded module" is shifted to a fresh
"this module" during re-expansion, and so on.
One problem with this approach is that the shift from "this module" to
"an expanded module" isn't applied to syntax properties --- but
there's some extra trickery to make it work out by mutating "this
module" to make it look like "an expanded module".
Submodule expansion introduces an intermediate "parent of this module"
that wasn't currently covered by the extra trickery, so fix that.
Repair a mismatch between `syntax-local-lift-expression` and the
way that `compile` tries to avoid creating bindings while
compiling a top-level `define` form.
Closes#1284 and #1282
A syntax property is added as preserved or not. For backward
compatibility, the default for a 'paren-shape key is preserved, and
any other key's default is non-preserved.
Cross-module inlining that pulls a variable reference across a
module boundary imposes a more struct requirement that run-time
"constant" detection is consistent with the optimizer's view of
"constant" within a module. So, make sure they're the same.
Sometimes the optimizer removes all the references to a variable but it
doesn't detect that the variable is unused, so it keeps the definition.
Later, the sfs detects the unused variable so it marks it, but it doesn't
remove the let form.
Formerly, cross-module inlining would not work for a function like
(define (f x)
(if .... .... (slow x)))
unless `slow` was also inlined into `f`. This commit changes
cross-module inlining so that it allows a call to `f` to be replaced
with an expression that references other module-level bindings (that
are not primitives), such as `slow`.
Adjusting the inlining rules can always make some program worse. In
this case, a hueristic about whether to export an optimized or
unoptimized variant of a fnuciton for inlining tends to collide with
the adjusted inlining rule, so this commit tweaks that heuristic, too.
Enable the optimizer to figure to figure out that a loop
argument is always a real number, for example, in much the
same way that it can detect fixnums and flonums for unboxing.
Unboxing information was only needed at the resolve level,
but `real?` information is useful only to the optimizer, so
the generalization enables the optimizer to reach
approximations of type information earlier (e.g., among
a subset of a function's arguments).
Simplify `(wcm <k1> <v1> (wcm <k1> <v2> <e>))` to
`(begin <v1> (wcm <k1> <v2> <e>))` for a simple enough <k1>.
A variable simple enough, so this is useful for improving
errortrace output.
Compute an `equal?` hash code for `read`able values that
is a constant, at least for a given version of Racket. Only
(interned) symbols failed to have that property before.
With the old representation of local variables, optimize_info_lookup
had to search the stack for the frame with the information about the
variable. This was complicated so it has many flags to be used in
different situations and extract different kind of information.
With the new representation this process is easier, so it's possible
to split the function into a few smaller functions with an easier
control flow.
In particular, this is useful to avoid marking a variable as used
inside a lambda when the reference in immediately reduced to a
constant using the type information.
The iterator saves the return points in a list. For small immutable hashes,
encode the values in the list in the bits of a fixnum to avoid allocations.
Expose tagged allocation and a function that interprets a description
of tagged shapes. As a furst cut, the description can only specify
constant offsets for pointers within the object, but future extensions
are possible.
When a chaperone-wrapped function leads to a slow-path tail
call, the continuation-mark depth can be made too deep when
resolving the slow tail call.
Closes#1265
Reduce
(eq? v v) ==> #t
(if t v v) ==> (begin t v)
(if v v #f) ==> v
when v is a local or a top level variable.
Previously, the last two reductions were used only
with local variables.
Also, move the (if x #t #f) ==> (not x) reduction
after branch optimization.
When a key is removed at a level that other only has a collision
table, the HAMT representation was not adjusted properly by
eliminating the layer. As aresult, table comparison via
`equal?` could fail. The problem could show up with hash tables
used to represent scope sets, where an internal "subset?" test
could fail and produce an incorrect binding resolution.
The transformation from
(begin (let <bindings> (begin <e1> ...)) <e2> ...)
to
(let <bindings> (begin <e1> ... <e2> ...))
makes things look simpler and might help the optimizer a little. But
it also tends to make the run-time stack deeper, and that slows some
programs a small but measurable amount.
A better solution would be to keep the transformation but add another
pass that moves expressions out of a `let`.
Since this operation only moves the code and doesn't make the final
bytecode bigger, it's not necessary to decrease the fuel and then it
is available for further inlining.
The calculation of used variables in a possibly unused function did
not work right when the function is referenced by a more deeply
nested function that itself is unused. The extra uses triggered by
more nested uses need to be registered as tentative in the more nested
frame, not in the outer frame.
Closes#1247
Correct the second-biggest design flaw in the bytecode optimizer:
instead of using a de Bruijn-like representation of variable
references in the optimizer pass, use variable objects.
This change is intended to address limitations on programs like the
one in
http://bugs.racket-lang.org/query/?cmd=view&pr=15244
where the optimizer could not perform a straightforward-seeming
transformation due to the constraints of its representation.
Besides handling the bug-report example better, there are other minor
optimization improvements as a side effect of refactoring the code. To
simplify the optimizer's implementation (e.g., eliminate code that I
didn't want to convert) and also preserve success for optimizer tests,
the optimizer ended up getting a little better at flattening and
eliminating `let` forms and `begin`--`let` combinations.
Overall, the optimizer tests in "optimize.rktl" pass, which helps
ensure that no optimizations were lost. I had to modify just a few
tests:
* The test at line 2139 didn't actually check against reordering as
intended, but was instead checking that the bug-report limitation
was intact (and now it's not).
* The tests around 3095 got extra `p` references, because the
optimizer is now able to eliminate an unused `let` around the
second case, but it still doesn't discover the unusedness of `p` in
the first case soon enough to eliminate the `let`. The extra
references prevent eliminating the `let` in both case, since that's
not the point of the tests.
Thanks to Gustavo for taking a close look at the changes.
LocalWords: pkgs rkt
Found with `-fsanitize=undefined`. The only changes that are potentially
bug repairs involve some abuses of pointers that can end up misaligned
(which is not an x86 issue, but might be on other platforms). Most of
the changes involve casting a signed integer to unsigned, which
effectively requests the usual two's complement behavior.
Some undefined behavior still present:
* floating-point operations that can divide by zero or coercions
from `double` to `float` that can fail;
* offset calculations such as `&SCHEME_CDR((Scheme_Object *)0x0)`,
which are supposed to be written with `offsetof`, but using
a NULL address composes better with macros.
* unaligned operations in the JIT for x86 (which are ok, because
they're platform-specific).
Hints for using `-fsanitize=undefined`:
* Add `-fsanitize=undefined` to both CPPFLAGS and LDFLAGS
* Add `-fno-sanitize=alignment -fno-sanitize=null` to CPPFLAGS to
disable those checks.
* Add `-DSTACK_SAFETY_MARGIN=200000` to CPPFLAGS to avoid stack
overflow due to large frames.
* Use `--enable-noopt` so that the JIT compiles.
The `alarm-evt` tests are inherently racy, since they depend on
the scheduler polling quickly enough. The old time values were
close enough that a test failure is particularly likely on
Windows, where the clock resolution is around 16ms. To reduce
failures, make the time differents much bigger.
Closes issue #1232
A reference to a local may be reduced in a branch to a constant, while it's unchanged in the
other because the optimizer has different type information for each branch. Try to use the
type information of the other branch to see if both branches are actually equivalent.
For example, (if (null? x) x x) is first reduced to (if (null? x) null x) using the type
information of the #t branch. But both branches are equivalent so they can be
reduced to (begin (null? x) x) and then to just x.
The functions expr_implies_predicate was very similar to
expr_produces_local_type, and slighty more general.
Merging them, is possible to use the type information
is expressions where the optimizer used only the
local types that were visible at the definition.
For example, this is useful in this expression to
transform bitwise-xor to it's unsafe version.
(lambda (x)
(when (fixnum? x)
(bitwise-xor x #xff)))