With the old representation of local variables, optimize_info_lookup
had to search the stack for the frame with the information about the
variable. This was complicated so it has many flags to be used in
different situations and extract different kind of information.
With the new representation this process is easier, so it's possible
to split the function into a few smaller functions with an easier
control flow.
In particular, this is useful to avoid marking a variable as used
inside a lambda when the reference in immediately reduced to a
constant using the type information.
The iterator saves the return points in a list. For small immutable hashes,
encode the values in the list in the bits of a fixnum to avoid allocations.
Expose tagged allocation and a function that interprets a description
of tagged shapes. As a furst cut, the description can only specify
constant offsets for pointers within the object, but future extensions
are possible.
When a chaperone-wrapped function leads to a slow-path tail
call, the continuation-mark depth can be made too deep when
resolving the slow tail call.
Closes#1265
Mostly just fill in some corners, but also fix a bug with lifted
functions that accepted a boxed argument and have less than three
arguments total.
The `tests/racket/test` test suite now passes with
`PLT_RECOMPILE_COMPILE` set --- except for the "optimize.rktl" test
suite, wher emore work is needed to ensure that optimizations
don't get lost.
Reduce
(eq? v v) ==> #t
(if t v v) ==> (begin t v)
(if v v #f) ==> v
when v is a local or a top level variable.
Previously, the last two reductions were used only
with local variables.
Also, move the (if x #t #f) ==> (not x) reduction
after branch optimization.
When a key is removed at a level that other only has a collision
table, the HAMT representation was not adjusted properly by
eliminating the layer. As aresult, table comparison via
`equal?` could fail. The problem could show up with hash tables
used to represent scope sets, where an internal "subset?" test
could fail and produce an incorrect binding resolution.
The transformation from
(begin (let <bindings> (begin <e1> ...)) <e2> ...)
to
(let <bindings> (begin <e1> ... <e2> ...))
makes things look simpler and might help the optimizer a little. But
it also tends to make the run-time stack deeper, and that slows some
programs a small but measurable amount.
A better solution would be to keep the transformation but add another
pass that moves expressions out of a `let`.
Since this operation only moves the code and doesn't make the final
bytecode bigger, it's not necessary to decrease the fuel and then it
is available for further inlining.
The calculation of used variables in a possibly unused function did
not work right when the function is referenced by a more deeply
nested function that itself is unused. The extra uses triggered by
more nested uses need to be registered as tentative in the more nested
frame, not in the outer frame.
Closes#1247
On Unix and OS X, the check to avoid replacing an existing
file or directory is made by Racket, rather than the OS,
so don't claim a system error if the operation fails for
that reason.
Also, update the docs to clarify that the check is not
atomic with the move.
Closes issue #1158
Correct the second-biggest design flaw in the bytecode optimizer:
instead of using a de Bruijn-like representation of variable
references in the optimizer pass, use variable objects.
This change is intended to address limitations on programs like the
one in
http://bugs.racket-lang.org/query/?cmd=view&pr=15244
where the optimizer could not perform a straightforward-seeming
transformation due to the constraints of its representation.
Besides handling the bug-report example better, there are other minor
optimization improvements as a side effect of refactoring the code. To
simplify the optimizer's implementation (e.g., eliminate code that I
didn't want to convert) and also preserve success for optimizer tests,
the optimizer ended up getting a little better at flattening and
eliminating `let` forms and `begin`--`let` combinations.
Overall, the optimizer tests in "optimize.rktl" pass, which helps
ensure that no optimizations were lost. I had to modify just a few
tests:
* The test at line 2139 didn't actually check against reordering as
intended, but was instead checking that the bug-report limitation
was intact (and now it's not).
* The tests around 3095 got extra `p` references, because the
optimizer is now able to eliminate an unused `let` around the
second case, but it still doesn't discover the unusedness of `p` in
the first case soon enough to eliminate the `let`. The extra
references prevent eliminating the `let` in both case, since that's
not the point of the tests.
Thanks to Gustavo for taking a close look at the changes.
LocalWords: pkgs rkt
Found with `-fsanitize=undefined`. The only changes that are potentially
bug repairs involve some abuses of pointers that can end up misaligned
(which is not an x86 issue, but might be on other platforms). Most of
the changes involve casting a signed integer to unsigned, which
effectively requests the usual two's complement behavior.
Some undefined behavior still present:
* floating-point operations that can divide by zero or coercions
from `double` to `float` that can fail;
* offset calculations such as `&SCHEME_CDR((Scheme_Object *)0x0)`,
which are supposed to be written with `offsetof`, but using
a NULL address composes better with macros.
* unaligned operations in the JIT for x86 (which are ok, because
they're platform-specific).
Hints for using `-fsanitize=undefined`:
* Add `-fsanitize=undefined` to both CPPFLAGS and LDFLAGS
* Add `-fno-sanitize=alignment -fno-sanitize=null` to CPPFLAGS to
disable those checks.
* Add `-DSTACK_SAFETY_MARGIN=200000` to CPPFLAGS to avoid stack
overflow due to large frames.
* Use `--enable-noopt` so that the JIT compiles.
In some cases, for example while using no_types, the optimizer can try to
add again the type information of a local variable. This creates unnecessary
internal storage to save the repeated information.
A reference to a local may be reduced in a branch to a constant, while it's unchanged in the
other because the optimizer has different type information for each branch. Try to use the
type information of the other branch to see if both branches are actually equivalent.
For example, (if (null? x) x x) is first reduced to (if (null? x) null x) using the type
information of the #t branch. But both branches are equivalent so they can be
reduced to (begin (null? x) x) and then to just x.
The functions expr_implies_predicate was very similar to
expr_produces_local_type, and slighty more general.
Merging them, is possible to use the type information
is expressions where the optimizer used only the
local types that were visible at the definition.
For example, this is useful in this expression to
transform bitwise-xor to it's unsafe version.
(lambda (x)
(when (fixnum? x)
(bitwise-xor x #xff)))
Support "Unix-style" (as opposed to "in-place") installation for
OS X, which is mostly a matter of putting ".app" files in the
right place and correcting relative references.
Intended to fix#1180
This code uses call-with-values and case-lambda to check the number of
values that returns the original function inside the contract.
The case-lambda create new closures because they have references
to local variables.
In these case, it's possible to avoid the creation of closure saving the
results in temporal variables, that are used later outside the case-lambda.
drop the tail call fanciness
"simple enough", for now, means that it is a struct selector, predicate,
constructor, or mutator. Perhaps we will learn more about such simple
procedures where this is safe some other way.
This commit speeds up this program:
#lang racket/base
(require racket/contract/base)
(struct s (x))
(define f (contract (-> any/c integer?) s-x 'pos 'neg))
(define an-s (s 1))
(time
(for ([x (in-range 10000000)])
(f an-s)))
by about 1.9x
Skip calling the domain projection in that case and, if all of the
arguments are any/c then also skip putting the contract continuation mark
This appears to give about a 20% speed up on this program:
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c integer?)
(λ (x) 1)
'pos 'neg))
(time
(for ([x (in-range 4000000)])
(f 1)))
The recently added fast path for property-only chaperones did not
propagate the original object in the case that the property-only
chaperone wraps a `chaperone-procedure*` chaprerone.
Merge to v6.4
Fix `procedure-specialize` for a procedure that refers to a
syntax-object literal. A syntax-object literal becomes part of the
procedure closure, but in a special way that nomrally allows syntax
objects to be loaded on demand. For now, specialization counts as
a demand of the syntax object.
Merge to v6.4
After the JIT buffer becomes too full, some paths
don't bail out fast enough, so guard against
broken info in some relatively new uses of the info.
Merge to v6.4
When a procedure created by `unsafe-{chaperone,impersonate}-procedure`
is given the wrong number of arguments, the original procedure's name
should be used in the error message.
This commit, combined with the use of unsafe-chaperone-procedure,
achieves almost the same speedups as c24ddb4a7, but now correctly.
More concretely, this program:
#lang racket/base
(module server racket/base
(require racket/contract/base)
(provide
(contract-out
[f (-> integer? integer?)]))
(define (f x) x))
(require 'server)
(time
(let ([f f]) ;; <-- defeats the plus-one-arity optimiztion
(for ([x (in-range 1000000)])
(f 1) (f 2) (f 3) (f 4) (f 5))))
runs only about 40% slower than the version without the "(let ([f f])"
and this program
#lang racket/base
(module m racket/base
(provide f)
(define (f x) x))
(module n typed/racket/base
(require/typed
(submod ".." m)
[f (-> Integer Integer)])
(time
(for ([x (in-range 1000000)])
(f 1) (f 2) (f 3) (f 4))))
(require 'n)
runs about 2.8x faster than it did before that same set of changes.
for contracts where the arity of the given function is exactly
the arity that the contract expects (i.e. no optional arguments
are turned into madatory or dropped)
During inlining, the type information gathered in
code that was inside the lambda is copied to the outer
context. But the coordinates of the type information
were shifted in the wrong direction, so the type was
assigned to the wrong variable.
This bug is difficult to trigger, so the test is convoluted.
Merge to v6.4
Add 'module-body-inside-context, 'module-body-outside-context, and
'module-body-context-simple? properties to the expansion of a
`module` form. These properties expose scopes that are used by
`module->namespace` and taht appear in marshaled bytecode.
The expression in a `define-runtime-path` form is used in
both a run-time context and a compile-time context. The
latter is used for `raco exe`. In a cross-build context,
you might need to load OpenSSL support for Linux (say)
at build time while generating executables that refer to
Windows (say) OpenSSL support. In that case, `#:runtime?-id`
lets you choose between `(cross-system-type)` and
`(system-type)`.
Merge to v6.4
In particular, instead of going directly back to the chaperone, handle
the case where the function doesn't accept keyword arguments with a
less expensive fallback.
The less expensive fallback uses a case-lambda wrapper (wrapped inside
a make-keyword-procedure) to close over the neg-party and avoid the
chaperone creation. With this commit, the program below gets about 3x
faster, and is only about 20% slower than the version that replaces
the "(let ([f f]) ...)" with its body
#lang racket/base
(module m racket/base
(require racket/contract/base)
(provide (contract-out [f (-> integer? integer?)]))
(define (f x) x))
(require 'm)
(collect-garbage)
(time (for ([x (in-range 5000000)]) (let ([f f]) (f 1))))
Thanks, @samth!
OS X's libssl is deprecated, and it doesn't work with SSL connections
that need SNI. We'll distribute out own libssl builds for OS X via a
package, but we need a native implementation that works well enough to
get that package.
The 'secure protocol symbol is just a shorthand for
`(ssl-secure-client-context)`, but it helps highlight
that the default 'auto isn't secure, and having a plain
symbol smooths the connection to native Win32 and OS X
implementations of SSL.
The getdtablesize() result appears not to be constant on OS X.
Creating some combination of CFStream objects, threads, and CFRunLoop
objects can cause the value to be increased by a factor of 10. Avoid
the need for getdtablesize() by switching from select() to poll().
this yields only a modest speed up and only when iterating
over long sequences, e.g. like in this program:
#lang racket/base
(require racket/sequence racket/contract/base)
(define s (make-string 10000 #\a))
(define str (contract (any/c . -> . (sequence/c char?))
(λ (x) (in-string s))
'pos 'neg))
(time
(for ([x (in-range 100)])
(for ([x (str 0)])
(void))))
The repair involves making `raco exe` detect a sub-submodule
whose name is `declare-preserve-for-embedding` as an indication
that a submodule should be carried along with its enclosing module.
Normally, `define-runtime-module-path-index` would do that, but
the submodule for `place` is created with `syntax-local-lift-module`,
and the point of `syntax-local-lift-module` is to work in a
nested experssion context where definitions cannot be lifted
to the enclosing module.
Made the hash-set chaperones essentially forward the hash chaperone
operations, but now explain them all in terms of set-based operations
in the docs.
Also adjusted value-blame and has-blame? to support late-neg projections
The issue is what happens when the actual function has other arities.
For example, if the function were (λ (x [y 1]) y) then it is not okay
to simply check if procedure-arity-includes? of 1 is true (what the
code used to do) because then when the function is applied to 2
arguments, the call won't fail like it should. It is possible to check
and reject functions that don't have exactly the right arity, but if
the contract were (-> string? any), then the function would have been
allowed and only when the extra argument is supplied would the error
occur. So, this commit makes it so that (-> any/c any) is like
(-> string? any), but with the optimization that if the procedure
accepts only one argument, then no wrapper is created.
This is a backwards incompatible change because it used to be the
case that (flat-contract? (-> any)) returned #t and it now returns #f.
When a module defines <name-1> and doesn't export it, but when
the module imports <name-2> and re-exports that refers to another
module's definition of <name-1>, then <name-1> wasn't properly
registered as an unexported binding.
Most of the implementation change is just a clean-up of an
unnecessary traversal from before the addition of a `count`
field in each hash table.
Also, add `#:skip-filtered-directory?` to `find-files`.
Less significantly, adjust `pathlist-closure` to be consistent in the
way that it includes a separator at the end of a directory path.
When a tree marshaled to bytecode form has many shared pieces,
the unmarshaling process can lose track of the sharing in one
place and traverse too much of the structure as pieces are
loaded.
I was using match-let and got a syntax error that pointed to this file. After changing the match-let definition to use syntax/loc the error pointed to the exact spot causing the problem. Yay!
I changed quite a few vanilla syntax-quotes to the *syntax/loc form... perhaps some do not need to? I'm not sure.
added back nested syntax/loc
This program runs about 10x faster than it did before this commit, but
seems to still be about 100x slower than the version where you change
an-s to just be (s).
#lang racket/base
(require racket/contract/base racket/generic)
(define-generics id [m id x])
(struct s () #:methods gen:id [(define (m g x) x)])
(define an-s
(contract (id/c [m (-> any/c integer? integer?)])
(s)
'pos 'neg))
(time
(for ([x (in-range 100000)])
(m an-s 2)))
rest of the contract system, creating and using a slightly
more legitmate blame record and calling into the late-neg
projections instead of using `contract`
turns out to be a predicate.
In that case, just call it instead of creating all of the extra junk
that would normally be created by coercing the predicate to a contract
and invoking it
Use a job object to ensure that subprocesses that are meant
to be killed by the current custodian are reliably terminated
if Racket exits for any reason.
Although `procedure-specialize` should be useful in places where
inlining does not apply, allowing inlining and related optimizations
through it, anyway.
- use chaperone-hash-set for set/c when the contract allows only hash-sets
- add a #:lazy flag to allow explicit choice of when to use laziness
(but have a backwards-compatible default that, roughly, eschews laziness
only when the resulting contract would be flat)
Specifically, remove reliance on procedure-closure-contents-eq? to
tell when a pending check is stronger in favor of usint
contract-stronger?
Also, tighten up the specification of contract-stronger? to require
that any contract is stronger than itself
With this commit, this program gets about 10% slower:
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c integer?)
(λ (x) (if (zero? x)
0
(f (- x 1))))
'pos 'neg))
(time (f 2000000))
becuase the checking is doing work more explicitly now but because the
checking in more general, it identifies the redundant checking in this
program
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c integer?)
(contract
(-> any/c integer?)
(λ (x) (if (zero? x)
0
(f (- x 1))))
'pos 'neg)
'pos 'neg))
(time (f 200000))
which makes it run about 13x faster than it did before
I'm not sure if this is a win overall, since the checking can be more
significant in the case of "near misses". For example, with this
program, where neither the new nor the old checking detects the
redundancy is about 40% slower after this commit than it was before:
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c (<=/c 0))
(contract
(-> any/c (>=/c 0))
(λ (x) (if (zero? x)
0
(f (- x 1))))
'pos 'neg)
'pos 'neg))
(time (f 50000))
(The redundancy isn't detected here because the contract system only
looks at the first pending contract check.)
Overall, despite the fact that it slows down some programs and speeds
up others, my main thought is that it is worth doing because it
eliminates a (painful) reliance on procedure-closure-contents-eq? that
inhibits other approaches to optimizing these contracts we might try.
After adding `procedure-specialize`, making
`procedure-closure-contents-eq?` work as before involves
a little extra tracking. I'd prefer to weaken or
even get rid of `procedure-closure-contents-eq?`, but
this adjustment keeps some contract tests passing.
The `procedure-specialize` function is the identity function, but it
provides a hint to the JIT to compile the body of a closure
specifically for the values in the closure (as opposed to compiling
the body generically for all closure instances).
This hint is useful to the contract system, where a predicate
is coerced to a projection with
(lambda (p?)
(procedure-specialize
(lambda (v)
(if (p? v)
v
....))))
Specializing the projection to a given `p?` allows primitive
predicates to be JIT-inlined in the projection's body.
Make the JIT-generated function-call dispatch recognize a call to
an impersonator that wraps a procedure only to hold properties.
This change also repairs handling for a arity-reducing wrapper
on a primitive, where the fast path incorrecty treated the
primitive as a JIT-generated function.
in particular, when there is a recursive contract, then we check only
some part of the first-order checks and see if that was enough to
distinguish the branches. if it was, we don't continue and otherwise we do
the first-order check and the projection itself
can duplicate work (potentailly lots of work
in a non-constant factor sort of a way when
recursive-contract is involved)
this seems also to be a potential problem for other
uses of or/c too
It used to have a (provide (except-out (all-from-out <private-file>) ...))
and various private functions leaked to the outside over the years.
None of the ones removed in this commit were documented, so hopefully
they weren't being used. But this is definitely not backwards compatible,
so this commit is mostly about testing the waters
A value that starts "1", "y", or "Y" enabled incremental mode
permanently (any value was allowed formerly), while a value that
starts "0", "n", or "N" causes incremental-mode requests to be
ignored.
When a major GC triggers finalization, another major
GC is scheduled immediately on the grounds that the
finalizer may release other values. That was important
at once time, but the finalization and weak-reference
implementation has improved to the point where the
extra ful GC no longer seems necessary or useful.
Originally, generation 1/2 was intended to delay major
collections when the heap is especially large. It doesn't
seem to be effective in that case, and it can slow down
minor GCs, so continue to use it only in incremental
mode (where it helps significantly with fragmentation).
At the completion of an incremental major GC, if incremental
mode wasn't requested recently, schedule an immediate major
GC to reduce the heap back to its normal footprint.
A collection's "info.rkt" might have `(define compile-omit-paths
'all)` but also other setup actions, so don't completely ignore
a collection directory just because there's nothing to compile.
Also, make nursery 1/4 as big in incremental mode, and
correspondingly tune the amount of work to perform per
collection by 1/4 its old value. Using generation 1/2
reduces fragmentation that would otherwise be increase
by a smaller nursery.
The main intent of this change is to raise the buffer size
for ARM, but that would leave only 32-bit x86 at size 100, so
just make it size large for all machines.
- uniformly remove the extra layers of calls to unknown functions for
chapereone-of? checks that make sure that chaperone contracts are
well-behaved (put those checks only in contracts that are created
outside racket/contract)
- clean up and simplify how missing projection functions are created
(val-first vs late-neg vs the regular ones)
- add some logging to more accurately tell when late-neg projections
aren't being used
- port the contract combinator that ->m uses to use late-neg
- port the </c combinator to use late-neg
Increemntal GC skips the old-generation compaction phase.
For most applications, that's ok, but it's possible for
fragmentation to get out of hand. Detect that situation
and fall back to non-incremental mode for one major GC.
Memory accounting is enabled on demand; if demand goes
away --- as approximated by no live custodians having
a limit or previously been queried for memory use ---
then stop accounting until demand resumes.
Fix the case that an old-generation finalizer ends up
on a modified page after all old-generation marking
is complete.
Also, make sure epehemerons are checked after previous
marking that may have left the stack empty.
The allocation strategy for immobile objects avoids some fragmentation
in non-incremental mode, but it interferes with finishing up an
incremental major collection, so trade some fragmentation for
an earlier finish (which is far more likely to use less memory
instead of more, despite extra fragmentation).
Really, just improve when majors GCs are forced to trigger
further finalizations. This improvement makes `(collect-garbage)`
followed by `(collect-garbage 'incremental)` move more
reliably into incremental mode.
When `(collect-garbage 'minor)` is combined with incremental
mode, do less incremental work. At the same time, don't
skip an incremental GC just because a major GC is ready.
Although calling `(collect-garbage 'incremental)` in a program with
a periodic task is the best way to request incremental collection, it's
handy for some experiments to have an environment variable that turns
it on permanently.
This change also makes incremental-mode minor collections log as "mIn"
instead of "min", and it changes the first field of the logged
`gc-info` structure to be a mode symbol instead of a boolean.
Incremental GC now works well enough to be useful for some programs
(e.g., games). Memory accounting is still not incremental, so DrRacket
(and running programs in DrRacket) does not really support incremental
collection, although pause times can be much shorter in incremental
mode than by default.
This step is semi-incremental, in that it can happen during
incremental collection, but all finalization is performed at once.
This will work well enoguh if the number of finalizers, weak boxes,
etc., will be small enough relative to the heap size.
Casting a `uintptr_t` to `double` seems not to round to
nearest, so keep all bits while moving to `double` and
use arithmetic to combine them (since the rounding mode
is used correctly for arithmetic).
The strategy of converting a bignum to a flonum by converting on word
boundaries can lose one bit of precision. (If the use of a word
boundary causes a single bit to get rounded away, but the first bit of
the next word is non-zero, then the rounding might have been down when
it should have been up.)
Avoid the problem by aligning relative to the high bit, instead.
Fix even basic readind when extflonums are not supported, but
also fix reading extflonums with large exponents (related to
the other recent changes to number parsing).
Allow a more dynamic (than `impersonator-prop:application-mark`)
determination of continuation marks and associated values to wrap the
call of an impersonated procedure.
When an internal-definition context is used with `local-expand`, the
any binding added to the context affect expansion, but the binding do
not appear in the expansion. As a result, Check Syntax was unable to
draw an arrow from the `s` use to its binding in
(class object%
(define-struct s ())
s)
The general solution is to add the internal-definition context's
bindings to the expansion as a 'disappeared-bindings property. The new
`internal-definitionc-context-track` function does that using a new
`internal-definition-context-binding-identifier` primitive.
Mishandling of the `require`-binding table could cause
`racket/private/pre-base` to export `andmap` as syntax, for example,
instead of as a variable. The syntax-versus-variable distinction
doesn't usually matter, but it affects the order of exports in
bytecode form.
Even though `dynamic-require` might lead to loading source, the
path into the compiler for that source will force compile-time code
as needed.
One benefit of ths change is that `racket -l pict3d` takes about half
as long, because `racket/gui` includes a `dynamic-require` to load a
platform-specific back-end, while `pict3d` can pull in a lot of
compile-time code to cooperate with Typed Racket.
Getting NULL from CTFontCollectionCreateMatchingFontDescriptors()
might indicate a font installation problem; I'm not sure. In any case,
checking for NULL avoids a crash on at least one installation.
This bug is already fixed in the Cairo source repo, so we
can discard the patch on the next Cairo upgrade.
It's not clear which platforms are affected. On OS X, at least,
writing to a global constant can cause a crash.
Thanks to Spencer for making a small example that triggers the bug
(added to the "draw-test" package).
The CTFontCreatePathForGlyph() function can return NULL when
the glyph exists but has an empty path. Instead of treating that
as failure, which causes Cairo to generate a bitmap version of
the glyph, check that the glyph is mapped for bounding boxes,
and treat a NULL path as an empty path in that case.
In #956, @gus-massa warned that `syntax-local-infer-name` was changed
in a breaking way, but the implications were not clear. At a minimum,
identifiers need to be treated like symbols, so that `mzlib/contract`
name inference works right. I'm erroring more generally on the side
of keeping the old behavior for anything other than pair-based
trees.
Closes#1117.
The `--enable-extflonums` option doesn't really do anything, since
extflonum support is enabled automtatically when the compiler's
configuration allows it. To make this slightly less confusing, report
an error when extflonums cannot be supported, despite
`--enable-extflonums`. The error is reported via compiling, instead of
via `configure`, but hopefully that's enough to be helpful.
Continuing with 2f25a1e2bd...
On further reflection, a GC is possible because a
thread swap is possible, and that's asking for trouble.
Disallow thread swaps (and, incidentally, GCs) whle
comparing scope propagations from the cache.
Merge to v6.3
Repairs a problem with d719c06e00.
A GC can happen while checking whether a cache entry matches,
in which case the cache is cleared, so don't check the cache
slot again after comparing.
Merge to v6.3
Repairs 3eb2c20ad0, which used a scope-set comparison for
a table that maps scopes to propagation actions (add, remove,
or flip).
Closes#1113
Merge to v6.3
So now (-> any/c integer?) will avoid the chaperone wrapper when the
function is a struct predicate while simultaneously supporting the
"extra argument neg party" protocol
In `syntax-local-lift-require`, avoid scope adjustments intended
to deal with `require` forms that are compiled in one namespace
and evaluated in another.
This makes two changes to `(or ...)` pattern compilation.
* Avoid reordering the individual elements of an `or` pattern.
Since this reordering has broken programs with `and` patterns,
avoid it here as well.
* Avoid re-ordering sets of patterns that _contain_ an `or`. This
is not semantically important for match itself, but Typed Racket
relies on the previous behavior.
Closesracket/typed-racket#150.
Merge to 6.3
This change ensures that the `reorder?` flag is passed to recursive
calls to `compile` correctly. Related to racket/frtime#1, which is
probably now fixed.
Merge to 6.3.
This changes how multi-in is implemented so that the location for each
expanded element in the final require spec is tied to the last relevant
module path element. This allows DrRacket to intelligently show arrows
linking each imported binding with a relevant piece of the multi-in
import spec.
When `or` has many subexpressions, the expansion generates a
sequence of deeply nested `let`s, where original and macro-introduced
forms are interleaved in a way that defeats a minimal
child-is-same-as-parent sharing of scope sets. Add a small
cache that's good enough to capture extra sharing and
dramatically lower memory use for an `or` that has 1000
subexpressions.
The `setup/winstrip` step was run too late. As an extra measure,
make make `setup/winstrip` more precise about the files it
will discard.
Merge to v6.3
When an import is shadowed by another import or by a definition, don't
include it in the set of bindings in the resut of
`syntax-local-module-required-identifiers` or in the set that can be
exported by `all-from-out`.
Merge to v6.3
To make the API consistent for MSVC versus MinGW builds, make
a functional formerly required for embedding on 32-bit Windows
always available and required for all Windows variants.
Building creates compiler-specific files in "lib/msvc"
or "lib/gcc". For consistency, strip those directories
when creating a distribution.
The newly added ".def" file provides information that
would otherwise be lost by removing the MSVC ".lib"
file from the distribution.
Removing the compiler-specific ".obj" files means that
used to be included for linking extensions. My guess
is that the files are now completely unused.
Recent versions of MinGW-W64 use emutls for `__thread` variables,
and that's much slower than Windows-native TLS. Go back to the
inline-assembly implementation of therad-local access.
Make the old-generation marking process incremental
on request, where `(collect-garbage 'incremental)`
makes a request.
Only the marking phase of an old-generation collection
is incremental, so far. In exchange for slower
minor collections and a larger heap, you get a major
collection pause time that is roughly halved. So, this
is a step forward, but not good enough for most purposes
that need incremental collection.
An incremental-mode request sticks until the next
major GC. The idea is that any program that could
benefit from incremental collection will have
some sort of periodic task where it can naturally
request incremental mode. (In particular, that
request belongs in the program, not in some external
flag to the runtime system.) Otherwise, the
system should revert to non-incremental mode, given
that incremental mode is slower overall and can
use much more memory --- usually within a factor of
two, but the factor can be much worse due to
fragmentation.
For a minor GC and pages that contain backpointers,
leave mark bits as they are; instead make a pass to
shift mark bits for new objects to "dead" bits, and
use dead bits for fixup.
This change is intended as a small step toward incremental
collection.
The rule for using generation 1/2is based on the current
memory use versus the maximum size of generation 0. Recent
changes to the GC have caused that size to vary during
a collection, which means that the choice to use generation
1/2 or not can change within a collection.
Partial use of generation 1/2 doesn't inherently cause problems, but
it can cause a generation-1 object to point to a generation-1/2 object
even though the former was allocated after the latter. That's a
problem on if getting generations out of order relative to allocation
order can create problems. As it happens, reset_finalizer_tree()
checks the generation of the finalization record and not the finalized
pointer, because the record is always allocated after the pointer.
Merge to v6.3
Certain datatypes in the runtime system are not supposed
to be hashed, where bits normally reserved for hash codes
are used for other purposes. A bad bytecode file can cause
some of those to be hashed, anyway. Normally, the damage is
isolated to that content of the damaged bytecode, but
certain variable-reference bytecode forms are both shared
and non-hashable. Set a bit that ensures hashing will not
change flags in the shared object.
This problem was exposed by fuzz testing.
History of the parsing of "file:" URLs for Windows:
* In response to PR 8060 (April 2006): special handling added to
support ill-formed URLs that were (are?) commonly used for
filesystem paths.
* Follow-up to PR 8060 (April 2008): added `path->url` and
`url->path`.
* In response to #1086 (October 2015, the commit): changed
Windows-specific handling to be more constrained and added support
for the proper encoding of UNC paths --- where "proper" means
"according to a blog post from late 2006", which appears to be as
close as we get to documentation of the URL encoding for Windows
paths.
When a compiler is run in standards mode, predefined macros that
do not start with "_" are dropped, so use the "_" versions
consistently. Whether or not Racket itself would compile in
standards mode, the Racket headers should be able to work that
way --- at least on Unix platforms.
In Mac OS X 10.11, something about the use of exceptions triggers
a libunwind stack traversal, and that traversal runs into trouble
with Racket's stack mangling for threads. Inserting generated code
in the stack frame sequence causes libunwind to give up and avoids
a crash (e.g., with `-j -l drracket` on startup).
After some expansions, a expression with the syntax property 'inferred-name of
'x is converted to one with ('x . 'x), so it's not useful to get the name of a
procedure. So we simplify the syntax property 'inferred-name to handle
these cases.
When a place message is deserialized by simply adopting the page
containing the message, the adoption can trigger a garbage
collection, but there's still a pointer to a chain of objects
"in flight" in the thread, and a GC can discard the pairs that
form the chain.
Removing all original module context doesn't work, because it
doesn't distinguish between fragments of syntax that had the
"inside-edge" scope without the "outside-edge" scope.
Record the presence of the outside-edge scope by using the
root scope, and convert the root scope to the current namespace's
outside-edge scope on evaluation.
The bug could cause
#lang racket/base
(define x 'outer)
(define-syntax-rule (def-and-use-m given-x)
(begin
(define-syntax-rule (m)
(let ()
(define given-x 'inner)
x))
(m)))
(def-and-use-m x)
to produce 'inner when it should produce 'outer.
Thanks to Brian Mastenbrook for pointing the problem and
providing examples.
Interrupting bytecode unmarshal for syntax objects could leave
half-constructed values in a table that is intended to resolve graph
structure. Clear out work towards a graph construction when
interrupted.
The most common symptom of half-constructed syntax objects was a crash
after a Ctl-C during startup.
Rename fields in a page record and split some of them with `union` to
better document the intent of each field.
This change is intended to have no effect on the GC's behavior. One
tricky case is the line dropped around line 3542 of "newgc.c". That
line reset `scan_boundary` (formerly `previous_size`), which on the
surface is inconsistent with leving objects before the boundary
without `marked` bits set. However, that line is reachable only
when geneation-1 objects are being marked (objects newly moved
there would not be unmarked), in which case `san_boundary` should
already be reset.
Using `--enable-racket=auto` causes a Racket for the current platform
to be built in a "local" subdirectory of the build directory as
support for cross-compilation.
The original idea was to count phantom bytes as "administrative
overhead", but issues discussed in #962 identified problems
with that idea. Finish shifting the accounting to treat
phantom bytes as payload allocation.
The misplacement of `SCHEME_PRIM_SOMETIMES_INLINED` caused the
optimizer to produce different results when the JIT is statically
disabled, for example.
This bug is an old one, in a sense, because travesing fields
in a closure could have moved the prefix with earlier versions
of the collector. It shows up now because we're changing fields
one indirection closer.
Compact fewer blocks by moving them only when other
blocks have room.
Also, fix block protection tracking in the case of a page
count that isn't divisible by 8.
In the common case of a minor GC without a generation 1/2
or a major GC without compaction, a single pass suffices
to both mark and update references.
This change reduces overall GC time by 10%-25% on typical
programs.
The GC supported allocation for an array of objects where
the first one provides a tag, but at this point it was
used only in some corners. Change those corner and simplify
the GC by removing support for arrays of tagged objects.
The main corner to clean up is in the handling of a macro-expansion
observer and inferred names. Move those into the compile-time
environment. It's possible that name inference has been
broken by the changes, but in addition to passing the tests,
the generated bytecode for the base collections is exactly the
same as before the change.
Although a block cache is set up to group most page-protection changes
into a single OS call, allocating new old-generation pages was not
covered. Adjust the block cache to group those.
This change has a small effect on performance, but it seems
better to have a few system calls in place of thousands.
First bug:
When the optimize converts
(let-values ([(X ...) (values M ...)])
....)
to
(let ([X M] ...)
....)
it incorrectly attached a virtual timestamp to each "[X M]" binding
that corresponds to the timestamp after the whole `(values M ...)`.
The solution is to approximate tracking the timestamp for invidual
expressions.
Second bug:
The compiler could reorder a continuation-capturing expression past
an allocation.
The solution is to track allocations with a new virtual clock.
Make `eval-syntax`, `compile-syntax`, and `expand-syntax` more
consistent (with intent and each other) by not installing a fallback
automatically. In particular, a fallback is not installed for a
`module` form, so that different ways of expanding a `module` form
produce consistent results (e.g., for ambiguous bindings).
p ..k matches exactly k repetitions of p, but its documented
behavior is to match "k or more" repetitions. This fix implements
the documented behavior.
Fixes bug number 15122.
Also, strengthen the checking that `#:permissive?` (off by default)
performs for `untar` and `untgz` to disallow a link whose target is an
absolute path or has an up-directory element.
Retry communication up to five times when `exn:fail:network`
is raised.
Not all `exn:fail:network` exceptions are transient. For example,
attempting to connect to a bad server name will produce an error more
slowly than before, since the bad connection will be tried five times.
Still, retrying on `exn:fail:network` seems like a good heuristic.
This fixes the DrRacket check syntax arrows in uses of `attribute` like this:
```racket
#lang racket/base
(require syntax/parse)
(syntax-parse #'a
[a
(attribute a)])
```
The symbol is used as the "who" field in the error message.
Also fix lazy-require of runtime-report.rkt in residual.rkt; don't
load until syntax-parse actually needs to produce an error report.
(Previously was loaded to create handler whenever syntax-parse code ran.)
When `raco pkg {install,update}` has a list of packages to check,
first run through the list in a "prefetch" mode to create futures to
fetch the information. Issuing requests to servers in parallel
can greatly speed up `raco pkg update --all`.
Package content is still downloaded sequentially.
Using the GitHub API for GitHub sources can run afoul of API
limits. Since we now support the Git protocol generall, use
that for GitHub sources, too.
Set the `PLT_USE_GITHUB_API` environment variable to use the
GitHub API, instead.
Putting <PlatformToolset> in the new places makes the projects
work when more than one version of Visual Studio is installed.
Maybe the old place was always the wrong place, or maybe
VS 2010 wanted it in the old place. Either way, sprinkling
the version in more places seems unlikely to hurt.
The SQLITE_READONLY_ROLLBACK error is supposed to mean that a crash
occurred and a hot journal exists that needs to be replayed (but
can't, because the current connection is read-only).
In practice, it seems that the error can happen even if there has been
no crash. In that case, retrying in the same was as other transient
errors allows the process to continue.
A possible reason for the spurious error: In the implementation of
SQLite, comments in hasHotJournal() mention the possibility of false
positives (ticket #3883) and how the false positive will be handled in
the playback mechanism after obtaining an exclusive lock. The check
for a read-only connection after hasHotJournal() is called, however,
happens before that lock is acquired. So, it seems like the race
condition could trigger a false SQLITE_READONLY_ROLLBACK error.
User-scope package installation matching the version of
Racket being built could affect the collections visible
during `raco setup` for `make base`. In particular, the
presence of `setup/scribble` could cause all built docs
to be discarded.
Also, add the `--no-user-path` flag to `racket` (which
has long been documented as an alias for `-U`).
The table as a tree is traversed to prune empty branches,
but the travseral is needed only toward branches that
have changed. Skipping the traversal can save several
milliseconds on each collection.
Name handling formerly interned symbols along the
way to allocating a plain string, which takes effort
and causes changes to the symbol table, which forces
a minor GC to traverse the whole symbol table. Skip
unnecessary symbol-interning steps.
Refine the changes in 16c198805b so that `(define id ... id ... )` at
the top level compiles more consistently when `id` is an identifier
whose lexical context does not include `#%top`.
When `compile` is used on a top-level definition, do not
create a binding in the current namespace, but arrange for
a suitable binding to be in place for the target namespace.
Closes#1036
This repair adjusts the bug fix of commit 769ad3e98. That older commit
ensured that `sync/enable-break` doesn't both break and accept a
channel message or semaphore wait. But it effectively disables those
actions if the break is continued.
Instead of (partially!) ending the `sync` get out of semaphore
and channel queues so that no event can be selected during
the break, and then get back in line if the break is continued.
When a path is made relative for marshaling to bytecode, record
a list of byte strings in stead of a platform-specific relative
path.
For syntax-object source locations, convert any non-relative path to a
string that shows just the last couple of path elements preceded by
".../". This conversion avoids embedding absolute paths in bytecode,
but at the cost of some information. A more complete and consistent
solution would invove using a module-path index instead of a path, but
that would be a big change at several layers.
Make room in the bytecode format for source locations and 'paren-shape
property values for syntax objects. Saving source locations increases
bytecode size by about 10% on average.
Also, convert the internal representation of syntax properties to
use immutable hash tables, instead of lists.
The `prop:expansion-contexts` property can control the expansion
of a rename transformer in much the same that conditionals on
`(syntax-local-context)` can control the expansion of other
transformers.
These avoid one layer of currying and are more efficient, getting
about a 1.3x speed up on this program:
#lang racket/base
(module server racket/base
(require racket/contract/base)
(provide
(contract-out
[f (-> integer? boolean? char? void?)]))
(define (f i b c) (void)))
(require (submod "." server))
(time
(for ([x (in-range 10000000)])
(f 1 #t #\x)))
All places uses the same accounting bit for objects
that are in the shared space. Each place also flips
the bit value it wants on each accounting, so if two
places are accounting at the same time with opposite
bit values and can reach the same objects, they can
interefere. It's even possible for them to race
through cycles and cause each other to loop forever.
Add a lock to ensure that there's only one bit value
in play for the shared space at any given time. A
place must stall if other places are busy with memory
accounting and an opposite bit value.
While a place message is received by a thread but not yet
deserialized, if the message contains references to objects in the
shared space, and if a "master" GC happens (which crosses all places),
make sure that the references in the still-serialized message are
traversed.
This is the most common contract created for -> (at 318 occurrences
out of the 6515 arrow contracts created when DrRacket starts up)
but this specialization doesn't seem to actually improve the performance
much. Leave it in for now, in case the story changes at some point
in the future
Adjust installation tools to support cross-installation (i.e.,
installation for a platform other than the current one) as triggered
by "system.rktd" in "lib" having different information than the
running Racket executable.
Also, change floating-point handling to be like the MSVC build by
default, where the process is left in double-precision mode and
the mode is changed for exfl operations.
Includes repairs for integer-size mismatches in uses of Windows
threads.
The error message for the guard used an incorrect contract.
Also removed an unused line that allows a box value in the
property. I don't think it was possible to trigger this line
anyway because of the dynamic check.
Support PNG-encoded icons in ".ico" files and executables.
For executables, instead of supporting only new icons that match the
sizes and encodings of existing icons in an executable, support
arbitrary replacement icons in an executable.
The improved funcitonality relies on a new library (currently
private) for general updates to a Windows executable's
resources.
In a case like
(let-values ([(X ...) (with-continuation-mark M_k M_v
(values M ...))])
....)
where the bytecode compiler cannot convert to a sequence of `let`
bindings, make the JIT implement `values` as delivering argument
results directly to the corresponding variable locations.
When `call/cc` and `call/ec` were moved out of `#%kernel`, the
`let/cc` and `let/ec` macros changed to refer to
`call-with-current-continuation` and `call-with-escape-continuation`.
Move `call/cc` and `call/ec` again and restore the macros, so that
matching on the expansion of the macros (e.g,. in the web server's
language that looks for `call/cc`) work as before.
A `module-suffixes` entry in a collection's "info.rkt" adds a
file suffix that is meant to be recognized globally (i.e., in
all collections) by all Racket tools.
The new fields are reported by `compiler/module-suffix` library, which
is (so far) used by `raco setup`.
Note that if package A includes files with a suffix that is registered
by package B, then A should have a dependency on B, but `raco setup`
cannot currently detect that such a dependency is needed. That
dependency is likely to happen, anyway, since package A is likely
using libraries form package B.
Comments in the implementation suggest that a timestamp
specialization was intended for directories, which makes
sense given that directory-modification dates are
not preserved when unpacking an archive. The change
also affected all bytecode timestamps, however, which
can create inconsistencies across package creations.
The `from` string argument is converted to a regexp and cached. When `from` is
a mutable string this can cause wrong results in the following calls
to string-replace. So the string is first converted to an immutable string to
be used as the key for the cache.
When `place` expands, the body of the `place` form is placed into a
`(module* place-body-<n> #f ....)` submodule.
The `place` form previously placed its body in a lifted function,
where the function's exported name was based on
`(current-inexact-milliseconds)`. The generated submodules have
deterministic names, so that compilation is deterministic, and
submodule names don't collide (unlike exported function names) when
multiple `place`-using module are imported into some other module.
Also, using a submodule avoids the problem that the clock doesn't
change fast enough on Windows.
Those aliases were moved out of `#%kernel` as part of the
determinstic-bytecode changes, but putting them in
`racket/private/pre-base` meant that they weren't included in
`mzscheme` or Pretty Big. The new location is with `let/cc`, which
makes more sense, and makes them picked up by `mzscheme` and Pretty
Big.
`net/url` provides functions for both converting strings
and paths to and from URLs.
`net/url` also includes functions for creating (pure and import)
network ports. This functionality `require` the HTTP client stack
which is unnecessary when URLs simple need parsing for their
"bits".
New library: `net/url-strings` handles `url->string` and `string->url`
(and also the related `path->url` and `url->path` functions). This is
required by net/url for compatability.
`net/url-exception.rkt` is factored out for use by both libraries.
- See also racket/net changes for T&D
url-string.rkt changes requested by mflatt
url-strings.rkt is now called url-string.rkt
identifiers from url-string.rkt are reprovided by url.rkt
using (all-from-out "url-string.rkt") instead of explicit
exports
Changes:
- Allow unit contracts to import and export the same signature.
- Add "invoke" contracts that will wrap the result of invoking a unit contract,
no wrapping occurs when a body contract is not specified
- Improve error messages
- Support for init-depend clauses in unit contracts.
- Fix documentation to refelct the above
- Overhaul of unit related tests
Handling init-depend clauses in unit contracts is a rather large and somewhat
non-backwards-compatible change to unit contracts. Unit contracts must now
specify at least as many initialization dependencies as the unit value being
contracted, but may include more. These new dependencies are now actually
specified in the unit wrapper so that they will be checked by compound-unit
expressions.
This commit also adds more information to the first-order-check
error messages. If a unit imports tagged signatures, previously the errror
message did not specify which tag was missing from the unit contract. Now
the tag is printed along with the signature name.
Documentation has been edited to reflect the changes to unit/c contracts
made by this commit.
Additionally this commit overhauls all tests for units and unit contracts.
Test cases now actually check that expected error messages are triggered when
checking contract, syntax, and runtime errors. Test forms now expand into uses
of rackunit's check-exn form so only test failures are reported and all tests in
a file are run on every run of the test file.
Progress toward making the bytecode compiler deterministic, so that a
fresh `make base` always produces exactly the same bytecode from the
same sources. Most changes involve avoiding hash-table order
dependencies and adjusting scope identity. The namespace used to load
a reader extension is also better defined. Plus many other little
changes.
The identity of a scope that is unmarshaled from a bytecode file now
incorporates the hash of the file, and the relative order of scopes is
preserved in a bytecode file. This combination allows compilation to
start with modules that loaded and compiled in different orders
(including delayed loading of bytecode fragments within one file).
Formerly, a reader extension triggered by `#lang` or `#reader` was
loaded in whatever namespace happens to be current. That's
unpredictable and can pollute a module build at the level of bytecode.
To help make builds deterministic, reader extensions are now loaded in
a root namespace of the current namespace.
Deterministic compilation in general relies on deterministic macros.
The two most common ways for a macro to be non-deterministic are by
using `gensym` (use `generate-temporaries`, instead) and by using an
unsorted hash-table traversal (don't do that).
At this point, bytecode generation is unlikely to be completely
deterministic, since I uncovered non-determinism mostly by iterating
attempts over the base collections. For now, the intent is not to
provide guarantees outside of the compilation of the base collections
--- but "more deterministic" is likely to be useful in the short run,
and we can improve further in the long run.
Specialize a
(call-with-immediate-continuation-mark _key (lambda (_arg) _body) _def-val)
call to an internal
(with-immediate-continuation-mark [_arg (#%immediate _key _def_val)] _body)
form, which avoids a closure allocation and more.
This optimization is useful for contracts, which use
`call-with-immediate-continuation-mark` to avoid redundant
contract checks.
On OS X, it seems that access() can sometimes fail with EPERM
when checking for execute permission on a file without it.
I've previously seen this result when running as the superuser,
but that's apparently not the only possibility; a long path
may also be relevant.
Re-linking in a new namespace doesn't need the namespace of
compilation.
A "namespac.rktl" test exposed this problem, where the "transfer a
definition of a macro-introduced variable" test could fail if a GC
occurred between compilation in one namespace and evaluation in
another.
When a module is currently installed as bytecode, but without
corresponding source and without a "info.rkt" specification that
bytecode should be preserved without source, then `raco pkg` should
not count that module bytecode as a conflict (since `raco setup`
will remove it).
In case a collection "a" is composed from two places, and in
case the first place has a bytecode file for "x.rkt" while
only the second place has the source of "x.rkt" (probably it
was recently moved), then `raco setup` should delete the
sourceless bytecode so that any dependency on "x.rkt" will
reference the right version.
Nested splicing forms would lead to an "ambigious binding" error
when the nested forms bind the same name, such as in
(splicing-let ([a 1])
(splicing-let ([a 2])
(define x a)))
The problem is that splicing is implemented by adding a scope to
everything in the form's body, but removing it back off the
identifiers of a definition (so the `x` above ends up with no new
scopes). Meanwhile, a splicing form expands to a set of definitions,
where the locally bound identifier keeps the extra scope (unlike
definitions from the body). A local identifier for a nested splicing
form would then keep the inner scope but lose the outer scope, while
a local identifier from the outer splicing form would keep the outer
scope but no have the inner one --- leading to ambiguity.
The solution in this commit is to annotate a local identifier for a
splicing form with a property that says "intended to be local", so the
nested definition will keep the scope for the outer splicing form as
well as the inner one. It's not clear that this is the right approach,
but it's the best idea I have for now.
Although `eval-syntax` is not supposed to add the current namespace's
"outer edge" scope, it must add the "inner edge" scope to be consistent
with adding the inner edge to every intermediate expansion (as in
other definition contexts).
In addition, `eval`, `eval-syntax`, `expand`, and `expand-syntax`
did not cooperate properly with `local-expand` on the inner edge.
Some failure paths were missing an update before calling failure
code, and the new failure paths need to unconditionally update the
runstack pointer (because the common stub doesn't know whether the
calling context needs an update).