It looks like my bound for last time was too conservative,
in that I looked for the lowest number that didn't seem
to fail in 10.6. The range of failing values is apparently
not continuous.
I've tightened the bound to match the lowest
number that produces a useful result on my 10.7 machine,
assuming that it works for a continuous range there.
(The new bound is higher than the number previously used as
a lower bound.)
Merge to 5.2.1
Setting the environment variable causes the bytecode compiler to run
the bytecode validator (which is normally applied to input from a
bytecode file) immediately on all of the compiler's own results.
Certain `lambda'-lifting operations can cause information
about the flonumness of a variable to get lost, leading
to a mismatch between the closure's flags and flags on
a variable reference. (The bytecode validator could detect the
bug when loading the broken bytecode. The broken information,
meanwhile, was only used by the JIT.)
For numbers around -67768122973228093, localtime() doesn't return in
10.6.8, while it returns NULL for 10.7.2. Work around the bug by
setting a lower bound that seems to be high enough to avoid the
problem (and that's lower than the lowest value that succeeds, so no
results are lost, at least for now).
Merge to 5.2.1
If a function with a rest arg is called with argv not at the
start of the runstack, then space is allocated for the rest-arg list
on the runstack without clearing the allocated slot. The value in
the slot could be a pointer that wasn't traversed by the most recent
GC, so it could crash a GC during allocation of the rest-arg list.
Also, tweak setup code for a function of no arguments, and improve
comments in the code.
Merge to 5.2.1
The generated code was checking arity after potentially copying
arguments to the start of the runstack (i.e., if the arguments
were not already there). If too few arguments are provided, then
the copy might access past the end of the given array.
The redundant arity check removed in commit f7c506471b
had previously masked this problem. (Or the check wasn't redundant
in that sense, but it's better this way.)
Merge to 5.2.1
The problem is related to marks that should cancel eagerly when
a form passes through many layers of macro expansion, such as in
the sieve stress test for `syntax-rules'.
There's no particular reason that any one format will have all
the information that other formats need, but it conveniently works
for now that HTML info can subsume Latex info.
Certain unsafe operations were allowed to propagate across a
`lambda' boundary (where space safety is known not to be an issue),
which could lead to duplicate uses of a "once used" variable if
the relevant `lambda' is inlined.
Furthermore, `lambda' boundary crossing wasn't detected in the case
that the operation to propagate was propagated through an intermediate
variable without a `lambda' crossing.
Merge to 5.2.1
I'm fairly certain that the change in commit 25e9bd2a190acf861 isn't
right, but I'm having trouble generating tests to demonstrate the
original bug or this correction.
First use of the function was determining a single arity for
the enclosing module, and that arity could trigger warnings
in addition to failures to inline. For example, using `map'
on 3 arguments would trigger incorrect warnings for later
uses of `map' on 2 arguments.
Rename `read-intern-literal' to `datum-intern-literal'.
Interning is needed only in `read-syntax' or `datum->syntax' to
set up the invariants that the bytecode compiler needs for cross-module
optimization. When `read'ing numbers from a data file, meanwhile,
interning slows things down a lot and doesn't seem worthwhile.
Pass a pointer to the thread-local table on entry to JIT-generated
code, instead of having the JIT-generated code call a C function
to get the table. This doesn't seem to improve performance on my
machine, but it generates less code and is probably faster in
some cases.
When a future is blocked on JIT generation, a lightweight closure
is captured, and then the future moves on, the runtime thread would
correctly shift the on-demand JIT argument to the captured copy
of the runstack. However, it would also add 2 to that pointer
to use as the argv array, and the captured runstack is not allocated
to allow interior pointers, so a GC during on-demand JIT could
crash. The solution is to keep an offset alongside the argv pointer
during JITting.
The over-eager transformation could be space-unsafe, and it
could duplicate an unsafe operation whose result is used only
once in a function that eds up being inlined multiple times.
More generally, support a
(define _id (begin 'compiler-hint:cross-module-inline _proc-expr))
hint, which is how the compiler determines that `map', etc., are
candidates for inlining.
Inline only trivial functions, such as `(empty? x)' -> `(null? x)',
to avoid generating too much code.
Bytecode includes a new `inline-variant' form, which records a
version of a function that is suitable for cross-module inlining.
Mostly, the variant let the run-time system to retain a copy
of the bytecode while JITting (and dropping the bytecode of)
the main variant, but it may be different from the main variant
in other ways that make it better for inlining (such a less loop
unrolling).
An unreceived message can have a reference to a master-allocated
value, in which case that value must be marked. This marking
is implemented by embedding a linked link within the message
memory.
This improvement applies to both poll() and select() modes, and it
can reduce scheduling overhead when blocking on many I/O sources
at once.
This mode is not enabled for Windows, however, since Racket doesn't
exactly use select() on Windows.
On Mac OS X, poll() doesn't work right in versions earlier than 10.5.5,
select() is always faster, and large number of sockets will be
better handled via kqueue(). On Linux, poll() is defintely better.
Otherwise, we stick with select() to be conservative.
Applies in the case of simple ports without line counting, etc.
Also, `read-line' keeps track of whether all bytes are ASCII
(which is easy) to shortcut general UTF-8 decoding.
On x86_64, if the scratch-space address fits into 32
bits and the final place for shared code doesn't
fit into a 32-bit address, then the size of the generated
code could change, leading to a JIT buffer overflow.
Merge to 5.2
Fix memory accounting to detect when messages pile up in a
place channel and when shared values (such as the result of
`make-shared-bytes') pile up. Also fix problems where a GC
or free-page purge needs to be triggered.
The implementation causes a minor API change, which is that
a place channel sent multiple times as a message generates
values that are `equal?' but no longer eq?'.
Closes PR 12273
[Do not merge to 5.2]
Reordering `unsafe-vector-ref' past an `unsafe-vector-set!' was
particularly bad. Meanwhile, some non-mutating operations like
`unsafe-mcar' were treated too conservatively.
Merge to 5.2
The bug was that a procedure could be incorrectly marked as
a "leaf" procedure, which could in turn cause the compiler
to keep inlining a very small procedure that calls itself.
Closes PR 12270
Merge to 5.2
As variables are dropped for lifted functions, the bitmap
for flonum closure variables was not shifted down by the
number of dropped variables.
Closes PR 12259
The libjpeg, libeay, and ssleay libraries for Win64 linked to
msvcr90.dll, because of the way that they were compiled with
MSVC 2008, but msvcr90.dll is not included with Win7, and
redistributing it is problematic. The new variants of the libraries
link instead of msvcrt.dll --- which you're not supposed to do
according to MS, but that's the way libraries like Gtk are
built, and it seems to be the right approach. See also
http://kobyk.wordpress.com/2007/07/20/dynamically-linking-with-msvcrtdll-using-visual-c-2005/
I built libjpeg-8, while the other two are courtesey of
http://www.indyproject.org.
Closes PR 12246
Show process time of start of GC and otherwise adjust to make
the output more compact, and attach a prefab struct to the
logged message to report all available data in Racket form
(including real start and end times, which are not shown in
the output).
The `date*' structure type is an extension of `date' with
`nanosecond' and `time-zone-name' fields.
The `seconds->date' function now accepts a real and returns a
`date*'. The fractional part of its argument goes into the
`nanosecond' field.
Macros and other tools that need syntax privilege used
`(current-code-inspector)' at the module top-level to try to
capture the right code inspector at load time. It's more
consistent to instead use the enclosing module's declaration-time
inspector, and `var-ref->mod-decl-insp' provides that. The
new function works only on references to anonymous variables,
which limits access to the inspector.
The real function name is longer, of course.
The GC problem was related to generational GC and the way constant
values are associated to JIT-generated code. See `retaining_data'.
The stack-overflow problems affects the JIT, module expansion,
and module invocation.
Ports must be forced closed in the case of kill a place,
and the existing code takes care of that.
The Windows fix is especially needed for the new places port
handling, but it turns out that the console handlign was broken for
places anyway.
Finalization for a place channel used a recursive, non-atomic
function, which meant that a thread switch could happen during
place-channel finalization, leaving the new thread with the
master GC and generally confused. (The random-message test
found the bug right away on my machine.)
We already have a non-recursive, non-atomic function to traverse
place messages, so collapse all modes into that one implementation.
Along the way, problems with empty structs (found by random tester)
and checking of file descriptors (test added) also fixed.
Use `continuation-mark-set-first', instead.
Also, re-enable bytecode for Racket code that is built into
the binary, which had been left disabled accidentally.
The recent addition of a shared table of names for shared code
caused bad performance on some machines (such as Robby's)
due to the lock on the table. The lock dosn't seem to be necessary
for platforms where places are supported, though.
GRacket registers witht a global table to indicate that
no transform is needed. (This change was intended to address
a 64-bit problem on Lion. It didn't help, but this seems
better than ignoring an error.)
The main change is to use C99 flexible array declarations
in structs, instead of declaring single-element arrays.
There are still a few -Wtautological-compare warnings
in 3m due to marco expansion.
Lazy initialization of statics shared across places doesn't work.
Also, each static must be registered with the GC exactly once;
I'm not sure why regstering on every callback didn't cause more
problems.
The `current-memory-use' function's result now includes the memory
use of places created from the calling place, and custodian memory
limits apply to memory use by places (owned by the custodian).
This change is relevant to PR 12004 in that DrRacket will no longer
crash on the example if a memory limit is in effect, but plain
Racket starts with no such limit and will exhaust all memory.
For 64-bit builds, MSVC has become smart enough to inline functions
in a way that interferes with the implementation of continuations,
so that (planet "williams/simulation/examples/model-2b") crashes,
for example. Explicitly disabling inlining avoids the problem by
making the C stack layout match the implementation's expectation.
The module cache was added in 97ce26b1 (April 16, 2011),
but it was accidentally disabled in e9721058 (May 5, 2011).
This time, I figured out a way to test whether the cache is
working (other than to benchmark examples, which is how I
discovered that it wasn't working).
For example,
(define-for-syntax (f x) (g x))
(define-for-syntax (g y) y)
is now allowed. The unbound-variable check for phase 1
and up is delayed until after the module body is partially expanded.
The JIT and bytecode compiler disagreed on the definition of
"constant". Now there are two levels: "constant" means constant across
all instantiations, and "fixed" means constant for a given instantation.
The JIT uses this distinction to generate direct-primitive calls
or not. (Without the distinction, a direct jump to `reverse' could
be wrong, because `racket/base' might get instantiated with the
JIT disabled or not.)
Also, fixed a bug in the JIT's `vector-set!' code in the case that
the target vector is a top-/module-level reference that is ready,
fixed, or constant.
More specifically, for a string of length N and a match that
only looks at the first M characters, the complexity of
`regexp-match' is now O(M) instead of O(N). This allows
`regexp-split' to be O(N) for a string instead of O(N^2).
Also, fixed a bug in non-greedy matching that could affect
both long strings and input ports.
Commit 311d55b5cf fixed a shallow bug that masked a deeper
bug in the interaction of local bindings and module-level
bindings. This one fixes the deeper problem, which is that
the recursive resolution that ignores module bindings should
start from the beginning of the wraps, not the wrap after
a module renaming.
Closes PR 12116
Shared locking now allowed only on input port, and exclusive
locking is allowed only on output ports, which allows an implementation
via fcntl(...,F_SETLK,...).
Although a future thread used an atomic compare-and-swap to
set "is a list" or "not a list" flag on pairs via the
JIT-implemented `list?', the hashing function in the runtime
thread did not; as a result, it might be possible to lose
a hash code due to cache inconsistency (although I'm not
sure it's actually possible, and I couldn't trigger a problem
with a test). Most of the changes are related to using
an atomic compare-and-swap when setting a hash code, as
well as clean-ups to related code. Processor-count tests
avoid using atomic compare-and-swap on uniprocessors, which
might not support the relevant machine instructions.
As significantly, the compare-and-swap operation for the
JIT-implemented `list?' did not actually set flags on
a pair that has a hash code. This could lead to `list?'
tests that were not constant time (but only if the relevant
pair's `eq?' hash code had been used previously).
The specific error reported by CopyFileW doesn't seem
to be documented. It's unclear whether Racket's old test
for ERROR_EXISTS_ALREADY was the wrong choice (as opposed
to ERROR_FILE_EXISTS) or whether some Windows versions
use it; we test for both for now.
Also, improve error reporting when an errno or
GetLastError() value is available.
Closes PR 12074
Merge to 5.1.2
This reverts commit 2afff3d210.
This commit caused real->double-flonum to have a different behavior
when jitted as opposed to interpreted, and caused real->single-flonum
to break in some cases.
Merge to 5.1.2.
A recent (weeks-old) JIT change set one of a function's code
pointers to NULL to indicate that JIT-compilation of the
function is in progress, but that breaks futures. Set the
code pointer to a different not-yet-ready function, instead.
Merge to 5.1.2
Closes PR 12037
There were two:
* new: after finding a hash code, the key wasn't
always checked to be `eq?' to the desired key
* old: the hash code wan't downshifted by 2, so
changes in the low two bits (like when a pair
is determined to start a list) could break
lookup
Merge to 5.1.2