The optimizer can no longer reduce
(with-continuation-mark _id _v-expr _expr)
to just
_expr
when _v-expr and _expr are simple enough, because _id
might be bound to a continuation mark key with an
impersonator that checks the result of _v-expr.
The loss of an optimization can have a significant affect on
errortrace of microbenchmarks, such as
(for ([i (in-range 10000000)])
i)
A arity-reduced procedure-valued `prop:procedure' was not handled
correctly, for example.
A good candidate for random testing? I had the right kind of test in
place, but only for an arity of 1. It turns out that testing any other
arity would have exposed the problem, so now there are tests with
arity 0. If I had randomly generated procedures instead of manually
constructing `f0' through `f1:+' in "procs.rktl", then maybe I would
have more naturally generalized the arity testing, too. Then again,
I did already have relevant inputs, and it was the testing of inputs
that was too specific.
Closes PR 12870
For all currently supported platforms, the result was already
portable, despite the documentation's hedging.
Also fixed up the documentation in other ways, such as the fact
that `seconds->date' returns a `date*'.
Add extra intitial-message lines, use "..." on a field name
to indicate that it could reasonably be hidden by default,
and refine some existing messages.
Includes the addition of 'overflow and 'start-overflow-work
events, whcih are effectively specializations of 'sync and
'start-work to expose overflow handling.
Also, fix a bug related to a potential GC during mark-stack
restore from a lightweight continuation.
By itself, this change won't help anything, because tail-call
allocation is triggered for something that can't be called directly.
This change sets up part of an improvement for future-local recovery
from stack overflow, though.
(I had trouble constructing a test that would trigger the new code.
Fortunately, the existing tests trigger it.)
When a "synchronized" operation is handled for a future thread
during a `touch' on the future, then still report the handling
as "synchronized" insteda of "blocked".
Also, fix FFI procedures to preserve names: change `ptr-ref' with
`_fpointer' on an `ffi-obj' value to return the `ffi-obj'
value, so that the name in the `ffi-obj' value can be used
by `_cprocedure'.
Closes PR 12645
The new predicates are `progress-evt?' `thread-cell-values?',
`prefab-key?', `semaphore-peek-evt?', and `channel-put-evt?'.
These were used internally, and now they appear in contract
error messages.
When supplying an accessor to redirect, either the corresponding field
must be accessible through the current inspector, or a mutator for
the same field must be redirected, too.
Stevie realized that we need this constraint; otherwise, impersonators
can implement mutator-like behavior even when the mutator is otherwise
secret.
Add `raise-argument-error', `raise-result-error', `raise-arguments-error',
and `raise-range-error'.
The old convention was designed for reporting on a single (sometimes very
long line). The new convention is
<name>: <short message>
<field>: <detail>
...
If <detail> is long or itself spans multiple lines, then it may
also use the form
<field>:
<detail>
where each line of <detail> is indented by 3 spaces.
Backtrace information is shown as a multi-line "context" field.
For example, if definitions have two unmarked `x's that originate
from different modules, make them correspond to different bindings.
This improvement will be used by `scribble/srcdoc', which will
rely on module context to connect `for-doc' requires to documentation
code that appears in the same module --- which is needed, for example,
if a macro expands to documentation code and the macro is used in
a different module.
These primitives atomically update a box to a new value, as long
as the current value is the same as a provided value. They also
are future-safe.
When futures are enabled, they use low-level hardware instructions
to perform the change atomically.
The expander no longer needs to generate certain phase-N
identifiers, since we now have a comparison operation that
uses different phases for each of two identifiers.
The preserved path is exposed by a new `module-path-index-submodule'
function, and `module-path-index-join' now accepts an optional
submodule path.
Also, fixed a problem with `collapse-module-path-index' when
a module path indx is built on a resolved module path that
is a submodule path.
In addition to the main repair, `collapse-module-path[-index]' is
correctly documented to allow '(quote <sym>) rel-to paths.
Finally, `collapse-module-path-index' changed to use a symbolic
resolved module path that appears as the base of a module path
index, rather than falling back to the given rel-to path. It's
possble that the old beavior was intentional, but it wasn't tested,
and it seems more likely to have been a bug.
Closes PR 12724
For example, `(module-declared? '(submod (planet dyoo/bf) reader) #t)'
shouldn't fail if there's no "main.rkt" to hold a `reader' submodule;
it should return #f.
Merge to 5.3, but updating cstartup.inc will require a manual merge.
AVL trees to tend to be shorter, which means a faster
search and insertion. The potential benefit of a red--black
tree's fewer rotations doesn't matter, I think, for a functional
variant, where you have to reconstruct a spine to the root,
anyway. The difference is small for typical tables, though it
can be as much as 50% for a large table with keys inserted in
order. And since the AVL code is also much simpler, why not?
The usual techniques: shortcut around generic scheme_apply() for
chaperone-triggered slow path, shortcut around scheme_apply() for
application of a native-code interposition function, and shortcut
`chaperone-of?' test by trying `eq?' first.
The immediate symptom was that `(provide (all-defined-out))'
didn't work in a `module+'-based submodule, but there were
also non-submodule ways to expose the problem.
Although th eoriginal idea was to distinguish "text" paths
from derived filesystem paths, practically everythign that accepts
a module path also accepts a path. Building the generalization into
`module-path?' makes it easier to support `submod' wrappers on paths,
and it seems to fix things rather than break them.
The bug happens with n-ary uses of arithmetic operations that
have constant arguments but couldn't be constant-folded ---
maybe due to a divide-by-zero.
It's possible for a deep recursion to be all in C instead of
JIT-generated code, in which case the caching code for
`current-continuation-mark' cannot kick in to make the operation
effectively constant time. Bail out (to keep things constant time) if
that happens.
The old implementation could cause deadlock by blocking on a semaphore
while waiting for the original place to run a callback, but a master
GC might be needed (and the blocked place wouldn't get the signal).
Beside fixing that problem, a potential memory leak is fixed in
calling an ffi funcition and having a Racket->C unmarshaling fail.
Also, the GC marking routine for a `place' value didn't reference the
place's underlying `place_obj' value.
The bug is triggered by unsafe flonum operations, a
conversion that tries to make the arguments more unboxable,
and a `lambda' form within an argument to the unsafe
operation.
Closes PR 12587
I now think the problem is likely to be realted to values
that do not fit into a signed 32-bit integer. Check for
the OS version and reject such integers.
The ActiveX part of MysterX is gone. The `ffi/com' re-imeplemtnation
provides only core COM support.
The "mysssink" DLL is still needed, and its source is still
in the tree, but it is downloaded in the same way as other
pre-built DLLs. The DLL no longer needs to be registered with
regsvr32.
The prohbition against `handle-evt' on `handle-evt' is as
document and as originally intended. I'm not sure why it
was allowed.
Existing programs that use `handle-evt' incorrectly
can break. I found and fixed one incorrect use and one
questionable use in the Racket tree (which is a small
minority of the uses of `handle-evt' in the tree).
Extend `define-cstruct' to support #:property specs, which causes
the constructor and C->Racket coercsions to wrap the pointer in
a structure instance with the specified properties. Of course,
the wrapper structure has a `prop:cpointer' property so that the
wrapper can be used transparently as a C pointer.
Add missing tests and documentation for the id`->list', `list->'id,
id`->list*', and `list*->'id bindings created by `define-cstruct'.
This addition triggered several other changes:
* -k for a Mac OS X embedding is now relative to the __PLTSCHEME
segment (which means that executables won't break if you strip
them, for example)
* the command-line no longer has a limited size for Mac OS X
launchers and embedding executables
* Mac OS X GUI and Windows launchers record the creation-time
collection path, unless they are created as "relative" launchers
In particular, allow a pair of a relative-to directory and a base
directory. Paths that syntactically extend the base directory are
recorded as relative to the relative-to directory (which must
syntactically extend the base directory).
The compilation manager now sets the parameter to a pair with
the base directory as the main collection directory, if the source
file's path extends that directory's path.
This generalization solves problems created by cross-module inlining,
where the source location of a procedure in bytecode can now be in a
different file than the enclosing module's file.
Also add a test that checks whether the build directory shows up
in any ".zo", ".dep", or documentation ".html" files.
Closes PR 12549
It looks like my bound for last time was too conservative,
in that I looked for the lowest number that didn't seem
to fail in 10.6. The range of failing values is apparently
not continuous.
I've tightened the bound to match the lowest
number that produces a useful result on my 10.7 machine,
assuming that it works for a continuous range there.
(The new bound is higher than the number previously used as
a lower bound.)
Merge to 5.2.1
Setting the environment variable causes the bytecode compiler to run
the bytecode validator (which is normally applied to input from a
bytecode file) immediately on all of the compiler's own results.
Certain `lambda'-lifting operations can cause information
about the flonumness of a variable to get lost, leading
to a mismatch between the closure's flags and flags on
a variable reference. (The bytecode validator could detect the
bug when loading the broken bytecode. The broken information,
meanwhile, was only used by the JIT.)
For numbers around -67768122973228093, localtime() doesn't return in
10.6.8, while it returns NULL for 10.7.2. Work around the bug by
setting a lower bound that seems to be high enough to avoid the
problem (and that's lower than the lowest value that succeeds, so no
results are lost, at least for now).
Merge to 5.2.1
If a function with a rest arg is called with argv not at the
start of the runstack, then space is allocated for the rest-arg list
on the runstack without clearing the allocated slot. The value in
the slot could be a pointer that wasn't traversed by the most recent
GC, so it could crash a GC during allocation of the rest-arg list.
Also, tweak setup code for a function of no arguments, and improve
comments in the code.
Merge to 5.2.1
The generated code was checking arity after potentially copying
arguments to the start of the runstack (i.e., if the arguments
were not already there). If too few arguments are provided, then
the copy might access past the end of the given array.
The redundant arity check removed in commit f7c506471b
had previously masked this problem. (Or the check wasn't redundant
in that sense, but it's better this way.)
Merge to 5.2.1
The problem is related to marks that should cancel eagerly when
a form passes through many layers of macro expansion, such as in
the sieve stress test for `syntax-rules'.
There's no particular reason that any one format will have all
the information that other formats need, but it conveniently works
for now that HTML info can subsume Latex info.
Certain unsafe operations were allowed to propagate across a
`lambda' boundary (where space safety is known not to be an issue),
which could lead to duplicate uses of a "once used" variable if
the relevant `lambda' is inlined.
Furthermore, `lambda' boundary crossing wasn't detected in the case
that the operation to propagate was propagated through an intermediate
variable without a `lambda' crossing.
Merge to 5.2.1
I'm fairly certain that the change in commit 25e9bd2a190acf861 isn't
right, but I'm having trouble generating tests to demonstrate the
original bug or this correction.
First use of the function was determining a single arity for
the enclosing module, and that arity could trigger warnings
in addition to failures to inline. For example, using `map'
on 3 arguments would trigger incorrect warnings for later
uses of `map' on 2 arguments.
Rename `read-intern-literal' to `datum-intern-literal'.
Interning is needed only in `read-syntax' or `datum->syntax' to
set up the invariants that the bytecode compiler needs for cross-module
optimization. When `read'ing numbers from a data file, meanwhile,
interning slows things down a lot and doesn't seem worthwhile.
Pass a pointer to the thread-local table on entry to JIT-generated
code, instead of having the JIT-generated code call a C function
to get the table. This doesn't seem to improve performance on my
machine, but it generates less code and is probably faster in
some cases.
When a future is blocked on JIT generation, a lightweight closure
is captured, and then the future moves on, the runtime thread would
correctly shift the on-demand JIT argument to the captured copy
of the runstack. However, it would also add 2 to that pointer
to use as the argv array, and the captured runstack is not allocated
to allow interior pointers, so a GC during on-demand JIT could
crash. The solution is to keep an offset alongside the argv pointer
during JITting.
The over-eager transformation could be space-unsafe, and it
could duplicate an unsafe operation whose result is used only
once in a function that eds up being inlined multiple times.
More generally, support a
(define _id (begin 'compiler-hint:cross-module-inline _proc-expr))
hint, which is how the compiler determines that `map', etc., are
candidates for inlining.
Inline only trivial functions, such as `(empty? x)' -> `(null? x)',
to avoid generating too much code.
Bytecode includes a new `inline-variant' form, which records a
version of a function that is suitable for cross-module inlining.
Mostly, the variant let the run-time system to retain a copy
of the bytecode while JITting (and dropping the bytecode of)
the main variant, but it may be different from the main variant
in other ways that make it better for inlining (such a less loop
unrolling).
An unreceived message can have a reference to a master-allocated
value, in which case that value must be marked. This marking
is implemented by embedding a linked link within the message
memory.
This improvement applies to both poll() and select() modes, and it
can reduce scheduling overhead when blocking on many I/O sources
at once.
This mode is not enabled for Windows, however, since Racket doesn't
exactly use select() on Windows.
On Mac OS X, poll() doesn't work right in versions earlier than 10.5.5,
select() is always faster, and large number of sockets will be
better handled via kqueue(). On Linux, poll() is defintely better.
Otherwise, we stick with select() to be conservative.
Applies in the case of simple ports without line counting, etc.
Also, `read-line' keeps track of whether all bytes are ASCII
(which is easy) to shortcut general UTF-8 decoding.
On x86_64, if the scratch-space address fits into 32
bits and the final place for shared code doesn't
fit into a 32-bit address, then the size of the generated
code could change, leading to a JIT buffer overflow.
Merge to 5.2
Fix memory accounting to detect when messages pile up in a
place channel and when shared values (such as the result of
`make-shared-bytes') pile up. Also fix problems where a GC
or free-page purge needs to be triggered.
The implementation causes a minor API change, which is that
a place channel sent multiple times as a message generates
values that are `equal?' but no longer eq?'.
Closes PR 12273
[Do not merge to 5.2]
Reordering `unsafe-vector-ref' past an `unsafe-vector-set!' was
particularly bad. Meanwhile, some non-mutating operations like
`unsafe-mcar' were treated too conservatively.
Merge to 5.2
The bug was that a procedure could be incorrectly marked as
a "leaf" procedure, which could in turn cause the compiler
to keep inlining a very small procedure that calls itself.
Closes PR 12270
Merge to 5.2
As variables are dropped for lifted functions, the bitmap
for flonum closure variables was not shifted down by the
number of dropped variables.
Closes PR 12259
The libjpeg, libeay, and ssleay libraries for Win64 linked to
msvcr90.dll, because of the way that they were compiled with
MSVC 2008, but msvcr90.dll is not included with Win7, and
redistributing it is problematic. The new variants of the libraries
link instead of msvcrt.dll --- which you're not supposed to do
according to MS, but that's the way libraries like Gtk are
built, and it seems to be the right approach. See also
http://kobyk.wordpress.com/2007/07/20/dynamically-linking-with-msvcrtdll-using-visual-c-2005/
I built libjpeg-8, while the other two are courtesey of
http://www.indyproject.org.
Closes PR 12246
Show process time of start of GC and otherwise adjust to make
the output more compact, and attach a prefab struct to the
logged message to report all available data in Racket form
(including real start and end times, which are not shown in
the output).
The `date*' structure type is an extension of `date' with
`nanosecond' and `time-zone-name' fields.
The `seconds->date' function now accepts a real and returns a
`date*'. The fractional part of its argument goes into the
`nanosecond' field.
Macros and other tools that need syntax privilege used
`(current-code-inspector)' at the module top-level to try to
capture the right code inspector at load time. It's more
consistent to instead use the enclosing module's declaration-time
inspector, and `var-ref->mod-decl-insp' provides that. The
new function works only on references to anonymous variables,
which limits access to the inspector.
The real function name is longer, of course.
The GC problem was related to generational GC and the way constant
values are associated to JIT-generated code. See `retaining_data'.
The stack-overflow problems affects the JIT, module expansion,
and module invocation.
Ports must be forced closed in the case of kill a place,
and the existing code takes care of that.
The Windows fix is especially needed for the new places port
handling, but it turns out that the console handlign was broken for
places anyway.
Finalization for a place channel used a recursive, non-atomic
function, which meant that a thread switch could happen during
place-channel finalization, leaving the new thread with the
master GC and generally confused. (The random-message test
found the bug right away on my machine.)
We already have a non-recursive, non-atomic function to traverse
place messages, so collapse all modes into that one implementation.
Along the way, problems with empty structs (found by random tester)
and checking of file descriptors (test added) also fixed.
Use `continuation-mark-set-first', instead.
Also, re-enable bytecode for Racket code that is built into
the binary, which had been left disabled accidentally.
The recent addition of a shared table of names for shared code
caused bad performance on some machines (such as Robby's)
due to the lock on the table. The lock dosn't seem to be necessary
for platforms where places are supported, though.
GRacket registers witht a global table to indicate that
no transform is needed. (This change was intended to address
a 64-bit problem on Lion. It didn't help, but this seems
better than ignoring an error.)
The main change is to use C99 flexible array declarations
in structs, instead of declaring single-element arrays.
There are still a few -Wtautological-compare warnings
in 3m due to marco expansion.
Lazy initialization of statics shared across places doesn't work.
Also, each static must be registered with the GC exactly once;
I'm not sure why regstering on every callback didn't cause more
problems.
The `current-memory-use' function's result now includes the memory
use of places created from the calling place, and custodian memory
limits apply to memory use by places (owned by the custodian).
This change is relevant to PR 12004 in that DrRacket will no longer
crash on the example if a memory limit is in effect, but plain
Racket starts with no such limit and will exhaust all memory.
For 64-bit builds, MSVC has become smart enough to inline functions
in a way that interferes with the implementation of continuations,
so that (planet "williams/simulation/examples/model-2b") crashes,
for example. Explicitly disabling inlining avoids the problem by
making the C stack layout match the implementation's expectation.
The module cache was added in 97ce26b1 (April 16, 2011),
but it was accidentally disabled in e9721058 (May 5, 2011).
This time, I figured out a way to test whether the cache is
working (other than to benchmark examples, which is how I
discovered that it wasn't working).
For example,
(define-for-syntax (f x) (g x))
(define-for-syntax (g y) y)
is now allowed. The unbound-variable check for phase 1
and up is delayed until after the module body is partially expanded.
The JIT and bytecode compiler disagreed on the definition of
"constant". Now there are two levels: "constant" means constant across
all instantiations, and "fixed" means constant for a given instantation.
The JIT uses this distinction to generate direct-primitive calls
or not. (Without the distinction, a direct jump to `reverse' could
be wrong, because `racket/base' might get instantiated with the
JIT disabled or not.)
Also, fixed a bug in the JIT's `vector-set!' code in the case that
the target vector is a top-/module-level reference that is ready,
fixed, or constant.
More specifically, for a string of length N and a match that
only looks at the first M characters, the complexity of
`regexp-match' is now O(M) instead of O(N). This allows
`regexp-split' to be O(N) for a string instead of O(N^2).
Also, fixed a bug in non-greedy matching that could affect
both long strings and input ports.
Commit 311d55b5cf fixed a shallow bug that masked a deeper
bug in the interaction of local bindings and module-level
bindings. This one fixes the deeper problem, which is that
the recursive resolution that ignores module bindings should
start from the beginning of the wraps, not the wrap after
a module renaming.
Closes PR 12116
Shared locking now allowed only on input port, and exclusive
locking is allowed only on output ports, which allows an implementation
via fcntl(...,F_SETLK,...).