drop the tail call fanciness
"simple enough", for now, means that it is a struct selector, predicate,
constructor, or mutator. Perhaps we will learn more about such simple
procedures where this is safe some other way.
This commit speeds up this program:
#lang racket/base
(require racket/contract/base)
(struct s (x))
(define f (contract (-> any/c integer?) s-x 'pos 'neg))
(define an-s (s 1))
(time
(for ([x (in-range 10000000)])
(f an-s)))
by about 1.9x
Skip calling the domain projection in that case and, if all of the
arguments are any/c then also skip putting the contract continuation mark
This appears to give about a 20% speed up on this program:
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c integer?)
(λ (x) 1)
'pos 'neg))
(time
(for ([x (in-range 4000000)])
(f 1)))
The recently added fast path for property-only chaperones did not
propagate the original object in the case that the property-only
chaperone wraps a `chaperone-procedure*` chaprerone.
Merge to v6.4
Fix `procedure-specialize` for a procedure that refers to a
syntax-object literal. A syntax-object literal becomes part of the
procedure closure, but in a special way that nomrally allows syntax
objects to be loaded on demand. For now, specialization counts as
a demand of the syntax object.
Merge to v6.4
After the JIT buffer becomes too full, some paths
don't bail out fast enough, so guard against
broken info in some relatively new uses of the info.
Merge to v6.4
When a procedure created by `unsafe-{chaperone,impersonate}-procedure`
is given the wrong number of arguments, the original procedure's name
should be used in the error message.
This commit, combined with the use of unsafe-chaperone-procedure,
achieves almost the same speedups as c24ddb4a7, but now correctly.
More concretely, this program:
#lang racket/base
(module server racket/base
(require racket/contract/base)
(provide
(contract-out
[f (-> integer? integer?)]))
(define (f x) x))
(require 'server)
(time
(let ([f f]) ;; <-- defeats the plus-one-arity optimiztion
(for ([x (in-range 1000000)])
(f 1) (f 2) (f 3) (f 4) (f 5))))
runs only about 40% slower than the version without the "(let ([f f])"
and this program
#lang racket/base
(module m racket/base
(provide f)
(define (f x) x))
(module n typed/racket/base
(require/typed
(submod ".." m)
[f (-> Integer Integer)])
(time
(for ([x (in-range 1000000)])
(f 1) (f 2) (f 3) (f 4))))
(require 'n)
runs about 2.8x faster than it did before that same set of changes.
for contracts where the arity of the given function is exactly
the arity that the contract expects (i.e. no optional arguments
are turned into madatory or dropped)
During inlining, the type information gathered in
code that was inside the lambda is copied to the outer
context. But the coordinates of the type information
were shifted in the wrong direction, so the type was
assigned to the wrong variable.
This bug is difficult to trigger, so the test is convoluted.
Merge to v6.4
Add 'module-body-inside-context, 'module-body-outside-context, and
'module-body-context-simple? properties to the expansion of a
`module` form. These properties expose scopes that are used by
`module->namespace` and taht appear in marshaled bytecode.
The expression in a `define-runtime-path` form is used in
both a run-time context and a compile-time context. The
latter is used for `raco exe`. In a cross-build context,
you might need to load OpenSSL support for Linux (say)
at build time while generating executables that refer to
Windows (say) OpenSSL support. In that case, `#:runtime?-id`
lets you choose between `(cross-system-type)` and
`(system-type)`.
Merge to v6.4
In particular, instead of going directly back to the chaperone, handle
the case where the function doesn't accept keyword arguments with a
less expensive fallback.
The less expensive fallback uses a case-lambda wrapper (wrapped inside
a make-keyword-procedure) to close over the neg-party and avoid the
chaperone creation. With this commit, the program below gets about 3x
faster, and is only about 20% slower than the version that replaces
the "(let ([f f]) ...)" with its body
#lang racket/base
(module m racket/base
(require racket/contract/base)
(provide (contract-out [f (-> integer? integer?)]))
(define (f x) x))
(require 'm)
(collect-garbage)
(time (for ([x (in-range 5000000)]) (let ([f f]) (f 1))))
Thanks, @samth!
OS X's libssl is deprecated, and it doesn't work with SSL connections
that need SNI. We'll distribute out own libssl builds for OS X via a
package, but we need a native implementation that works well enough to
get that package.
The 'secure protocol symbol is just a shorthand for
`(ssl-secure-client-context)`, but it helps highlight
that the default 'auto isn't secure, and having a plain
symbol smooths the connection to native Win32 and OS X
implementations of SSL.
The getdtablesize() result appears not to be constant on OS X.
Creating some combination of CFStream objects, threads, and CFRunLoop
objects can cause the value to be increased by a factor of 10. Avoid
the need for getdtablesize() by switching from select() to poll().
this yields only a modest speed up and only when iterating
over long sequences, e.g. like in this program:
#lang racket/base
(require racket/sequence racket/contract/base)
(define s (make-string 10000 #\a))
(define str (contract (any/c . -> . (sequence/c char?))
(λ (x) (in-string s))
'pos 'neg))
(time
(for ([x (in-range 100)])
(for ([x (str 0)])
(void))))
The repair involves making `raco exe` detect a sub-submodule
whose name is `declare-preserve-for-embedding` as an indication
that a submodule should be carried along with its enclosing module.
Normally, `define-runtime-module-path-index` would do that, but
the submodule for `place` is created with `syntax-local-lift-module`,
and the point of `syntax-local-lift-module` is to work in a
nested experssion context where definitions cannot be lifted
to the enclosing module.
Made the hash-set chaperones essentially forward the hash chaperone
operations, but now explain them all in terms of set-based operations
in the docs.
Also adjusted value-blame and has-blame? to support late-neg projections
The issue is what happens when the actual function has other arities.
For example, if the function were (λ (x [y 1]) y) then it is not okay
to simply check if procedure-arity-includes? of 1 is true (what the
code used to do) because then when the function is applied to 2
arguments, the call won't fail like it should. It is possible to check
and reject functions that don't have exactly the right arity, but if
the contract were (-> string? any), then the function would have been
allowed and only when the extra argument is supplied would the error
occur. So, this commit makes it so that (-> any/c any) is like
(-> string? any), but with the optimization that if the procedure
accepts only one argument, then no wrapper is created.
This is a backwards incompatible change because it used to be the
case that (flat-contract? (-> any)) returned #t and it now returns #f.
When a module defines <name-1> and doesn't export it, but when
the module imports <name-2> and re-exports that refers to another
module's definition of <name-1>, then <name-1> wasn't properly
registered as an unexported binding.
Most of the implementation change is just a clean-up of an
unnecessary traversal from before the addition of a `count`
field in each hash table.
Also, add `#:skip-filtered-directory?` to `find-files`.
Less significantly, adjust `pathlist-closure` to be consistent in the
way that it includes a separator at the end of a directory path.
When a tree marshaled to bytecode form has many shared pieces,
the unmarshaling process can lose track of the sharing in one
place and traverse too much of the structure as pieces are
loaded.
I was using match-let and got a syntax error that pointed to this file. After changing the match-let definition to use syntax/loc the error pointed to the exact spot causing the problem. Yay!
I changed quite a few vanilla syntax-quotes to the *syntax/loc form... perhaps some do not need to? I'm not sure.
added back nested syntax/loc
This program runs about 10x faster than it did before this commit, but
seems to still be about 100x slower than the version where you change
an-s to just be (s).
#lang racket/base
(require racket/contract/base racket/generic)
(define-generics id [m id x])
(struct s () #:methods gen:id [(define (m g x) x)])
(define an-s
(contract (id/c [m (-> any/c integer? integer?)])
(s)
'pos 'neg))
(time
(for ([x (in-range 100000)])
(m an-s 2)))
rest of the contract system, creating and using a slightly
more legitmate blame record and calling into the late-neg
projections instead of using `contract`
turns out to be a predicate.
In that case, just call it instead of creating all of the extra junk
that would normally be created by coercing the predicate to a contract
and invoking it
Use a job object to ensure that subprocesses that are meant
to be killed by the current custodian are reliably terminated
if Racket exits for any reason.
Although `procedure-specialize` should be useful in places where
inlining does not apply, allowing inlining and related optimizations
through it, anyway.
- use chaperone-hash-set for set/c when the contract allows only hash-sets
- add a #:lazy flag to allow explicit choice of when to use laziness
(but have a backwards-compatible default that, roughly, eschews laziness
only when the resulting contract would be flat)
Specifically, remove reliance on procedure-closure-contents-eq? to
tell when a pending check is stronger in favor of usint
contract-stronger?
Also, tighten up the specification of contract-stronger? to require
that any contract is stronger than itself
With this commit, this program gets about 10% slower:
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c integer?)
(λ (x) (if (zero? x)
0
(f (- x 1))))
'pos 'neg))
(time (f 2000000))
becuase the checking is doing work more explicitly now but because the
checking in more general, it identifies the redundant checking in this
program
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c integer?)
(contract
(-> any/c integer?)
(λ (x) (if (zero? x)
0
(f (- x 1))))
'pos 'neg)
'pos 'neg))
(time (f 200000))
which makes it run about 13x faster than it did before
I'm not sure if this is a win overall, since the checking can be more
significant in the case of "near misses". For example, with this
program, where neither the new nor the old checking detects the
redundancy is about 40% slower after this commit than it was before:
#lang racket/base
(require racket/contract/base)
(define f
(contract
(-> any/c (<=/c 0))
(contract
(-> any/c (>=/c 0))
(λ (x) (if (zero? x)
0
(f (- x 1))))
'pos 'neg)
'pos 'neg))
(time (f 50000))
(The redundancy isn't detected here because the contract system only
looks at the first pending contract check.)
Overall, despite the fact that it slows down some programs and speeds
up others, my main thought is that it is worth doing because it
eliminates a (painful) reliance on procedure-closure-contents-eq? that
inhibits other approaches to optimizing these contracts we might try.
After adding `procedure-specialize`, making
`procedure-closure-contents-eq?` work as before involves
a little extra tracking. I'd prefer to weaken or
even get rid of `procedure-closure-contents-eq?`, but
this adjustment keeps some contract tests passing.
The `procedure-specialize` function is the identity function, but it
provides a hint to the JIT to compile the body of a closure
specifically for the values in the closure (as opposed to compiling
the body generically for all closure instances).
This hint is useful to the contract system, where a predicate
is coerced to a projection with
(lambda (p?)
(procedure-specialize
(lambda (v)
(if (p? v)
v
....))))
Specializing the projection to a given `p?` allows primitive
predicates to be JIT-inlined in the projection's body.
Make the JIT-generated function-call dispatch recognize a call to
an impersonator that wraps a procedure only to hold properties.
This change also repairs handling for a arity-reducing wrapper
on a primitive, where the fast path incorrecty treated the
primitive as a JIT-generated function.
in particular, when there is a recursive contract, then we check only
some part of the first-order checks and see if that was enough to
distinguish the branches. if it was, we don't continue and otherwise we do
the first-order check and the projection itself
can duplicate work (potentailly lots of work
in a non-constant factor sort of a way when
recursive-contract is involved)
this seems also to be a potential problem for other
uses of or/c too
It used to have a (provide (except-out (all-from-out <private-file>) ...))
and various private functions leaked to the outside over the years.
None of the ones removed in this commit were documented, so hopefully
they weren't being used. But this is definitely not backwards compatible,
so this commit is mostly about testing the waters
A value that starts "1", "y", or "Y" enabled incremental mode
permanently (any value was allowed formerly), while a value that
starts "0", "n", or "N" causes incremental-mode requests to be
ignored.
When a major GC triggers finalization, another major
GC is scheduled immediately on the grounds that the
finalizer may release other values. That was important
at once time, but the finalization and weak-reference
implementation has improved to the point where the
extra ful GC no longer seems necessary or useful.
Originally, generation 1/2 was intended to delay major
collections when the heap is especially large. It doesn't
seem to be effective in that case, and it can slow down
minor GCs, so continue to use it only in incremental
mode (where it helps significantly with fragmentation).
At the completion of an incremental major GC, if incremental
mode wasn't requested recently, schedule an immediate major
GC to reduce the heap back to its normal footprint.
A collection's "info.rkt" might have `(define compile-omit-paths
'all)` but also other setup actions, so don't completely ignore
a collection directory just because there's nothing to compile.
Also, make nursery 1/4 as big in incremental mode, and
correspondingly tune the amount of work to perform per
collection by 1/4 its old value. Using generation 1/2
reduces fragmentation that would otherwise be increase
by a smaller nursery.
The main intent of this change is to raise the buffer size
for ARM, but that would leave only 32-bit x86 at size 100, so
just make it size large for all machines.
- uniformly remove the extra layers of calls to unknown functions for
chapereone-of? checks that make sure that chaperone contracts are
well-behaved (put those checks only in contracts that are created
outside racket/contract)
- clean up and simplify how missing projection functions are created
(val-first vs late-neg vs the regular ones)
- add some logging to more accurately tell when late-neg projections
aren't being used
- port the contract combinator that ->m uses to use late-neg
- port the </c combinator to use late-neg
Increemntal GC skips the old-generation compaction phase.
For most applications, that's ok, but it's possible for
fragmentation to get out of hand. Detect that situation
and fall back to non-incremental mode for one major GC.
Memory accounting is enabled on demand; if demand goes
away --- as approximated by no live custodians having
a limit or previously been queried for memory use ---
then stop accounting until demand resumes.
Fix the case that an old-generation finalizer ends up
on a modified page after all old-generation marking
is complete.
Also, make sure epehemerons are checked after previous
marking that may have left the stack empty.
The allocation strategy for immobile objects avoids some fragmentation
in non-incremental mode, but it interferes with finishing up an
incremental major collection, so trade some fragmentation for
an earlier finish (which is far more likely to use less memory
instead of more, despite extra fragmentation).