For the curious, this was an attempt to change the way context
matching works. Currently, when matching a pattern, if 'hole' is
encountered, the match succeeds and the result just includes the term
at that point. This means that when matching (in-hole p1 p2), p1
generally returns multiple results and then those results are thinned
out by matching p2 against the thing actually at the hole.
Instead, one could pass along the function that does the matching and
then, when matching a hole pattern, it could decide right at that
point whether or not the match works.
This seems like it would be a win overall, but it interferes with
caching. Specifically, most reduction systems have lots of rules
that all begin
(--> (in-hole E ...) ...)
and, in the strategy first described above, that matching can be cached.
But in the second, it cannot. Overall, this turns out to be a slight
lose in the current version of Redex. Maybe if other things change, however,
this tradeoff will change.
Revert "IN PROGRESS: more context speedup attempt"
This reverts commit 0134b8753d.
Revert "IN PROGRESS: a possible speed up attempt; match the thing in the hole before returning the context matches instead of afterwards"
This reverts commit 11059e2b5c.
This speeds up the lambdajs model considerably because the computation
to determine duplicates is expensive and no duplicates are really
ever dropped (and, in general, I think that duplicates will only
be dropped when the grammar is ambiguous; so maybe a better thing
is to just rewrite the grammar when that happens)
when one hole has been found
This improves the lambdajs model example's running time, presumably
because the hole is generally found near the "beginning" of the
term
Instead of using a hash-table, use the equal-hash-code directly;
this lets me evict entries only when they clobber each other,
and generally keep good cache utilization.
Also, cut the cache size by a factor of 5 while still having a
slight performance improvement on the r6rs test suite benchmark.
On that same benchmark, there are 1714812 misses in the cache, but
only 3485 times is an entry in the cache clobbered
have any holes, hide-holes, or names and, in that case, just
combining booleans instead of building of mtch structs.
This does seem to work on a simple benchmark. The code below
gets about 6x faster. But on the r6rs test suite, there is
no substantial change (possibly because the caching obviates
this optimization?)
lang racket/base
(require redex/reduction-semantics)
(caching-enabled? #f)
(define-language L (e (+ e e) number))
(define t
(let loop ([n 100])
(cond
[(zero? n) 1]
[else `(+ 11 ,(loop (- n 1)))])))
(define f (redex-match L e))
(time (for ([x (in-range 1000)]) (f t)))
When I enabled this, I don't see any speedup, on the R6RS test suite
benchmark (I see minor slowdown). Here are the numbers I get, on my
laptop:
nt cache: 35537 msec
neither: 844933 msec
Jay's idea: 875306 msec
And with both on, I see a similar, minor slowdown (as compared to the
version with the nt cache).
The main difference seems to be that I'm getting about 6 "hits" per
test case on the nt-match structs (that is, I avoid work by finding an
nt-match struct) and I'm getting about 8,800 hits in the cache per
test case.
redex patterns a bunch:
- repeats are turned into wrappers in sequences,
- names are all explicit,
- non-terminals are wrapped with `nt',
- cross patterns always have the hyphens in them.
- ellipses names are normalized (so there are no "hidden"
name equalities); this also means that repeat patterns
can have both a regular name and a mismatch name
Also, added a match-a-pattern helper macro that checks to make sure
that functions that process patterns don't miss any cases
This library is used by Redex, which wants a `syntax'-like template
language, but for datum values instead of syntax objects. Using
`datum-case' and `datum' generates much less code. Redex uses
only a small part of the general functionality, so adding
`syntax/datum' could be overkill. It's implemented by generalizing
the `syntax-case' and `syntax' pattern matching and template
constructing code, though; it's not a lot of extra code, and it's
easiest to generalize completely. We may find other uses for
datum templates, too.
when doing typesetting stuff in Redex, as the former seems to have
some kind of context dependency that makes it insert ||s around
some upper-case symbols sometimes
a) avoids creating big intermediate lists of the same things over and over
(this closes PR 12380)
b) generates less code (by generating calls to local functions)
c) normalizes its output (sorts by the printed representation)
(by expanding into a call to a 30 or so line procedure, instead of putting
that code directly into the result of the macro).
This produces about a 6x speedup on this reduction-relation
(reduction-relation L (--> 0 1) (--> 1 2) ... (--> 99 100))
where L is
(define-language L)
The time it takes to run "racket r6rs.rkt" in the shell from the
directory collects/redex/examples/r6rs speeds up by about 10% (15%
with errortrace enabled), in the case where all .zo files are built,
except the ones in the r6rs directory. (Also worth noting that "racket
-l redex" takes more than 50% of that time.) And the change has no
noticeable effect on the time it takes to run r6rs-test.rkt.
This case doesn't appear necessary, since LWs are constructed in an
expansion step that occurs after all of the meta-function names
(including the current one) are bound.
"canonical" way to write symbols, instead of the way they are displayed.
This makes a difference for symbols that have spaces in them or symbols
that, when displayed, look like numbers or other non-symbol things.
that apply-reduction-relation* (and thus test-->>) uses
also make apply-reduction-relation* call remove-duplicates
on the result of apply-reduction-relation