cells when searching in the tree
Two tricks: represent lists of nodes as improper lists so singleton
lists don't allocate a cons and pass around two accumulators that
correspond to the hd & tl of a path, instead of cons'ing that up
into a list and them immediately taking it apart again
measurement: when starting up drracket with
collects/drracket/private/unit.rkt and then waiting for the colorer to
finish, and then inserting an open quote right before the first open
quote in the file (and waiting again for the colorer to finish)
creates 249000 cons cells before this change and 116000 after this
change
After a little more work, I'm pretty much convinced that this was
the wrong approach and that the splaying implementation should just
change to not allocate the paths into lists at all, thus removing
the other 116k cons cells. (I plan to get to this another day;
it should not be difficult now that I roughly understand how these
things work.)
I also looked into top-down splaying and found these notes to
be illuminating:
http://digital.cs.usu.edu/~allan/DS/Notes/Ch22.pdf
They essentially convinced me that we cannot use top-down splaying
here, since the "reassembling" stage requires moving some arbitrary,
unexplored subtree from a right-child to a left-child, and thus the
left-subtree-length cannot be updated properly.
This distinction is important after the introduction of chaperones and
impersonators, since accessing a key and accessing its corresponding value
may have different effects, and hash-keys should only trigger the former.
the errors that would be signalled by the body. also, remove
url-regexp from the exports (it was only recently added)
I believe this eliminates two of Eli's concerns:
- the contract is no longer so painful to read
- the performance is more reasonable.
Specifically, for the performance, here are the times I see to call
string->url on "http://www.racket-lang.org":
no contract: any/c
cpu time: 564 real time: 566 gc time: 3
weak contract: (-> (or/c string? bytes?) url?)
cpu time: 590 real time: 590 gc time: 3
strong, regexp-based contract:
(-> (or/c (not/c #rx"^([^:/?#]*):") #rx"^[a-zA-Z][a-zA-Z0-9+.-]*:") url?)
cpu time: 632 real time: 633 gc time: 5
This appears to be about a 10% slowdown for the regexp-based contract
over the weaker contract.
related to PR 12652