The lexer mishandled cases like "#x1E+2", because "E" is
not an exponent marker for hexadecimal --- and it has been that
way forever, but the repair isn't all that difficult.
lexers it defers to
also, remove the checks in color.rkt in the framework (they are not
all covered by the added contract, but they mostly are and when they
aren't, most of those times are using the heavily tested racket-lexer)
Also, some cleanups of the code and
make the syntax-color/ library docs
point to color:text<%> instead of
color:text% (as the interesting information
is attached to the interface, not the class)
A few invariants involving subtree-width and black-height balance
could break if singleton nodes were reused; bugs were due to stale
data in the nodes.
The token tree implements the implicit interface in the original
splay-based token tree module in syntax-color/token-tree. However, it
does not appear to perform significantly differently in the case of
indentation yet. It needs some additional profiling and analysis to
see if we can take advantage of the richer structure in the rb tree.
cells when searching in the tree
Two tricks: represent lists of nodes as improper lists so singleton
lists don't allocate a cons and pass around two accumulators that
correspond to the hd & tl of a path, instead of cons'ing that up
into a list and them immediately taking it apart again
measurement: when starting up drracket with
collects/drracket/private/unit.rkt and then waiting for the colorer to
finish, and then inserting an open quote right before the first open
quote in the file (and waiting again for the colorer to finish)
creates 249000 cons cells before this change and 116000 after this
change
After a little more work, I'm pretty much convinced that this was
the wrong approach and that the splaying implementation should just
change to not allocate the paths into lists at all, thus removing
the other 116k cons cells. (I plan to get to this another day;
it should not be difficult now that I roughly understand how these
things work.)
I also looked into top-down splaying and found these notes to
be illuminating:
http://digital.cs.usu.edu/~allan/DS/Notes/Ch22.pdf
They essentially convinced me that we cannot use top-down splaying
here, since the "reassembling" stage requires moving some arbitrary,
unexplored subtree from a right-child to a left-child, and thus the
left-subtree-length cannot be updated properly.
and change the tests so they all run with port line
counting enabled (or else the unicode test fails)
adjust module-lexer.rkt tests so they can run in either
port-counting mode or not (but currently run them all in
port-counting mode because scheme-lexer doesn't work without it)
also make a first stab at what needs to change in the module
lexer to make it work in non port line-counting mode
'read-language' uses as a single token in the case that read-language
fails. This helps it to deal with things like s-exp and at-exp
properly
closes PR 12260