It now just uses "words" and regular pattern matches rather than regular expressions. The resulting code is quite a bit simpler, and goes much faster. I've added some unit tests for it too.