There was a bug where things scoped in via pragmas were never scoped out again, which was screwing up the local names stack. I then realised/decided that pragmas were really specifications, and decided to put them there in the parser.
The rest of this patch is just some rewiring to allow the special name munging involved in pragmas (they have already got a munged version of their name) and to stop the scoped in pragmas appearing in the AST.
The second part of the patch is essential, given the first. Otherwise names in different pragmas in the same file can overlap -- this already happened in oak!
I must admit, this was mainly done to allow munged names back in again as valid identifiers.
OEP 144 suggests replacing dot with underscore; this change just allows underscore alongside dot. It won't break any existing code, and seems like something we want anyway, so I think it's a valid thing to do.
One change, based on Adam's suggestion, was to rename the pragma to TOCKEXTERNAL.
Another, also based on Adam's suggestion, was to generate both the munged name and the original name, which allows (along with a previous patch) different files to declare the same PROC, and will remove the need for the occam_ prefix in the backend.
I also stopped using specific states in the lexer, in favour of just using the normal lexing function (which has had its type generalised slightly).
This will allow (along with a few patches in a minute) different occam files to declare the same PROC, and have it resolved correctly based on the order of their declaration, just like if it was all in one file.
The solution is a bit hacky, but this was an important problem. If your PRAGMA failed to parse, that was worthy of a warning. But if that then caused the parse to fail, all you would get is the parser error (could not find name), and you would never see the warnings about the pragmas not being recognised. So now the pragmas are shoved into the error (using a basic encoding) and pulled out and issued if the parser dies.
The separately compiled occam PROCs now use #PRAGMA OCCAMEXTERNAL, which also discards the "= number" thing at the end. These PROCs then need to be processed differently when adding on the sizes (C externals have one size per dimension, occam externals have the normal array of sizes).
We also now record which processes were originally at the top-level, and keep their original names (i.e. minus the _u43 suffixes) plus an "occam_" prefix to avoid collisions.
I changed a little bit of the code, but mainly the tests. Several of the remaining failures are actually real failures, so I need to dig through the rest carefully. A lot are failing because the C++ backend is broken.
With my previous change to PRAGMAs, unknown pragmas would fatally fail in the lexer, so that an unknown pragma would always stop compilation, which is not good. I've changed it more towards Adam's suggestion of re-lexing and re-parsing the pragma from the parser, so we now gracefully ignore unknown pragmas again. The lexer is a bit messy, though.