One change, based on Adam's suggestion, was to rename the pragma to TOCKEXTERNAL.
Another, also based on Adam's suggestion, was to generate both the munged name and the original name, which allows (along with a previous patch) different files to declare the same PROC, and will remove the need for the occam_ prefix in the backend.
I also stopped using specific states in the lexer, in favour of just using the normal lexing function (which has had its type generalised slightly).
This will allow (along with a few patches in a minute) different occam files to declare the same PROC, and have it resolved correctly based on the order of their declaration, just like if it was all in one file.
The solution is a bit hacky, but this was an important problem. If your PRAGMA failed to parse, that was worthy of a warning. But if that then caused the parse to fail, all you would get is the parser error (could not find name), and you would never see the warnings about the pragmas not being recognised. So now the pragmas are shoved into the error (using a basic encoding) and pulled out and issued if the parser dies.
The separately compiled occam PROCs now use #PRAGMA OCCAMEXTERNAL, which also discards the "= number" thing at the end. These PROCs then need to be processed differently when adding on the sizes (C externals have one size per dimension, occam externals have the normal array of sizes).
We also now record which processes were originally at the top-level, and keep their original names (i.e. minus the _u43 suffixes) plus an "occam_" prefix to avoid collisions.
This shows that my current scheme is a bit of a hack; if an occam programmer starts a normal PROC with B or C, it will be treated funny during separate compilation. In future I probably need some new form of #PRAGMA to support separate compilation, different from that used to interface with C.
I changed a little bit of the code, but mainly the tests. Several of the remaining failures are actually real failures, so I need to dig through the rest carefully. A lot are failing because the C++ backend is broken.
Due to awkward module dependencies, some functions had to be moved around to accommodate this change. Two from Types have gone to EvalLiterals, and two to CompState. Everything still compiles just as before though.
The problem was that the free name could involved in an array dimension (and hence a type) of something in the PROC. When the name was then replaced in the type, CompState was not updated to have the new type, and instead kept the old type (potentially) all the way through to the backend, where it might be used for checking the bounds of an array index (against the old name taken from CompState, not the replaced name).
With my previous change to PRAGMAs, unknown pragmas would fatally fail in the lexer, so that an unknown pragma would always stop compilation, which is not good. I've changed it more towards Adam's suggestion of re-lexing and re-parsing the pragma from the parser, so we now gracefully ignore unknown pragmas again. The lexer is a bit messy, though.
This is quite a big patch, as it reworks a large pass. The three backend passes dealing with sizes stuff have now been merged into one (because the traversal order is important).
Instead of generating sizes arrays by blindly appending "_sizes", we now create nonces and store them in the csArraySizes map in CompState, which is a bit less hacky.
Added to that, we also generate constant-size arrays (e.g. for [8]) -- which are needed in case we pass the array to a PROC that has a flexible dimension -- at the top of the whole program, and use that array for every variable with that size (so if foo and bar have the same size, we use the same sizes array from the top of the program).