This may seem like an odd change, but it simplifies the logic a lot. I kept having problems with passes not operating on externals (e.g. functions-to-procs, adding array sizes, constant folding in array dimensions) and adding a special case every time to also process the externals was getting silly.
Putting the externals in the AST therefore made sense, but I didn't want to just add dummy bodies as this would cause them to throw up errors (e.g. in the type-checking for functions). So I turned the bodies into a Maybe type, and that has worked out well.
I also stopped storing the formals in csExternals (since they are now in csNames, and the tree), which streamlined that nicely, and stopped me having to keep them up to date.
There was a bug where things scoped in via pragmas were never scoped out again, which was screwing up the local names stack. I then realised/decided that pragmas were really specifications, and decided to put them there in the parser.
The rest of this patch is just some rewiring to allow the special name munging involved in pragmas (they have already got a munged version of their name) and to stop the scoped in pragmas appearing in the AST.
The second part of the patch is essential, given the first. Otherwise names in different pragmas in the same file can overlap -- this already happened in oak!
One change, based on Adam's suggestion, was to rename the pragma to TOCKEXTERNAL.
Another, also based on Adam's suggestion, was to generate both the munged name and the original name, which allows (along with a previous patch) different files to declare the same PROC, and will remove the need for the occam_ prefix in the backend.
I also stopped using specific states in the lexer, in favour of just using the normal lexing function (which has had its type generalised slightly).
This will allow (along with a few patches in a minute) different occam files to declare the same PROC, and have it resolved correctly based on the order of their declaration, just like if it was all in one file.
The solution is a bit hacky, but this was an important problem. If your PRAGMA failed to parse, that was worthy of a warning. But if that then caused the parse to fail, all you would get is the parser error (could not find name), and you would never see the warnings about the pragmas not being recognised. So now the pragmas are shoved into the error (using a basic encoding) and pulled out and issued if the parser dies.
The separately compiled occam PROCs now use #PRAGMA OCCAMEXTERNAL, which also discards the "= number" thing at the end. These PROCs then need to be processed differently when adding on the sizes (C externals have one size per dimension, occam externals have the normal array of sizes).
We also now record which processes were originally at the top-level, and keep their original names (i.e. minus the _u43 suffixes) plus an "occam_" prefix to avoid collisions.
With my previous change to PRAGMAs, unknown pragmas would fatally fail in the lexer, so that an unknown pragma would always stop compilation, which is not good. I've changed it more towards Adam's suggestion of re-lexing and re-parsing the pragma from the parser, so we now gracefully ignore unknown pragmas again. The lexer is a bit messy, though.
At the moment, the information is only needed in the parser, which must define recursive names before parsing the body of the function. But in future, we should keep the information when the function becomes a proc, and then the C/C++ backends may need to use it (for example, when calculating stack space usage)
For now, I have fixed the occam parser so that it allows 1 or more direction specifiers after channel names. So c?? is valid, and should end up being equivalent to c?, but this may need altering later.
This lets you write things like "[cs! FOR 5]", which is horrible; I
would prefer "[cs FOR 5]!", since then that doesn't imply that you can
do things like "cs![0] ! 0".
However, Tock now compiles and passes cgtest87 -- the first occam-pi
cgtest we've handled. :)
This is mostly straightforward: modify the parser to allow direction
decorators in the right places, and extend the type checker to match.
There's some slight awkwardness in that some of the Types functions
have to perform the same checks as the type checker (e.g. directing a
non-channel), so I've tidied up their error messages a bit.
At the backend, I've just added a little pass to strip out all the
DirectedVariables, since the other backend passes don't handle them
gracefully. From the occam/C point of view this is fine, but I'm not
sure if it's going to cause problems for C++.
Previously it was a tuple, which meant it couldn't have sensible
custom instances. Token and TokenType now have Show instances, so we
get more useful output when parsing fails.