This may seem like an odd change, but it simplifies the logic a lot. I kept having problems with passes not operating on externals (e.g. functions-to-procs, adding array sizes, constant folding in array dimensions) and adding a special case every time to also process the externals was getting silly.
Putting the externals in the AST therefore made sense, but I didn't want to just add dummy bodies as this would cause them to throw up errors (e.g. in the type-checking for functions). So I turned the bodies into a Maybe type, and that has worked out well.
I also stopped storing the formals in csExternals (since they are now in csNames, and the tree), which streamlined that nicely, and stopped me having to keep them up to date.
I changed a little bit of the code, but mainly the tests. Several of the remaining failures are actually real failures, so I need to dig through the rest carefully. A lot are failing because the C++ backend is broken.
The problem was that the free name could involved in an array dimension (and hence a type) of something in the PROC. When the name was then replaced in the type, CompState was not updated to have the new type, and instead kept the old type (potentially) all the way through to the backend, where it might be used for checking the bounds of an array index (against the old name taken from CompState, not the replaced name).
When we pull up names (which happens before abbreviation checking), we create a lot of nonces with greater scope than the original variables. When we go to abbreviation check this, we therefore set off lots of false alarms about originally safe code. So to fix this, the easiest method seems to be turning off checking on all the names introduced by the compiler.
This expansion was causing a big blow-up in the code, as things like:
VAL [2][1]INT as IS [[0,1]]
were getting transformed into:
VAL [2]INT n0 IS [0,1]:
VAL [2]INT n1 IS [0,1]:
VAL [2]INT n2 IS [n0[0], n1[1]]:
VAL [2][1]INT as IS [n2]:
Or something similar -- the inner arrays were pulled up into multiple definitions that were then subscripted, because the first pull-up did this:
VAL [2]INT n2 IS [[0,1][0], [0,1][1]]:
and then the inner arrays got pulled up again, separately. The change hasn't immediately broken anything, but I haven't fully tested it yet