142 lines
8.2 KiB
Plaintext
142 lines
8.2 KiB
Plaintext
On verification
|
|
===============
|
|
|
|
* Some guarantees can be enforced statically using the type system.
|
|
They are in a way proven to be enforeced in the expanded code,
|
|
but in order to write verified compilers, it would be necessary to
|
|
prove that the type indeed enforces the desired property, for all
|
|
possible macro inputs.
|
|
|
|
Advantage: these proofs need to be written only once for the graph macro,
|
|
and can be relied on by all compilers written using these static guarantees.
|
|
* Some guarantees can be enforced statically "by construction",
|
|
and the design patterns used to enforce them are guaranteed to be
|
|
correctly applied because the code is macro-generated.
|
|
In order to write verified compilers, it would be necessary to
|
|
prove that the macro-generated code always enforces the desired property.
|
|
|
|
Advantage: these proofs need to be written only once for the graph macro,
|
|
and can be relied on by all compilers written using these static guarantees.
|
|
* Some guarantees can be enforced at run-time.
|
|
This allows catching bugs early, and ensures no incorrect output can occur
|
|
when these guarantees are broken. It does not however ensure that the compiler
|
|
will never break these guarantees. It is therefore still possible to ship a
|
|
broken compiler which fails on valid inputs.
|
|
|
|
Drawback: it is necessary for every compiler written using such guards to prove
|
|
that the guards are never raised.
|
|
|
|
|
|
Features
|
|
========
|
|
|
|
* GC:
|
|
* Group nodes by SCC
|
|
* Two kinds of "indices" pointers:
|
|
* index within the current strongly-connected component
|
|
* direct pointer to a descendant SCC + index within it
|
|
* To limit the waste when there is an SCC not containing nodes of a given type,
|
|
do not use a fixed tuple of vectors of nodes. Instead, use a "list with possible omissions".
|
|
Problem: the size of the fully-expanded type in TR is O(2ⁿ) for these.
|
|
Either:
|
|
* use casts
|
|
* pay the compile-time price of a huge type (not feasible with large N)
|
|
* pay the run-time price of bloated components (feasible but costly with large N)
|
|
* use a case→ function, which returns the empty vector as a fallback. With pure functions, this could be doable.
|
|
* Problem: generating all the possible functions means O(2^n) templates for closures that may be used at
|
|
run-time
|
|
* Otherwise, generate one closure over 1 array, one closure over 2 arrays, one over 3 arrays, etc.
|
|
and also close over some markers indicating to which "real" arrays these correspond.
|
|
This means O(2n) space usage worst-case.
|
|
Problem again: the construction function for such a case→ while minimizing the number of values closed on
|
|
has a large O(2^n) type too.
|
|
* typed clojure has typed heterogeneous maps https://github.com/clojure/core.typed/wiki/Types#heterogeneous-maps
|
|
* In theory it could be possible to give hints to the GC,
|
|
but Racket's GC does not offer that possibility.
|
|
|
|
GC Implementation:
|
|
* Must be optional
|
|
* Initially, all nodes within a single component, all indices are of the "slef + index" kind.
|
|
* Use tarjan's algorithm
|
|
* group nodes by SCC
|
|
Update references to nodes in the same SCC, using just an index
|
|
Update references to nodes in "lower" SCCs, using a pointer to the SCC + an index.
|
|
* Due to the higher memory cost (+ 1 reference for references to other SCCs),
|
|
and the poor set of options for implementation (cast, exponential size of type, or large memory usage)
|
|
we will not implement the GC.
|
|
|
|
We argue that in most cases it would not be useful, as compilers will normally work on the whole graph, and
|
|
not on a subgraph. In the rare case where only part of a graph is needed, applying the graph-level identity
|
|
transformation to the nodes of interest would effectively copy them and the nodes reachable from these new roots,
|
|
thereby allowing the original graph to be freed.
|
|
|
|
* Invariants
|
|
* Scope of the invariants:
|
|
* Input contracts, types and structural properties
|
|
* Output contracts, types and structural properties
|
|
* Transformation invariants (relate the input and output graphs)
|
|
* Time of verification:
|
|
* Run-time
|
|
Should be easy enough to implement: Add a define-graph-invariant form, recognize these options, and check
|
|
them at run-time. Implemented as some sorts of contracts.
|
|
* Compile-time
|
|
* Node types
|
|
* Statically-enforced structural invariants at the level of node types
|
|
(e.g. disallow cycles in the types, to ensure there are no cycles in the instance)
|
|
* Macros can be used to generate code which is known to be correct
|
|
e.g. paths
|
|
* Concern: static "by construction" guarantees may interfere with each other, if they fill in some nodes,
|
|
e.g. a "no cycles starting from this node" constraint would not work as expected if a "backwards link"
|
|
is filled in afterwards. We probably need to hardcode a set of constraints which know about eachother
|
|
and about the potential interactions.
|
|
* Conserve well-scopedness within a transition: pass in nodes flagged with a ∀ type, and check
|
|
that the output contains that flag.
|
|
* PHOAS
|
|
* Specification
|
|
* Invariants specified in the graph type
|
|
* Transformation invariants specified on the graph creation code
|
|
* Checks (run-time or the various compile-time mechanisms) are specified in the graph creation code
|
|
The graph creation code must enforce all invariants within the type
|
|
That way, it is guaranteed that any instance of the graph type satisfies its invariants, either by
|
|
construction, or as a guard at the end of the construction.
|
|
|
|
* Automatic generation of mappings
|
|
* when there is no mapping taking an old node as an input, a mapping is automatically generated.
|
|
The mapping simply translates the old node to the new node type, and recursively transforms its fields,
|
|
traversing lists etc.
|
|
* When there is a mapping with more than one argument, then no mapping is auto-generated for that input node type,
|
|
and instead calls to the mapping must be explicit (i.e. can't return the old node type).
|
|
* This means that we have a mechanism before the actual core graph macro, which checks and decides which mappings
|
|
to auto-generate.
|
|
* We also have a mechanism for auto-calling transformations on input node types
|
|
* Possibility to escape this, in order to actually insert a reference to the old graph?
|
|
* Some notation in the type, to indicate that it's "protected" in some way?
|
|
* Some wrapper indicating that it should merely be unwrapped, not processed? (I think that's better).
|
|
|
|
* Structural node equality
|
|
* Would be nice to coalesce nodes which are equal? (i.e. hash consing)
|
|
so that any two nodes which are equal? within the same graph have the same index.
|
|
I suppose this would be rather costly: O(n log n) comparisons, with each comparison potentially
|
|
costly, in principle. If we cached results, we could achieve a better run-time in practice, and
|
|
perhaps a better theoretical complexity if we handled cycles ourselves.
|
|
* The general algorithm when there are no unordered sets is deterministic finite automaton minimization
|
|
* The general algorithm when there are unordered sets is nondeterministic finite automaton minimization
|
|
* We could cache all calls to equal? on two nodes for a limited dynamic scope.
|
|
* If we have this, then we can, for comparisons within the same graph, quickly return #t or #f based on eq?
|
|
* Alpha-equivalence comparisons
|
|
(i.e. ignore the actual values of some fields, except for constraining the shape of the graph)
|
|
Possibility: create values which are unique within the graph, but are ignored when comparing
|
|
nodes from different graphs.
|
|
I'm not sure whether plainly erasing the information won't be enough, for a weaker form of alpha-equivalence?
|
|
* Alpha-equivalence can't easily be implemented atop equal?, dropping it for now.
|
|
It's not extremely useful anyway, as thanks to the full-blown graph representation, we are likely to
|
|
discard names early on (keeping them just for printing / debuggin purposes).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Coloring: the existing graph library for Racket has some coloring algorithms:
|
|
http://docs.racket-lang.org/graph/index.html#%28def._%28%28lib._graph%2Fmain..rkt%29._coloring%2Fgreedy%29%29
|
|
Maybe we can build a wrapper for those? |