notes
This commit is contained in:
parent
60f8915c59
commit
7bc0b80356
40
features-and-implementation
Normal file
40
features-and-implementation
Normal file
|
@ -0,0 +1,40 @@
|
||||||
|
* gc
|
||||||
|
* automatic generation of mappings Old → New
|
||||||
|
|
||||||
|
|
||||||
|
* safe and automatic generation of 1↔1, 1↔many and many↔1 backward references
|
||||||
|
|
||||||
|
** global directives which apply to all nodes types: insert a pointer to the root, insert a backward pointer for every link (i.e. .back brings back to the previous node)
|
||||||
|
|
||||||
|
** the backward references specify the link that should be reversed, as well as a path from there. That way, methods can have a backward pointer to the "call" instructions referencing them, but the backward pointer should not
|
||||||
|
directly point to the "call" instruction, but instead to the instruction's containing method, accessed with ".method". Maybe some syntactic sugar can allow this to be specified as #:backward-reference method.instructions.*[call]
|
||||||
|
|
||||||
|
* try to think about PHOAS
|
||||||
|
|
||||||
|
** Strong HOAS article: https://www.schoolofhaskell.com/user/edwardk/phoas
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
* mapping-placeholder = Placeholder type, one for each mapping
|
||||||
|
* mapping-incomplete-result = (~> mapping) types within the mapping's result type are replaced by (U mapping-placeholder mapping-incomplete-result), and node types are replaced by (U xxx old-node)
|
||||||
|
* mapping-with-promises-result = node types within the mapping's result type are replaced by node-promise
|
||||||
|
|
||||||
|
* node-promise = (Promise node-with-promises)
|
||||||
|
* node-with-promises = node types within the node's fields are replaced by node-promise
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
To discuss:
|
||||||
|
* maybe add a "self" variable, so that a mapping can give to others a placeholder for the node it is currently generating? That wouldn't play well with mappings which generate multiple nodes and/or part of a node.
|
||||||
|
** self is a opaque "SELF" token, not an actual placeholder
|
||||||
|
|
||||||
|
Minor TODOs:
|
||||||
|
* If we can guarantee that any two nodes within the same graph are (not (equal? …)), then we can implement a fast negative equality test, in addition to the existing fast equality test (which compares the "raw" field of nodes).
|
142
features-and-implementation2
Normal file
142
features-and-implementation2
Normal file
|
@ -0,0 +1,142 @@
|
||||||
|
On verification
|
||||||
|
===============
|
||||||
|
|
||||||
|
* Some guarantees can be enforced statically using the type system.
|
||||||
|
They are in a way proven to be enforeced in the expanded code,
|
||||||
|
but in order to write verified compilers, it would be necessary to
|
||||||
|
prove that the type indeed enforces the desired property, for all
|
||||||
|
possible macro inputs.
|
||||||
|
|
||||||
|
Advantage: these proofs need to be written only once for the graph macro,
|
||||||
|
and can be relied on by all compilers written using these static guarantees.
|
||||||
|
* Some guarantees can be enforced statically "by construction",
|
||||||
|
and the design patterns used to enforce them are guaranteed to be
|
||||||
|
correctly applied because the code is macro-generated.
|
||||||
|
In order to write verified compilers, it would be necessary to
|
||||||
|
prove that the macro-generated code always enforces the desired property.
|
||||||
|
|
||||||
|
Advantage: these proofs need to be written only once for the graph macro,
|
||||||
|
and can be relied on by all compilers written using these static guarantees.
|
||||||
|
* Some guarantees can be enforced at run-time.
|
||||||
|
This allows catching bugs early, and ensures no incorrect output can occur
|
||||||
|
when these guarantees are broken. It does not however ensure that the compiler
|
||||||
|
will never break these guarantees. It is therefore still possible to ship a
|
||||||
|
broken compiler which fails on valid inputs.
|
||||||
|
|
||||||
|
Drawback: it is necessary for every compiler written using such guards to prove
|
||||||
|
that the guards are never raised.
|
||||||
|
|
||||||
|
|
||||||
|
Features
|
||||||
|
========
|
||||||
|
|
||||||
|
* GC:
|
||||||
|
* Group nodes by SCC
|
||||||
|
* Two kinds of "indices" pointers:
|
||||||
|
* index within the current strongly-connected component
|
||||||
|
* direct pointer to a descendant SCC + index within it
|
||||||
|
* To limit the waste when there is an SCC not containing nodes of a given type,
|
||||||
|
do not use a fixed tuple of vectors of nodes. Instead, use a "list with possible omissions".
|
||||||
|
Problem: the size of the fully-expanded type in TR is O(2ⁿ) for these.
|
||||||
|
Either:
|
||||||
|
* use casts
|
||||||
|
* pay the compile-time price of a huge type (not feasible with large N)
|
||||||
|
* pay the run-time price of bloated components (feasible but costly with large N)
|
||||||
|
* use a case→ function, which returns the empty vector as a fallback. With pure functions, this could be doable.
|
||||||
|
* Problem: generating all the possible functions means O(2^n) templates for closures that may be used at
|
||||||
|
run-time
|
||||||
|
* Otherwise, generate one closure over 1 array, one closure over 2 arrays, one over 3 arrays, etc.
|
||||||
|
and also close over some markers indicating to which "real" arrays these correspond.
|
||||||
|
This means O(2n) space usage worst-case.
|
||||||
|
Problem again: the construction function for such a case→ while minimizing the number of values closed on
|
||||||
|
has a large O(2^n) type too.
|
||||||
|
* typed clojure has typed heterogeneous maps https://github.com/clojure/core.typed/wiki/Types#heterogeneous-maps
|
||||||
|
* In theory it could be possible to give hints to the GC,
|
||||||
|
but Racket's GC does not offer that possibility.
|
||||||
|
|
||||||
|
GC Implementation:
|
||||||
|
* Must be optional
|
||||||
|
* Initially, all nodes within a single component, all indices are of the "slef + index" kind.
|
||||||
|
* Use tarjan's algorithm
|
||||||
|
* group nodes by SCC
|
||||||
|
Update references to nodes in the same SCC, using just an index
|
||||||
|
Update references to nodes in "lower" SCCs, using a pointer to the SCC + an index.
|
||||||
|
* Due to the higher memory cost (+ 1 reference for references to other SCCs),
|
||||||
|
and the poor set of options for implementation (cast, exponential size of type, or large memory usage)
|
||||||
|
we will not implement the GC.
|
||||||
|
|
||||||
|
We argue that in most cases it would not be useful, as compilers will normally work on the whole graph, and
|
||||||
|
not on a subgraph. In the rare case where only part of a graph is needed, applying the graph-level identity
|
||||||
|
transformation to the nodes of interest would effectively copy them and the nodes reachable from these new roots,
|
||||||
|
thereby allowing the original graph to be freed.
|
||||||
|
|
||||||
|
* Invariants
|
||||||
|
* Scope of the invariants:
|
||||||
|
* Input contracts, types and structural properties
|
||||||
|
* Output contracts, types and structural properties
|
||||||
|
* Transformation invariants (relate the input and output graphs)
|
||||||
|
* Time of verification:
|
||||||
|
* Run-time
|
||||||
|
Should be easy enough to implement: Add a define-graph-invariant form, recognize these options, and check
|
||||||
|
them at run-time. Implemented as some sorts of contracts.
|
||||||
|
* Compile-time
|
||||||
|
* Node types
|
||||||
|
* Statically-enforced structural invariants at the level of node types
|
||||||
|
(e.g. disallow cycles in the types, to ensure there are no cycles in the instance)
|
||||||
|
* Macros can be used to generate code which is known to be correct
|
||||||
|
e.g. paths
|
||||||
|
* Concern: static "by construction" guarantees may interfere with each other, if they fill in some nodes,
|
||||||
|
e.g. a "no cycles starting from this node" constraint would not work as expected if a "backwards link"
|
||||||
|
is filled in afterwards. We probably need to hardcode a set of constraints which know about eachother
|
||||||
|
and about the potential interactions.
|
||||||
|
* Conserve well-scopedness within a transition: pass in nodes flagged with a ∀ type, and check
|
||||||
|
that the output contains that flag.
|
||||||
|
* PHOAS
|
||||||
|
* Specification
|
||||||
|
* Invariants specified in the graph type
|
||||||
|
* Transformation invariants specified on the graph creation code
|
||||||
|
* Checks (run-time or the various compile-time mechanisms) are specified in the graph creation code
|
||||||
|
The graph creation code must enforce all invariants within the type
|
||||||
|
That way, it is guaranteed that any instance of the graph type satisfies its invariants, either by
|
||||||
|
construction, or as a guard at the end of the construction.
|
||||||
|
|
||||||
|
* Automatic generation of mappings
|
||||||
|
* when there is no mapping taking an old node as an input, a mapping is automatically generated.
|
||||||
|
The mapping simply translates the old node to the new node type, and recursively transforms its fields,
|
||||||
|
traversing lists etc.
|
||||||
|
* When there is a mapping with more than one argument, then no mapping is auto-generated for that input node type,
|
||||||
|
and instead calls to the mapping must be explicit (i.e. can't return the old node type).
|
||||||
|
* This means that we have a mechanism before the actual core graph macro, which checks and decides which mappings
|
||||||
|
to auto-generate.
|
||||||
|
* We also have a mechanism for auto-calling transformations on input node types
|
||||||
|
* Possibility to escape this, in order to actually insert a reference to the old graph?
|
||||||
|
* Some notation in the type, to indicate that it's "protected" in some way?
|
||||||
|
* Some wrapper indicating that it should merely be unwrapped, not processed? (I think that's better).
|
||||||
|
|
||||||
|
* Structural node equality
|
||||||
|
* Would be nice to coalesce nodes which are equal? (i.e. hash consing)
|
||||||
|
so that any two nodes which are equal? within the same graph have the same index.
|
||||||
|
I suppose this would be rather costly: O(n log n) comparisons, with each comparison potentially
|
||||||
|
costly, in principle. If we cached results, we could achieve a better run-time in practice, and
|
||||||
|
perhaps a better theoretical complexity if we handled cycles ourselves.
|
||||||
|
* The general algorithm when there are no unordered sets is deterministic finite automaton minimization
|
||||||
|
* The general algorithm when there are unordered sets is nondeterministic finite automaton minimization
|
||||||
|
* We could cache all calls to equal? on two nodes for a limited dynamic scope.
|
||||||
|
* If we have this, then we can, for comparisons within the same graph, quickly return #t or #f based on eq?
|
||||||
|
* Alpha-equivalence comparisons
|
||||||
|
(i.e. ignore the actual values of some fields, except for constraining the shape of the graph)
|
||||||
|
Possibility: create values which are unique within the graph, but are ignored when comparing
|
||||||
|
nodes from different graphs.
|
||||||
|
I'm not sure whether plainly erasing the information won't be enough, for a weaker form of alpha-equivalence?
|
||||||
|
* Alpha-equivalence can't easily be implemented atop equal?, dropping it for now.
|
||||||
|
It's not extremely useful anyway, as thanks to the full-blown graph representation, we are likely to
|
||||||
|
discard names early on (keeping them just for printing / debuggin purposes).
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
* Coloring: the existing graph library for Racket has some coloring algorithms:
|
||||||
|
http://docs.racket-lang.org/graph/index.html#%28def._%28%28lib._graph%2Fmain..rkt%29._coloring%2Fgreedy%29%29
|
||||||
|
Maybe we can build a wrapper for those?
|
99
features-and-implementation3
Normal file
99
features-and-implementation3
Normal file
|
@ -0,0 +1,99 @@
|
||||||
|
* Invariants
|
||||||
|
* Scope of the invariants:
|
||||||
|
* Input contracts, types and structural properties
|
||||||
|
* Output contracts, types and structural properties
|
||||||
|
* Transformation invariants (relate the input and output graphs)
|
||||||
|
* Time of verification:
|
||||||
|
* Run-time
|
||||||
|
Should be easy enough to implement: Add a define-graph-invariant form, recognize these options, and check
|
||||||
|
them at run-time. Implemented as some sorts of contracts.
|
||||||
|
* Compile-time
|
||||||
|
* Node types
|
||||||
|
* Statically-enforced structural invariants at the level of node types
|
||||||
|
(e.g. disallow cycles in the types, to ensure there are no cycles in the instance)
|
||||||
|
* Macros can be used to generate code which is known to be correct
|
||||||
|
e.g. paths
|
||||||
|
* Concern: static "by construction" guarantees may interfere with each other, if they fill in some nodes,
|
||||||
|
e.g. a "no cycles starting from this node" constraint would not work as expected if a "backwards link"
|
||||||
|
is filled in afterwards. We probably need to hardcode a set of constraints which know about eachother
|
||||||
|
and about the potential interactions.
|
||||||
|
* Conserve well-scopedness within a transition: pass in nodes flagged with a ∀ type, and check
|
||||||
|
that the output contains that flag. Potentially out-of-scope fields in the input do not have the flag.
|
||||||
|
* PHOAS
|
||||||
|
* Specification
|
||||||
|
* Invariants specified in the graph type
|
||||||
|
* Transformation invariants specified on the graph creation code
|
||||||
|
* Checks (run-time or the various compile-time mechanisms) are specified in the graph creation code
|
||||||
|
The graph creation code must enforce all invariants within the type
|
||||||
|
That way, it is guaranteed that any instance of the graph type satisfies its invariants, either by
|
||||||
|
construction, or as a guard at the end of the construction.
|
||||||
|
|
||||||
|
; The body should produce a function of type (→ (Listof Nodeᵢ) … Boolean)
|
||||||
|
; The body should also return an indication about which invariants it ensures. Maybe use {~seq #:ensures (invariant-name inariant-arg …)} …
|
||||||
|
however this does not allow the body to perform some minimal amount of rewriting on the "ensures" options.
|
||||||
|
(define-syntax/parse (define-graph-contract (name g-descriptor arg …) . body)
|
||||||
|
#'(define-syntax name
|
||||||
|
(graph-contract
|
||||||
|
(λ (g-descriptor arg …) . body))))
|
||||||
|
|
||||||
|
;; TODO: find a way to translate this to a type, with subtyping for weaker invariants.
|
||||||
|
;; The type only serves to testify that the invariant was statically enforced or dynamically checked at construction, it does not actually encode the property using the type system.
|
||||||
|
(struct invariants-wrapper ()) ;; Use a private struct to prevent forging of the invariants aggregated in a case→ (since it is never executed, any non-terminating λ could otherwise be supplied).
|
||||||
|
(define-for-syntax invariant-introducer
|
||||||
|
(make-syntax-introducer))
|
||||||
|
;; the body should return the syntax for a type, such that less precise invariants are supertypes of that type.
|
||||||
|
(define-syntax/parse (define-graph-invariant (name g-descriptor arg …) . body)
|
||||||
|
#'(define-syntax name
|
||||||
|
(graph-invariant
|
||||||
|
(λ (g-descriptor arg …) . body))))
|
||||||
|
|
||||||
|
The type of a graph node with the invariants inv₁ … invₙ on the graph must include an extra dummy field with type
|
||||||
|
(invariants-wrapper
|
||||||
|
(case (→ inv₁ inv-arg … #t)
|
||||||
|
…
|
||||||
|
(→ invₙ inv-arg … #t)))
|
||||||
|
|
||||||
|
The invariant arguments can be symbols, to indicate node types or field names.
|
||||||
|
|
||||||
|
Concern: invariants on a graph may not have the same semantics if the graph has more or fewer nodes?
|
||||||
|
|
||||||
|
* Automatic generation of mappings
|
||||||
|
* when there is no mapping taking an old node as an input, a mapping is automatically generated.
|
||||||
|
The mapping simply translates the old node to the new node type, and recursively transforms its fields,
|
||||||
|
traversing lists etc.
|
||||||
|
* When there is a mapping with more than one argument, then no mapping is auto-generated for that input node type,
|
||||||
|
and instead calls to the mapping must be explicit (i.e. can't return the old node type).
|
||||||
|
* This means that we have a mechanism before the actual core graph macro, which checks and decides which mappings
|
||||||
|
to auto-generate.
|
||||||
|
* We also have a mechanism for auto-calling transformations on input node types
|
||||||
|
* Possibility to escape this, in order to actually insert a reference to the old graph?
|
||||||
|
* Some notation in the type, to indicate that the output should be an "old" node here
|
||||||
|
* In the rare case where we have an (U (Old nd) nd), in the mapping use some (old v) wrapper
|
||||||
|
indicating that the returned it should merely be unwrapped, not processed.
|
||||||
|
|
||||||
|
(define-syntax define-graph-type
|
||||||
|
(syntax-parser
|
||||||
|
[(_ _name [_nodeᵢ [_fieldᵢⱼ :colon _τᵢⱼ] …] …)
|
||||||
|
;; save the graph type metadata
|
||||||
|
]))
|
||||||
|
|
||||||
|
(define-syntax define-graph-transformer
|
||||||
|
(syntax-parser
|
||||||
|
[(_ _name graph-type)
|
||||||
|
;; TODO
|
||||||
|
]))
|
||||||
|
|
||||||
|
* Structural node equality
|
||||||
|
* Would be nice to coalesce nodes which are equal? (i.e. hash consing)
|
||||||
|
so that any two nodes which are equal? within the same graph have the same index.
|
||||||
|
I suppose this would be rather costly: O(n log n) comparisons, with each comparison potentially
|
||||||
|
costly, in principle. If we cached results, we could achieve a better run-time in practice, and
|
||||||
|
perhaps a better theoretical complexity if we handled cycles ourselves.
|
||||||
|
* The general algorithm when there are no unordered sets is deterministic finite automaton minimization
|
||||||
|
* The general algorithm when there are unordered sets is nondeterministic finite automaton minimization
|
||||||
|
* We could cache all calls to equal? on two nodes for a limited dynamic scope.
|
||||||
|
* If we have this, then we can, for comparisons within the same graph, quickly return #t or #f based on eq?
|
||||||
|
|
||||||
|
* Coloring: the existing graph library for Racket has some coloring algorithms:
|
||||||
|
http://docs.racket-lang.org/graph/index.html#%28def._%28%28lib._graph%2Fmain..rkt%29._coloring%2Fgreedy%29%29
|
||||||
|
Maybe we can build a wrapper for those?
|
Loading…
Reference in New Issue
Block a user