notes
This commit is contained in:
parent
60f8915c59
commit
7bc0b80356
40
features-and-implementation
Normal file
40
features-and-implementation
Normal file
|
@ -0,0 +1,40 @@
|
|||
* gc
|
||||
* automatic generation of mappings Old → New
|
||||
|
||||
|
||||
* safe and automatic generation of 1↔1, 1↔many and many↔1 backward references
|
||||
|
||||
** global directives which apply to all nodes types: insert a pointer to the root, insert a backward pointer for every link (i.e. .back brings back to the previous node)
|
||||
|
||||
** the backward references specify the link that should be reversed, as well as a path from there. That way, methods can have a backward pointer to the "call" instructions referencing them, but the backward pointer should not
|
||||
directly point to the "call" instruction, but instead to the instruction's containing method, accessed with ".method". Maybe some syntactic sugar can allow this to be specified as #:backward-reference method.instructions.*[call]
|
||||
|
||||
* try to think about PHOAS
|
||||
|
||||
** Strong HOAS article: https://www.schoolofhaskell.com/user/edwardk/phoas
|
||||
|
||||
|
||||
|
||||
* mapping-placeholder = Placeholder type, one for each mapping
|
||||
* mapping-incomplete-result = (~> mapping) types within the mapping's result type are replaced by (U mapping-placeholder mapping-incomplete-result), and node types are replaced by (U xxx old-node)
|
||||
* mapping-with-promises-result = node types within the mapping's result type are replaced by node-promise
|
||||
|
||||
* node-promise = (Promise node-with-promises)
|
||||
* node-with-promises = node types within the node's fields are replaced by node-promise
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
To discuss:
|
||||
* maybe add a "self" variable, so that a mapping can give to others a placeholder for the node it is currently generating? That wouldn't play well with mappings which generate multiple nodes and/or part of a node.
|
||||
** self is a opaque "SELF" token, not an actual placeholder
|
||||
|
||||
Minor TODOs:
|
||||
* If we can guarantee that any two nodes within the same graph are (not (equal? …)), then we can implement a fast negative equality test, in addition to the existing fast equality test (which compares the "raw" field of nodes).
|
142
features-and-implementation2
Normal file
142
features-and-implementation2
Normal file
|
@ -0,0 +1,142 @@
|
|||
On verification
|
||||
===============
|
||||
|
||||
* Some guarantees can be enforced statically using the type system.
|
||||
They are in a way proven to be enforeced in the expanded code,
|
||||
but in order to write verified compilers, it would be necessary to
|
||||
prove that the type indeed enforces the desired property, for all
|
||||
possible macro inputs.
|
||||
|
||||
Advantage: these proofs need to be written only once for the graph macro,
|
||||
and can be relied on by all compilers written using these static guarantees.
|
||||
* Some guarantees can be enforced statically "by construction",
|
||||
and the design patterns used to enforce them are guaranteed to be
|
||||
correctly applied because the code is macro-generated.
|
||||
In order to write verified compilers, it would be necessary to
|
||||
prove that the macro-generated code always enforces the desired property.
|
||||
|
||||
Advantage: these proofs need to be written only once for the graph macro,
|
||||
and can be relied on by all compilers written using these static guarantees.
|
||||
* Some guarantees can be enforced at run-time.
|
||||
This allows catching bugs early, and ensures no incorrect output can occur
|
||||
when these guarantees are broken. It does not however ensure that the compiler
|
||||
will never break these guarantees. It is therefore still possible to ship a
|
||||
broken compiler which fails on valid inputs.
|
||||
|
||||
Drawback: it is necessary for every compiler written using such guards to prove
|
||||
that the guards are never raised.
|
||||
|
||||
|
||||
Features
|
||||
========
|
||||
|
||||
* GC:
|
||||
* Group nodes by SCC
|
||||
* Two kinds of "indices" pointers:
|
||||
* index within the current strongly-connected component
|
||||
* direct pointer to a descendant SCC + index within it
|
||||
* To limit the waste when there is an SCC not containing nodes of a given type,
|
||||
do not use a fixed tuple of vectors of nodes. Instead, use a "list with possible omissions".
|
||||
Problem: the size of the fully-expanded type in TR is O(2ⁿ) for these.
|
||||
Either:
|
||||
* use casts
|
||||
* pay the compile-time price of a huge type (not feasible with large N)
|
||||
* pay the run-time price of bloated components (feasible but costly with large N)
|
||||
* use a case→ function, which returns the empty vector as a fallback. With pure functions, this could be doable.
|
||||
* Problem: generating all the possible functions means O(2^n) templates for closures that may be used at
|
||||
run-time
|
||||
* Otherwise, generate one closure over 1 array, one closure over 2 arrays, one over 3 arrays, etc.
|
||||
and also close over some markers indicating to which "real" arrays these correspond.
|
||||
This means O(2n) space usage worst-case.
|
||||
Problem again: the construction function for such a case→ while minimizing the number of values closed on
|
||||
has a large O(2^n) type too.
|
||||
* typed clojure has typed heterogeneous maps https://github.com/clojure/core.typed/wiki/Types#heterogeneous-maps
|
||||
* In theory it could be possible to give hints to the GC,
|
||||
but Racket's GC does not offer that possibility.
|
||||
|
||||
GC Implementation:
|
||||
* Must be optional
|
||||
* Initially, all nodes within a single component, all indices are of the "slef + index" kind.
|
||||
* Use tarjan's algorithm
|
||||
* group nodes by SCC
|
||||
Update references to nodes in the same SCC, using just an index
|
||||
Update references to nodes in "lower" SCCs, using a pointer to the SCC + an index.
|
||||
* Due to the higher memory cost (+ 1 reference for references to other SCCs),
|
||||
and the poor set of options for implementation (cast, exponential size of type, or large memory usage)
|
||||
we will not implement the GC.
|
||||
|
||||
We argue that in most cases it would not be useful, as compilers will normally work on the whole graph, and
|
||||
not on a subgraph. In the rare case where only part of a graph is needed, applying the graph-level identity
|
||||
transformation to the nodes of interest would effectively copy them and the nodes reachable from these new roots,
|
||||
thereby allowing the original graph to be freed.
|
||||
|
||||
* Invariants
|
||||
* Scope of the invariants:
|
||||
* Input contracts, types and structural properties
|
||||
* Output contracts, types and structural properties
|
||||
* Transformation invariants (relate the input and output graphs)
|
||||
* Time of verification:
|
||||
* Run-time
|
||||
Should be easy enough to implement: Add a define-graph-invariant form, recognize these options, and check
|
||||
them at run-time. Implemented as some sorts of contracts.
|
||||
* Compile-time
|
||||
* Node types
|
||||
* Statically-enforced structural invariants at the level of node types
|
||||
(e.g. disallow cycles in the types, to ensure there are no cycles in the instance)
|
||||
* Macros can be used to generate code which is known to be correct
|
||||
e.g. paths
|
||||
* Concern: static "by construction" guarantees may interfere with each other, if they fill in some nodes,
|
||||
e.g. a "no cycles starting from this node" constraint would not work as expected if a "backwards link"
|
||||
is filled in afterwards. We probably need to hardcode a set of constraints which know about eachother
|
||||
and about the potential interactions.
|
||||
* Conserve well-scopedness within a transition: pass in nodes flagged with a ∀ type, and check
|
||||
that the output contains that flag.
|
||||
* PHOAS
|
||||
* Specification
|
||||
* Invariants specified in the graph type
|
||||
* Transformation invariants specified on the graph creation code
|
||||
* Checks (run-time or the various compile-time mechanisms) are specified in the graph creation code
|
||||
The graph creation code must enforce all invariants within the type
|
||||
That way, it is guaranteed that any instance of the graph type satisfies its invariants, either by
|
||||
construction, or as a guard at the end of the construction.
|
||||
|
||||
* Automatic generation of mappings
|
||||
* when there is no mapping taking an old node as an input, a mapping is automatically generated.
|
||||
The mapping simply translates the old node to the new node type, and recursively transforms its fields,
|
||||
traversing lists etc.
|
||||
* When there is a mapping with more than one argument, then no mapping is auto-generated for that input node type,
|
||||
and instead calls to the mapping must be explicit (i.e. can't return the old node type).
|
||||
* This means that we have a mechanism before the actual core graph macro, which checks and decides which mappings
|
||||
to auto-generate.
|
||||
* We also have a mechanism for auto-calling transformations on input node types
|
||||
* Possibility to escape this, in order to actually insert a reference to the old graph?
|
||||
* Some notation in the type, to indicate that it's "protected" in some way?
|
||||
* Some wrapper indicating that it should merely be unwrapped, not processed? (I think that's better).
|
||||
|
||||
* Structural node equality
|
||||
* Would be nice to coalesce nodes which are equal? (i.e. hash consing)
|
||||
so that any two nodes which are equal? within the same graph have the same index.
|
||||
I suppose this would be rather costly: O(n log n) comparisons, with each comparison potentially
|
||||
costly, in principle. If we cached results, we could achieve a better run-time in practice, and
|
||||
perhaps a better theoretical complexity if we handled cycles ourselves.
|
||||
* The general algorithm when there are no unordered sets is deterministic finite automaton minimization
|
||||
* The general algorithm when there are unordered sets is nondeterministic finite automaton minimization
|
||||
* We could cache all calls to equal? on two nodes for a limited dynamic scope.
|
||||
* If we have this, then we can, for comparisons within the same graph, quickly return #t or #f based on eq?
|
||||
* Alpha-equivalence comparisons
|
||||
(i.e. ignore the actual values of some fields, except for constraining the shape of the graph)
|
||||
Possibility: create values which are unique within the graph, but are ignored when comparing
|
||||
nodes from different graphs.
|
||||
I'm not sure whether plainly erasing the information won't be enough, for a weaker form of alpha-equivalence?
|
||||
* Alpha-equivalence can't easily be implemented atop equal?, dropping it for now.
|
||||
It's not extremely useful anyway, as thanks to the full-blown graph representation, we are likely to
|
||||
discard names early on (keeping them just for printing / debuggin purposes).
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
* Coloring: the existing graph library for Racket has some coloring algorithms:
|
||||
http://docs.racket-lang.org/graph/index.html#%28def._%28%28lib._graph%2Fmain..rkt%29._coloring%2Fgreedy%29%29
|
||||
Maybe we can build a wrapper for those?
|
99
features-and-implementation3
Normal file
99
features-and-implementation3
Normal file
|
@ -0,0 +1,99 @@
|
|||
* Invariants
|
||||
* Scope of the invariants:
|
||||
* Input contracts, types and structural properties
|
||||
* Output contracts, types and structural properties
|
||||
* Transformation invariants (relate the input and output graphs)
|
||||
* Time of verification:
|
||||
* Run-time
|
||||
Should be easy enough to implement: Add a define-graph-invariant form, recognize these options, and check
|
||||
them at run-time. Implemented as some sorts of contracts.
|
||||
* Compile-time
|
||||
* Node types
|
||||
* Statically-enforced structural invariants at the level of node types
|
||||
(e.g. disallow cycles in the types, to ensure there are no cycles in the instance)
|
||||
* Macros can be used to generate code which is known to be correct
|
||||
e.g. paths
|
||||
* Concern: static "by construction" guarantees may interfere with each other, if they fill in some nodes,
|
||||
e.g. a "no cycles starting from this node" constraint would not work as expected if a "backwards link"
|
||||
is filled in afterwards. We probably need to hardcode a set of constraints which know about eachother
|
||||
and about the potential interactions.
|
||||
* Conserve well-scopedness within a transition: pass in nodes flagged with a ∀ type, and check
|
||||
that the output contains that flag. Potentially out-of-scope fields in the input do not have the flag.
|
||||
* PHOAS
|
||||
* Specification
|
||||
* Invariants specified in the graph type
|
||||
* Transformation invariants specified on the graph creation code
|
||||
* Checks (run-time or the various compile-time mechanisms) are specified in the graph creation code
|
||||
The graph creation code must enforce all invariants within the type
|
||||
That way, it is guaranteed that any instance of the graph type satisfies its invariants, either by
|
||||
construction, or as a guard at the end of the construction.
|
||||
|
||||
; The body should produce a function of type (→ (Listof Nodeᵢ) … Boolean)
|
||||
; The body should also return an indication about which invariants it ensures. Maybe use {~seq #:ensures (invariant-name inariant-arg …)} …
|
||||
however this does not allow the body to perform some minimal amount of rewriting on the "ensures" options.
|
||||
(define-syntax/parse (define-graph-contract (name g-descriptor arg …) . body)
|
||||
#'(define-syntax name
|
||||
(graph-contract
|
||||
(λ (g-descriptor arg …) . body))))
|
||||
|
||||
;; TODO: find a way to translate this to a type, with subtyping for weaker invariants.
|
||||
;; The type only serves to testify that the invariant was statically enforced or dynamically checked at construction, it does not actually encode the property using the type system.
|
||||
(struct invariants-wrapper ()) ;; Use a private struct to prevent forging of the invariants aggregated in a case→ (since it is never executed, any non-terminating λ could otherwise be supplied).
|
||||
(define-for-syntax invariant-introducer
|
||||
(make-syntax-introducer))
|
||||
;; the body should return the syntax for a type, such that less precise invariants are supertypes of that type.
|
||||
(define-syntax/parse (define-graph-invariant (name g-descriptor arg …) . body)
|
||||
#'(define-syntax name
|
||||
(graph-invariant
|
||||
(λ (g-descriptor arg …) . body))))
|
||||
|
||||
The type of a graph node with the invariants inv₁ … invₙ on the graph must include an extra dummy field with type
|
||||
(invariants-wrapper
|
||||
(case (→ inv₁ inv-arg … #t)
|
||||
…
|
||||
(→ invₙ inv-arg … #t)))
|
||||
|
||||
The invariant arguments can be symbols, to indicate node types or field names.
|
||||
|
||||
Concern: invariants on a graph may not have the same semantics if the graph has more or fewer nodes?
|
||||
|
||||
* Automatic generation of mappings
|
||||
* when there is no mapping taking an old node as an input, a mapping is automatically generated.
|
||||
The mapping simply translates the old node to the new node type, and recursively transforms its fields,
|
||||
traversing lists etc.
|
||||
* When there is a mapping with more than one argument, then no mapping is auto-generated for that input node type,
|
||||
and instead calls to the mapping must be explicit (i.e. can't return the old node type).
|
||||
* This means that we have a mechanism before the actual core graph macro, which checks and decides which mappings
|
||||
to auto-generate.
|
||||
* We also have a mechanism for auto-calling transformations on input node types
|
||||
* Possibility to escape this, in order to actually insert a reference to the old graph?
|
||||
* Some notation in the type, to indicate that the output should be an "old" node here
|
||||
* In the rare case where we have an (U (Old nd) nd), in the mapping use some (old v) wrapper
|
||||
indicating that the returned it should merely be unwrapped, not processed.
|
||||
|
||||
(define-syntax define-graph-type
|
||||
(syntax-parser
|
||||
[(_ _name [_nodeᵢ [_fieldᵢⱼ :colon _τᵢⱼ] …] …)
|
||||
;; save the graph type metadata
|
||||
]))
|
||||
|
||||
(define-syntax define-graph-transformer
|
||||
(syntax-parser
|
||||
[(_ _name graph-type)
|
||||
;; TODO
|
||||
]))
|
||||
|
||||
* Structural node equality
|
||||
* Would be nice to coalesce nodes which are equal? (i.e. hash consing)
|
||||
so that any two nodes which are equal? within the same graph have the same index.
|
||||
I suppose this would be rather costly: O(n log n) comparisons, with each comparison potentially
|
||||
costly, in principle. If we cached results, we could achieve a better run-time in practice, and
|
||||
perhaps a better theoretical complexity if we handled cycles ourselves.
|
||||
* The general algorithm when there are no unordered sets is deterministic finite automaton minimization
|
||||
* The general algorithm when there are unordered sets is nondeterministic finite automaton minimization
|
||||
* We could cache all calls to equal? on two nodes for a limited dynamic scope.
|
||||
* If we have this, then we can, for comparisons within the same graph, quickly return #t or #f based on eq?
|
||||
|
||||
* Coloring: the existing graph library for Racket has some coloring algorithms:
|
||||
http://docs.racket-lang.org/graph/index.html#%28def._%28%28lib._graph%2Fmain..rkt%29._coloring%2Fgreedy%29%29
|
||||
Maybe we can build a wrapper for those?
|
Loading…
Reference in New Issue
Block a user