scribble-enhanced/graph-lib/graph/graph2.lp2.rkt_

618 lines
24 KiB
Plaintext

#lang debug scribble/lp2
@(require "../lib/doc.rkt")
@doc-lib-setup
@title[#:style manual-doc-style]{Graph library}
@(table-of-contents)
@; TODO: allow a mapping to return a new placeholder, in order to act as a
@; redirect. All references to the old placeholder will act as if they were to
@; the new placeholder.
@section{Introduction}
This module provides a @tc[graph] macro which helps constructing immutable
graphs (using lambdas to defer potentially cyclic references).
@subsection{Example usage}
We will start with a running example, which will help us both show the macro's
syntax, and show some of the key advantages offered by this graph library.
@subsection{The graph's type}
Each node type in the graph is a variant's constructor, tagged with the node
name. For example, a graph representing a city and its inhabitants could use
these variants:
@chunk[<example-variants>
[City [streets : (Listof Street)] [people : (Listof Person)] <m-city>]
[Street [houses : (Listof House)] <m-street>]
[House [owner : Person] [location : Street] <m-house>]
[Person [name : String]] <m-person>]
Notice the cycle in the type: a street contains houses, which are located on the
same street.
@subsubsection{A seed from which to unravel the graph: the root parameters}
In order to build a graph with that type, we start from the root parameters.
Here, we will take a representation of the city as a list of
@tc[(street . person-name)] pairs, and will convert it to a more convenient
graph representation. Our single root parameter will thus be the whole list:
@chunk[<example-root>
'(["Amy" . "Ada street"]
["Jack" . "J street"]
["Anabella" . "Ada street"])]
We then provide a mapping from the root parameter to the root node, in our case
@tc[City]. When processing the root parameter, one can call mappings that will
create other nodes.
@subsubsection{Mapping the root parameters to the root node}
Here is the root mapping for our example. It maps over the list of names and
street names @tc[c], and calls for each element the @tc[m-street] mapping and
the @tc[Person] node constructor.
@; Would be nicer with (map (∘ (curry street c) my-car) c)), but that doesn't
@; typecheck (yet).
@chunk[<m-city>
[(m-city [c : (Listof (Pairof String String))]) : City
(City (remove-duplicates (map (curry m-street c) (cars c)))
(remove-duplicates (map m-person (cdrs c))))]]
@subsubsection{More mappings}
Next, we write the @tc[m-street] mapping, which takes a street name and the
whole city @tc[c] in list form, and creates a @tc[Street] node.
@chunk[<m-street>
[(m-street [c : (Listof (Pairof String String))] [s : String]) : Street
(Street (map (curry (curry m-house s) c)
(cars (filter (λ ([x : (Pairof String String)])
(equal? (cdr x) s))
c))))]]
The @tc[m-house] mapping calls back the @tc[m-street] mapping, to store for each
house a reference to the containing street. Normally, this would cause infinite
recursion in an eager language, like @tc[typed/racket]. However, the mappings
aren't called directly, and instead the @tc[m-street] function here returns a
placeholder. This allows us to not worry about mutually recursive mappings: a
mapping can be called any number of times with the same data, it will actually
only be run once.
The @tc[make-graph-constructor] macro will post-process the result of each
mapping, and replace the placeholders with promises for the the result of the
mapping. The promises are not available during graph construction, so there is
no risk of forcing one before it is available.
Finally, we write the @tc[m-house] mapping.
@chunk[<m-house>
[(m-house [s : String]
[c : (Listof (Pairof String String))]
[p : String])
: House
(House (m-person p) (m-street c s))]]
@chunk[<m-person>
[(m-person [p : String]) : Person
(Person p)]]
@identity{
Notice how we are calling directly the @tc[Person] constructor above. We also
called it directly in the @tc[m-city] mapping. Since @tc[Person] does not
contain references to @tc[House], @tc[Street] or @tc[City], we do not need to
delay creation of these nodes by calling yet another mapping.
@; TODO: above: Should we merge two identical instances of Person? They won't
@; necessarily be eq? if they contain cycles deeper in their structure, anyway.
@; And we are already merging all equal? placeholders, so there shouldn't be
@; any blowup in the number of nodes.
@; It would probably be better for graph-map etc. to have all the nodes in the
@; database, though.
The number and names of mappings do not necessarily reflect the graph's type.
Here, we have no mapping named @tc[m-person], because that node is always
created directly. Conversely, we could have two mappings, @tc[m-big-street] and
@tc[m-small-street], with different behaviours, instead of passing an extra
boolean argument to @tc[m-street].
@; TODO: make the two street mappings
}
@subsubsection{Making a constructor for the graph}
@identity{
@chunk[<make-constructor-example>
(make-graph-constructor (<example-variants>)
<example-root>)]
@subsubsection{Creating a graph instance}
@chunk[<use-example>
(define g <make-constructor-example>)]
}
@subsection{More details on the semantics}
Let's take a second look at the root mapping:
@chunk[<m-city-2>
[(m-city [c : (Listof (Pairof String String))]) : City
(City (remove-duplicates (map (curry m-street c) (cars c)))
(remove-duplicates (map Person (cdrs c))))]]
The first case shows that we can use @tc[m-street] as any other function,
passing it to @tc[curry], and calling @tc[remove-duplicates] on the results.
Note that each placeholder returned by @tc[m-street] will contain all
information passed to it, here a street name and @tc[c]. Two placeholders for
@tc[m-street] will therefore be @tc[equal?] if and only if all the arguments
passed to @tc[m-street] are @tc[equal?]. The placeholders also include a symbol
specifying which mapping was called, so two placeholders for two different
mappings will not be @tc[equal?], even if identical parameters were supplied.
@identity{
The second case shows that we can also directly call the constructor for the
@tc[Person] node type. If that type contains references to other nodes, the
constructor here will actually accept either a placeholder, or an actual
instance, which itself may contain placeholders.
The node type allowing placeholders is derived from the ideal type given above.
Here, the type for @tc[Person] is @tc[[String]], so there are no substitutions
to make. On the contrary, the type for @tc[City], originally expressed as
@tc[[(Listof Street) (Listof Person)]], will be rewritten into
@tc[[(Listof (U Street Street-Placeholder))
(Listof (U Person Person-Placeholder))]].
}
The @tc[rewrite-type] module we use to derive types with placeholders from the
original ones only handles a handful of the types offered by @tc[typed/racket].
In particular, it does not handle recursive types described with @tc[Rec] yet.
@section{Implementation}
In this section, we will describe how the @tc[make-graph-constructor] macro is
implemented.
@subsection{The macro's syntax}
We use a simple syntax for @tc[make-graph-constructor], and make it more
flexible through wrapper macros.
@chunk[<signature>
(make-graph-constructor
(root-expr:expr ...)
([node <field-signature> … <mapping-declaration>] …))]
Where @tc[<field-signature>] is:
@chunk[<field-signature>
[field-name:id (~literal :) field-type:expr]]
And @tc[<mapping-declaration>] is:
@chunk[<mapping-declaration>
((mapping:id [param:id (~literal :) param-type:expr] …)
. mapping-body)]
@subsection{The different types of a node and mapping}
A single node name can refer to several types:
@itemlist[
@item{The @emph{ideal} type, expressed by the user, for example
@racket[[City (Listof Street) (Listof Person)]], it is never used as-is in
practice}
@item{The @emph{placeholder} type, type and constructor, which just store the
arguments for the mapping along with a tag indicating the node name}
@item{The @emph{incomplete} type, in which references to other node types are
allowed to be either actual (@racket[incomplete]) instances, or placeholders.
For example, @racket[[City (Listof (U Street Street/placeholder-type))
(Listof (U Person Person/placeholder-type))]].}
@item{The @emph{with-indices} type, in which references to other node types
must be replaced by an index into the results list for the target node's
@racket[with-promises] type. For example,
@racket[[City (Listof (Pairof 'Street/with-indices-tag Index))
(Listof (Pairof 'Person/with-indices-tag Index))]].}
@item{The @emph{with-promises} type, in which references to other node types
must be replaced by a @racket[Promise] for the target node's
@racket[with-promises] type. For example,
@racket[[City (Listof (Promise Street/with-promises-type))
(Listof (Promise Person/with-promises-type))]].}
@item{The @emph{mapping function}, which takes some parameters and
returns a node (this is the code directly provided by the user)}]
We derive identifiers for these based on the @tc[node] or @tc[mapping] name:
@;;;;
@chunk[<define-ids2>
(define-temp-ids "~a/make-placeholder" (mapping …) #:first-base root)
(define-temp-ids "~a/placeholder-type" (mapping …))
(define-temp-ids "~a/make-incomplete" (node …))
(define-temp-ids "~a/incomplete-type" (node …))
(define-temp-ids "~a/make-with-indices" (node …))
(define-temp-ids "~a/with-indices-type" (node …))
(define-temp-ids "~a/make-with-promises" (node …))
(define-temp-ids "~a/with-promises-type" (node …))
(define-temp-ids "~a/function" (mapping …))]
@chunk[<define-ids2>
(define/with-syntax (root/make-placeholder . _)
#'(mapping/make-placeholder …))]
@subsection{Overview}
The macro relies heavily on two sidekick modules: @tc[rewrite-type], and
@tc[fold-queue]. The former will allow us to derive from the ideal type of a
node the incomplete type and the with-promises type. It will also allow us to
search in instances of incomplete nodes, in order to extract the placehoders,
and replace these parts with promises. The latter, @tc[fold-queue], will be used
to process all the pending placeholders, with the possibility to enqueue new
ones as these placeholders are discovered inside incomplete nodes.
When the graph constructor is called with the arguments for the root parameters,
it is equivalent to make and then resolve an initial placeholder. We will use a
function from the @tc[fold-queue] library to process the queues of pending
placeholders, starting with a queue containing only that root placeholder.
We will have one queue for each placeholder type.@note{It we had only one queue,
we would have only one collection of results, and would need a @racket[cast]
when extracting nodes from the collection of results.} The
queues' element types will therefore be these placeholder types.
@chunk[<fold-queue-type-element>
mapping/placeholder-type]
The return type for each queue will be the corresponding with-promises type. The
fold-queues function will therefore return a vector of with-promises nodes.
@chunk[<fold-queue-type-result>
<with-promises-type>]
@; Problem: how do we ensure we return the right type for the root?
@; How do we avoid casts when doing look-ups?
@; We need several queues, handled in parallel, with distinct element types.
@; * Several result aggregators, one for each type, so we don't have to cast
@; * Several queues, so that we can make sure the root node is of the expected
@; type.
@; TODO: clarity.
@; The @tc[fold-queues] function allows us to associate each element with a tag,
@; so that, inside the processing function and outside, we can refer to an
@; element using this tag, which can be more lightweight than keeping a copy of
@; the element.
@;
@; We will tag our elements with an @tc[Index], which prevents memory leakage:
@; if we kept references to the original data added to the queue, a graph's
@; representation would hold references to its input, which is not the case when
@; using simple integers to refer to other nodes, instead of using the input for
@; these nodes. Also, it makes lookups in the database much faster, as we will
@; be able to use an array instead of a hash table.
@subsection{The queues of placeholders}
The fold-queus macro takes a root element, in our case the root placeholder,
which it will insert into the first queue. The next clauses are the queue
handlers, which look like function definitions of the form
@tc[(queue-name [element : element-type] Δ-queues enqueue)]. The @tc[enqueue]
argument is a function used to enqueue elements and get a tag in return, which
can later be used to retrieve the processed element.
Since the @tc[enqueue] function is pure, it takes a parameter of the same type
as @tc[Δ-queues] representing the already-enqueued elements, and returns a
modified copy, in addition to the tag. The queue's processing body should return
the latest @tc[Δ-queues] in order to have these elements added to the queue.
@chunk[<fold-queue>
(fold-queues <root-placeholder>
[(mapping/placeholder-tag [e : <fold-queue-type-element>]
Δ-queues
enqueue)
: <fold-queue-type-result>
<fold-queue-body>]
...)]
@subsection{Making placeholders for mappings}
We start creating the root placeholder which we provide to @tc[fold-queues].
@chunk[<root-placeholder>
(root/make-placeholder root-expr ...)]
To make the placeholder, we will need a @tc[make-placeholder] function for each
@tc[mapping]. We define the type of each placeholder (a list of arguments,
tagged with the @tc[mapping]'s name), and a constructor:
@; TODO: just use (variant [mapping param-type ...] ...)
@chunk[<define-mapping-placeholder>
(define-type mapping/placeholder-type (List 'mapping/placeholder-tag
param-type ...))
(: mapping/make-placeholder (→ param-type ... mapping/placeholder-type))
(define (mapping/make-placeholder [param : param-type] ...)
(list 'mapping/placeholder-tag param ...))]
The code above needs some identifiers derived from @tc[mapping] names:
@chunk[<define-ids>
(define-temp-ids "~a/make-placeholder" (mapping ...))
(define-temp-ids "~a/placeholder-type" (mapping ...))
(define-temp-ids "~a/placeholder-tag" (mapping ...))
(define/with-syntax (root/make-placeholder . _)
#'(mapping/make-placeholder ...))]
@subsection{Making with-promises nodes}
We derive the @tc[with-promises] type from each @emph{ideal} node type using
the @tc[tmpl-replace-in-type] template metafunction from the rewrite-type
library. We replace all occurrences of a @tc[node] name with a @tc[Promise] for
that node's @tc[with-promises] type.
@; TODO: use a type-expander here, instead of a template metafunction.
@CHUNK[<define-with-promises-nodes>
(define-type field/with-promises-type
(tmpl-replace-in-type field-type
[node (Promise node/with-promises-type)]
…))
(define-type node/with-promises-type (List 'with-promises
'node
field/with-promises-type …))
(: node/make-with-promises (→ field/with-promises-type …
node/with-promises-type))
(define (node/make-with-promises field-name …)
(list 'with-promises 'node field-name …))]
The code above needs some identifiers derived from @tc[node] and
@tc[field-name]s:
@chunk[<define-ids>
(define-temp-ids "~a/make-with-promises" (node ...))
(define-temp-ids "~a/with-promises-type" (node ...))
(define/with-syntax ((field/with-promises-type …) …)
(stx-map generate-temporaries #'((field-name …) …)))]
@subsection{Making incomplete nodes}
We derive the @tc[incomplete] type from each @emph{ideal} node type using
the @tc[tmpl-replace-in-type] template metafunction from the rewrite-type
library. We replace all occurrences of a @tc[node] name with a union of the
node's @tc[incomplete] type, and all compatible @tc[placeholder] types.
TODO: for now we allow all possible mappings, but we should only allow those
which return type is the desired node type.
@; TODO: use a type-expander here, instead of a template metafunction.
@CHUNK[<define-incomplete-nodes>
(define-type field/incomplete-type <field/incomplete-type>)
(define-type node/incomplete-type
(Pairof 'node/incomplete-tag (List field/incomplete-type …)))
(: node/make-incomplete (→ field/incomplete-type … node/incomplete-type))
(define (node/make-incomplete field-name …)
(list 'node/incomplete-tag field-name …))]
Since the incomplete type for fields will appear in two different places, above
and in the incomplete-to-with-promises conversion routine below, we write it in
a separate chunk:
@chunk[<field/incomplete-type>
(tmpl-replace-in-type field-type
[node (U node/incomplete-type
node/compatible-placeholder-types …)]
…)]
@identity{
We must however compute for each node the set of compatible placeholder types.
We do that
@chunk[<define-compatible-placeholder-types>
(define/with-syntax ((node/compatible-placeholder-types ...) ...)
(for/list ([x (in-syntax #'(node ...))])
(multiassoc-syntax
x
#'([result-type . mapping/placeholder-type];;;;;;;;;;;;;;;;;;;;;;;;;;;; . (List 'mapping/placeholder-tag param-type ...)
…))))]
The multiassoc-syntax function used above filters the associative syntax list
and returns the @tc[stx-cdr] of the matching elements, therefore returning a
list of @tc[mapping/placeholder-type]s for which the @tc[result-type] is the
given @tc[node] name.
@chunk[<multiassoc-syntax>
(define (multiassoc-syntax query alist)
(map stx-cdr
(filter (λ (xy) (free-identifier=? query (stx-car xy)))
(syntax->list alist))))
(define (cdr-assoc-syntax query alist)
(stx-cdr (findf (λ (xy) (free-identifier=? query (stx-car xy)))
(syntax->list alist))))
(define-template-metafunction (tmpl-cdr-assoc-syntax stx)
(syntax-parse stx
[(_ query [k . v] …)
(cdr-assoc-syntax #'query #'([k . v] …))]))]
The code above also needs some identifiers derived from @tc[node] and
@tc[field-name]s:
@chunk[<define-ids>
(define-temp-ids "~a/make-incomplete" (node …))
(define-temp-ids "~a/incomplete-type" (node …))
(define-temp-ids "~a/incomplete-tag" (node …))
(define-temp-ids "~a/incomplete-fields" (node …))
(define/with-syntax ((field/incomplete-type …) …)
(stx-map-nested #'((field-name …) …)))]
}
@subsection{Converting incomplete nodes to with-promises ones}
@chunk[<convert-incomplete-to-with-promises>
[node/incomplete-type
node/with-promises-type
(λ (x) (and (pair? x) (eq? (car x) 'node/incomplete-tag)))
(λ ([x : node/incomplete-type] [acc : Void])
<convert-incomplete-successor>)]]
@chunk[<convert-placeholder-to-with-promises>
[mapping/placeholder-type
(tmpl-replace-in-type result-type [node node/with-promises-type] …)
(λ (x) (and (pair? x)
(eq? (car x) 'mapping/placeholder-tag)))
(λ ([x : mapping/placeholder-type] [acc : Void])
<convert-placeholder-successor>)]]
@; TODO: this would be much simpler if we forced having only one mapping per
@; node, and extended that with a macro.
@chunk[<define-compatible-mappings>
(define/with-syntax ((node/compatible-mappings ...) ...)
(for/list ([x (in-syntax #'(node ...))])
(multiassoc-syntax
x
#'([result-type . mapping]
…))))]
@chunk[<convert-incomplete-successor>
(error (~a "Not implemented yet " x))]
@chunk[<convert-placeholder-successor>
(% index new-Δ-queues = (enqueue 'mapping/placeholder-tag x Δ-queues)
(list 'mapping/placeholder-tag index)
(error (~a "Not implemented yet " x)))]
@subsection{Processing the placeholders}
@; TODO: also allow returning a placeholder (which means we should then
@; process that placeholder in turn). The placeholder should return the
@; same node type, but can use a different mapping?
@; Or maybe we can do this from the ouside, using a wrapper macro?
@CHUNK[<fold-queue-body>
(let ([mapping-result (apply mapping/function (cdr e))])
(tmpl-fold-instance <the-incomplete-type>
Void
<convert-incomplete-to-with-promises> …
<convert-placeholder-to-with-promises> …))
'todo!]
@chunk[<the-incomplete-type>
(tmpl-cdr-assoc-syntax result-type
[node . (List <field/incomplete-type> …)]
…)]
@section{The mapping functions}
We define the mapping functions as they are described by the user, with an
important change: Instead of returning an @emph{ideal} node type, we expect them
to return an incomplete node type.
@chunk[<define-mapping-function>
(define-type mapping/incomplete-result-type
(tmpl-replace-in-type result-type
[node (List 'node/incomplete-tag
<field/incomplete-type> …)]
…))
(: mapping/function (→ param-type … mapping/incomplete-result-type))
(define mapping/function
(let ([mapping mapping/make-placeholder]
[node node/make-incomplete]
…)
(λ (param …)
. mapping-body)))]
@chunk[<define-ids>
(define-temp-ids "~a/function" (mapping ...))
(define-temp-ids "~a/incomplete-result-type" (mapping ...))]
@section{Temporary fillers}
@chunk[<with-promises-type>
Any]
@section{Putting it all together}
@chunk[<make-graph-constructor>
(define-syntax/parse <signature>
<define-ids>
(let ()
<define-ids2>
<define-compatible-placeholder-types>
((λ (x) (pretty-write (syntax->datum x)) x)
(template
(let ()
(begin <define-mapping-placeholder>) …
(begin <define-with-promises-nodes>) …
(begin <define-incomplete-nodes>) …
(begin <define-mapping-function>) …
<fold-queue>)))))]
@section{Conclusion}
@chunk[<module-main>
(module main typed/racket
(require (for-syntax syntax/parse
racket/syntax
syntax/stx
syntax/parse/experimental/template
racket/sequence
racket/pretty; DEBUG
alexis/util/threading; DEBUG
"rewrite-type.lp2.rkt"
"../lib/low-untyped.rkt")
alexis/util/threading; DEBUG
"fold-queues.lp2.rkt"
"rewrite-type.lp2.rkt"
"../lib/low.rkt")
(begin-for-syntax
<multiassoc-syntax>)
(provide make-graph-constructor)
<make-graph-constructor>)]
@chunk[<module-test>
(module* test typed/racket
(require (submod "..")
"fold-queues.lp2.rkt"; DEBUG
"rewrite-type.lp2.rkt"; DEBUG
"../lib/low.rkt"; DEBUG
typed/rackunit)
<use-example>
g)]
@chunk[<*>
(begin
<module-main>
(require 'main)
(provide (all-from-out 'main))
<module-test>)]