SSAX Package ============ A SSAX functional XML parsing framework consists of a DOM/SXML parser, a SAX parser, and a supporting library of lexing and parsing procedures. The procedures in the package can be used separately to tokenize or parse various pieces of XML documents. The framework supports XML Namespaces, character, internal and external parsed entities, attribute value normalization, processing instructions and CDATA sections. The package includes a semi-validating SXML parser: a DOM-mode parser that is an instantiation of a SAX parser (called SSAX). SSAX is a full-featured, algorithmically optimal, pure-functional parser, which can act as a stream processor. SSAX is an efficient SAX parser that is easy to use. SSAX minimizes the amount of application-specific state that has to be shared among user-supplied event handlers. SSAX makes the maintenance of an application-specific element stack unnecessary, which eliminates several classes of common bugs. SSAX is written in a pure-functional subset of Scheme. Therefore, the event handlers are referentially transparent, which makes them easier for a programmer to write and to reason about. The more expressive, reliable and easier to use application interface for the event-driven XML parsing is the outcome of implementing the parsing engine as an enhanced tree fold combinator, which fully captures the control pattern of the depth-first tree traversal. ------------------------------------------------- Quick start ; procedure: ssax:xml->sxml PORT NAMESPACE-PREFIX-ASSIG ; ; This is an instance of a SSAX parser that returns an SXML ; representation of the XML document to be read from PORT. ; NAMESPACE-PREFIX-ASSIG is a list of (USER-PREFIX . URI-STRING) ; that assigns USER-PREFIXes to certain namespaces identified by ; particular URI-STRINGs. It may be an empty list. ; The procedure returns an SXML tree. The port points out to the ; first character after the root element. (define (ssax:xml->sxml port namespace-prefix-assig) ...) ; procedure: pre-post-order TREE BINDINGS ; ; Traversal of an SXML tree or a grove: ; a or a ; ; A and a are mutually-recursive datatypes that ; underlie the SXML tree: ; ::= (name . ) | "text string" ; An (ordered) set of nodes is just a list of the constituent nodes: ; ::= ( ...) ; Nodelists, and Nodes other than text strings are both lists. A ; however is either an empty list, or a list whose head is ; not a symbol (an atom in general). A symbol at the head of a node is ; either an XML name (in which case it's a tag of an XML element), or ; an administrative name such as '@'. ; See SXPath.scm and SSAX.scm for more information on SXML. ; ; ; Pre-Post-order traversal of a tree and creation of a new tree: ; pre-post-order:: x -> ; where ; ::= ( ...) ; ::= ( *preorder* . ) | ; ( *macro* . ) | ; ( . ) | ; ( . ) ; ::= XMLname | *text* | *default* ; :: x [] -> ; ; The pre-post-order function visits the nodes and nodelists ; pre-post-order (depth-first). For each of the form (name ; ...) it looks up an association with the given 'name' among ; its . If failed, pre-post-order tries to locate a ; *default* binding. It's an error if the latter attempt fails as ; well. Having found a binding, the pre-post-order function first ; checks to see if the binding is of the form ; ( *preorder* . ) ; If it is, the handler is 'applied' to the current node. Otherwise, ; the pre-post-order function first calls itself recursively for each ; child of the current node, with prepended to the ; in effect. The result of these calls is passed to the ; (along with the head of the current ). To be more ; precise, the handler is _applied_ to the head of the current node ; and its processed children. The result of the handler, which should ; also be a , replaces the current . If the current ; is a text string or other atom, a special binding with a symbol ; *text* is looked up. ; ; A binding can also be of a form ; ( *macro* . ) ; This is equivalent to *preorder* described above. However, the result ; is re-processed again, with the current stylesheet. ; (define (pre-post-order tree bindings) ...) ------------------------------------------------- Additional tools included into the package 1. "access-remote.ss" Uniform access to local and remote resources Resolution for relative URIs in accordance with RFC 2396 2. "id.ss" Creation and manipulation of the ID-index for a faster access to SXML elements by their unique ID Provides the DTD parser for extracting ID attribute declarations 3. "xlink-parser.ss" Parser for XML documents that contain XLink elements 4. "multi-parser.ss" SSAX multi parser: combines several specialized parsers into one Provides creation of parent pointers to SXML document constructed