Reverse kathy's changes to doc.txt
svn: r5732
This commit is contained in:
parent
58bc45a979
commit
f5deb3f60d
|
@ -1,6 +1,7 @@
|
||||||
_parser-tools_
|
_parser-tools_
|
||||||
|
|
||||||
This documentation provides directions on using the lexer, Yacc-style parser generator and combinator parser library. It assumes familiarity with lex and yacc style lexer and parser generators and with combinator parsers.
|
This documentation assumes familiarity with lex and yacc style lexer
|
||||||
|
and parser generators.
|
||||||
|
|
||||||
_lex.ss_
|
_lex.ss_
|
||||||
A _regular expression_ is one of the following:
|
A _regular expression_ is one of the following:
|
||||||
|
@ -425,131 +426,4 @@ the original grammar have nested blocks the tool will fail.
|
||||||
Annotated examples are in the examples subdirectory of the parser-tools
|
Annotated examples are in the examples subdirectory of the parser-tools
|
||||||
collection directory.
|
collection directory.
|
||||||
|
|
||||||
_combinator-unit.ss_
|
|
||||||
This library provides a unit implementing four higher-order functions
|
|
||||||
that can be used to build a combinator parser, and the export and
|
|
||||||
import signatures related to it. The functions contained in this unit
|
|
||||||
automatically build error reporting mechanisms in the event that no parse
|
|
||||||
is found. Unlike other combinator parsers, this system assumes that the
|
|
||||||
input is already lexed into tokens using _lex.ss_. This library relies on
|
|
||||||
_(lib "lazy.ss" "lazy")_.
|
|
||||||
|
|
||||||
The unit _combinator-parser-tools_ exports the signature
|
|
||||||
_combinator-parser^_ and imports the signatures _error-format-parameters^_, _language-format-parameters^_, and _language-dictionary^_.
|
|
||||||
|
|
||||||
The signature combinator-parser^ references functions to build combinators,
|
|
||||||
a function to build a runable parser using a combinator, a structure for
|
|
||||||
recording errors and macro definitions to specify combinators with:
|
|
||||||
|
|
||||||
>(terminal predicate result name spell-check case-check type-check) ->
|
|
||||||
(list token) -> parser-result
|
|
||||||
The returned function accepts one terminal from a token stream, and
|
|
||||||
returns produces an opaque value that interacts with other combinators.
|
|
||||||
|
|
||||||
predicate: token -> boolean - check that the token is the expected one
|
|
||||||
result: token -> beta - create the ast node for this terminal
|
|
||||||
name: string - human-language name for this terminal
|
|
||||||
spell-check, case-check, type-check: (U bool (token -> bool))
|
|
||||||
optional arguments, default to #f, perform spell checking, case
|
|
||||||
checking, and kind checking on incorrect tokens
|
|
||||||
|
|
||||||
>(seq sequence result name) -> (list token) -> parser-result
|
|
||||||
The returned function accepts a term made up of a sequence of smaller
|
|
||||||
terms, and produces an opaque value that interacts with other
|
|
||||||
combinators.
|
|
||||||
|
|
||||||
sequence: (listof ((list token) -> parser-result)) - the subterms
|
|
||||||
result: (list alpha) -> beta - create the ast node for this sequence.
|
|
||||||
Input list matches length of sequence list
|
|
||||||
name: human-language name for this term
|
|
||||||
|
|
||||||
>(choice options name) -> (list token) -> parser-result
|
|
||||||
The returned function selects between different terms, and produces an
|
|
||||||
opaque value that interacts with other combinators
|
|
||||||
|
|
||||||
options: (listof ((list token) -> parser-result) - the possible terms
|
|
||||||
name: human-language name for this term
|
|
||||||
|
|
||||||
>(repeat term) -> (list token) -> parser-result
|
|
||||||
The returned function accepts 0 or more instances of term, and produces
|
|
||||||
an opaque value that interacts with other combinators
|
|
||||||
|
|
||||||
term: (list token) -> parser-result
|
|
||||||
|
|
||||||
>(parser term) -> (list token) location -> ast-or-error
|
|
||||||
Returns a function that parses a list of tokens, producing either the
|
|
||||||
result of calling all appropriate result functions or an err
|
|
||||||
|
|
||||||
term: (list token) -> parser-result
|
|
||||||
location: string | editor
|
|
||||||
Either the string representing the file name or the editor being read,
|
|
||||||
typically retrieved from file-path
|
|
||||||
ast-or-error: AST | err
|
|
||||||
AST is the result of calling the given result function
|
|
||||||
|
|
||||||
The err structure is:
|
|
||||||
>(make-err string source-list)
|
|
||||||
|
|
||||||
>(err-msg err) -> string
|
|
||||||
The error message
|
|
||||||
>(err-src err) -> (list location line-k col-k pos-k span-k)
|
|
||||||
This list is suitable for calling raise-read-error,
|
|
||||||
*-k are positive integers
|
|
||||||
|
|
||||||
The language forms provided are:
|
|
||||||
>(define-simple-terminals NAME (simple-spec ...))
|
|
||||||
Expands to a define-empty-tokens and one terminal definition per
|
|
||||||
simple-spec
|
|
||||||
|
|
||||||
NAME is an identifier specifying a group of tokens
|
|
||||||
|
|
||||||
simple-spec = NAME | (NAME string) | (NAME proc) | (NAME string proc)
|
|
||||||
NAME is an identifier specifying a token/terminal with no value
|
|
||||||
proc: token -> ast - A procedure from tokens to AST nodes. id is used
|
|
||||||
by default. The token will be a symbol.
|
|
||||||
string is the human-language name for the terminal, NAME is used by
|
|
||||||
default
|
|
||||||
|
|
||||||
>(define-terminals NAME (terminal-spec ...))
|
|
||||||
Like define-simple-terminals, except uses define-tokens
|
|
||||||
|
|
||||||
terminal-spec = (NAME proc) | (NAME string proc)
|
|
||||||
proc: token -> ast - a procedure from tokens to AST node.
|
|
||||||
The token will be the token defined as NAME and will be a value token.
|
|
||||||
|
|
||||||
>(sequence (NAME ...) proc string)
|
|
||||||
Generates a call to seq with the specified names in a list,
|
|
||||||
proc => result and string => name.
|
|
||||||
The name can be omitted when nested in another sequence or choose
|
|
||||||
|
|
||||||
>(sequence (NAME_ID ...) proc string)
|
|
||||||
where NAME_ID is either NAME or (^ NAME)
|
|
||||||
The ^ form identifies a parser production that can be used to identify
|
|
||||||
this production in an error message. Otherwise the same as above
|
|
||||||
|
|
||||||
>(choose (NAME ...) string)
|
|
||||||
Generates a call to choice using the given terms as the list of options,
|
|
||||||
string => name.
|
|
||||||
The name can be omitted when nested in another sequence or choose
|
|
||||||
|
|
||||||
>(eta NAME)
|
|
||||||
Eta expands name with a wrapping that properly mimcs a parser term
|
|
||||||
|
|
||||||
The _error-format-parameters^_ signature requires five names:
|
|
||||||
src?: boolean- will the lexer include source information
|
|
||||||
input-type: string- used to identify the source of input
|
|
||||||
show-options: boolean- presently ignored
|
|
||||||
max-depth: int- The depth of errors reported
|
|
||||||
max-choice-depth: int- The max number of options listed in an error
|
|
||||||
|
|
||||||
The _language-format-parameters^_ requires two names
|
|
||||||
class-type: string - general term for language keywords
|
|
||||||
input->output-name: token -> string - translates tokens into strings
|
|
||||||
|
|
||||||
The _language-dictionary^_ requires three names
|
|
||||||
misspelled: string string -> boolean -
|
|
||||||
check the spelling of the second arg against the first
|
|
||||||
misscap: string string -> boolean -
|
|
||||||
check the capitalization of the second arg against the first
|
|
||||||
missclass: string string -> boolean -
|
|
||||||
check if the second arg names a correct token kind
|
|
Loading…
Reference in New Issue
Block a user