Moved the combinator parser out of parser-tools to alleviate the problem the
dependence on lazy had for the mzscheme build svn: r5728
This commit is contained in:
parent
67e01a1f47
commit
fb40b02b42
132
collects/combinator-parser/doc.txt
Normal file
132
collects/combinator-parser/doc.txt
Normal file
|
@ -0,0 +1,132 @@
|
|||
_combinator-parser_
|
||||
|
||||
This documentation provides directions on using the combinator parser library. It assumes familiarity with lexing and with combinator parsers.
|
||||
|
||||
_combinator-unit.ss_
|
||||
This library provides a unit implementing four higher-order functions
|
||||
that can be used to build a combinator parser, and the export and
|
||||
import signatures related to it. The functions contained in this unit
|
||||
automatically build error reporting mechanisms in the event that no parse
|
||||
is found. Unlike other combinator parsers, this system assumes that the
|
||||
input is already lexed into tokens using _lex.ss_. This library relies on
|
||||
_(lib "lazy.ss" "lazy")_.
|
||||
|
||||
The unit _combinator-parser-tools_ exports the signature
|
||||
_combinator-parser^_ and imports the signatures _error-format-parameters^_, _language-format-parameters^_, and _language-dictionary^_.
|
||||
|
||||
The signature combinator-parser^ references functions to build combinators,
|
||||
a function to build a runable parser using a combinator, a structure for
|
||||
recording errors and macro definitions to specify combinators with:
|
||||
|
||||
>(terminal predicate result name spell-check case-check type-check) ->
|
||||
(list token) -> parser-result
|
||||
The returned function accepts one terminal from a token stream, and
|
||||
returns produces an opaque value that interacts with other combinators.
|
||||
|
||||
predicate: token -> boolean - check that the token is the expected one
|
||||
result: token -> beta - create the ast node for this terminal
|
||||
name: string - human-language name for this terminal
|
||||
spell-check, case-check, type-check: (U bool (token -> bool))
|
||||
optional arguments, default to #f, perform spell checking, case
|
||||
checking, and kind checking on incorrect tokens
|
||||
|
||||
>(seq sequence result name) -> (list token) -> parser-result
|
||||
The returned function accepts a term made up of a sequence of smaller
|
||||
terms, and produces an opaque value that interacts with other
|
||||
combinators.
|
||||
|
||||
sequence: (listof ((list token) -> parser-result)) - the subterms
|
||||
result: (list alpha) -> beta - create the ast node for this sequence.
|
||||
Input list matches length of sequence list
|
||||
name: human-language name for this term
|
||||
|
||||
>(choice options name) -> (list token) -> parser-result
|
||||
The returned function selects between different terms, and produces an
|
||||
opaque value that interacts with other combinators
|
||||
|
||||
options: (listof ((list token) -> parser-result) - the possible terms
|
||||
name: human-language name for this term
|
||||
|
||||
>(repeat term) -> (list token) -> parser-result
|
||||
The returned function accepts 0 or more instances of term, and produces
|
||||
an opaque value that interacts with other combinators
|
||||
|
||||
term: (list token) -> parser-result
|
||||
|
||||
>(parser term) -> (list token) location -> ast-or-error
|
||||
Returns a function that parses a list of tokens, producing either the
|
||||
result of calling all appropriate result functions or an err
|
||||
|
||||
term: (list token) -> parser-result
|
||||
location: string | editor
|
||||
Either the string representing the file name or the editor being read,
|
||||
typically retrieved from file-path
|
||||
ast-or-error: AST | err
|
||||
AST is the result of calling the given result function
|
||||
|
||||
The err structure is:
|
||||
>(make-err string source-list)
|
||||
|
||||
>(err-msg err) -> string
|
||||
The error message
|
||||
>(err-src err) -> (list location line-k col-k pos-k span-k)
|
||||
This list is suitable for calling raise-read-error,
|
||||
*-k are positive integers
|
||||
|
||||
The language forms provided are:
|
||||
>(define-simple-terminals NAME (simple-spec ...))
|
||||
Expands to a define-empty-tokens and one terminal definition per
|
||||
simple-spec
|
||||
|
||||
NAME is an identifier specifying a group of tokens
|
||||
|
||||
simple-spec = NAME | (NAME string) | (NAME proc) | (NAME string proc)
|
||||
NAME is an identifier specifying a token/terminal with no value
|
||||
proc: token -> ast - A procedure from tokens to AST nodes. id is used
|
||||
by default. The token will be a symbol.
|
||||
string is the human-language name for the terminal, NAME is used by
|
||||
default
|
||||
|
||||
>(define-terminals NAME (terminal-spec ...))
|
||||
Like define-simple-terminals, except uses define-tokens
|
||||
|
||||
terminal-spec = (NAME proc) | (NAME string proc)
|
||||
proc: token -> ast - a procedure from tokens to AST node.
|
||||
The token will be the token defined as NAME and will be a value token.
|
||||
|
||||
>(sequence (NAME ...) proc string)
|
||||
Generates a call to seq with the specified names in a list,
|
||||
proc => result and string => name.
|
||||
The name can be omitted when nested in another sequence or choose
|
||||
|
||||
>(sequence (NAME_ID ...) proc string)
|
||||
where NAME_ID is either NAME or (^ NAME)
|
||||
The ^ form identifies a parser production that can be used to identify
|
||||
this production in an error message. Otherwise the same as above
|
||||
|
||||
>(choose (NAME ...) string)
|
||||
Generates a call to choice using the given terms as the list of options,
|
||||
string => name.
|
||||
The name can be omitted when nested in another sequence or choose
|
||||
|
||||
>(eta NAME)
|
||||
Eta expands name with a wrapping that properly mimcs a parser term
|
||||
|
||||
The _error-format-parameters^_ signature requires five names:
|
||||
src?: boolean- will the lexer include source information
|
||||
input-type: string- used to identify the source of input
|
||||
show-options: boolean- presently ignored
|
||||
max-depth: int- The depth of errors reported
|
||||
max-choice-depth: int- The max number of options listed in an error
|
||||
|
||||
The _language-format-parameters^_ requires two names
|
||||
class-type: string - general term for language keywords
|
||||
input->output-name: token -> string - translates tokens into strings
|
||||
|
||||
The _language-dictionary^_ requires three names
|
||||
misspelled: string string -> boolean -
|
||||
check the spelling of the second arg against the first
|
||||
misscap: string string -> boolean -
|
||||
check the capitalization of the second arg against the first
|
||||
missclass: string string -> boolean -
|
||||
check if the second arg names a correct token kind
|
5
collects/combinator-parser/info.ss
Normal file
5
collects/combinator-parser/info.ss
Normal file
|
@ -0,0 +1,5 @@
|
|||
|
||||
(module info (lib "infotab.ss" "setup")
|
||||
(define doc.txt "doc.txt")
|
||||
(define name "Combinator parser"))
|
||||
|
2
collects/combinator-parser/private-combinator/info.ss
Normal file
2
collects/combinator-parser/private-combinator/info.ss
Normal file
|
@ -0,0 +1,2 @@
|
|||
(module info (lib "infotab.ss" "setup")
|
||||
(define name "Combinator-parser private-combinator"))
|
|
@ -1,2 +0,0 @@
|
|||
(module info (lib "infotab.ss" "setup")
|
||||
(define name "Parser-tools private-combinator"))
|
Loading…
Reference in New Issue
Block a user