diff --git a/collects/parser-tools/doc.txt b/collects/parser-tools/doc.txt index c971a47be1..cb484a8ef3 100644 --- a/collects/parser-tools/doc.txt +++ b/collects/parser-tools/doc.txt @@ -1,6 +1,7 @@ _parser-tools_ -This documentation provides directions on using the lexer, Yacc-style parser generator and combinator parser library. It assumes familiarity with lex and yacc style lexer and parser generators and with combinator parsers. +This documentation assumes familiarity with lex and yacc style lexer +and parser generators. _lex.ss_ A _regular expression_ is one of the following: @@ -425,131 +426,4 @@ the original grammar have nested blocks the tool will fail. Annotated examples are in the examples subdirectory of the parser-tools collection directory. -_combinator-unit.ss_ -This library provides a unit implementing four higher-order functions -that can be used to build a combinator parser, and the export and -import signatures related to it. The functions contained in this unit -automatically build error reporting mechanisms in the event that no parse -is found. Unlike other combinator parsers, this system assumes that the -input is already lexed into tokens using _lex.ss_. This library relies on -_(lib "lazy.ss" "lazy")_. -The unit _combinator-parser-tools_ exports the signature -_combinator-parser^_ and imports the signatures _error-format-parameters^_, _language-format-parameters^_, and _language-dictionary^_. - -The signature combinator-parser^ references functions to build combinators, -a function to build a runable parser using a combinator, a structure for -recording errors and macro definitions to specify combinators with: - - >(terminal predicate result name spell-check case-check type-check) -> - (list token) -> parser-result - The returned function accepts one terminal from a token stream, and - returns produces an opaque value that interacts with other combinators. - - predicate: token -> boolean - check that the token is the expected one - result: token -> beta - create the ast node for this terminal - name: string - human-language name for this terminal - spell-check, case-check, type-check: (U bool (token -> bool)) - optional arguments, default to #f, perform spell checking, case - checking, and kind checking on incorrect tokens - - >(seq sequence result name) -> (list token) -> parser-result - The returned function accepts a term made up of a sequence of smaller - terms, and produces an opaque value that interacts with other - combinators. - - sequence: (listof ((list token) -> parser-result)) - the subterms - result: (list alpha) -> beta - create the ast node for this sequence. - Input list matches length of sequence list - name: human-language name for this term - - >(choice options name) -> (list token) -> parser-result - The returned function selects between different terms, and produces an - opaque value that interacts with other combinators - - options: (listof ((list token) -> parser-result) - the possible terms - name: human-language name for this term - - >(repeat term) -> (list token) -> parser-result - The returned function accepts 0 or more instances of term, and produces - an opaque value that interacts with other combinators - - term: (list token) -> parser-result - - >(parser term) -> (list token) location -> ast-or-error - Returns a function that parses a list of tokens, producing either the - result of calling all appropriate result functions or an err - - term: (list token) -> parser-result - location: string | editor - Either the string representing the file name or the editor being read, - typically retrieved from file-path - ast-or-error: AST | err - AST is the result of calling the given result function - - The err structure is: - >(make-err string source-list) - - >(err-msg err) -> string - The error message - >(err-src err) -> (list location line-k col-k pos-k span-k) - This list is suitable for calling raise-read-error, - *-k are positive integers - - The language forms provided are: - >(define-simple-terminals NAME (simple-spec ...)) - Expands to a define-empty-tokens and one terminal definition per - simple-spec - - NAME is an identifier specifying a group of tokens - - simple-spec = NAME | (NAME string) | (NAME proc) | (NAME string proc) - NAME is an identifier specifying a token/terminal with no value - proc: token -> ast - A procedure from tokens to AST nodes. id is used - by default. The token will be a symbol. - string is the human-language name for the terminal, NAME is used by - default - - >(define-terminals NAME (terminal-spec ...)) - Like define-simple-terminals, except uses define-tokens - - terminal-spec = (NAME proc) | (NAME string proc) - proc: token -> ast - a procedure from tokens to AST node. - The token will be the token defined as NAME and will be a value token. - - >(sequence (NAME ...) proc string) - Generates a call to seq with the specified names in a list, - proc => result and string => name. - The name can be omitted when nested in another sequence or choose - - >(sequence (NAME_ID ...) proc string) - where NAME_ID is either NAME or (^ NAME) - The ^ form identifies a parser production that can be used to identify - this production in an error message. Otherwise the same as above - - >(choose (NAME ...) string) - Generates a call to choice using the given terms as the list of options, - string => name. - The name can be omitted when nested in another sequence or choose - - >(eta NAME) - Eta expands name with a wrapping that properly mimcs a parser term - -The _error-format-parameters^_ signature requires five names: - src?: boolean- will the lexer include source information - input-type: string- used to identify the source of input - show-options: boolean- presently ignored - max-depth: int- The depth of errors reported - max-choice-depth: int- The max number of options listed in an error - -The _language-format-parameters^_ requires two names - class-type: string - general term for language keywords - input->output-name: token -> string - translates tokens into strings - -The _language-dictionary^_ requires three names - misspelled: string string -> boolean - - check the spelling of the second arg against the first - misscap: string string -> boolean - - check the capitalization of the second arg against the first - missclass: string string -> boolean - - check if the second arg names a correct token kind \ No newline at end of file