diff --git a/collects/parser-tools/combinator-unit.ss b/collects/combinator-parser/combinator-unit.ss similarity index 100% rename from collects/parser-tools/combinator-unit.ss rename to collects/combinator-parser/combinator-unit.ss diff --git a/collects/combinator-parser/doc.txt b/collects/combinator-parser/doc.txt new file mode 100644 index 0000000000..6b2fc65466 --- /dev/null +++ b/collects/combinator-parser/doc.txt @@ -0,0 +1,132 @@ +_combinator-parser_ + +This documentation provides directions on using the combinator parser library. It assumes familiarity with lexing and with combinator parsers. + +_combinator-unit.ss_ +This library provides a unit implementing four higher-order functions +that can be used to build a combinator parser, and the export and +import signatures related to it. The functions contained in this unit +automatically build error reporting mechanisms in the event that no parse +is found. Unlike other combinator parsers, this system assumes that the +input is already lexed into tokens using _lex.ss_. This library relies on +_(lib "lazy.ss" "lazy")_. + +The unit _combinator-parser-tools_ exports the signature +_combinator-parser^_ and imports the signatures _error-format-parameters^_, _language-format-parameters^_, and _language-dictionary^_. + +The signature combinator-parser^ references functions to build combinators, +a function to build a runable parser using a combinator, a structure for +recording errors and macro definitions to specify combinators with: + + >(terminal predicate result name spell-check case-check type-check) -> + (list token) -> parser-result + The returned function accepts one terminal from a token stream, and + returns produces an opaque value that interacts with other combinators. + + predicate: token -> boolean - check that the token is the expected one + result: token -> beta - create the ast node for this terminal + name: string - human-language name for this terminal + spell-check, case-check, type-check: (U bool (token -> bool)) + optional arguments, default to #f, perform spell checking, case + checking, and kind checking on incorrect tokens + + >(seq sequence result name) -> (list token) -> parser-result + The returned function accepts a term made up of a sequence of smaller + terms, and produces an opaque value that interacts with other + combinators. + + sequence: (listof ((list token) -> parser-result)) - the subterms + result: (list alpha) -> beta - create the ast node for this sequence. + Input list matches length of sequence list + name: human-language name for this term + + >(choice options name) -> (list token) -> parser-result + The returned function selects between different terms, and produces an + opaque value that interacts with other combinators + + options: (listof ((list token) -> parser-result) - the possible terms + name: human-language name for this term + + >(repeat term) -> (list token) -> parser-result + The returned function accepts 0 or more instances of term, and produces + an opaque value that interacts with other combinators + + term: (list token) -> parser-result + + >(parser term) -> (list token) location -> ast-or-error + Returns a function that parses a list of tokens, producing either the + result of calling all appropriate result functions or an err + + term: (list token) -> parser-result + location: string | editor + Either the string representing the file name or the editor being read, + typically retrieved from file-path + ast-or-error: AST | err + AST is the result of calling the given result function + + The err structure is: + >(make-err string source-list) + + >(err-msg err) -> string + The error message + >(err-src err) -> (list location line-k col-k pos-k span-k) + This list is suitable for calling raise-read-error, + *-k are positive integers + + The language forms provided are: + >(define-simple-terminals NAME (simple-spec ...)) + Expands to a define-empty-tokens and one terminal definition per + simple-spec + + NAME is an identifier specifying a group of tokens + + simple-spec = NAME | (NAME string) | (NAME proc) | (NAME string proc) + NAME is an identifier specifying a token/terminal with no value + proc: token -> ast - A procedure from tokens to AST nodes. id is used + by default. The token will be a symbol. + string is the human-language name for the terminal, NAME is used by + default + + >(define-terminals NAME (terminal-spec ...)) + Like define-simple-terminals, except uses define-tokens + + terminal-spec = (NAME proc) | (NAME string proc) + proc: token -> ast - a procedure from tokens to AST node. + The token will be the token defined as NAME and will be a value token. + + >(sequence (NAME ...) proc string) + Generates a call to seq with the specified names in a list, + proc => result and string => name. + The name can be omitted when nested in another sequence or choose + + >(sequence (NAME_ID ...) proc string) + where NAME_ID is either NAME or (^ NAME) + The ^ form identifies a parser production that can be used to identify + this production in an error message. Otherwise the same as above + + >(choose (NAME ...) string) + Generates a call to choice using the given terms as the list of options, + string => name. + The name can be omitted when nested in another sequence or choose + + >(eta NAME) + Eta expands name with a wrapping that properly mimcs a parser term + +The _error-format-parameters^_ signature requires five names: + src?: boolean- will the lexer include source information + input-type: string- used to identify the source of input + show-options: boolean- presently ignored + max-depth: int- The depth of errors reported + max-choice-depth: int- The max number of options listed in an error + +The _language-format-parameters^_ requires two names + class-type: string - general term for language keywords + input->output-name: token -> string - translates tokens into strings + +The _language-dictionary^_ requires three names + misspelled: string string -> boolean - + check the spelling of the second arg against the first + misscap: string string -> boolean - + check the capitalization of the second arg against the first + missclass: string string -> boolean - + check if the second arg names a correct token kind \ No newline at end of file diff --git a/collects/parser-tools/examples/combinator-example.ss b/collects/combinator-parser/examples/combinator-example.ss similarity index 100% rename from collects/parser-tools/examples/combinator-example.ss rename to collects/combinator-parser/examples/combinator-example.ss diff --git a/collects/combinator-parser/info.ss b/collects/combinator-parser/info.ss new file mode 100644 index 0000000000..2df34e404e --- /dev/null +++ b/collects/combinator-parser/info.ss @@ -0,0 +1,5 @@ + +(module info (lib "infotab.ss" "setup") + (define doc.txt "doc.txt") + (define name "Combinator parser")) + diff --git a/collects/parser-tools/private-combinator/combinator-parser.scm b/collects/combinator-parser/private-combinator/combinator-parser.scm similarity index 100% rename from collects/parser-tools/private-combinator/combinator-parser.scm rename to collects/combinator-parser/private-combinator/combinator-parser.scm diff --git a/collects/parser-tools/private-combinator/combinator.scm b/collects/combinator-parser/private-combinator/combinator.scm similarity index 100% rename from collects/parser-tools/private-combinator/combinator.scm rename to collects/combinator-parser/private-combinator/combinator.scm diff --git a/collects/parser-tools/private-combinator/errors.scm b/collects/combinator-parser/private-combinator/errors.scm similarity index 100% rename from collects/parser-tools/private-combinator/errors.scm rename to collects/combinator-parser/private-combinator/errors.scm diff --git a/collects/combinator-parser/private-combinator/info.ss b/collects/combinator-parser/private-combinator/info.ss new file mode 100644 index 0000000000..771331161e --- /dev/null +++ b/collects/combinator-parser/private-combinator/info.ss @@ -0,0 +1,2 @@ +(module info (lib "infotab.ss" "setup") + (define name "Combinator-parser private-combinator")) diff --git a/collects/parser-tools/private-combinator/parser-sigs.ss b/collects/combinator-parser/private-combinator/parser-sigs.ss similarity index 100% rename from collects/parser-tools/private-combinator/parser-sigs.ss rename to collects/combinator-parser/private-combinator/parser-sigs.ss diff --git a/collects/parser-tools/private-combinator/structs.scm b/collects/combinator-parser/private-combinator/structs.scm similarity index 100% rename from collects/parser-tools/private-combinator/structs.scm rename to collects/combinator-parser/private-combinator/structs.scm diff --git a/collects/parser-tools/private-combinator/info.ss b/collects/parser-tools/private-combinator/info.ss deleted file mode 100644 index ddcb21c109..0000000000 --- a/collects/parser-tools/private-combinator/info.ss +++ /dev/null @@ -1,2 +0,0 @@ -(module info (lib "infotab.ss" "setup") - (define name "Parser-tools private-combinator"))