Reverse kathy's changes to doc.txt

svn: r5732
2007-03-05 04:55:35 +00:00 · 2007-03-05 04:55:35 +00:00 · f5deb3f60d
commit f5deb3f60d
parent 58bc45a979
1 changed files with 2 additions and 128 deletions
--- a/collects/parser-tools/doc.txt
+++ b/collects/parser-tools/doc.txt
@ -1,6 +1,7 @@
 _parser-tools_ 
-This documentation provides directions on using the lexer, Yacc-style parser generator and combinator parser library. It assumes familiarity with lex and yacc style lexer and parser generators and with combinator parsers.
+This documentation assumes familiarity with lex and yacc style lexer
 and parser generators.
 _lex.ss_
 A _regular expression_ is one of the following:
@ -425,131 +426,4 @@ the original grammar have nested blocks the tool will fail.
 Annotated examples are in the examples subdirectory of the parser-tools
 collection directory.
 _combinator-unit.ss_
 This library provides a unit implementing four higher-order functions 
 that can be used to build a combinator parser, and the export and
 import signatures related to it. The functions contained in this unit
 automatically build error reporting mechanisms in the event that no parse
 is found. Unlike other combinator parsers, this system assumes that the 
 input is already lexed into tokens using _lex.ss_. This library relies on 
 _(lib "lazy.ss" "lazy")_. 
 The unit _combinator-parser-tools_ exports the signature 
 _combinator-parser^_ and imports the signatures _error-format-parameters^_, _language-format-parameters^_, and _language-dictionary^_.
 The signature combinator-parser^ references functions to build combinators,
 a function to build a runable parser using a combinator, a structure for 
 recording errors and macro definitions to specify combinators with:
  >(terminal predicate result name spell-check case-check type-check) ->
   (list token) -> parser-result 
  The returned function accepts one terminal from a token stream, and 
  returns produces an opaque value that interacts with other combinators.
    predicate: token -> boolean - check that the token is the expected one
    result: token -> beta       - create the ast node for this terminal
    name: string - human-language name for this terminal
    spell-check, case-check, type-check: (U bool (token -> bool)) 
     optional arguments, default to #f, perform spell checking, case 
     checking, and kind checking on incorrect tokens
  >(seq sequence result name) -> (list token) -> parser-result
  The returned function accepts a term made up of a sequence of smaller
  terms, and produces an opaque value that interacts with other
  combinators.
    sequence: (listof ((list token) -> parser-result)) - the subterms
    result: (list alpha) -> beta - create the ast node for this sequence. 
      Input list matches length of sequence list
    name: human-language name for this term
  >(choice options name) -> (list token) -> parser-result
 The returned function selects between different terms, and produces an
 opaque value that interacts with other combinators 
    options: (listof ((list token) -> parser-result) - the possible terms
    name: human-language name for this term
  >(repeat term) -> (list token) -> parser-result
  The returned function accepts 0 or more instances of term, and produces
  an opaque value that interacts with other combinators
    term: (list token) -> parser-result
  >(parser term) -> (list token) location -> ast-or-error
    Returns a function that parses a list of tokens, producing either the 
    result of calling all appropriate result functions or an err
    term: (list token) -> parser-result 
    location: string | editor 
     Either the string representing the file name or the editor being read,
     typically retrieved from file-path
    ast-or-error: AST | err 
     AST is the result of calling the given result function
  The err structure is:
  >(make-err string source-list)
  >(err-msg err) -> string 
     The error message
  >(err-src err) -> (list location line-k col-k pos-k span-k)
       This list is suitable for calling raise-read-error,
       *-k are positive integers
  The language forms provided are:
  >(define-simple-terminals NAME (simple-spec ...))
    Expands to a define-empty-tokens and one terminal definition per
    simple-spec
    NAME is an identifier specifying a group of tokens
    simple-spec = NAME | (NAME string) | (NAME proc) | (NAME string proc)
    NAME is an identifier specifying a token/terminal with no value
    proc: token -> ast - A procedure from tokens to AST nodes. id is used
    by default. The token will be a symbol.
    string is the human-language name for the terminal, NAME is used by
    default
  >(define-terminals NAME (terminal-spec ...))
   Like define-simple-terminals, except uses define-tokens 
   terminal-spec = (NAME proc) | (NAME string proc)
    proc: token -> ast - a procedure from tokens to AST node. 
    The token will be the token defined as NAME and will be a value token.
  >(sequence (NAME ...) proc string)
  Generates a call to seq with the specified names in a list, 
  proc => result and string => name.
  The name can be omitted when nested in another sequence or choose
  >(sequence (NAME_ID ...) proc string)  
  where NAME_ID is either NAME or (^ NAME)
  The ^ form identifies a parser production that can be used to identify
  this production in an error message. Otherwise the same as above
  >(choose (NAME ...) string)
  Generates a call to choice using the given terms as the list of options, 
  string => name.
  The name can be omitted when nested in another sequence or choose
  >(eta NAME)
  Eta expands name with a wrapping that properly mimcs a parser term
 The _error-format-parameters^_ signature requires five names:
  src?: boolean- will the lexer include source information
  input-type: string- used to identify the source of input
  show-options: boolean- presently ignored
  max-depth: int- The depth of errors reported
  max-choice-depth: int- The max number of options listed in an error 
 The _language-format-parameters^_ requires two names
  class-type: string - general term for language keywords
  input->output-name: token -> string - translates tokens into strings
 The _language-dictionary^_ requires three names
  misspelled: string string -> boolean - 
     check the spelling of the second arg against the first
  misscap: string string -> boolean - 
     check the capitalization of the second arg against the first
  missclass: string string -> boolean - 
     check if the second arg names a correct token kind