Adding a set of higher-order combinator-builders with good error message generation
Combinator implementation is the responsiblity of kathyg, not sowens svn: r5718
This commit is contained in:
parent
eadc853016
commit
c064fe4238
10
collects/parser-tools/combinator-unit.ss
Normal file
10
collects/parser-tools/combinator-unit.ss
Normal file
|
@ -0,0 +1,10 @@
|
|||
(module combinator-unit mzscheme
|
||||
|
||||
(require "private-combinator/combinator-parser.scm"
|
||||
"private-combinator/parser-sigs.ss")
|
||||
|
||||
(provide combinator-parser-tools@
|
||||
combinator-parser^
|
||||
error-format-parameters^ language-format-parameters^ language-dictionary^)
|
||||
|
||||
)
|
|
@ -1,7 +1,6 @@
|
|||
_parser-tools_
|
||||
|
||||
This documentation assumes familiarity with lex and yacc style lexer
|
||||
and parser generators.
|
||||
This documentation provides directions on using the lexer, Yacc-style parser generator and combinator parser library. It assumes familiarity with lex and yacc style lexer and parser generators and with combinator parsers.
|
||||
|
||||
_lex.ss_
|
||||
A _regular expression_ is one of the following:
|
||||
|
@ -426,4 +425,131 @@ the original grammar have nested blocks the tool will fail.
|
|||
Annotated examples are in the examples subdirectory of the parser-tools
|
||||
collection directory.
|
||||
|
||||
_combinator-unit.ss_
|
||||
This library provides a unit implementing four higher-order functions
|
||||
that can be used to build a combinator parser, and the export and
|
||||
import signatures related to it. The functions contained in this unit
|
||||
automatically build error reporting mechanisms in the event that no parse
|
||||
is found. Unlike other combinator parsers, this system assumes that the
|
||||
input is already lexed into tokens using _lex.ss_. This library relies on
|
||||
_(lib "lazy.ss" "lazy")_.
|
||||
|
||||
The unit _combinator-parser-tools_ exports the signature
|
||||
_combinator-parser^_ and imports the signatures _error-format-parameters^_, _language-format-parameters^_, and _language-dictionary^_.
|
||||
|
||||
The signature combinator-parser^ references functions to build combinators,
|
||||
a function to build a runable parser using a combinator, a structure for
|
||||
recording errors and macro definitions to specify combinators with:
|
||||
|
||||
>(terminal predicate result name spell-check case-check type-check) ->
|
||||
(list token) -> parser-result
|
||||
The returned function accepts one terminal from a token stream, and
|
||||
returns produces an opaque value that interacts with other combinators.
|
||||
|
||||
predicate: token -> boolean - check that the token is the expected one
|
||||
result: token -> beta - create the ast node for this terminal
|
||||
name: string - human-language name for this terminal
|
||||
spell-check, case-check, type-check: (U bool (token -> bool))
|
||||
optional arguments, default to #f, perform spell checking, case
|
||||
checking, and kind checking on incorrect tokens
|
||||
|
||||
>(seq sequence result name) -> (list token) -> parser-result
|
||||
The returned function accepts a term made up of a sequence of smaller
|
||||
terms, and produces an opaque value that interacts with other
|
||||
combinators.
|
||||
|
||||
sequence: (listof ((list token) -> parser-result)) - the subterms
|
||||
result: (list alpha) -> beta - create the ast node for this sequence.
|
||||
Input list matches length of sequence list
|
||||
name: human-language name for this term
|
||||
|
||||
>(choice options name) -> (list token) -> parser-result
|
||||
The returned function selects between different terms, and produces an
|
||||
opaque value that interacts with other combinators
|
||||
|
||||
options: (listof ((list token) -> parser-result) - the possible terms
|
||||
name: human-language name for this term
|
||||
|
||||
>(repeat term) -> (list token) -> parser-result
|
||||
The returned function accepts 0 or more instances of term, and produces
|
||||
an opaque value that interacts with other combinators
|
||||
|
||||
term: (list token) -> parser-result
|
||||
|
||||
>(parser term) -> (list token) location -> ast-or-error
|
||||
Returns a function that parses a list of tokens, producing either the
|
||||
result of calling all appropriate result functions or an err
|
||||
|
||||
term: (list token) -> parser-result
|
||||
location: string | editor
|
||||
Either the string representing the file name or the editor being read,
|
||||
typically retrieved from file-path
|
||||
ast-or-error: AST | err
|
||||
AST is the result of calling the given result function
|
||||
|
||||
The err structure is:
|
||||
>(make-err string source-list)
|
||||
|
||||
>(err-msg err) -> string
|
||||
The error message
|
||||
>(err-src err) -> (list location line-k col-k pos-k span-k)
|
||||
This list is suitable for calling raise-read-error,
|
||||
*-k are positive integers
|
||||
|
||||
The language forms provided are:
|
||||
>(define-simple-terminals NAME (simple-spec ...))
|
||||
Expands to a define-empty-tokens and one terminal definition per
|
||||
simple-spec
|
||||
|
||||
NAME is an identifier specifying a group of tokens
|
||||
|
||||
simple-spec = NAME | (NAME string) | (NAME proc) | (NAME string proc)
|
||||
NAME is an identifier specifying a token/terminal with no value
|
||||
proc: token -> ast - A procedure from tokens to AST nodes. id is used
|
||||
by default. The token will be a symbol.
|
||||
string is the human-language name for the terminal, NAME is used by
|
||||
default
|
||||
|
||||
>(define-terminals NAME (terminal-spec ...))
|
||||
Like define-simple-terminals, except uses define-tokens
|
||||
|
||||
terminal-spec = (NAME proc) | (NAME string proc)
|
||||
proc: token -> ast - a procedure from tokens to AST node.
|
||||
The token will be the token defined as NAME and will be a value token.
|
||||
|
||||
>(sequence (NAME ...) proc string)
|
||||
Generates a call to seq with the specified names in a list,
|
||||
proc => result and string => name.
|
||||
The name can be omitted when nested in another sequence or choose
|
||||
|
||||
>(sequence (NAME_ID ...) proc string)
|
||||
where NAME_ID is either NAME or (^ NAME)
|
||||
The ^ form identifies a parser production that can be used to identify
|
||||
this production in an error message. Otherwise the same as above
|
||||
|
||||
>(choose (NAME ...) string)
|
||||
Generates a call to choice using the given terms as the list of options,
|
||||
string => name.
|
||||
The name can be omitted when nested in another sequence or choose
|
||||
|
||||
>(eta NAME)
|
||||
Eta expands name with a wrapping that properly mimcs a parser term
|
||||
|
||||
The _error-format-parameters^_ signature requires five names:
|
||||
src?: boolean- will the lexer include source information
|
||||
input-type: string- used to identify the source of input
|
||||
show-options: boolean- presently ignored
|
||||
max-depth: int- The depth of errors reported
|
||||
max-choice-depth: int- The max number of options listed in an error
|
||||
|
||||
The _language-format-parameters^_ requires two names
|
||||
class-type: string - general term for language keywords
|
||||
input->output-name: token -> string - translates tokens into strings
|
||||
|
||||
The _language-dictionary^_ requires three names
|
||||
misspelled: string string -> boolean -
|
||||
check the spelling of the second arg against the first
|
||||
misscap: string string -> boolean -
|
||||
check the capitalization of the second arg against the first
|
||||
missclass: string string -> boolean -
|
||||
check if the second arg names a correct token kind
|
58
collects/parser-tools/examples/combinator-example.ss
Normal file
58
collects/parser-tools/examples/combinator-example.ss
Normal file
|
@ -0,0 +1,58 @@
|
|||
(module combinator-example mzscheme
|
||||
|
||||
(require (lib "unit.ss")
|
||||
(lib "lex.ss" "parser-tools")
|
||||
(lib "combinator-unit.ss" "parser-tools"))
|
||||
|
||||
(define-unit support
|
||||
(import)
|
||||
(export error-format-parameters^
|
||||
language-format-parameters^
|
||||
language-dictionary^)
|
||||
|
||||
(define src? #t)
|
||||
(define input-type "file")
|
||||
(define show-options #f)
|
||||
(define max-depth 1)
|
||||
(define max-choice-depth 2)
|
||||
|
||||
(define class-type "keyword")
|
||||
(define (input->output-name t) (token-name t))
|
||||
|
||||
(define (misspelled s1 s2)
|
||||
(and (equal? s1 "lam")
|
||||
(equal? s2 "lambda")))
|
||||
(define (misscap s1 s2)
|
||||
(and (equal? s1 "lam")
|
||||
(equal? s2 "Lam")))
|
||||
(define (missclass s1 s2) #f)
|
||||
)
|
||||
|
||||
(define-signature parser^ (parse-prog))
|
||||
|
||||
(define-unit lambda-calc
|
||||
(import combinator-parser^)
|
||||
(export parser^)
|
||||
|
||||
(define-simple-terminals keywords
|
||||
(lam (O_paren "(") (C_paren ")")))
|
||||
|
||||
(define-terminals ids
|
||||
((id "variable" string->symbol) (number (lambda (x) (read (open-input-string x))))))
|
||||
|
||||
(define app
|
||||
(sequence (O_paren (repeat (eta expr)) C_paren)
|
||||
(lambda (id) id)
|
||||
"application"))
|
||||
|
||||
(define func
|
||||
(sequence (O_paren lam O_paren (repeat id) (eta expr))
|
||||
(lambda (id) id)
|
||||
"function"))
|
||||
|
||||
(define expr (choose (id number app func) "expression"))
|
||||
|
||||
(define parse-prog (parser expr))
|
||||
)
|
||||
|
||||
)
|
|
@ -0,0 +1,92 @@
|
|||
(module combinator-parser (lib "lazy.ss" "lazy")
|
||||
|
||||
(require (lib "unit.ss")
|
||||
(lib "lex.ss" "parser-tools"))
|
||||
|
||||
(require "structs.scm" "parser-sigs.ss" "combinator.scm" "errors.scm")
|
||||
|
||||
(provide combinator-parser-tools@)
|
||||
|
||||
(define-unit main-parser@
|
||||
(import error^ out^ error-format-parameters^ language-format-parameters^)
|
||||
(export parser^)
|
||||
|
||||
(define (sort-used reses)
|
||||
(sort reses (lambda (a b) (> (res-used a) (res-used b)))))
|
||||
|
||||
(define (parser start)
|
||||
(lambda (input file)
|
||||
(let* ([result (start input)]
|
||||
[out
|
||||
(cond
|
||||
[(and (res? result) (res-a result) (null? (res-rest result)))
|
||||
(car (res-a (!!! result)))]
|
||||
[(and (res? result) (res-a result) (res-possible-error result))
|
||||
(fail-type->message (!!! (res-possible-error result)))]
|
||||
[(and (res? result) (res-a result))
|
||||
(make-err
|
||||
(format "Found extraneous input after ~a, starting with ~a, at the end of ~a."
|
||||
(!!! (res-msg result))
|
||||
(input->output-name (!!! (car (res-rest result)))) input-type)
|
||||
(and src?
|
||||
(make-src-lst (position-token-start-pos (!!! (car (res-rest result)))))))]
|
||||
[(res? result) (fail-type->message (res-msg (!!! result)))]
|
||||
[(or (choice-res? result) (pair? result))
|
||||
(let* ([options (if (choice-res? result) (choice-res-matches result) result)]
|
||||
[finished-options (filter (lambda (o)
|
||||
(cond [(res? o) (null? (res-rest o))]
|
||||
[(repeat-res? o)
|
||||
(eq? (repeat-res-stop o) 'out-of-input)]))
|
||||
options)]
|
||||
[possible-errors (filter res-possible-error
|
||||
(map (lambda (a) (if (repeat-res? a) (repeat-res-a a) a))
|
||||
options))])
|
||||
(cond
|
||||
[(not (null? finished-options)) (car (res-a (!!! (car finished-options))))]
|
||||
[(not (null? possible-errors))
|
||||
(!!! (fail-type->message
|
||||
(res-possible-error (!!! (car (sort-used possible-errors))))))]
|
||||
[else
|
||||
(let ([used-sort (sort-used options)])
|
||||
(make-err
|
||||
(format "Found additional content after ~a, begining with ~a."
|
||||
(!!! (res-msg (car used-sort)))
|
||||
(input->output-name (!!! (car (res-rest (car used-sort))))))
|
||||
(and src?
|
||||
(make-src-lst (position-token-start-pos
|
||||
(!!! (car (res-rest (car used-sort)))))))))]))]
|
||||
[(and (repeat-res? result) (eq? 'out-of-input (repeat-res-stop (!!! result))))
|
||||
(res-a (repeat-res-a result))]
|
||||
[(and (repeat-res? result) (fail-type? (repeat-res-stop (!!! result))))
|
||||
(!!! (fail-type->message (!!! (repeat-res-stop (!!! result)))))]
|
||||
[else (error 'parser (format "Internal error: recieved unexpected input ~a"
|
||||
(!!! result)))])])
|
||||
(cond
|
||||
[(err? out)
|
||||
(make-err (!!! (err-msg out))
|
||||
(cons file (!!list (err-src out))))]
|
||||
[else out]))))
|
||||
)
|
||||
|
||||
(define-unit rank-max@
|
||||
(import)
|
||||
(export ranking-parameters^)
|
||||
(define (rank-choice choices) (apply max choices)))
|
||||
|
||||
(define-unit out-struct@
|
||||
(import)
|
||||
(export out^)
|
||||
(define-struct err (msg src)))
|
||||
|
||||
(define-compound-unit/infer combinator-parser@
|
||||
(import error-format-parameters^ language-format-parameters^ language-dictionary^)
|
||||
(export combinator-parser-forms^ parser^ out^)
|
||||
(link out-struct@ main-parser@ rank-max@ error-formatting@ combinators@))
|
||||
|
||||
(define-unit/new-import-export combinator-parser-tools@
|
||||
(import error-format-parameters^ language-format-parameters^ language-dictionary^)
|
||||
(export combinator-parser^)
|
||||
((combinator-parser-forms^ parser^ out^) combinator-parser@ error-format-parameters^ language-format-parameters^
|
||||
language-dictionary^))
|
||||
|
||||
)
|
427
collects/parser-tools/private-combinator/combinator.scm
Normal file
427
collects/parser-tools/private-combinator/combinator.scm
Normal file
|
@ -0,0 +1,427 @@
|
|||
(module combinator (lib "lazy.ss" "lazy")
|
||||
|
||||
(require (lib "unit.ss")
|
||||
(only (lib "etc.ss") opt-lambda))
|
||||
|
||||
(require "structs.scm"
|
||||
"parser-sigs.ss"
|
||||
(lib "lex.ss" "parser-tools"))
|
||||
|
||||
(provide (all-defined))
|
||||
|
||||
(define-unit combinators@
|
||||
(import error-format-parameters^ ranking-parameters^ language-dictionary^)
|
||||
(export combinator-parser-forms^)
|
||||
|
||||
(define return-name "dummy")
|
||||
|
||||
;terminal: ('a -> bool 'a -> 'b string) -> ( (list 'a) -> res )
|
||||
(define terminal
|
||||
(opt-lambda (pred build name [spell? #f] [case? #f] [class? #f])
|
||||
(let* ([fail-str (string-append "failed " name)]
|
||||
[t-name
|
||||
(lambda (t) (if src? (token-name (position-token-token t)) (token-name t)))]
|
||||
[t-val
|
||||
(lambda (t) (if src? (token-value (position-token-token t)) (token-value t)))]
|
||||
[spell? (if spell? spell?
|
||||
(lambda (token)
|
||||
(when (position-token? token) (set! token (position-token-token token)))
|
||||
(and (token-value token)
|
||||
(misspelled name (token-value token)))))]
|
||||
[case? (if case? case?
|
||||
(lambda (token)
|
||||
(when (position-token? token) (set! token (position-token-token token)))
|
||||
(and (token-value token)
|
||||
(misscap name (token-value token)))))]
|
||||
[class? (if class? class?
|
||||
(lambda (token)
|
||||
(when (position-token? token) (set! token (position-token-token token)))
|
||||
(missclass name (token-name token))))]
|
||||
[make-fail
|
||||
(lambda (c n k i u)
|
||||
(make-terminal-fail c (if src?
|
||||
(make-src-lst (position-token-start-pos i)
|
||||
(position-token-end-pos i))
|
||||
null)
|
||||
n 0 u k (if src? (position-token-token i) i)))]
|
||||
[value (lambda (t) (or (t-val t) name))]
|
||||
[builder
|
||||
(if src?
|
||||
(lambda (token) (build (position-token-token token)
|
||||
(position-token-start-pos token)
|
||||
(position-token-end-pos token)))
|
||||
build)])
|
||||
|
||||
(opt-lambda (input [last-src (list 0 0 0 0)] [alts 1])
|
||||
#;(!!! (printf "terminal ~a~n" name))
|
||||
(cond
|
||||
[(eq? input return-name) name]
|
||||
[(null? input)
|
||||
(make-terminal-fail null last-src .4 0 0 'end #f)]
|
||||
[(pred (if src? (position-token-token (car input)) (car input)))
|
||||
(make-res (list (builder (car input))) (cdr input)
|
||||
name (value (car input)) 1 #f (car input))]
|
||||
[else
|
||||
#;(printf "Incorrect input for ~a : ~a miscase? ~a misspell? ~a ~n" name
|
||||
(cond
|
||||
[(and (position-token? (car input))
|
||||
(token-value (position-token-token (car input))))
|
||||
(token-value (position-token-token (car input)))]
|
||||
[(position-token? (car input))
|
||||
(token-name (position-token-token (car input)))]
|
||||
[else (car input)])
|
||||
(case? (car input))
|
||||
(spell? (car input)))
|
||||
(fail-res (cdr input)
|
||||
(let-values ([(chance kind may-use)
|
||||
(cond
|
||||
[(case? (car input)) (values 9/10 'misscase 1)]
|
||||
[(spell? (car input))
|
||||
(values 4/5 'misspell 1)]
|
||||
[(class? (car input)) (values 2/5 'missclass 1)]
|
||||
[else (values 1/5 'wrong 0)])])
|
||||
(make-fail chance name kind (car input) may-use)))])))))
|
||||
|
||||
;seq: ( (list ((list 'a) -> res)) ((list 'b) -> 'c) string -> ((list 'a) -> result)
|
||||
(define seq
|
||||
(opt-lambda (sub-list build name [id-position 0])
|
||||
(let* ([sequence-length (length sub-list)]
|
||||
[memo-table (make-hash-table 'weak)]
|
||||
[prev (lambda (x)
|
||||
(cond [(eq? x return-name) "default previous"]
|
||||
[else (fail-res null null)]))]
|
||||
[builder
|
||||
(lambda (r)
|
||||
(cond
|
||||
[(res? r)
|
||||
(make-res (list (build (res-a r))) (res-rest r)
|
||||
name (res-id r) (res-used r)
|
||||
(res-possible-error r)
|
||||
(res-first-tok r))]
|
||||
[(repeat-res? r)
|
||||
(make-res (list (build (res-a (repeat-res-a r))))
|
||||
(res-rest (repeat-res-a r))
|
||||
name (res-id (repeat-res-a r))
|
||||
(res-used (repeat-res-a r))
|
||||
(repeat-res-stop r)
|
||||
(res-first-tok (repeat-res-a r)))]))]
|
||||
[my-error (sequence-error-gen name sequence-length)]
|
||||
[my-walker (seq-walker id-position name my-error)])
|
||||
(opt-lambda (input [alts 1] [last-src (list 0 0 0 0)])
|
||||
#;(printf "seq ~a~n" name)
|
||||
(cond
|
||||
[(eq? input return-name) name]
|
||||
[(hash-table-get memo-table input #f) (hash-table-get memo-table input)]
|
||||
[(null? sub-list)
|
||||
(builder (make-res null input name #f 0 #f #f))]
|
||||
[else
|
||||
(let* ([pre-build-ans (my-walker sub-list input prev #f #f null 0 alts last-src)]
|
||||
[ans
|
||||
(cond
|
||||
[(and (res? pre-build-ans) (res-a pre-build-ans)) (builder pre-build-ans)]
|
||||
[(pair? pre-build-ans) (map builder pre-build-ans)]
|
||||
[else pre-build-ans])])
|
||||
(hash-table-put! memo-table input ans)
|
||||
#;(printf "sequence ~a returning ~n" name)
|
||||
ans)])))))
|
||||
|
||||
;seq-walker: int string error-gen -> [(list parser) (list alpha) parser result (U bool string) (list string) int int -> result
|
||||
(define (seq-walker id-position seq-name build-error)
|
||||
(letrec ([next-res
|
||||
(lambda (a id used tok rst)
|
||||
(cond
|
||||
[(res? rst)
|
||||
(make-res (append a (res-a rst)) (res-rest rst)
|
||||
seq-name (or id (res-id rst))
|
||||
(+ used (res-used rst)) (res-possible-error rst) tok)]
|
||||
[(repeat-res? rst)
|
||||
(make-res (append a (res-a (repeat-res-a rst)))
|
||||
(res-rest (repeat-res-a rst)) seq-name
|
||||
(or id (res-id (repeat-res-a rst)))
|
||||
(+ used (res-used (repeat-res-a rst)))
|
||||
(repeat-res-stop rst) tok)]))]
|
||||
[walker
|
||||
(lambda (subs input previous? look-back curr-id seen used alts last-src)
|
||||
(let* ([next-preds (cdr subs)]
|
||||
[curr-pred (car subs)]
|
||||
[id-spot? (= id-position (add1 (length seen)))]
|
||||
[next-call
|
||||
(lambda (old-result curr curr-name new-id tok alts)
|
||||
(let* ([old-answer (res-a old-result)]
|
||||
[rest (res-rest old-result)]
|
||||
[old-used (res-used old-result)]
|
||||
[rsts (walker next-preds rest curr-pred curr
|
||||
(or new-id curr-id) (cons curr-name seen)
|
||||
(+ old-used used) alts
|
||||
(make-src-lst (res-first-tok old-result)))])
|
||||
(cond
|
||||
[(and (res? rsts) (res-a rsts))
|
||||
(next-res old-answer new-id old-used tok rsts)]
|
||||
[(res? rsts) (fail-res rest (res-msg rsts))]
|
||||
[(pair? rsts)
|
||||
(map (lambda (rst) (next-res old-answer new-id old-used tok rst))
|
||||
rsts)])))])
|
||||
(cond
|
||||
[(null? next-preds)
|
||||
(build-error (curr-pred input last-src)
|
||||
(previous? input) (previous? return-name)
|
||||
look-back used curr-id seen alts last-src)]
|
||||
[else
|
||||
#;(printf "seq-walker called: else case~n")
|
||||
(let ([fst (curr-pred input last-src)])
|
||||
(cond
|
||||
[(res? fst)
|
||||
(cond
|
||||
[(res-a fst) (next-call fst fst (res-msg fst) (and id-spot? (res-id fst))
|
||||
(res-first-tok fst) alts)]
|
||||
[else
|
||||
(build-error fst (previous? input) (previous? return-name)
|
||||
look-back used curr-id seen alts last-src)])]
|
||||
[(repeat-res? fst) (next-call (repeat-res-a fst) fst
|
||||
(res-msg (repeat-res-a fst)) #f
|
||||
(res-first-tok (repeat-res-a fst)) alts)]
|
||||
[(or (choice-res? fst) (pair? fst))
|
||||
#;(printf "choice-res or pair: ~a ~a ~n"
|
||||
(if (choice-res? fst) (map res-rest (choice-res-matches fst)) fst)
|
||||
(if (choice-res? fst) (map res-a (choice-res-matches fst)) fst))
|
||||
(let*-values
|
||||
([(lst name curr)
|
||||
(if (choice-res? fst)
|
||||
(values (choice-res-matches fst)
|
||||
(lambda (_) (choice-res-name fst))
|
||||
(lambda (_) fst))
|
||||
(values fst res-msg (lambda (x) x)))]
|
||||
[(new-alts) (+ alts (length lst))]
|
||||
[(rsts)
|
||||
(map (lambda (res)
|
||||
(cond
|
||||
[(res? res)
|
||||
(next-call res (curr res) (name res) (and id-spot? (res-id res))
|
||||
(res-first-tok res) new-alts)]
|
||||
[(repeat-res? res)
|
||||
(next-call (repeat-res-a res) res
|
||||
(res-msg (repeat-res-a res)) #f
|
||||
(res-first-tok (repeat-res-a res)) new-alts)])) lst)]
|
||||
[(correct-rsts) (correct-list rsts)])
|
||||
#;(printf "correct-rsts ~a~n" (map res-a correct-rsts))
|
||||
#;(printf "rsts: ~a~n" (map res-a rsts))
|
||||
(cond
|
||||
[(null? correct-rsts)
|
||||
(let ([fails (map (lambda (rst)
|
||||
(res-msg
|
||||
(build-error rst (previous? input) (previous? return-name)
|
||||
look-back used curr-id seen alts last-src)))
|
||||
rsts)])
|
||||
(fail-res input
|
||||
(make-options-fail
|
||||
(rank-choice (map fail-type-chance fails)) #f seq-name
|
||||
(rank-choice (map fail-type-used fails))
|
||||
(rank-choice (map fail-type-may-use fails)) fails)))]
|
||||
[else correct-rsts]))]))])))])
|
||||
walker))
|
||||
|
||||
;get-fail-info: fail-type -> (values symbol 'a 'b)
|
||||
(define (get-fail-info fail)
|
||||
(cond
|
||||
[(terminal-fail? fail)
|
||||
(values (terminal-fail-kind fail)
|
||||
(fail-type-name fail)
|
||||
(terminal-fail-found fail))]
|
||||
[(sequence-fail? fail)
|
||||
(values 'sub-seq (sequence-fail-expected fail) fail)]
|
||||
[(choice-fail? fail) (values 'choice null fail)]
|
||||
[(options-fail? fail) (values 'options null fail)]))
|
||||
|
||||
;update-src: symbol src-list src-list token -> src-list
|
||||
(define (update-src error-kind src prev-src tok)
|
||||
(and src?
|
||||
(case error-kind
|
||||
[(choice options) prev-src]
|
||||
[(sub-seq misscase misspell end) src]
|
||||
[(missclass wrong) (update-src-start src (position-token-start-pos tok))])))
|
||||
|
||||
;build-sequence-error: result boolean result string int [U #f string] [listof string] int int -> result
|
||||
(define (sequence-error-gen name len)
|
||||
(lambda (old-res prev prev-name look-back used id seen alts last-src)
|
||||
(cond
|
||||
[(and (pair? old-res) (null? (cdr old-res))) (car old-res)]
|
||||
[(repeat-res? old-res)
|
||||
(cond
|
||||
[(fail-type? (repeat-res-stop old-res))
|
||||
(let ([res (repeat-res-a old-res)])
|
||||
(make-res (res-a res) (res-rest res) (res-msg res) (res-id res) (res-used res)
|
||||
(repeat-res-stop old-res) (res-first-tok res)))]
|
||||
[else (repeat-res-a old-res)])]
|
||||
[(or (and (res? old-res) (res-a old-res))
|
||||
(choice-res? old-res)
|
||||
(pair? old-res)) old-res]
|
||||
[else
|
||||
;There actually was an error
|
||||
(fail-res (res-rest old-res)
|
||||
(let*-values ([(fail)
|
||||
(cond
|
||||
[(and (repeat-res? look-back)
|
||||
(fail-type? (repeat-res-stop look-back))
|
||||
(> (fail-type-chance (repeat-res-stop look-back))
|
||||
(fail-type-chance (res-msg old-res))))
|
||||
(repeat-res-stop look-back)]
|
||||
[else (res-msg old-res)])]
|
||||
[(kind expected found) (get-fail-info fail)]
|
||||
[(new-src) (update-src kind
|
||||
(fail-type-src fail)
|
||||
last-src
|
||||
(res-first-tok old-res))]
|
||||
[(seen-len) (length seen)]
|
||||
[(updated-len) (+ (- used seen-len) len)])
|
||||
#;(printf "sequence ~a failed.~n seen ~a~n" name (reverse seen))
|
||||
#;(when (repeat-res? look-back)
|
||||
(printf "look-back ~a : ~a vs ~a : ~a > ~a~n"
|
||||
(repeat-res-stop look-back)
|
||||
(and (fail-type? (repeat-res-stop look-back)) (fail-type-name (repeat-res-stop look-back)))
|
||||
(fail-type-name (res-msg old-res))
|
||||
(and (fail-type? (repeat-res-stop look-back)) (fail-type-chance (repeat-res-stop look-back)))
|
||||
(fail-type-chance (res-msg old-res))))
|
||||
#;(printf "old-probability ~a new probability ~a~n"
|
||||
(cond
|
||||
[(= 0 used) (fail-type-chance fail)]
|
||||
[else (+ (* (fail-type-chance fail) (/ 1 updated-len))
|
||||
(/ used updated-len))])
|
||||
(compute-chance len seen-len used alts (fail-type-chance fail)))
|
||||
(make-sequence-fail
|
||||
(cond
|
||||
[(= 0 used) (fail-type-chance fail)]
|
||||
[else (compute-chance len seen-len used alts (fail-type-chance fail))])
|
||||
(fail-type-src fail)
|
||||
name used
|
||||
(+ used (fail-type-may-use fail))
|
||||
id kind (reverse seen) expected found (and (res? prev) (res-a prev) (res-msg prev))
|
||||
prev-name)))])))
|
||||
|
||||
(define (compute-chance expected-length seen-length used-toks num-alts sub-chance)
|
||||
(let* ([revised-expectation (+ (- used-toks seen-length) expected-length)]
|
||||
[probability-with-sub (* (/ (add1 used-toks) revised-expectation) (/ 1 num-alts))]
|
||||
[probability-without-sub (* (/ used-toks revised-expectation) (/ 1 num-alts))]
|
||||
[expected-sub probability-with-sub]
|
||||
[expected-no-sub probability-without-sub]
|
||||
[probability (/ (* expected-sub sub-chance) (+ (* expected-sub sub-chance)
|
||||
(* expected-no-sub (- 1 sub-chance))))])
|
||||
#;(printf "compute-chance: args ~a ~a ~a ~a ~a~n"
|
||||
expected-length seen-length used-toks num-alts sub-chance)
|
||||
#;(printf "compute-chance: intermediate values: ~a ~a ~a ~a ~a~n"
|
||||
revised-expectation probability-with-sub probability-without-sub expected-sub expected-no-sub)
|
||||
probability))
|
||||
|
||||
;repeat: (list 'a) -> result -> (list 'a) -> result
|
||||
(define (repeat sub)
|
||||
(letrec ([repeat-name (string-append "any number of " (sub return-name))]
|
||||
[memo-table (make-hash-table 'weak)]
|
||||
[process-rest
|
||||
(lambda (curr-ans rest-ans)
|
||||
(cond
|
||||
[(repeat-res? rest-ans)
|
||||
(let ([a (res-a curr-ans)]
|
||||
[rest (repeat-res-a rest-ans)])
|
||||
(make-repeat-res
|
||||
(cond
|
||||
[(res? rest)
|
||||
(make-res (append a (res-a rest)) (res-rest rest) repeat-name ""
|
||||
(+ (res-used curr-ans) (res-used rest)) (res-possible-error rest)
|
||||
(res-first-tok curr-ans))]
|
||||
[(and (pair? rest) (null? (cdr rest)))
|
||||
(make-res (append a (res-a (car rest))) (res-rest (car rest)) repeat-name ""
|
||||
(+ (res-used curr-ans) (res-used (car rest)))
|
||||
(res-possible-error (car rest))
|
||||
(res-first-tok curr-ans))]
|
||||
[(pair? rest)
|
||||
(correct-list
|
||||
(map (lambda (rs)
|
||||
(make-res (append a (res-a rs)) (res-rest rs) repeat-name ""
|
||||
(+ (res-used curr-ans) (res-used rs))
|
||||
(res-possible-error rs)
|
||||
(res-first-tok curr-ans)))
|
||||
rest))])
|
||||
(repeat-res-stop rest-ans)))]
|
||||
[(pair? rest-ans)
|
||||
(map (lambda (r) (process-rest curr-ans r)) rest-ans)]))])
|
||||
(opt-lambda (input [last-src (list 0 0 0 0)] [alts 1])
|
||||
(cond
|
||||
[(eq? input return-name) repeat-name]
|
||||
[(hash-table-get memo-table input #f) (hash-table-get memo-table input)]
|
||||
[else
|
||||
(let ([ans
|
||||
(let loop ([curr-input input])
|
||||
(cond
|
||||
[(null? curr-input)
|
||||
(make-repeat-res (make-res null null repeat-name "" 0 #f #f) 'out-of-input)]
|
||||
[else
|
||||
(let ([this-res (sub curr-input last-src)])
|
||||
#;(printf "Repeat of ~a called it's repeated entity: ~a~n" repeat-name this-res)
|
||||
(cond
|
||||
[(and (res? this-res) (res-a this-res))
|
||||
(process-rest this-res (loop (res-rest this-res)))]
|
||||
[(res? this-res)
|
||||
(make-repeat-res (make-res null curr-input repeat-name "" 0 #f #f) (res-msg this-res))]
|
||||
[(or (choice-res? this-res) (pair? this-res))
|
||||
(map (lambda (match) (process-rest match (loop (res-rest match))))
|
||||
(if (choice-res? this-res) (choice-res-matches this-res) this-res))]))]))])
|
||||
(hash-table-put! memo-table input ans)
|
||||
#;(printf "repeat of ~a ended with ans ~a~n" repeat-name ans)
|
||||
ans)]))))
|
||||
|
||||
;choice: [list [[list 'a ] -> result]] name -> result
|
||||
(define (choice opt-list name)
|
||||
(let ([memo-table (make-hash-table 'weak)]
|
||||
[num-choices (length opt-list)]
|
||||
[choice-names (map (lambda (o) (o return-name)) opt-list)])
|
||||
(opt-lambda (input [last-src (list 0 0 0 0)] [alts 1])
|
||||
#;(printf "choice ~a~n" name)
|
||||
(let ([sub-opts (sub1 (+ alts num-choices))])
|
||||
(cond
|
||||
[(hash-table-get memo-table input #f) (hash-table-get memo-table input)]
|
||||
[(eq? input return-name) name]
|
||||
[else
|
||||
(let* ([options (map (lambda (term) (term input last-src sub-opts)) opt-list)]
|
||||
[fails (map res-msg options)]
|
||||
[corrects (correct-list options)]
|
||||
[ans
|
||||
(cond
|
||||
[(null? corrects)
|
||||
(fail-res input
|
||||
(make-choice-fail (rank-choice (map fail-type-chance fails)) #f name
|
||||
(rank-choice (map fail-type-used fails))
|
||||
(rank-choice (map fail-type-may-use fails))
|
||||
num-choices choice-names fails))]
|
||||
[(null? (cdr corrects)) (car corrects)]
|
||||
[else (make-choice-res name corrects)])])
|
||||
#;(printf "choice ~a is returning ~a options were ~a ~n" name ans choice-names)
|
||||
(hash-table-put! memo-table input ans) ans)])))))
|
||||
|
||||
;correct-list: (list result) -> (list result)
|
||||
(define (correct-list subs)
|
||||
(cond
|
||||
[(pair? subs)
|
||||
(cond
|
||||
[(and (res? (car subs)) (res-a (car subs)))
|
||||
(cons (car subs) (correct-list (cdr subs)))]
|
||||
[(choice-res? (car subs))
|
||||
(append (choice-res-matches (car subs)) (correct-list (cdr subs)))]
|
||||
[(repeat-res? (car subs))
|
||||
(correct-list (cons (repeat-res-a (car subs)) (cdr subs)))]
|
||||
[(pair? (car subs))
|
||||
(append (car subs) (correct-list (cdr subs)))]
|
||||
[else (correct-list (cdr subs))])]
|
||||
[(null? subs) null]))
|
||||
|
||||
(define (update-src-start src new-start)
|
||||
(list (position-line new-start)
|
||||
(position-col new-start)
|
||||
(position-offset new-start)
|
||||
(+ (- (third src) (position-offset new-start))
|
||||
(fourth src))))
|
||||
|
||||
(define (update-src-end src new-end)
|
||||
(list (first src) (second src) (third src)
|
||||
(- (position-offset new-end) (third src))))
|
||||
|
||||
)
|
||||
)
|
261
collects/parser-tools/private-combinator/errors.scm
Normal file
261
collects/parser-tools/private-combinator/errors.scm
Normal file
|
@ -0,0 +1,261 @@
|
|||
(module errors mzscheme
|
||||
|
||||
(require "structs.scm" "parser-sigs.ss")
|
||||
|
||||
(require (lib "force.ss" "lazy")
|
||||
(lib "etc.ss")
|
||||
(lib "unit.ss")
|
||||
(lib "list.ss"))
|
||||
|
||||
(provide (all-defined))
|
||||
|
||||
(define-unit error-formatting@
|
||||
(import error-format-parameters^ language-format-parameters^ out^)
|
||||
(export (rename error^ (public-fail-type->message fail-type->message)))
|
||||
|
||||
;public-fail-type->message : fail-type -> err
|
||||
(define (public-fail-type->message fail)
|
||||
(fail-type->message fail null))
|
||||
|
||||
;fail-type->message: fail-type (listof err) -> err
|
||||
(define (fail-type->message fail-type message-to-date)
|
||||
(let* ([fail-type (!!!-fail fail-type)]
|
||||
[input->output-name (!!! input->output-name)]
|
||||
[name (fail-type-name fail-type)]
|
||||
[a (a/an name)]
|
||||
[msg (lambda (m) (make-err m (fail-type-src fail-type)))])
|
||||
#;(printf "fail-type->message ~a~n" fail-type)
|
||||
(cond
|
||||
[(terminal-fail? fail-type)
|
||||
(combine-message
|
||||
(msg
|
||||
(case (terminal-fail-kind fail-type)
|
||||
[(end) (format "Expected to find ~a ~a, but ~a ended prematurely."
|
||||
a name input-type)]
|
||||
[(wrong) (format "Expected to find ~a ~a, but instead found ~a, which is illegal here."
|
||||
a name (input->output-name (terminal-fail-found fail-type)))]
|
||||
[(misscase) (format "Expected to find ~a ~a, found ~a which may be miscapitalized."
|
||||
a name (input->output-name (terminal-fail-found fail-type)))]
|
||||
[(misspell) (format "Expected to find ~a ~a, found ~a which seems to be misspelled."
|
||||
a name (input->output-name (terminal-fail-found fail-type)))]
|
||||
[(missclass) (format "Found ~a instead of ~a ~a, a ~a cannot be used as ~a ~a."
|
||||
(input->output-name (terminal-fail-found fail-type)) a name class-type a name)]))
|
||||
message-to-date)]
|
||||
[(sequence-fail? fail-type)
|
||||
(let* ([id-name
|
||||
(if (sequence-fail-id fail-type)
|
||||
(string-append name " " (sequence-fail-id fail-type)) name)]
|
||||
[expected (sequence-fail-expected fail-type)]
|
||||
[a2 (a/an expected)]
|
||||
[show-sequence (sequence-fail-correct fail-type)])
|
||||
(case (sequence-fail-kind fail-type)
|
||||
[(end)
|
||||
(combine-message
|
||||
(msg (format "Expected ~a to contain ~a ~a to complete the ~a. ~nFound ~a before ~a ended."
|
||||
input-type a2 expected id-name (format-seen show-sequence) input-type)) message-to-date)]
|
||||
[(wrong)
|
||||
(combine-message
|
||||
(msg
|
||||
(cond
|
||||
[(sequence-fail-repeat? fail-type)
|
||||
(format "Found a repitition of ~a; the required number are present. Expected ~a ~a next."
|
||||
(sequence-fail-last-seen fail-type) a2 expected)]
|
||||
[(null? show-sequence)
|
||||
(format "Expected ~a ~a to begin this ~a, instead found ~a."
|
||||
a2 expected id-name (input->output-name (sequence-fail-found fail-type)))]
|
||||
[else
|
||||
(format "Expected ~a ~a to continue this ~a. Instead, found ~a after ~a."
|
||||
a2 expected id-name (input->output-name (sequence-fail-found fail-type))
|
||||
(format-seen show-sequence))]))
|
||||
message-to-date)]
|
||||
[(misscase)
|
||||
(combine-message
|
||||
(msg (format "Expected to find ~a ~a to continue this ~a, found ~a which may be miscapitalized."
|
||||
a2 expected id-name (input->output-name (sequence-fail-found fail-type))))
|
||||
message-to-date)]
|
||||
[(misspell)
|
||||
(combine-message
|
||||
(msg (format "Expected to find ~a ~a to continue this ~a, found ~a which seems to be misspelled."
|
||||
a2 expected id-name (input->output-name (sequence-fail-found fail-type))))
|
||||
message-to-date)]
|
||||
[(missclass)
|
||||
(combine-message
|
||||
(msg (format "Found ~a instead of ~a ~a, a ~a cannot be used as a(n) ~a."
|
||||
(input->output-name (sequence-fail-found fail-type)) a2 expected class-type expected))
|
||||
message-to-date)]
|
||||
[(sub-seq choice)
|
||||
(fail-type->message (sequence-fail-found fail-type)
|
||||
(add-to-message (msg (format "An error occured in ~a.~n" id-name)) name message-to-date))]
|
||||
[(options)
|
||||
(let ([sorted-opts (sort (options-fail-opts (sequence-fail-found fail-type))
|
||||
(lambda (a b) (>= (fail-type-chance a) (fail-type-chance b))))])
|
||||
(fail-type->message (car sorted-opts)
|
||||
(add-to-message
|
||||
(msg (format "There is an error in this ~a after ~a, it is likely you intended a(n) ~a here.~n"
|
||||
id-name (car (reverse show-sequence)) (fail-type-name (car sorted-opts))))
|
||||
name message-to-date)))]))]
|
||||
[(options-fail? fail-type)
|
||||
#;(printf "selecting for options on ~a~n" name)
|
||||
(let* ([winners (select-errors (options-fail-opts fail-type))]
|
||||
[top-names (map fail-type-name winners)]
|
||||
[non-dup-tops (remove-dups top-names name)]
|
||||
[top-name (car top-names)])
|
||||
(fail-type->message
|
||||
(car winners)
|
||||
(add-to-message
|
||||
(msg
|
||||
(cond
|
||||
[(and (> (length winners) 1) (> (length non-dup-tops) 1))
|
||||
(format "There is an error in this ~a. It is likely you intended one of ~a here.~n"
|
||||
name (nice-list non-dup-tops))]
|
||||
[else
|
||||
(format "There is an error in this ~a~a.~n"
|
||||
name
|
||||
(if (equal? top-name name) ""
|
||||
(format ", it is likely you intended ~a ~a here" (a/an top-name) top-name)))]))
|
||||
name message-to-date)))]
|
||||
[(choice-fail? fail-type)
|
||||
#;(printf "selecting for ~a~n" name)
|
||||
(let* ([winners (select-errors (choice-fail-messages fail-type))]
|
||||
[top-names (map fail-type-name winners)]
|
||||
[top-name (car top-names)]
|
||||
[no-dup-names (remove-dups (choice-fail-names fail-type) name)])
|
||||
(fail-type->message
|
||||
(car winners)
|
||||
(add-to-message
|
||||
(msg (cond
|
||||
[(and (<= (choice-fail-options fail-type) max-choice-depth)
|
||||
(> (length no-dup-names) 1)
|
||||
(> (length winners) 1)
|
||||
(equal? top-names no-dup-names))
|
||||
(format "An error occured in this ~a, one of ~a is expected here."
|
||||
name (nice-list no-dup-names))]
|
||||
[(and (<= (choice-fail-options fail-type) max-choice-depth)
|
||||
(> (length no-dup-names) 1)
|
||||
(> (length winners) 1))
|
||||
(format "An error occured in this ~a, one of ~a is expected here. Input is close to one of ~a.~n"
|
||||
name (nice-list no-dup-names) (nice-list top-names))]
|
||||
[(and (<= (choice-fail-options fail-type) max-choice-depth)
|
||||
(> (length no-dup-names) 1))
|
||||
(format "An error occured in this ~a, one of ~a is expected here. Current input is close to ~a.~a~n"
|
||||
name (nice-list no-dup-names) top-name
|
||||
(if show-options " To see all options click here." ""))] ;Add support for formatting and passing up all options
|
||||
[else
|
||||
(format "An error occured in this ~a~a.~a~n"
|
||||
name
|
||||
(if (equal? name top-name) "" (format ", it is likely that you intended ~a ~a here"
|
||||
(a/an top-name) top-name))
|
||||
(if show-options " To see all options click here." ""))]))
|
||||
name message-to-date)))])))
|
||||
|
||||
(define (chance-used a) (* (fail-type-chance a) (fail-type-used a)))
|
||||
(define (chance-may-use a) (* (fail-type-chance a) (fail-type-may-use a)))
|
||||
(define (chance a) (fail-type-chance a))
|
||||
(define (composite a)
|
||||
(/ (+ (chance-used a) (chance-may-use a) (chance a)) 3))
|
||||
|
||||
(define (narrow-opts rank options)
|
||||
(get-ties (sort options (lambda (a b) (> (rank a) (rank b)))) rank))
|
||||
|
||||
(define (select-errors opts-list)
|
||||
(let* ([composite-winners
|
||||
(narrow-opts composite (!!list opts-list))]
|
||||
|
||||
[chance-used-winners
|
||||
(narrow-opts chance-used composite-winners)]
|
||||
|
||||
[chance-may-winners
|
||||
(narrow-opts chance-may-use chance-used-winners)]
|
||||
|
||||
[winners (narrow-opts chance chance-may-winners)])
|
||||
#;(printf "all options: ~a~n" (!!list opts-list))
|
||||
#;(printf "~a ~a ~a ~a ~n"
|
||||
(map fail-type-name (map !!! (!!list opts-list)))
|
||||
(map !!! (map fail-type-chance (!!list opts-list)))
|
||||
(map !!! (map fail-type-used (!!list opts-list)))
|
||||
(map !!! (map fail-type-may-use (!!list opts-list))))
|
||||
#;(printf "composite round: ~a ~a ~n"
|
||||
(map fail-type-name (map !!! composite-winners))
|
||||
(map composite (map !!! composite-winners)))
|
||||
#;(printf "final sorting: ~a~n" (map fail-type-name (map !!! winners)))
|
||||
winners))
|
||||
|
||||
(define (first-n n lst)
|
||||
(let loop ([count 0] [l lst])
|
||||
(cond
|
||||
[(= count n) null]
|
||||
[else (cons (car lst) (loop (add1 count) (cdr lst)))])))
|
||||
|
||||
(define (get-ties lst evaluate)
|
||||
(letrec ([getter
|
||||
(lambda (sub)
|
||||
(cond
|
||||
[(null? sub) null]
|
||||
[(>= (- (evaluate (car lst)) (evaluate (car sub))) .0001) null]
|
||||
[else (cons (car sub) (getter (cdr sub)))]))])
|
||||
(cons (car lst) (getter (cdr lst)))))
|
||||
|
||||
(define (a/an next-string)
|
||||
(if (string? next-string)
|
||||
(if (member (substring next-string 0 1) `("a" "e" "i" "o" "u"))
|
||||
"an" "a")
|
||||
"a"))
|
||||
|
||||
(define (format-seen l)
|
||||
(if (null? l)
|
||||
""
|
||||
(string-append "'"
|
||||
(car l)
|
||||
(apply string-append
|
||||
(map (lambda (i) (string-append " " i)) (cdr l)))
|
||||
"'")))
|
||||
|
||||
(define (nice-list l)
|
||||
(letrec ([formatter
|
||||
(lambda (l)
|
||||
(cond
|
||||
[(null? (cdr l)) (string-append "or " (car l))]
|
||||
[else (string-append (car l) ", " (formatter (cdr l)))]))])
|
||||
(cond
|
||||
[(null? l) (error 'internal-error "nice-list in combinator-parser/errors.scm recieved null list")]
|
||||
[(null? (cdr l)) (car l)]
|
||||
[(null? (cddr l)) (string-append (car l) " or " (cadr l))]
|
||||
[else (formatter l)])))
|
||||
|
||||
(define (downcase string)
|
||||
(string-append (string-downcase (substring string 0 1))
|
||||
(substring string 1 (string-length string))))
|
||||
|
||||
(define (remove-dups l n)
|
||||
(cond
|
||||
[(null? l) null]
|
||||
[(equal? (car l) n)
|
||||
(remove-dups (cdr l) n)]
|
||||
[(member (car l) (cdr l))
|
||||
(remove-dups (cdr l) n)]
|
||||
[else (cons (car l) (remove-dups (cdr l) n))]))
|
||||
|
||||
(define-struct ms (who say))
|
||||
|
||||
;add-to-message: err string (list err) -> (list err)
|
||||
(define (add-to-message msg name rest)
|
||||
(let ([next (make-ms name msg)])
|
||||
(cond
|
||||
[(null? rest) (list next)]
|
||||
[(equal? (ms-who (car rest)) name) (cons next (cdr rest))]
|
||||
[(< (length rest) max-depth) (cons next rest)]
|
||||
[else (cons next (first-n (sub1 max-depth) rest))])))
|
||||
|
||||
;combine-message: err (list ms) -> err
|
||||
(define (combine-message end-msg messages)
|
||||
(cond
|
||||
[(null? messages) end-msg]
|
||||
[else
|
||||
(combine-message
|
||||
(make-err (string-append (err-msg (ms-say (car messages)))
|
||||
(err-msg end-msg))
|
||||
(err-src end-msg))
|
||||
(cdr messages))]))
|
||||
|
||||
)
|
||||
)
|
2
collects/parser-tools/private-combinator/info.ss
Normal file
2
collects/parser-tools/private-combinator/info.ss
Normal file
|
@ -0,0 +1,2 @@
|
|||
(module info (lib "infotab.ss" "setup")
|
||||
(define name "Parser-tools private-combinator"))
|
193
collects/parser-tools/private-combinator/parser-sigs.ss
Normal file
193
collects/parser-tools/private-combinator/parser-sigs.ss
Normal file
|
@ -0,0 +1,193 @@
|
|||
(module parser-sigs mzscheme
|
||||
|
||||
(require (lib "unit.ss"))
|
||||
|
||||
(require (only (lib "etc.ss") opt-lambda)) ; Required for expansion
|
||||
(require (lib "lex.ss" "parser-tools")
|
||||
(lib "string.ss") (lib "list.ss"))
|
||||
|
||||
(provide (all-defined))
|
||||
|
||||
(define-signature-form (terminals stx)
|
||||
(syntax-case stx ()
|
||||
[(_ group (elt ...))
|
||||
(and (identifier? #'group)
|
||||
(andmap identifier? (syntax->list #'(elt ...))))
|
||||
(syntax->list #`(elt ...
|
||||
#,@(map (lambda (e)
|
||||
(datum->syntax-object e
|
||||
(string->symbol
|
||||
(format "token-~a" (syntax-e e)))))
|
||||
(syntax->list #'(elt ...)))))]))
|
||||
|
||||
(define-for-syntax (insert-name stx name)
|
||||
(let loop ([term stx]
|
||||
[pos 0]
|
||||
[id-pos 0]
|
||||
[terms null])
|
||||
(syntax-case term (sequence choose ^)
|
||||
[((sequence a b) . rest)
|
||||
(loop (syntax rest) (add1 pos) id-pos
|
||||
(cons (quasisyntax (sequence a b #,name)) terms))]
|
||||
[((choose a) . rest)
|
||||
(loop (syntax rest) (add1 pos) id-pos
|
||||
(cons (quasisyntax (choose a #,name)) terms))]
|
||||
[((^ a) . rest)
|
||||
(loop (syntax (a . rest))
|
||||
pos (add1 pos) terms)]
|
||||
[(a . rest)
|
||||
(loop (syntax rest) (add1 pos) id-pos (cons (syntax a) terms))]
|
||||
[() (list (reverse terms) id-pos)])))
|
||||
|
||||
(define-signature language-dictionary^ (misspelled misscap missclass))
|
||||
|
||||
(define-signature combinator-parser-forms^
|
||||
(terminal choice seq repeat
|
||||
(define-syntaxes (define-simple-terminals)
|
||||
(values
|
||||
(lambda (stx)
|
||||
(syntax-case stx ()
|
||||
((_ group elts)
|
||||
(let ([name-string-thunks
|
||||
(let loop ([elt-list (syntax elts)])
|
||||
(syntax-case elt-list (lambda)
|
||||
[() null]
|
||||
[(id . rest)
|
||||
(identifier? (syntax id))
|
||||
(cons (list (syntax id)
|
||||
(syntax (symbol->string (quote id)))
|
||||
`(lambda (x . args) x))
|
||||
(loop (syntax rest)))]
|
||||
[((id name) . rest)
|
||||
(and (identifier? (syntax id)) (string? (syntax-e (syntax name))))
|
||||
(cons (list (syntax id)
|
||||
(syntax name)
|
||||
`(lambda (x . args) x))
|
||||
(loop (syntax rest)))]
|
||||
[((id thunk) . rest)
|
||||
(and (identifier? (syntax id)) (identifier? (syntax thunk)))
|
||||
(cons (list (syntax id)
|
||||
(syntax (symbol->string (quote id)))
|
||||
(syntax thunk))
|
||||
(loop (syntax rest)))]
|
||||
[((id (lambda x body ...)) . rest)
|
||||
(identifier? (syntax id))
|
||||
(cons (list (syntax id)
|
||||
(syntax (symbol->string (quote id)))
|
||||
(syntax (lambda x body ...)))
|
||||
(loop (syntax rest)))]
|
||||
[((id name thunk) . rest)
|
||||
(and (identifier? (syntax id)) (string? (syntax-e (syntax name))))
|
||||
(cons (list (syntax id)
|
||||
(syntax name)
|
||||
(syntax thunk))
|
||||
(loop (syntax rest)))]))])
|
||||
(with-syntax ([(id ...) (map car name-string-thunks)]
|
||||
[(name ...) (map cadr name-string-thunks)]
|
||||
[(thunk ...) (map caddr name-string-thunks)])
|
||||
(syntax
|
||||
(begin
|
||||
(define-empty-tokens group (id ...))
|
||||
(define id
|
||||
(terminal
|
||||
(lambda (token) (eq? (token-name token) (quote id)))
|
||||
thunk
|
||||
name)) ...)))))))))
|
||||
|
||||
(define-syntaxes (define-terminals)
|
||||
(values
|
||||
(lambda (stx)
|
||||
(syntax-case stx ()
|
||||
[(_ group elts)
|
||||
(identifier? (syntax group))
|
||||
(let ([name-string-thunks
|
||||
(let loop ([elt-list (syntax elts)])
|
||||
(syntax-case elt-list (lambda)
|
||||
[() null]
|
||||
[((id (lambda (arg1 ...) body ...)) . rest)
|
||||
(identifier? (syntax id))
|
||||
(cons (list (syntax id)
|
||||
(syntax (symbol->string (quote id)))
|
||||
(syntax (lambda (arg1 ...) body ...)))
|
||||
(loop (syntax rest)))]
|
||||
[((id thunk) . rest)
|
||||
(and (identifier? (syntax id)) (identifier? (syntax thunk)))
|
||||
(cons (list (syntax id)
|
||||
(syntax (symbol->string (quote id)))
|
||||
(syntax thunk))
|
||||
(loop (syntax rest)))]
|
||||
[((id name thunk) . rest)
|
||||
(cons (list (syntax id)
|
||||
(syntax name)
|
||||
(syntax thunk))
|
||||
(loop (syntax rest)))]))])
|
||||
(with-syntax ([(id ...) (map car name-string-thunks)]
|
||||
[(name ...) (map cadr name-string-thunks)]
|
||||
[(thunk ...) (map caddr name-string-thunks)])
|
||||
(syntax
|
||||
(begin
|
||||
(define-tokens group (id ...))
|
||||
(define id
|
||||
(terminal
|
||||
(lambda (token) (eq? (token-name token) (quote id)))
|
||||
(lambda (x . args)
|
||||
(if (null? args)
|
||||
(thunk (token-value x))
|
||||
(thunk (token-value x) (car args) (cadr args))))
|
||||
name
|
||||
(lambda (token) #f)
|
||||
(lambda (token) #f))) ...))))]))))
|
||||
|
||||
(define-syntaxes (sequence)
|
||||
(values
|
||||
(lambda (stx)
|
||||
(syntax-case stx (^)
|
||||
[(_ (term ...) proc)
|
||||
(syntax
|
||||
(seq (list term ...) proc (symbol->string (gensym 'seq))))]
|
||||
[(_ terms proc name)
|
||||
(let ([new-terms (insert-name (syntax terms) (syntax name))])
|
||||
(with-syntax (((term ...) (car new-terms))
|
||||
(id-pos (cadr new-terms)))
|
||||
(syntax (seq (list term ...) proc name id-pos))))]))))
|
||||
|
||||
(define-syntaxes(choose)
|
||||
(values
|
||||
(lambda (stx)
|
||||
(syntax-case stx ()
|
||||
[(_ (term ...))
|
||||
(syntax
|
||||
(choice (list term ...) (symbol->string (gensym 'choice))))]
|
||||
[(_ terms name)
|
||||
(with-syntax (((term ...) [car (insert-name (syntax terms) (syntax name))]))
|
||||
(syntax
|
||||
(choice (list term ...) name)))]))))
|
||||
|
||||
(define-syntaxes (^)
|
||||
(values
|
||||
(syntax-rules ()
|
||||
[(_ f) f])))
|
||||
|
||||
(define-syntaxes (eta)
|
||||
(values (syntax-rules ()
|
||||
[(_ f)
|
||||
(opt-lambda (x [c 1]) (f x c))])))
|
||||
))
|
||||
|
||||
(define-signature parser^ (parser))
|
||||
(define-signature out^ ((struct err (msg src))))
|
||||
|
||||
(define-signature language-format-parameters^ (class-type input->output-name))
|
||||
|
||||
(define-signature error-format-parameters^
|
||||
(src? input-type show-options max-depth max-choice-depth))
|
||||
|
||||
(define-signature ranking-parameters^ (rank-choice))
|
||||
|
||||
(define-signature error^ (fail-type->message))
|
||||
|
||||
(define-signature combinator-parser^ extends combinator-parser-forms^
|
||||
(parser (struct err (msg src))))
|
||||
|
||||
)
|
||||
|
76
collects/parser-tools/private-combinator/structs.scm
Normal file
76
collects/parser-tools/private-combinator/structs.scm
Normal file
|
@ -0,0 +1,76 @@
|
|||
(module structs mzscheme
|
||||
|
||||
(provide (all-defined-except make-fail-type))
|
||||
|
||||
(require (lib "force.ss" "lazy")
|
||||
(lib "lex.ss" "parser-tools"))
|
||||
|
||||
;fail-src: (list line col pos span loc)
|
||||
|
||||
;make-src-lst: position position -> src-list
|
||||
(define (make-src-lst start end)
|
||||
(list (position-line start)
|
||||
(position-col start)
|
||||
(position-offset start)
|
||||
(- (position-offset end)
|
||||
(position-offset start))))
|
||||
|
||||
;(make-fail-type float fail-src string int int)
|
||||
(define-struct fail-type (chance src name used may-use) (make-inspector))
|
||||
;(make-terminal-fail float fail-src string symbol 'a)
|
||||
(define-struct (terminal-fail fail-type) (kind found))
|
||||
;(make-sequence-fail float fail-src string symbol (list string) string 'a boolean string)
|
||||
(define-struct (sequence-fail fail-type) (id kind correct expected found repeat? last-seen))
|
||||
;(make-choice-fail float fail-src string int (list string) (list fail-type))
|
||||
(define-struct (choice-fail fail-type) (options names messages) (make-inspector))
|
||||
;(make-options-fail float #f #f (list fail-type))
|
||||
(define-struct (options-fail fail-type) (opts))
|
||||
|
||||
(define (!!!-fail fail)
|
||||
(let*-values ([(chance src name used may-use)
|
||||
(if (fail-type? fail)
|
||||
(values (!!! (fail-type-chance fail))
|
||||
(!!! (fail-type-src fail))
|
||||
(!!! (fail-type-name fail))
|
||||
(!!! (fail-type-used fail))
|
||||
(!!! (fail-type-may-use fail)))
|
||||
(values #f #f #f #f #f))])
|
||||
(cond
|
||||
[(terminal-fail? fail)
|
||||
(make-terminal-fail chance src name used may-use
|
||||
(!!! (terminal-fail-kind fail))
|
||||
(!!! (terminal-fail-found fail)))]
|
||||
[(sequence-fail? fail)
|
||||
(make-sequence-fail chance src name used may-use
|
||||
(!!! (sequence-fail-id fail))
|
||||
(!!! (sequence-fail-kind fail))
|
||||
(!!! (sequence-fail-correct fail))
|
||||
(!!! (sequence-fail-expected fail))
|
||||
(!!!-fail (sequence-fail-found fail))
|
||||
(!!! (sequence-fail-repeat? fail))
|
||||
(!!! (sequence-fail-last-seen fail)))]
|
||||
[(choice-fail? fail)
|
||||
(make-choice-fail chance src name used may-use
|
||||
(!!! (choice-fail-options fail))
|
||||
(!!! (choice-fail-names fail))
|
||||
(map !!!-fail (!!! (choice-fail-messages fail))))]
|
||||
[(options-fail? fail)
|
||||
(make-options-fail chance src name used may-use
|
||||
(map !!!-fail (!!! (options-fail-opts fail))))]
|
||||
[else (!!! fail)])))
|
||||
|
||||
|
||||
|
||||
|
||||
;result = res | choice-res | repeat-res | (listof (U res choice-res))
|
||||
|
||||
;(make-res (U #f (listof 'b)) (listof 'a) (U string fail-type) (U string 'a) int) [U #f fail-type] token
|
||||
(define-struct res (a rest msg id used possible-error first-tok) (make-inspector))
|
||||
;make-choice-res string (listof res)
|
||||
(define-struct choice-res (name matches))
|
||||
;(make-repeat-res answer (U symbol fail-type))
|
||||
(define-struct repeat-res (a stop) (make-inspector))
|
||||
|
||||
(define (fail-res rst msg) (make-res #f rst msg "" 0 #f #f))
|
||||
|
||||
)
|
Loading…
Reference in New Issue
Block a user