236 lines
8.3 KiB
Racket
236 lines
8.3 KiB
Racket
#lang scribble/doc
|
|
@(require scribble/manual "utils.rkt")
|
|
|
|
@title[#:tag "decode"]{Decoding Text}
|
|
|
|
@defmodule[scribble/decode]{The @racketmodname[scribble/decode]
|
|
library helps you write document content in a natural way---more like
|
|
plain text, except for @litchar["@"] escapes. Roughly, it processes a
|
|
stream of strings to produces instances of the
|
|
@racketmodname[scribble/struct] datatypes (see @secref["struct"]).}
|
|
|
|
At the @tech{flow} level, decoding recognizes a blank line as a
|
|
@tech{paragraph} separator. Blocks and paragraphs without blank lines
|
|
in between are collected into a @tech{compound paragraph}.
|
|
|
|
@elemtag['(decode "rules")]{At} the @tech{content} level, decoding
|
|
makes just a few special text conversions:
|
|
|
|
@itemize[
|
|
|
|
@item{@litchar{---}: converted to @racket['mdash]}
|
|
|
|
@item{@litchar{--}: converted to @racket['ndash]}
|
|
|
|
@item{@litchar{``}: converted to @racket['ldquo], which is fancy open quotes: ``}
|
|
|
|
@item{@litchar{''}: converted to @racket['rdquo], which is fancy closing quotes: ''}
|
|
|
|
@item{@litchar{'}: converted to @racket['rsquo], which is a fancy apostrophe: '}
|
|
|
|
@item{@litchar{`}: converted to @racket['lsquo], which is a fancy quote: `}
|
|
|
|
]
|
|
|
|
Some functions @deftech{decode} a sequence of @racket[_pre-flow] or
|
|
@racket[_pre-content] arguments using @racket[decode-flow] or
|
|
@racket[decode-content], respectively. For example, the @racket[bold]
|
|
function accepts any number of @racket[_pre-content] arguments, so
|
|
that in
|
|
|
|
@verbatim[#:indent 2]|{@bold{``apple''}}|
|
|
|
|
the @litchar{``apple''} argument is decoded to use fancy quotes, and
|
|
then it is bolded.
|
|
|
|
|
|
@defproc[(pre-content? [v any/c]) boolean?]{
|
|
|
|
Returns @racket[#t] if @racket[v] is a @deftech{pre-content} value: a
|
|
string or other non-list @tech{content}, a list of @tech{pre-content} values, or a @racket[splice]
|
|
containing a list of @tech{pre-content} values; otherwise returns
|
|
@racket[#f].
|
|
|
|
Pre-content is decoded into @tech{content} by functions like
|
|
@racket[decode-content] and @racket[decode-paragraph].}
|
|
|
|
|
|
@defproc[(pre-flow? [v any/c]) boolean?]{
|
|
|
|
Returns @racket[#t] if @racket[v] is a @deftech{pre-flow} value: a
|
|
string or other non-list @tech{content}, a @racket[block],
|
|
@|void-const|, a list of @tech{pre-flow} values, or a @racket[splice] containing a list of
|
|
@tech{pre-flow} values; otherwise returns @racket[#f].
|
|
|
|
Pre-flow is decoded into a @tech{flow} (i.e., a list of @tech{blocks})
|
|
by functions like @racket[decode-flow].}
|
|
|
|
|
|
@defproc[(pre-part? [v any/c]) boolean?]{
|
|
|
|
Returns @racket[#t] if @racket[v] is a @deftech{pre-part} value: a
|
|
string or other non-list @tech{content}, a @tech{block}, a
|
|
@racket[part], a @racket[title-decl], a @racket[part-start], a
|
|
@racket[part-index-decl], a @racket[part-collect-decl], a
|
|
@racket[part-tag-decl], @|void-const|, a list of @tech{pre-part} values, or a @racket[splice] containing
|
|
a list of @tech{pre-part} values; otherwise returns @racket[#f].
|
|
|
|
A pre-part sequence is decoded into a @racket[part] by functions like
|
|
@racket[decode] and @racket[decode-part].}
|
|
|
|
|
|
@defproc[(decode [lst (listof pre-part?)]) part?]{
|
|
|
|
Decodes a document, producing a part. In @racket[lst], lists and instances of
|
|
@racket[splice] are inlined into the list, and @|void-const|s are dropped. An instance of
|
|
@racket[title-decl] supplies the title for the part, plus tag, style
|
|
and version information. Instances of @racket[part-index-decl] (that
|
|
precede any sub-part) add index entries that point to the
|
|
section. Instances of @racket[part-collect-decl] add elements to the
|
|
part that are used only during the @techlink{collect pass}. Instances
|
|
of @racket[part-tag-decl] add hyperlink tags to the section
|
|
title. Instances of @racket[part-start] at level 0 trigger sub-part
|
|
parsing. Instances of @racket[section] trigger are used as-is as
|
|
subsections, and instances of @racket[paragraph] and other
|
|
flow-element datatypes are used as-is in the enclosing flow.
|
|
|
|
As a part is decoded, as long as the style for the part does not
|
|
include the @tech{style property} @racket['hidden] or
|
|
@racket['no-index], an entry is added to the document index for the
|
|
part's title.
|
|
|
|
Portions of @racket[lst] are within a part are decoded using
|
|
@racket[decode-flow].
|
|
|
|
@history[#:changed "1.25" @elem{Added @racket['no-index] support.}]}
|
|
|
|
|
|
@defproc[(decode-part [lst (listof pre-part?)]
|
|
[tags (listof string?)]
|
|
[title (or/c #f list?)]
|
|
[depth exact-nonnegative-integer?])
|
|
part?]{
|
|
|
|
Like @racket[decode], but given a list of tag string for the part, a
|
|
title (if @racket[#f], then a @racket[title-decl] instance is used if
|
|
found), and a depth for @racket[part-start]s to trigger sub-part
|
|
parsing.
|
|
|
|
}
|
|
|
|
@defproc[(decode-flow [lst (listof pre-flow?)]) (listof block?)]{
|
|
|
|
Decodes a flow. In @racket[lst], lists and instances of
|
|
@racket[splice] are inlined into the list. A sequence of two or more
|
|
newlines separated only by whitespace is parsed as a
|
|
compound-paragraph separator.
|
|
|
|
Portions of @racket[lst] are within a compound paragraph are decoded using
|
|
@racket[decode-compound-paragraph].}
|
|
|
|
|
|
@defproc[(decode-compound-paragraph [lst (listof pre-flow?)]) block?]{
|
|
|
|
Decodes a compound paragraph. In @racket[lst], lists and instances of
|
|
@racket[splice] are inlined into the list. Instances of
|
|
@racket[paragraph] and other @tech{block} datatypes are used as-is in
|
|
the result. If the compound paragraph contains a single block, the
|
|
block is returned without a @racket[compound-paragraph] wrapper.
|
|
|
|
Portions of @racket[lst] that are separated by @tech{block}s are
|
|
decoded using @racket[decode-content].}
|
|
|
|
|
|
@defproc[(decode-paragraph [lst (listof pre-content?)]) paragraph?]{
|
|
|
|
Decodes a paragraph using @racket[decode-content] to decode
|
|
@racket[lst] as the paragraph's content.}
|
|
|
|
|
|
@defproc[(decode-content [lst (listof pre-content?)]) list?]{
|
|
|
|
Decodes @tech{content}. Elements at the start of the list that are
|
|
whitespace (according to @racket[whitespace?]) are dropped.
|
|
@margin-note*{Dropping whitespace in nested lists and splices was a poor
|
|
implementation choice that is left in place for compatibility. To protect
|
|
against it, you can exploit the similarly unfortunate fact that an empty
|
|
list does not count as whitespace.}
|
|
Lists and splices in @racket[lst] are
|
|
flattened into the list, similarly dropping leading whitespace.
|
|
Plain strings are @elemref['(decode
|
|
"rules")]{decoded}; non-string, non-list @tech{content} is included in
|
|
the result as-is.}
|
|
|
|
|
|
@defproc[(decode-elements [lst (listof pre-content?)]) list?]{
|
|
|
|
An alias for @racket[decode-content].}
|
|
|
|
|
|
@defproc[(decode-string [s string?]) (listof content?)]{
|
|
|
|
@elemref['(decode "rules")]{Decodes} a single string to produce
|
|
@tech{content}.}
|
|
|
|
|
|
@defproc[(whitespace? [v any/c]) boolean?]{
|
|
|
|
Returns @racket[#t] if @racket[v] is a string that contains only whitespace, @racket[#f]
|
|
otherwise.}
|
|
|
|
|
|
@defstruct[title-decl ([tag-prefix (or/c #f string?)]
|
|
[tags (listof string?)]
|
|
[version (or/c string? #f)]
|
|
[style style?]
|
|
[content content?])]{
|
|
|
|
See @racket[decode] and @racket[decode-part]. The @racket[tag-prefix]
|
|
and @racketidfont{style} fields are propagated to the resulting
|
|
@racket[part]. If the @racketidfont{version} field is not @racket[#f],
|
|
it is propagated as a @racket[document-version] style property on the
|
|
@racket[part].}
|
|
|
|
|
|
@defstruct[part-start ([depth integer?]
|
|
[tag-prefix (or/c #f string?)]
|
|
[tags (listof string?)]
|
|
[style style?]
|
|
[title content?])]{
|
|
|
|
Like @racket[title-decl], but for a sub-part. See @racket[decode] and
|
|
@racket[decode-part].}
|
|
|
|
|
|
@defstruct[part-index-decl ([plain-seq (listof string?)]
|
|
[entry-seq list?])]{
|
|
|
|
See @racket[decode]. The two fields are as for @racket[index-element].}
|
|
|
|
|
|
@defstruct[part-collect-decl ([element (or/c element? part-relative-element?)])]{
|
|
|
|
See @racket[decode].}
|
|
|
|
|
|
@defstruct[part-tag-decl ([tag tag?])]{
|
|
|
|
See @racket[decode].}
|
|
|
|
|
|
@defstruct[splice ([run list?])]{
|
|
|
|
See @racket[decode], @racket[decode-part], and @racket[decode-flow].}
|
|
|
|
|
|
@defproc[(spliceof [ctc flat-contract?]) flat-contract?]{
|
|
|
|
Produces a contract for a @racket[splice] instance whose
|
|
@racketidfont{run} elements satisfy @racket[ctc].}
|
|
|
|
|
|
@defproc[(clean-up-index-string [str string?]) string?]{
|
|
|
|
Trims leading and trailing whitespace, and converts non-empty
|
|
sequences of whitespace to a single space character.}
|