racket/pkgs/racket-doc/scribblings/reference/bytes.scrbl
Matthew Flatt 710320e3dc "Mac OS X" -> "Mac OS"
Although "macOS" is the correct name for Apple's current desktop OS,
we've decided to go with "Mac OS" to cover all of Apple's Unix-like
desktop OS versions. The label "Mac OS" is more readable, clear in
context (i.e., unlikely to be confused with the Mac OSes that
proceeded Mac OS X), and as likely to match Apple's future OS names
as anything.
2016-12-23 12:18:36 -07:00

674 lines
28 KiB
Racket

#lang scribble/doc
@(require "mz.rkt")
@title[#:tag "bytestrings"]{Byte Strings}
@guideintro["bytestrings"]{byte strings}
A @deftech{byte string} is a fixed-length array of bytes. A
@pidefterm{byte} is an exact integer between @racket[0] and
@racket[255] inclusive.
@index['("byte strings" "immutable")]{A} byte string can be
@defterm{mutable} or @defterm{immutable}. When an immutable byte
string is provided to a procedure like @racket[bytes-set!], the
@exnraise[exn:fail:contract]. Byte-string constants generated by the
default reader (see @secref["parse-string"]) are immutable,
and they are @tech{interned} in @racket[read-syntax] mode.
Two byte strings are @racket[equal?] when they have the same length
and contain the same sequence of bytes.
A byte string can be used as a single-valued sequence (see
@secref["sequences"]). The bytes of the string serve as elements
of the sequence. See also @racket[in-bytes].
@see-read-print["string"]{byte strings}
See also: @racket[immutable?].
@; ----------------------------------------
@section{Byte String Constructors, Selectors, and Mutators}
@defproc[(bytes? [v any/c]) boolean?]{ Returns @racket[#t] if @racket[v]
is a byte string, @racket[#f] otherwise.
@mz-examples[(bytes? #"Apple") (bytes? "Apple")]}
@defproc[(make-bytes [k exact-nonnegative-integer?] [b byte? 0])
bytes?]{ Returns a new mutable byte string of length @racket[k] where each
position in the byte string is initialized with the byte @racket[b].
@mz-examples[(make-bytes 5 65)]}
@defproc[(bytes [b byte?] ...) bytes?]{ Returns a new mutable byte
string whose length is the number of provided @racket[b]s, and whose
positions are initialized with the given @racket[b]s.
@mz-examples[(bytes 65 112 112 108 101)]}
@defproc[(bytes->immutable-bytes [bstr bytes?])
(and/c bytes? immutable?)]{
Returns an immutable byte string with the same content
as @racket[bstr], returning @racket[bstr] itself if @racket[bstr] is
immutable.
@examples[
(bytes->immutable-bytes (bytes 65 65 65))
(define b (bytes->immutable-bytes (make-bytes 5 65)))
(bytes->immutable-bytes b)
(eq? (bytes->immutable-bytes b) b)
]}
@defproc[(byte? [v any/c]) boolean?]{ Returns @racket[#t] if @racket[v] is
a byte (i.e., an exact integer between @racket[0] and @racket[255]
inclusive), @racket[#f] otherwise.
@mz-examples[(byte? 65) (byte? 0) (byte? 256) (byte? -1)]}
@defproc[(bytes-length [bstr bytes?]) exact-nonnegative-integer?]{
Returns the length of @racket[bstr].
@mz-examples[(bytes-length #"Apple")]}
@defproc[(bytes-ref [bstr bytes?] [k exact-nonnegative-integer?])
byte?]{ Returns the character at position @racket[k] in @racket[bstr].
The first position in the bytes cooresponds to @racket[0], so the
position @racket[k] must be less than the length of the bytes,
otherwise the @exnraise[exn:fail:contract].
@mz-examples[(bytes-ref #"Apple" 0)]}
@defproc[(bytes-set! [bstr (and/c bytes? (not/c immutable?))] [k
exact-nonnegative-integer?] [b byte?]) void?]{ Changes the
character position @racket[k] in @racket[bstr] to @racket[b]. The first
position in the byte string cooresponds to @racket[0], so the position
@racket[k] must be less than the length of the bytes, otherwise the
@exnraise[exn:fail:contract].
@mz-examples[(define s (bytes 65 112 112 108 101))
(bytes-set! s 4 121)
s]}
@defproc[(subbytes [bstr bytes?] [start exact-nonnegative-integer?]
[end exact-nonnegative-integer? (bytes-length str)]) bytes?]{ Returns
a new mutable byte string that is @racket[(- end start)] bytes long,
and that contains the same bytes as @racket[bstr] from @racket[start]
inclusive to @racket[end] exclusive. The @racket[start] and
@racket[end] arguments must be less than or equal to the length of
@racket[bstr], and @racket[end] must be greater than or equal to
@racket[start], otherwise the @exnraise[exn:fail:contract].
@mz-examples[(subbytes #"Apple" 1 3)
(subbytes #"Apple" 1)]}
@defproc[(bytes-copy [bstr bytes?]) bytes?]{ Returns
@racket[(subbytes str 0)].}
@defproc[(bytes-copy! [dest (and/c bytes? (not/c immutable?))]
[dest-start exact-nonnegative-integer?]
[src bytes?]
[src-start exact-nonnegative-integer? 0]
[src-end exact-nonnegative-integer? (bytes-length src)])
void?]{
Changes the bytes of @racket[dest] starting at position
@racket[dest-start] to match the bytes in @racket[src] from
@racket[src-start] (inclusive) to @racket[src-end] (exclusive). The
bytes strings @racket[dest] and @racket[src] can be the same byte
string, and in that case the destination region can overlap with the
source region; the destination bytes after the copy match the source
bytes from before the copy. If any of @racket[dest-start],
@racket[src-start], or @racket[src-end] are out of range (taking into
account the sizes of the bytes strings and the source and destination
regions), the @exnraise[exn:fail:contract].
@mz-examples[(define s (bytes 65 112 112 108 101))
(bytes-copy! s 4 #"y")
(bytes-copy! s 0 s 3 4)
s]}
@defproc[(bytes-fill! [dest (and/c bytes? (not/c immutable?))] [b
byte?]) void?]{ Changes @racket[dest] so that every position in the
bytes is filled with @racket[b].
@mz-examples[(define s (bytes 65 112 112 108 101))
(bytes-fill! s 113)
s]}
@defproc[(bytes-append [bstr bytes?] ...) bytes?]{
@index['("byte strings" "concatenate")]{Returns} a new mutable byte string
that is as long as the sum of the given @racket[bstr]s' lengths, and
that contains the concatenated bytes of the given @racket[bstr]s. If
no @racket[bstr]s are provided, the result is a zero-length byte
string.
@mz-examples[(bytes-append #"Apple" #"Banana")]}
@defproc[(bytes->list [bstr bytes?]) (listof byte?)]{ Returns a new
list of bytes corresponding to the content of @racket[bstr]. That is,
the length of the list is @racket[(bytes-length bstr)], and the
sequence of bytes in @racket[bstr] is the same sequence in the
result list.
@mz-examples[(bytes->list #"Apple")]}
@defproc[(list->bytes [lst (listof byte?)]) bytes?]{ Returns a new
mutable byte string whose content is the list of bytes in @racket[lst].
That is, the length of the byte string is @racket[(length lst)], and
the sequence of bytes in @racket[lst] is the same sequence in
the result byte string.
@mz-examples[(list->bytes (list 65 112 112 108 101))]}
@defproc[(make-shared-bytes [k exact-nonnegative-integer?] [b byte? 0])
bytes?]{ Returns a new mutable byte string of length @racket[k] where each
position in the byte string is initialized with the byte @racket[b].
For communication among @tech{places}, the new byte string is allocated in the
@tech{shared memory space}.
@mz-examples[(make-shared-bytes 5 65)]}
@defproc[(shared-bytes [b byte?] ...) bytes?]{ Returns a new mutable byte
string whose length is the number of provided @racket[b]s, and whose
positions are initialized with the given @racket[b]s.
For communication among @tech{places}, the new byte string is allocated in the
@tech{shared memory space}.
@mz-examples[(shared-bytes 65 112 112 108 101)]}
@; ----------------------------------------
@section{Byte String Comparisons}
@defproc[(bytes=? [bstr1 bytes?] [bstr2 bytes?] ...+) boolean?]{ Returns
@racket[#t] if all of the arguments are @racket[eqv?].}
@mz-examples[(bytes=? #"Apple" #"apple")
(bytes=? #"a" #"as" #"a")]
@(define (bytes-sort direction)
@elem{Like @racket[bytes<?], but checks whether the arguments are @|direction|.})
@defproc[(bytes<? [bstr1 bytes?] [bstr2 bytes?] ...+) boolean?]{
Returns @racket[#t] if the arguments are lexicographically sorted
increasing, where individual bytes are ordered by @racket[<],
@racket[#f] otherwise.
@mz-examples[(bytes<? #"Apple" #"apple")
(bytes<? #"apple" #"Apple")
(bytes<? #"a" #"b" #"c")]}
@defproc[(bytes>? [bstr1 bytes?] [bstr2 bytes?] ...+) boolean?]{
@bytes-sort["decreasing"]
@mz-examples[(bytes>? #"Apple" #"apple")
(bytes>? #"apple" #"Apple")
(bytes>? #"c" #"b" #"a")]}
@; ----------------------------------------
@section{Bytes to/from Characters, Decoding and Encoding}
@defproc[(bytes->string/utf-8 [bstr bytes?]
[err-char (or/c #f char?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (bytes-length bstr)])
string?]{
Produces a string by decoding the @racket[start] to @racket[end]
substring of @racket[bstr] as a UTF-8 encoding of Unicode code
points. If @racket[err-char] is not @racket[#f], then it is used for
bytes that fall in the range @racket[#o200] to @racket[#o377] but are
not part of a valid encoding sequence. (This rule is consistent with
reading characters from a port; see @secref["encodings"] for more
details.) If @racket[err-char] is @racket[#f], and if the
@racket[start] to @racket[end] substring of @racket[bstr] is not a
valid UTF-8 encoding overall, then the @exnraise[exn:fail:contract].
@examples[
(bytes->string/utf-8 (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3))
]}
@defproc[(bytes->string/locale [bstr bytes?]
[err-char (or/c #f char?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (bytes-length bstr)])
string?]{
Produces a string by decoding the @racket[start] to @racket[end] substring
of @racket[bstr] using the current locale's encoding (see also
@secref["encodings"]). If @racket[err-char] is not
@racket[#f], it is used for each byte in @racket[bstr] that is not part
of a valid encoding; if @racket[err-char] is @racket[#f], and if the
@racket[start] to @racket[end] substring of @racket[bstr] is not a valid
encoding overall, then the @exnraise[exn:fail:contract].}
@defproc[(bytes->string/latin-1 [bstr bytes?]
[err-char (or/c #f char?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (bytes-length bstr)])
string?]{
Produces a string by decoding the @racket[start] to @racket[end] substring
of @racket[bstr] as a Latin-1 encoding of Unicode code points; i.e.,
each byte is translated directly to a character using
@racket[integer->char], so the decoding always succeeds.
The @racket[err-char]
argument is ignored, but present for consistency with the other
operations.
@examples[
(bytes->string/latin-1 (bytes #xfe #xd3 #xd1 #xa5))
]}
@defproc[(string->bytes/utf-8 [str string?]
[err-byte (or/c #f byte?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (string-length str)])
bytes?]{
Produces a byte string by encoding the @racket[start] to @racket[end]
substring of @racket[str] via UTF-8 (always succeeding). The
@racket[err-byte] argument is ignored, but included for consistency with
the other operations.
@examples[
(define b
(bytes->string/utf-8
(bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3)))
(string->bytes/utf-8 b)
(bytes->string/utf-8 (string->bytes/utf-8 b))
]}
@defproc[(string->bytes/locale [str string?]
[err-byte (or/c #f byte?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (string-length str)])
bytes?]{
Produces a string by encoding the @racket[start] to @racket[end] substring
of @racket[str] using the current locale's encoding (see also
@secref["encodings"]). If @racket[err-byte] is not @racket[#f], it is used
for each character in @racket[str] that cannot be encoded for the
current locale; if @racket[err-byte] is @racket[#f], and if the
@racket[start] to @racket[end] substring of @racket[str] cannot be encoded,
then the @exnraise[exn:fail:contract].}
@defproc[(string->bytes/latin-1 [str string?]
[err-byte (or/c #f byte?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (string-length str)])
bytes?]{
Produces a string by encoding the @racket[start] to @racket[end] substring
of @racket[str] using Latin-1; i.e., each character is translated
directly to a byte using @racket[char->integer]. If @racket[err-byte] is
not @racket[#f], it is used for each character in @racket[str] whose
value is greater than @racket[255].
If @racket[err-byte] is @racket[#f], and if the
@racket[start] to @racket[end] substring of @racket[str] has a character
with a value greater than @racket[255], then the
@exnraise[exn:fail:contract].
@examples[
(define b
(bytes->string/latin-1 (bytes #xfe #xd3 #xd1 #xa5)))
(string->bytes/latin-1 b)
(bytes->string/latin-1 (string->bytes/latin-1 b))
]}
@defproc[(string-utf-8-length [str string?]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (string-length str)])
exact-nonnegative-integer?]{
Returns the length in bytes of the UTF-8 encoding of @racket[str]'s
substring from @racket[start] to @racket[end], but without actually
generating the encoded bytes.
@examples[
(string-utf-8-length
(bytes->string/utf-8 (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3)))
(string-utf-8-length "hello")
]}
@defproc[(bytes-utf-8-length [bstr bytes?]
[err-char (or/c #f char?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (bytes-length bstr)])
exact-nonnegative-integer?]{
Returns the length in characters of the UTF-8 decoding of
@racket[bstr]'s substring from @racket[start] to @racket[end], but without
actually generating the decoded characters. If @racket[err-char] is
@racket[#f] and the substring is not a UTF-8 encoding overall, the
result is @racket[#f]. Otherwise, @racket[err-char] is used to resolve
decoding errors as in @racket[bytes->string/utf-8].
@examples[
(bytes-utf-8-length (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3))
(bytes-utf-8-length (make-bytes 5 65))
]}
@defproc[(bytes-utf-8-ref [bstr bytes?]
[skip exact-nonnegative-integer? 0]
[err-char (or/c #f char?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (bytes-length bstr)])
char?]{
Returns the @racket[skip]th character in the UTF-8 decoding of
@racket[bstr]'s substring from @racket[start] to @racket[end], but without
actually generating the other decoded characters. If the substring is
not a UTF-8 encoding up to the @racket[skip]th character (when
@racket[err-char] is @racket[#f]), or if the substring decoding produces
fewer than @racket[skip] characters, the result is @racket[#f]. If
@racket[err-char] is not @racket[#f], it is used to resolve decoding
errors as in @racket[bytes->string/utf-8].
@examples[
(bytes-utf-8-ref (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3) 0)
(bytes-utf-8-ref (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3) 1)
(bytes-utf-8-ref (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3) 2)
(bytes-utf-8-ref (bytes 65 66 67 68) 0)
(bytes-utf-8-ref (bytes 65 66 67 68) 1)
(bytes-utf-8-ref (bytes 65 66 67 68) 2)
]}
@defproc[(bytes-utf-8-index [bstr bytes?]
[skip exact-nonnegative-integer? 0]
[err-char (or/c #f char?) #f]
[start exact-nonnegative-integer? 0]
[end exact-nonnegative-integer? (bytes-length bstr)])
exact-nonnegative-integer?]{
Returns the offset in bytes into @racket[bstr] at which the @racket[skip]th
character's encoding starts in the UTF-8 decoding of @racket[bstr]'s
substring from @racket[start] to @racket[end] (but without actually
generating the other decoded characters). The result is relative to
the start of @racket[bstr], not to @racket[start]. If the substring is not
a UTF-8 encoding up to the @racket[skip]th character (when
@racket[err-char] is @racket[#f]), or if the substring decoding produces
fewer than @racket[skip] characters, the result is @racket[#f]. If
@racket[err-char] is not @racket[#f], it is used to resolve decoding
errors as in @racket[bytes->string/utf-8].
@examples[
(bytes-utf-8-index (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3) 0)
(bytes-utf-8-index (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3) 1)
(bytes-utf-8-index (bytes #xc3 #xa7 #xc3 #xb0 #xc3 #xb6 #xc2 #xa3) 2)
(bytes-utf-8-index (bytes 65 66 67 68) 0)
(bytes-utf-8-index (bytes 65 66 67 68) 1)
(bytes-utf-8-index (bytes 65 66 67 68) 2)
]}
@; ----------------------------------------
@section{Bytes to Bytes Encoding Conversion}
@defproc[(bytes-open-converter [from-name string?] [to-name string?])
(or/c bytes-converter? #f)]{
Produces a @deftech{byte converter} to go from the encoding named by
@racket[from-name] to the encoding named by @racket[to-name]. If the
requested conversion pair is not available, @racket[#f] is returned
instead of a converter.
Certain encoding combinations are always available:
@itemize[
@item{@racket[(bytes-open-converter "UTF-8" "UTF-8")] --- the
identity conversion, except that encoding errors in the input lead
to a decoding failure.}
@item{@racket[(bytes-open-converter "UTF-8-permissive" "UTF-8")] ---
@index['("UTF-8-permissive")]{the} identity conversion, except that
any input byte that is not part of a valid encoding sequence is
effectively replaced by the UTF-8 encoding sequence for
@racketvalfont{#\uFFFD}. (This handling of invalid sequences is
consistent with the interpretation of port bytes streams into
characters; see @secref["ports"].)}
@item{@racket[(bytes-open-converter "" "UTF-8")] --- converts from
the current locale's default encoding (see @secref["encodings"])
to UTF-8.}
@item{@racket[(bytes-open-converter "UTF-8" "")] --- converts from
UTF-8 to the current locale's default encoding (see
@secref["encodings"]).}
@item{@racket[(bytes-open-converter "platform-UTF-8" "platform-UTF-16")]
--- converts UTF-8 to UTF-16 on @|AllUnix|, where each UTF-16
code unit is a sequence of two bytes ordered by the current
platform's endianness. On Windows, the input can include
encodings that are not valid UTF-8, but which naturally extend the
UTF-8 encoding to support unpaired surrogate code units, and the
output is a sequence of UTF-16 code units (as little-endian byte
pairs), potentially including unpaired surrogates.}
@item{@racket[(bytes-open-converter "platform-UTF-8-permissive" "platform-UTF-16")]
--- like @racket[(bytes-open-converter "platform-UTF-8" "platform-UTF-16")],
but an input byte that is not part of a valid UTF-8 encoding
sequence (or valid for the unpaired-surrogate extension on
Windows) is effectively replaced with @racket[(char->integer #\?)].}
@item{@racket[(bytes-open-converter "platform-UTF-16" "platform-UTF-8")]
--- converts UTF-16 (bytes ordered by the current platform's
endianness) to UTF-8 on @|AllUnix|. On Windows, the input can
include UTF-16 code units that are unpaired surrogates, and the
corresponding output includes an encoding of each surrogate in a
natural extension of UTF-8. On @|AllUnix|, surrogates are
assumed to be paired: a pair of bytes with the bits @code{#xD800}
starts a surrogate pair, and the @code{#x03FF} bits are used from
the pair and following pair (independent of the value of the
@code{#xDC00} bits). On all platforms, performance may be poor
when decoding from an odd offset within an input byte string.}
]
A newly opened byte converter is registered with the current custodian
(see @secref["custodians"]), so that the converter is closed when
the custodian is shut down. A converter is not registered with a
custodian (and does not need to be closed) if it is one of the
guaranteed combinations not involving @racket[""] on Unix, or if it
is any of the guaranteed combinations (including @racket[""]) on
Windows and Mac OS.
@margin-note{In the Racket software distributions for Windows, a suitable
@filepath{iconv.dll} is included with @filepath{libmzsch@italic{VERS}.dll}.}
The set of available encodings and combinations varies by platform,
depending on the @exec{iconv} library that is installed; the
@racket[from-name] and @racket[to-name] arguments are passed on to
@tt{iconv_open}. On Windows, @filepath{iconv.dll} or
@filepath{libiconv.dll} must be in the same directory as
@filepath{libmzsch@italic{VERS}.dll} (where @italic{VERS} is a version
number), in the user's path, in the system directory, or in the
current executable's directory at run time, and the DLL must either
supply @tt{_errno} or link to @filepath{msvcrt.dll} for @tt{_errno};
otherwise, only the guaranteed combinations are available.
Use @racket[bytes-convert] with the result to convert byte strings.}
@defproc[(bytes-close-converter [converter bytes-converter?]) void]{
Closes the given converter, so that it can no longer be used with
@racket[bytes-convert] or @racket[bytes-convert-end].}
@defproc[(bytes-convert [converter bytes-converter?]
[src-bstr bytes?]
[src-start-pos exact-nonnegative-integer? 0]
[src-end-pos exact-nonnegative-integer? (bytes-length src-bstr)]
[dest-bstr (or/c bytes? #f) #f]
[dest-start-pos exact-nonnegative-integer? 0]
[dest-end-pos (or/c exact-nonnegative-integer? #f)
(and dest-bstr
(bytes-length dest-bstr))])
(values (or/c bytes? exact-nonnegative-integer?)
exact-nonnegative-integer?
(or/c 'complete 'continues 'aborts 'error))]{
Converts the bytes from @racket[src-start-pos] to @racket[src-end-pos]
in @racket[src-bstr].
If @racket[dest-bstr] is not @racket[#f], the converted bytes are
written into @racket[dest-bstr] from @racket[dest-start-pos] to
@racket[dest-end-pos]. If @racket[dest-bstr] is @racket[#f], then a
newly allocated byte string holds the conversion results, and if
@racket[dest-end-pos] is not @racket[#f], the size of the result byte
string is no more than @racket[(- dest-end-pos dest-start-pos)].
The result of @racket[bytes-convert] is three values:
@itemize[
@item{@racket[_result-bstr] or @racket[_dest-wrote-amt] --- a byte
string if @racket[dest-bstr] is @racket[#f] or not provided, or the
number of bytes written into @racket[dest-bstr] otherwise.}
@item{@racket[_src-read-amt] --- the number of bytes successfully converted
from @racket[src-bstr].}
@item{@indexed-racket['complete], @indexed-racket['continues],
@indexed-racket['aborts], or @indexed-racket['error] --- indicates
how conversion terminated:
@itemize[
@item{@racket['complete]: The entire input was processed, and
@racket[_src-read-amt] will be equal to @racket[(- src-end-pos
src-start-pos)].}
@item{@racket['continues]: Conversion stopped due to the limit on
the result size or the space in @racket[dest-bstr]; in this case,
fewer than @racket[(- dest-end-pos dest-start-pos)] bytes may be
returned if more space is needed to process the next complete
encoding sequence in @racket[src-bstr].}
@item{@racket['aborts]: The input stopped part-way through an
encoding sequence, and more input bytes are necessary to continue.
For example, if the last byte of input is @racket[#o303] for a
@racket["UTF-8-permissive"] decoding, the result is
@racket['aborts], because another byte is needed to determine how to
use the @racket[#o303] byte.}
@item{@racket['error]: The bytes starting at @racket[(+
src-start-pos _src-read-amt)] bytes in @racket[src-bstr] do not form
a legal encoding sequence. This result is never produced for some
encodings, where all byte sequences are valid encodings. For
example, since @racket["UTF-8-permissive"] handles an invalid UTF-8
sequence by dropping characters or generating ``?,'' every byte
sequence is effectively valid.}
]}
]
Applying a converter accumulates state in the converter (even when the
third result of @racket[bytes-convert] is @racket['complete]). This
state can affect both further processing of input and further
generation of output, but only for conversions that involve ``shift
sequences'' to change modes within a stream. To terminate an input
sequence and reset the converter, use @racket[bytes-convert-end].
@examples[
(define convert (bytes-open-converter "UTF-8" "UTF-16"))
(bytes-convert convert (bytes 65 66 67 68))
(bytes 195 167 195 176 195 182 194 163)
(bytes-convert convert (bytes 195 167 195 176 195 182 194 163))
(bytes-close-converter convert)
]}
@defproc[(bytes-convert-end [converter bytes-converter?]
[dest-bstr (or/c bytes? #f) #f]
[dest-start-pos exact-nonnegative-integer? 0]
[dest-end-pos (or/c exact-nonnegative-integer? #f)
(and dest-bstr
(bytes-length dest-bstr))])
(values (or/c bytes? exact-nonnegative-integer?)
(or/c 'complete 'continues))]{
Like @racket[bytes-convert], but instead of converting bytes, this
procedure generates an ending sequence for the conversion (sometimes
called a ``shift sequence''), if any. Few encodings use shift
sequences, so this function will succeed with no output for most
encodings. In any case, successful output of a (possibly empty) shift
sequence resets the converter to its initial state.
The result of @racket[bytes-convert-end] is two values:
@itemize[
@item{@racket[_result-bstr] or @racket[_dest-wrote-amt] --- a byte string if
@racket[dest-bstr] is @racket[#f] or not provided, or the number of
bytes written into @racket[dest-bstr] otherwise.}
@item{@indexed-racket['complete] or @indexed-racket['continues] ---
indicates whether conversion completed. If @racket['complete], then
an entire ending sequence was produced. If @racket['continues], then
the conversion could not complete due to the limit on the result
size or the space in @racket[dest-bstr], and the first result is
either an empty byte string or @racket[0].}
]
}
@defproc[(bytes-converter? [v any/c]) boolean?]{
Returns @racket[#t] if @racket[v] is a @tech{byte converter} produced
by @racket[bytes-open-converter], @racket[#f] otherwise.
@examples[
(bytes-converter? (bytes-open-converter "UTF-8" "UTF-16"))
(bytes-converter? (bytes-open-converter "whacky" "not likely"))
(define b (bytes-open-converter "UTF-8" "UTF-16"))
(bytes-close-converter b)
(bytes-converter? b)
]}
@defproc[(locale-string-encoding) any]{
Returns a string for the current locale's encoding (i.e., the encoding
normally identified by @racket[""]). See also
@racket[system-language+country].}
@section{Additional Byte String Functions}
@note-lib[racket/bytes]
@(define string-eval (make-base-eval))
@@examples[#:hidden #:eval string-eval (require racket/bytes racket/list)]
@defproc[(bytes-append* [str bytes?] ... [strs (listof bytes?)]) bytes?]{
@; Note: this is exactly the same description as the one for append*
Like @racket[bytes-append], but the last argument is used as a list
of arguments for @racket[bytes-append], so @racket[(bytes-append*
str ... strs)] is the same as @racket[(apply bytes-append str
... strs)]. In other words, the relationship between
@racket[bytes-append] and @racket[bytes-append*] is similar to the
one between @racket[list] and @racket[list*].
@mz-examples[#:eval string-eval
(bytes-append* #"a" #"b" '(#"c" #"d"))
(bytes-append* (cdr (append* (map (lambda (x) (list #", " x))
'(#"Alpha" #"Beta" #"Gamma")))))
]}
@defproc[(bytes-join [strs (listof bytes?)] [sep bytes?]) bytes?]{
Appends the byte strings in @racket[strs], inserting @racket[sep] between
each pair of bytes in @racket[strs].
@mz-examples[#:eval string-eval
(bytes-join '(#"one" #"two" #"three" #"four") #" potato ")
]}
@close-eval[string-eval]