#reader(lib "docreader.ss" "scribble")
@require[(lib "manual.ss" "scribble")]
@require[(lib "eval.ss" "scribble")]
@require["guide-utils.ss"]

@title[#:tag "bytes"]{Bytes and Byte Strings}

A @defterm{byte} is an inexact integer between @scheme[0] and
@scheme[255], inclusive. The @scheme[byte?] predicate recognizes
numbers that represent bytes.

@examples[
(byte? 0)
(byte? 256)
]

A @defterm{byte string} is similar to a string---see
@secref["strings"]---but its content is a sequence of bytes instead of
characters. Byte strings can be used in applications that process pure
ASCII instead of Unicode text. The printed and form of a byte string
supports such uses in particular, because a byte string prints like
the ASCII decoding of the byte string, but prefixed with a
@schemefont{#}. Unprintable ASCII characters or non-ASCII bytes in the
byte string are written with octal notation.

@refdetails["mz:parse-string"]{the syntax of byte strings}

@examples[
#"Apple"
(bytes-ref #"Apple" 0)
(make-bytes 3 65)
(define b (make-bytes 2 0))
b
(bytes-set! b 0 1)
(bytes-set! b 1 255)
b
]

The @scheme[display] form of a byte string writes its raw bytes to the
current output port (see @secref["output"]). Technically,
@scheme[display] of a normal (i.e,. character) string prints the UTF-8
encoding of the string to the current output port, since output is
ultimately defined in terms of bytes; @scheme[display] of a byte
string, however, writes the raw bytes with no encoding. Along the same
lines, when this documentation shows output, it technically shows the
UTF-8-decoded form of the output.

@examples[
(display #"Apple")
(eval:alts (code:line (display #, @schemevalfont{"\316\273"})  (code:comment #, @t{same as @scheme["\316\273"]}))
           (display "\316\273"))
(code:line (display #"\316\273") (code:comment #, @t{UTF-8 encoding of @elem["\u03BB"]}))
]

For explicitly converting between strings and byte strings, Scheme
supports three kinds of encodings directly: UTF-8, Latin-1, and the
current locale's encoding. General facilities for byte-to-byte
conversions (especially to and from UTF-8) fill the gap to support
arbitrary string encodings.

@examples[
(bytes->string/utf-8 #"\316\273")
(bytes->string/latin-1 #"\316\273")
(code:line
 (parameterize ([current-locale "C"])  (code:comment #, @elem{C locale supports ASCII,})
   (bytes->string/locale #"\316\273")) (code:comment #, @elem{only, so...}))
(let ([cvt (bytes-open-converter "cp1253" (code:comment #, @elem{Greek code page})
                                 "UTF-8")]
      [dest (make-bytes 2)])
  (bytes-convert cvt #"\353" 0 1 dest)
  (bytes-close-converter cvt)
  (bytes->string/utf-8 dest))
]

@refdetails["mz:bytestrings"]{byte strings and byte-string procedures}