ffi/unsafe: add _bytes/nul-terminated

Add a `_bytes` variant type that will work more consistently with
Racket-on-Chez, where the representation of a byte string does not
include an implicit nul terminator.
This commit is contained in:
Matthew Flatt 2018-01-19 13:55:46 -07:00
parent d7421b5dc0
commit 62b8ca3ca7
6 changed files with 93 additions and 25 deletions

View File

@ -12,7 +12,7 @@
(define collection 'multi)
(define version "6.12.0.1")
(define version "6.12.0.2")
(define deps `("racket-lib"
["racket" #:version ,version]))

View File

@ -394,12 +394,13 @@ procedure would then not close over @racket[b].)}
bytes?]{
Returns a byte string made of the given pointer and the given length.
No copying is done. This can be used as an alternative to make
pointer values accessible in Racket when the size is known.
No copying is performed. Beware that future implementations of Racket
may not support this function (in case of a byte string representation
that combines a size and byte-string content without an indirection).
Beware that the representation of a Racket byte string normally
Beware also that the representation of a Racket byte string normally
requires a nul terminator at the end of the byte string (after
@racket[length] bytes), but some function work with a byte-string
@racket[length] bytes), but some functions work with a byte-string
representation that has no such terminator---notably
@racket[bytes-copy].

View File

@ -264,21 +264,16 @@ inputs.}
@subsection{Primitive String Types}
The five primitive string types correspond to cases where a C
representation matches Racket's representation without encodings.
The form @racket[_bytes] form can be used type for Racket byte
strings, which corresponds to C's @cpp{char*} type. In addition to
translating byte strings, @racket[#f] corresponds to the @cpp{NULL}
pointer.
See also @racket[_bytes/nul-terminator] and @racket[_bytes] for
converting between byte strings and C's @cpp{char*} type.
@deftogether[(
@defthing[_string/ucs-4 ctype?]
)]{
A type for Racket's native Unicode strings, which are in UCS-4 format.
These correspond to the C @cpp{mzchar*} type used by Racket. As usual, the types
treat @racket[#f] as @cpp{NULL} and vice versa.}
These correspond to the C @cpp{mzchar*} type used by Racket's C API.
As usual, the type treats @racket[#f] as @cpp{NULL} and vice versa.}
@deftogether[(
@ -291,8 +286,9 @@ Unicode strings in UTF-16 format. As usual, the types treat
@defthing[_path ctype?]{
Simple @cpp{char*} strings, corresponding to Racket's paths. As usual,
the types treat @racket[#f] as @cpp{NULL} and vice versa.
Simple @cpp{char*} strings, corresponding to Racket's @tech[#:doc
reference.scrbl]{path or string}. As usual, the type treats
@racket[#f] as @cpp{NULL} and vice versa.
Beware that changing the current directory via
@racket[current-directory] does not change the OS-level current
@ -331,7 +327,7 @@ Racket paths are converted using @racket[path->bytes].}
@subsection{Variable Auto-Converting String Type}
The @racket[_string/ucs-4] type is rarely useful when interacting with
foreign code, while using @racket[_bytes] is somewhat unnatural, since
foreign code, while using @racket[_bytes/nul-terminator] is somewhat unnatural, since
it forces Racket programmers to use byte strings. Using
@racket[_string/utf-8], etc., meanwhile, may prematurely commit to a
particular encoding of strings as bytes. The @racket[_string] type
@ -1083,13 +1079,55 @@ See @racket[_list] for more explanation about the examples.}
[_bytes
(_bytes o len-expr)]]{
A @tech{custom function type} that can be used by itself as a simple
type for a byte string as a C pointer. Coercion of a C pointer to
simply @racket[_bytes] (without a specified length) requires that the pointer
refers to a nul-terminated byte string. When the length-specifying form is used
for a function argument, a byte string is allocated with the given
length, including an extra byte for the nul terminator.}
The @racket[_bytes] form by itself corresponds to C's @cpp{char*}
type; a byte string is passed as @racket[_bytes] without any
copying. In the current Racket implementation, a Racket byte string is
normally nul terminated implicitly, but a future implementation of
Racket may not include an implicit nul terminator for byte strings.
See also @racket[_bytes/nul-terminated].
In the current Racket implementation, as @racket[_bytes] result, a C
non-NULL @cpp{char*} is wrapped as a Racket byte string without
copying; future Racket implementations may require copying to
represent a C @cpp{char*} result as a Racket byte string. The C result
must have a nul terminator to determine the Racket byte string's
length.
A @racket[(_bytes o len-expr)] form is a @tech{custom function type}.
As an argument, a byte string is allocated with the given length; in
the current Racket implementation, that byte string includes an extra
byte for the nul terminator (but, again, a future Racket
implementation may not behave that way). As a result, @racket[(_bytes
o len-expr)] wraps a C non-NULL @cpp{char*} pointer as a byte string of
the given length (but, again, a future Racket implementation may copy
the indicated number of bytes to a fresh byte string).
As usual, @racket[_bytes] treats @racket[#f] as @cpp{NULL} and vice
versa. As a result type, @racket[(_bytes o len-expr)] works only for
non-NULL results.}
@defform*[#:id _bytes/nul-terminated
#:literals (o)
[_bytes/nul-terminated
(_bytes/nul-terminated o len-expr)]]{
The @racket[_bytes/nul-terminated] type is like @racket[_bytes], but
an explicit nul-terminator byte is added to a byte-string argument,
which implies copying. As a result type, a @cpp{char*} is copied to a
fresh byte string (without an explicit nul terminator).
When @racket[(_bytes o len-expr)] is used as an argument type, a byte
string of length @racket[len-expr] is allocated. Similarly, when
@racket[(_bytes o len-expr)] is used as a result type, a @cpp{char*}
result is copied to a fresh byte string of length @racket[len-expr].
As usual, @racket[_bytes/nul-terminated] treats @racket[#f] as
@cpp{NULL} and vice versa. As a result type,
@racket[(_bytes/nul-terminated o len-expr)] works only for non-NULL
results.
@history[#:added "6.12.0.2"]}
@; ------------------------------------------------------------

View File

@ -532,6 +532,17 @@
(ptr-set! v _pointer (ptr-add #f 107))
(test 107 ptr-ref v _intptr))
;; Test _bytes and _bytes/nul-terminated
(let ([p (malloc 8)])
(memcpy p #"hi, all!" 8)
(test #"hi, all!" cast p _pointer _bytes)
(test #"hi, all!" cast p _pointer _bytes/nul-terminated))
(let* ([strdup (get-ffi-obj 'strdup #f (_fun _bytes/nul-terminated -> _pointer))]
[p (strdup #"howdy...")])
(test #"howdy..." cast p _pointer _bytes)
(test #"howdy..." cast p _pointer _bytes/nul-terminated)
(free p))
;; Test equality and hashing of c pointers:
(let ([seventeen1 (cast 17 _intptr _pointer)]
[seventeen2 (cast 17 _intptr _pointer)]

View File

@ -1069,6 +1069,24 @@
[(_ . xs) (_bytes . xs)]
[_ _bytes]))
;; _bytes/nul-terminated copies and includes a nul terminator in a
;; way that will be more consistent across Racket implementations
(define _bytes/nul-terminated
(make-ctype _bytes
(lambda (bstr) (and bstr (bytes-append bstr #"\0")))
(lambda (bstr) (bytes-copy bstr))))
(provide (rename-out [_bytes/nul-terminated* _bytes/nul-terminated]))
(define-fun-syntax _bytes/nul-terminated*
(syntax-id-rules (o)
[(_ o n) (type: _pointer
pre: (make-bytes n)
;; post is needed when this is used as a function output type
post: (x => (let ([s (make-bytes n)])
(memcpy s x n)
s)))]
[(_ . xs) (_bytes/nul-teriminated . xs)]
[_ _bytes/nul-terminated]))
;; (_array <type> <len> ...+)
(provide _array
array? array-length array-ptr array-type

View File

@ -13,12 +13,12 @@
consistently.)
*/
#define MZSCHEME_VERSION "6.12.0.1"
#define MZSCHEME_VERSION "6.12.0.2"
#define MZSCHEME_VERSION_X 6
#define MZSCHEME_VERSION_Y 12
#define MZSCHEME_VERSION_Z 0
#define MZSCHEME_VERSION_W 1
#define MZSCHEME_VERSION_W 2
#define MZSCHEME_VERSION_MAJOR ((MZSCHEME_VERSION_X * 100) + MZSCHEME_VERSION_Y)
#define MZSCHEME_VERSION_MINOR ((MZSCHEME_VERSION_Z * 1000) + MZSCHEME_VERSION_W)