Add a tutorial to the FFI overview
This commit is contained in:
parent
36d3745d4c
commit
e71abf5aba
|
@ -15,6 +15,8 @@ interface}. Furthermore, since most APIs consist mostly of functions,
|
|||
the foreign interface is sometimes called a @defterm{foreign function
|
||||
interface}, abbreviated @deftech{FFI}.
|
||||
|
||||
@;------------------------------------------------------------------------
|
||||
|
||||
@table-of-contents[]
|
||||
|
||||
@include-section["intro.scrbl"]
|
||||
|
@ -25,4 +27,11 @@ interface}, abbreviated @deftech{FFI}.
|
|||
@include-section["misc.scrbl"]
|
||||
@include-section["unexported.scrbl"]
|
||||
|
||||
@(bibliography
|
||||
(bib-entry #:key "Barzilay04"
|
||||
#:author "Eli Barzilay and Dmitry Orlovsky"
|
||||
#:title "Foreign Interface for PLT Scheme"
|
||||
#:location "Workshop on Scheme and Functional Programming"
|
||||
#:date "2004"))
|
||||
|
||||
@index-section[]
|
||||
|
|
|
@ -1,5 +1,18 @@
|
|||
#lang scribble/doc
|
||||
@(require "utils.rkt")
|
||||
@(require "utils.rkt"
|
||||
scribble/racket
|
||||
(for-syntax racket/base)
|
||||
(for-label ffi/unsafe/define))
|
||||
|
||||
@(define-syntax _MEVENT (make-element-id-transformer
|
||||
(lambda (stx) #'@schemeidfont{_MEVENT})))
|
||||
@(define-syntax _MEVENT-pointer (make-element-id-transformer
|
||||
(lambda (stx) #'@schemeidfont{_MEVENT-pointer})))
|
||||
@(define-syntax _WINDOW-pointer (make-element-id-transformer
|
||||
(lambda (stx) #'@schemeidfont{_WINDOW-pointer})))
|
||||
@(define-syntax _mmask_t (make-element-id-transformer
|
||||
(lambda (stx) #'@schemeidfont{_mmask_t})))
|
||||
|
||||
|
||||
@title[#:tag "intro"]{Overview}
|
||||
|
||||
|
@ -20,5 +33,482 @@ responsibility to provide a safe interface. If your library provides
|
|||
an unsafe interface, then it should have @racketidfont{unsafe} in its
|
||||
name, too.
|
||||
|
||||
For examples of common FFI usage patterns, see the defined interfaces
|
||||
in the @filepath{ffi} collection.
|
||||
For more information on the motivation and design of the Racket FFI,
|
||||
see @cite["Barzilay04"].
|
||||
|
||||
@; --------------------------------------------------
|
||||
|
||||
@section{Libraries, C Types, and Objects}
|
||||
|
||||
To use the FFI, you must have in mind
|
||||
|
||||
@itemlist[
|
||||
|
||||
@item{a particular library from which you want to access a function
|
||||
or value, }
|
||||
|
||||
@item{a particular symbol exported by the file, and}
|
||||
|
||||
@item{the C-level type (typically a function type) of the exported
|
||||
symbol.}
|
||||
|
||||
]
|
||||
|
||||
The library corresponds to a file with a suffix such as
|
||||
@filepath{.dll}, @filepath{.so}, or @filepath{.dylib} (depending on
|
||||
the platform), or it might be a library within a @filepath{.framework}
|
||||
directory on Mac OS X.
|
||||
|
||||
Knowing the library's name and/or path is often the trickiest part of
|
||||
using the FFI. Sometimes, when using a library name without a path
|
||||
prefix or file suffix, the library file can be located automatically,
|
||||
especially on Unix. See @racket[ffi-lib] for advice.
|
||||
|
||||
The @racket[ffi-lib] function gets a handle to a library. To extract
|
||||
exports of the library, it's simplest to use
|
||||
@racket[define-ffi-definer] from the @racketmodname[ffi/unsafe/define]
|
||||
library:
|
||||
|
||||
@racketmod[
|
||||
racket/base
|
||||
(require ffi/unsafe
|
||||
ffi/unsafe/define)
|
||||
|
||||
(define-ffi-definer define-curses "libcurses")
|
||||
]
|
||||
|
||||
This @racket[define-ffi-definer] declaration introduces a
|
||||
@racket[define-curses] form for binding a Racket name to a value
|
||||
extracted from @filepath{libcurses}---which might be located
|
||||
at @filepath{/usr/lib/libcurses.so}, depending on
|
||||
the platform.
|
||||
|
||||
To use @racket[define-curses], we need the names and C types of
|
||||
functions from @filepath{libcurses}. We'll start by using the
|
||||
following functions:
|
||||
|
||||
@verbatim[#:indent 2]{
|
||||
WINDOW* initscr(void);
|
||||
int waddstr(WINDOW *win, char *str);
|
||||
int wrefresh(WINDOW *win);
|
||||
int endwin(void);
|
||||
}
|
||||
|
||||
We make these functions callable from Racket as follows:
|
||||
|
||||
@margin-note{By convention, an underscore prefix
|
||||
indicates a representation of a C type (such as @racket[_int]) or a
|
||||
constructor of such representations (such as @racket[_cpointer]).}
|
||||
|
||||
@racketblock[
|
||||
(define _WINDOW-pointer (_cpointer 'WINDOW))
|
||||
|
||||
(define-curses initscr (_fun -> _WINDOW-pointer))
|
||||
(define-curses waddstr (_fun _WINDOW-pointer _string -> _int))
|
||||
(define-curses wrefresh (_fun _WINDOW-pointer -> _int))
|
||||
(define-curses endwin (_fun -> _int))
|
||||
]
|
||||
|
||||
The definition of @racket[_WINDOW-pointer] creates a Racket value that
|
||||
reflects a C type via @racket[_cpointer], which creates a type
|
||||
representation for a pointer type---usually one that is opaque. The
|
||||
@racket['WINDOW] argument could have been any value, but by
|
||||
convention, we use a symbol matching the C base type.
|
||||
|
||||
Each @racket[define-curses] form uses the given identifier as both the
|
||||
name of the library export and the Racket identifier to
|
||||
bind.@margin-note*{An optional @racket[#:c-id] clause for
|
||||
@racket[define-curses] can specify a name for the library export that
|
||||
is different from the Racket identifier to bind.} The @racket[(_fun
|
||||
... -> ...)] part of each definition describes the C type of the
|
||||
exported function, since the library file does not encode that
|
||||
information for its exports. The types listed to the left of @racket[->] are the
|
||||
argument types, while the type to the right of @racket[->] is the
|
||||
result type. The pre-defined @racket[_int] type naturally corresponds
|
||||
to the @tt{int} C type, while @racket[_string] corresponds to the
|
||||
@tt{char*} type when it is intended as a string to read.
|
||||
|
||||
At this point, @racket[initscr], @racket[waddstr], @racket[wrefresh],
|
||||
and @racket[endwin] are normal Racket bindings to Racket functions
|
||||
(that happen to call C functions), and so they can be exported from
|
||||
the defining module or called directly:
|
||||
|
||||
@racketblock[
|
||||
(define win (initscr))
|
||||
(void (waddstr win "Hello"))
|
||||
(void (wrefresh win))
|
||||
(sleep 1)
|
||||
(void (endwin))
|
||||
]
|
||||
|
||||
@; --------------------------------------------------
|
||||
|
||||
@section{Function-Type Bells and Whistles}
|
||||
|
||||
Our initial use of functions like @racket[waddstr] is sloppy, because
|
||||
we ignore return codes. C functions often return error
|
||||
codes, and checking them is a pain. A better approach is to build the
|
||||
check into the @racket[waddstr] binding and raise an exception when
|
||||
the code is non-zero.
|
||||
|
||||
The @racket[_fun] function-type constructor includes many options to
|
||||
help convert C functions to nicer Racket functions. We can use some of
|
||||
those features to convert return codes into either @|void-const| or an
|
||||
exception:
|
||||
|
||||
@racketblock[
|
||||
(define (check v who)
|
||||
(unless (zero? v)
|
||||
(error who "failed: ~a" v)))
|
||||
|
||||
(define-curses initscr (_fun -> _WINDOW-pointer))
|
||||
(define-curses waddstr (_fun _WINDOW-pointer _string -> (r : _int)
|
||||
-> (check r 'waddstr)))
|
||||
(define-curses wrefresh (_fun _WINDOW-pointer -> (r : _int)
|
||||
-> (check r 'wrefresh)))
|
||||
(define-curses endwin (_fun -> (r : _int)
|
||||
-> (check r 'endwin)))
|
||||
]
|
||||
|
||||
Using @racket[(r : _int)] as a result type gives the local name
|
||||
@racket[r] to the C function's result. This name is then used in the
|
||||
result post-processing expression that is specified after a second
|
||||
@racket[->] in the @racket[_fun] form.
|
||||
|
||||
@; --------------------------------------------------
|
||||
|
||||
@section{By-Reference Arguments}
|
||||
|
||||
To get mouse events from @filepath{libcurses}, we must explicitly
|
||||
enable them through the @racket[mousemask] function:
|
||||
|
||||
@verbatim[#:indent 2]{
|
||||
typedef unsigned long mmask_t;
|
||||
#define BUTTON1_CLICKED 004L
|
||||
|
||||
mmask_t mousemask(mmask_t newmask, mmask_t *oldmask);
|
||||
}
|
||||
|
||||
Setting @racket[BUTTON1_CLICKED] in the mask enables button-click
|
||||
events. At the same time, @racket[mousemask] returns the current mask
|
||||
by installing it into the pointer provided as its second
|
||||
argument.
|
||||
|
||||
Since these kinds of call-by-reference interfaces are common in C,
|
||||
@racket[_fun] cooperates with a @racket[_ptr] form to automatically
|
||||
allocate space for a by-reference argument and extract the value put
|
||||
there by the C function. Give the extracted value name to use in the
|
||||
post-processing expression. The post-processing expression can combine
|
||||
the by-reference result with the function's direct result (which, in
|
||||
this case, reports a subset of the given mask that is actually
|
||||
supported).
|
||||
|
||||
@racketblock[
|
||||
(define _mmask_t _ulong)
|
||||
(define-curses mousemask (_fun _mmask_t (o : (_ptr o _mmask_t))
|
||||
-> (r : _mmask_t)
|
||||
-> (values o r)))
|
||||
(define BUTTON1_CLICKED #o004)
|
||||
|
||||
(define-values (old supported) (mousemask BUTTON1_CLICKED))
|
||||
]
|
||||
|
||||
@; --------------------------------------------------
|
||||
|
||||
@section{C Structs}
|
||||
|
||||
Assuming that mouse events are supported, the @filepath{libcurses}
|
||||
library reports them via @racket[getmouse], which accepts a pointer to
|
||||
a @cpp{MEVENT} struct to fill with mouse-event information:
|
||||
|
||||
@verbatim[#:indent 2]{
|
||||
typedef struct {
|
||||
short id;
|
||||
int x, y, z;
|
||||
mmask_t bstate;
|
||||
} MEVENT;
|
||||
|
||||
int getmouse(MEVENT *event);
|
||||
}
|
||||
|
||||
To work with @cpp{MEVENT} values, we use @racket[define-cstruct]:
|
||||
|
||||
@racketblock[
|
||||
(define-cstruct _MEVENT ([id _short]
|
||||
[x _int]
|
||||
[y _int]
|
||||
[z _int]
|
||||
[bstate _mmask_t]))
|
||||
]
|
||||
|
||||
This definition binds many names in the same way that
|
||||
@racket[define-struct] binds many names: @racket[_MEVENT] is a C type
|
||||
representing the struct type, @racket[_MEVENT-pointer] is a C type
|
||||
representing a pointer to a @racket[_MEVENT], @racket[make-MEVENT]
|
||||
constructs a @racket[_MEVENT] value, @racket[MEVENT-x] extracts
|
||||
the @racket[x] fields from an @racket[_MEVENT] value, and so on.
|
||||
|
||||
With this C struct declaration, we can define the function type for
|
||||
@racket[getmouse]. The simplest approach is to define
|
||||
@racket[getmouse] to accept an @racket[_MEVENT-pointer], and then explicitly
|
||||
allocate the @racket[_MEVENT] value before calling @racket[getmouse]:
|
||||
|
||||
@racketblock[
|
||||
(define-curses getmouse (_fun _MEVENT-pointer -> _int))
|
||||
|
||||
(define m (make-MEVENT 0 0 0 0 0))
|
||||
(when (zero? (getmouse m))
|
||||
(code:comment @#,t{use @racket[m]...})
|
||||
....)
|
||||
]
|
||||
|
||||
For a more Racket-like function, use @racket[(_ptr o _MEVENT)] and a
|
||||
post-processing expression:
|
||||
|
||||
@racketblock[
|
||||
(define-curses getmouse (_fun (m : (_ptr o _MEVENT))
|
||||
-> (r : _int)
|
||||
-> (and (zero? r) m)))
|
||||
|
||||
(waddstr win (format "click me fast..."))
|
||||
(wrefresh win)
|
||||
(sleep 1)
|
||||
|
||||
(define m (getmouse))
|
||||
(when m
|
||||
(waddstr win (format "at ~a,~a"
|
||||
(MEVENT-x m)
|
||||
(MEVENT-y m)))
|
||||
(wrefresh win)
|
||||
(sleep 1))
|
||||
|
||||
(endwin)
|
||||
]
|
||||
|
||||
The difference between @racket[_MEVENT-pointer] and @racket[_MEVENT]
|
||||
is crucial. Using @racket[(_ptr o _MEVENT-pointer)] would allocate
|
||||
only enough space for a pointer to an @cpp{MEVENT} struct, which is
|
||||
not enough space for an @cpp{MEVENT} struct.
|
||||
|
||||
@; --------------------------------------------------
|
||||
|
||||
@section{Pointers and Manual Allocation}
|
||||
|
||||
To get text from the user instead of a mouse click, @racket{libcurses}
|
||||
provides @racket[wgetnstr]:
|
||||
|
||||
@verbatim[#:indent 2]{
|
||||
int wgetnstr(WINDOW *win, char *str, int n);
|
||||
}
|
||||
|
||||
While the @cpp{char*} argument to @racket[waddstr] is treated as a
|
||||
nul-terminated string, the @cpp{char*} argument to @racket[wgetnstr]
|
||||
is treated as a buffer whose size is indicated by the final @cpp{int}
|
||||
argument. The C type @racket[_string] does not work for such
|
||||
buffers.
|
||||
|
||||
One way to approach this function from Racket is to describe the
|
||||
arguments in their rawest form, using plain @racket[_pointer] for the
|
||||
second argument to @racket[wgetnstr]:
|
||||
|
||||
@racket[
|
||||
(define-curses wgetnstr (_fun _WINDOW-pointer _pointer _int
|
||||
-> _int))
|
||||
]
|
||||
|
||||
To call this raw version of @racket[wgetnstr], allocate memory, zero
|
||||
it, and pass the size minus one (to leave room a nul
|
||||
terminator) to @racket[wgetnstr]:
|
||||
|
||||
@racketblock[
|
||||
(define SIZE 256)
|
||||
(define buffer (malloc 'raw SIZE))
|
||||
(memset buffer 0 SIZE)
|
||||
|
||||
(void (wgetnstr win buffer (sub1 SIZE)))
|
||||
]
|
||||
|
||||
When @racket[wgetnstr] returns, it has written bytes to
|
||||
@racket[buffer]. At that point, we can use @racket[cast] to convert the
|
||||
value from a raw pointer to a string:
|
||||
|
||||
@racketblock[
|
||||
(cast buffer _pointer _string)
|
||||
]
|
||||
|
||||
Conversion via the @racket[_string] type causes the data refereced by
|
||||
the original pointer to be copied (and UTF-8 decoded), so the memory
|
||||
referenced by @racket[buffer] is no longer needed. Memory allocated
|
||||
with @racket[(malloc 'raw ...)] must be released with @racket[free]:
|
||||
|
||||
@racketblock[
|
||||
(free buffer)
|
||||
]
|
||||
|
||||
@; --------------------------------------------------
|
||||
|
||||
@section{Pointers and GC-Managed Allocation}
|
||||
|
||||
Instead of allocating @racket[buffer] with @racket[(malloc 'raw ...)],
|
||||
we could have allocated it with @racket[(malloc 'atomic ...)]:
|
||||
|
||||
@racketblock[
|
||||
(define buffer (malloc 'atomic SIZE))
|
||||
]
|
||||
|
||||
Memory allocated with @racket['atomic] is managed by the garbage
|
||||
collector, so @racket[free] is neither necessary nor allowed when the
|
||||
memory referenced by @racket[buffer] is no longer needed. Instead,
|
||||
when @racket[buffer] becomes inaccessible, the allocated memory will
|
||||
be reclaimed automatically.
|
||||
|
||||
Allowing the garbage collector (GC) to manage memory is usually
|
||||
preferable. It's easy to forget to call @racket[free], and exceptions
|
||||
or thread termination can easily skip a @racket[free].
|
||||
|
||||
At the same time, using GC-managed memory adds a different burden on
|
||||
the programmer: data managed by the GC may be moved to a new address
|
||||
as the GC compacts allocated objects to avoid fragmentation. C
|
||||
functions, meanwhile, expect to receive pointers to objects that will
|
||||
stay put.
|
||||
|
||||
Fortunately, unless a C function calls back into the Racket run-time
|
||||
system (perhaps through a function that is provided as an argument),
|
||||
no garbage collection will happen between the time that a C function
|
||||
is called and the time that the function returns.
|
||||
|
||||
Let's look a few possibilities related to allocation and pointers:
|
||||
|
||||
@itemlist[
|
||||
|
||||
@item{Ok:
|
||||
|
||||
@racketblock[
|
||||
(define p (malloc 'atomic SIZE))
|
||||
(wgetnstr win p (sub1 SIZE))
|
||||
]
|
||||
|
||||
Although the data allocated by @racket[malloc] can move
|
||||
around, @racket[p] will always point to it, and no garbage collection
|
||||
will happen between the time that the address is extracted form
|
||||
@racket[p] to pass to @racket[wgetnstr] and the time that
|
||||
@racket[wgetnstr] returns.}
|
||||
|
||||
@item{Bad:
|
||||
|
||||
@racketblock[
|
||||
(define p (malloc 'atomic SIZE))
|
||||
(define i (cast p _pointer _intptr))
|
||||
(wgetnstr win (cast i _intptr _pointer) (sub1 SIZE))
|
||||
]
|
||||
|
||||
The data referenced by @racket[p] can move after the
|
||||
address is converted to an integer, in which case @racket[i] cast
|
||||
back to a pointer will be the wrong address.
|
||||
|
||||
Obviously, casting a pointer to an integer is generally a bad idea,
|
||||
but the cast simulates another possibility, which is passing the
|
||||
pointer to a C function that retains the pointer in its own private
|
||||
store for later use. Such private storage is invisible to the Racket
|
||||
GC, so it has the same effect as casting the pointer to an integer.}
|
||||
|
||||
@item{Ok:
|
||||
|
||||
@racketblock[
|
||||
(define p (malloc 'atomic SIZE))
|
||||
(define p2 (ptr-add p 4))
|
||||
(wgetnstr win p2 (- SIZE 5))
|
||||
]
|
||||
|
||||
The pointer @racket[p2] retains the original reference and
|
||||
only adds the @racket[4] at the last minute before calling
|
||||
@racket[wgetnstr] (i.e., after the point that garbage collection is
|
||||
allowed).}
|
||||
|
||||
@item{Ok:
|
||||
|
||||
@racketblock[
|
||||
(define p (malloc 'atomic-interior SIZE))
|
||||
(define i (cast p _pointer _intptr))
|
||||
(wgetnstr win (cast i _intptr _pointer) (sub1 SIZE))
|
||||
]
|
||||
|
||||
This is ok assuming that @racket[p] itself stays accessible, so that
|
||||
the data it references isn't reclaimed. Allocating with
|
||||
@racket['atomic-interior] puts data at a particular address and
|
||||
keeps it there. A garbage collection will not change the address in
|
||||
@racket[p], and so @racket[i] (cast back to a pointer) will always
|
||||
refer to the data.}
|
||||
|
||||
]
|
||||
|
||||
Keep in mind that C struct constructors like @racket[make-MEVENT] are
|
||||
effectively the same as @racket[(malloc 'atomic ...)]; the result values
|
||||
can move in memory during a garbage collection. The same is true of
|
||||
byte strings allocated with @racket[make-bytes], which (as a
|
||||
convenience) can be used directly as a pointer value (unlike character
|
||||
strings, which are always copied for UTF-8 encoding or decoding).
|
||||
|
||||
For more information about memory management and garbage collection,
|
||||
see @secref[#:doc InsideRacket-doc "im:memoryalloc"] in
|
||||
@|InsideRacket|.
|
||||
|
||||
@; --------------------------------------------------
|
||||
|
||||
@section{Reliable Release of Resources}
|
||||
|
||||
Using GC-managed memory saves you from manual @racket[free]s for plain
|
||||
memory blocks, but C libraries often allocate resources and require a
|
||||
matching call to a function that releases the resources. For example,
|
||||
@filepath{libcurses} supports windows on the screen that
|
||||
are created with @racket[newwin] and released with @racket[delwin]:
|
||||
|
||||
@verbatim[#:indent 2]{
|
||||
WINDOW *newwin(int lines, int ncols, int y, int x);
|
||||
int delwin(WINDOW *win);
|
||||
}
|
||||
|
||||
In a sufficiently complex program, ensuring that every @racket[newwin]
|
||||
is paired with @racket[delwin] can be challenging, especially if the
|
||||
functions are wrapped by otherwise safe functions that are provided
|
||||
from a library. A library that is intended to be safe for use in a
|
||||
sandbox, say, must protect against resource leaks within the Racket
|
||||
process as a whole when a sandboxed program misbehaves or is
|
||||
terminated.
|
||||
|
||||
The @racketmodname[ffi/unsafe/alloc] library provides functions to
|
||||
connect resource-allocating functions and resource-releasing
|
||||
functions. The library then for finalization to release a resource if
|
||||
it becomes inaccessible (according to the GC) before it is explicitly
|
||||
released. At the same time, the library handles tricky atomicity
|
||||
requirements to ensure that the finalization is properly registered
|
||||
and never run multiple times.
|
||||
|
||||
Using @racketmodname[ffi/unsafe/alloc], the @racket[newwin] and
|
||||
@racket[delwin] functions can be imported with @racket[allocator]
|
||||
and @racket[deallocator] wrappers, respectively:
|
||||
|
||||
@racketblock[
|
||||
(require ffi/unsafe/alloc)
|
||||
|
||||
(define-curses delwin (_fun _WINDOW-pointer -> _int)
|
||||
#:wrap (deallocator))
|
||||
|
||||
(define-curses newwin (_fun _int _int _int _int
|
||||
-> _WINDOW-pointer)
|
||||
#:wrap (allocator delwin))
|
||||
]
|
||||
|
||||
A @racket[deallocator] wrapper makes a function cancel any existing
|
||||
finalizer for the function's argument. An @racket[allocator] wrapper
|
||||
refers to the deallocator, so that the deallocator can be run if
|
||||
necessary by a finalizer.
|
||||
|
||||
@; ------------------------------------------------------------
|
||||
|
||||
@section{More Examples}
|
||||
|
||||
For more examples of common FFI patterns, see the defined interfaces
|
||||
in the @filepath{ffi/examples} collection. See also @cite["Barzilay04"].
|
||||
|
||||
|
||||
|
|
|
@ -12,7 +12,7 @@
|
|||
ffi/vector))
|
||||
|
||||
(provide cpp
|
||||
InsideRacket
|
||||
InsideRacket InsideRacket-doc
|
||||
guide.scrbl
|
||||
(all-from-out scribble/manual)
|
||||
(for-label (all-from-out racket/base
|
||||
|
@ -21,8 +21,10 @@
|
|||
ffi/unsafe/cvector
|
||||
ffi/vector)))
|
||||
|
||||
(define InsideRacket-doc '(lib "scribblings/inside/inside.scrbl"))
|
||||
|
||||
(define InsideRacket
|
||||
(other-manual '(lib "scribblings/inside/inside.scrbl")))
|
||||
(other-manual InsideRacket-doc))
|
||||
|
||||
(define guide.scrbl
|
||||
'(lib "scribblings/guide/guide.scrbl"))
|
||||
|
|
Loading…
Reference in New Issue
Block a user