add missing doc files
svn: r18401
This commit is contained in:
parent
57ab0dee65
commit
2ddfa89a7a
182
collects/scribblings/guide/futures.scrbl
Normal file
182
collects/scribblings/guide/futures.scrbl
Normal file
|
@ -0,0 +1,182 @@
|
|||
#lang scribble/doc
|
||||
@(require scribble/manual
|
||||
"guide-utils.ss"
|
||||
(for-label scheme/flonum scheme/future))
|
||||
|
||||
@title[#:tag "effective-futures"]{Parallelism with Futures}
|
||||
|
||||
The @schememodname[scheme/future] library provides support for
|
||||
performance improvement through parallelism with the @scheme[future]
|
||||
and @scheme[touch] functions. The level of parallelism available from
|
||||
those constructs, however, is limited by several factors, and the
|
||||
current implementation is best suited to numerical tasks.
|
||||
|
||||
@margin-note{Other functions, such as @scheme[thread], support the
|
||||
creation of reliably concurrent tasks. However, thread never run truly
|
||||
in parallel, even if the hardware and operating system support
|
||||
parallelism.}
|
||||
|
||||
As a starting example, the @scheme[any-double?] function below takes a
|
||||
list of numbers and determines whether any number in the list has a
|
||||
double that is also in the list:
|
||||
|
||||
@schemeblock[
|
||||
(define (any-double? l)
|
||||
(for/or ([i (in-list l)])
|
||||
(for/or ([i2 (in-list l)])
|
||||
(= i2 (* 2 i)))))
|
||||
]
|
||||
|
||||
This function runs in quadratic time, so it can take a long time (on
|
||||
the order of a second) on large lists like @scheme[l1] and
|
||||
@scheme[l2]:
|
||||
|
||||
@schemeblock[
|
||||
(define l1 (for/list ([i (in-range 5000)])
|
||||
(+ (* 2 i) 1)))
|
||||
(define l2 (for/list ([i (in-range 5000)])
|
||||
(- (* 2 i) 1)))
|
||||
(or (any-double? l1)
|
||||
(any-double? l2))
|
||||
]
|
||||
|
||||
The best way to speed up @scheme[any-double?] is to use a different
|
||||
algorithm. However, on a machine that offers at least two processing
|
||||
units, the example above can run in about half the time using
|
||||
@scheme[future] and @scheme[touch]:
|
||||
|
||||
@schemeblock[
|
||||
(let ([f (future (lambda () (any-double? l2)))])
|
||||
(or (any-double? l1)
|
||||
(touch f)))
|
||||
]
|
||||
|
||||
The future @scheme[f] runs @scheme[(any-double? l2)] in parallel to
|
||||
@scheme[(any-double? l1)], and the result for @scheme[(any-double?
|
||||
l2)] becomes available about the same time that it is demanded by
|
||||
@scheme[(touch f)].
|
||||
|
||||
Futures run in parallel as long as they can do so safely, but the
|
||||
notion of ``safe'' for parallelism is inherently tied to the system
|
||||
implementation. The distinction between ``safe'' and ``unsafe''
|
||||
operations may be far from apparent at the level of a Scheme program.
|
||||
|
||||
Consider the following core of a Mandelbrot-set computation:
|
||||
|
||||
@schemeblock[
|
||||
(define (mandelbrot iterations x y n)
|
||||
(let ((ci (- (/ (* 2.0 y) n) 1.0))
|
||||
(cr (- (/ (* 2.0 x) n) 1.5)))
|
||||
(let loop ((i 0) (zr 0.0) (zi 0.0))
|
||||
(if (> i iterations)
|
||||
i
|
||||
(let ((zrq (* zr zr))
|
||||
(ziq (* zi zi)))
|
||||
(cond
|
||||
((> (+ zrq ziq) 4.0) i)
|
||||
(else (loop (add1 i)
|
||||
(+ (- zrq ziq) cr)
|
||||
(+ (* 2.0 zr zi) ci)))))))))
|
||||
]
|
||||
|
||||
The expressions @scheme[(mandelbrot 10000000 62 500 1000)] and
|
||||
@scheme[(mandelbrot 10000000 62 501 1000)] each take a while to
|
||||
produce an answer. Computing them both, of course, takes twice as
|
||||
long:
|
||||
|
||||
@schemeblock[
|
||||
(list (mandelbrot 10000000 62 500 1000)
|
||||
(mandelbrot 10000000 62 501 1000))
|
||||
]
|
||||
|
||||
Unfortunately, attempting to run the two computations in parallel with
|
||||
@scheme[future] does not improve performance:
|
||||
|
||||
@schemeblock[
|
||||
(let ([f (future (lambda () (mandelbrot 10000000 62 501 1000)))])
|
||||
(list (mandelbrot 10000000 62 500 1000)
|
||||
(touch f)))
|
||||
]
|
||||
|
||||
One problem is that the @scheme[*] and @scheme[/] operations in the
|
||||
first two lines of @scheme[mandelbrot] involve a mixture of exact and
|
||||
inexact real numbers. Such mixtures typically trigger a slow path in
|
||||
execution, and the general slow path is not safe for
|
||||
parallelism. Consequently, the future created in this example is
|
||||
almost immediately suspended, and it cannot resume until
|
||||
@scheme[touch] is called.
|
||||
|
||||
Changing the first two lines of @scheme[mandelbrot] addresses that
|
||||
first the problem:
|
||||
|
||||
@schemeblock[
|
||||
(define (mandelbrot iterations x y n)
|
||||
(let ((ci (- (/ (* 2.0 (->fl y)) (->fl n)) 1.0))
|
||||
(cr (- (/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
|
||||
....))
|
||||
]
|
||||
|
||||
With that change, @scheme[mandelbrot] computations can run in
|
||||
parallel. Nevertheless, performance still does not improve. The
|
||||
problem is that most every arithmetic operation in this example
|
||||
produces an inexact number whose storage must be allocated. Especially
|
||||
frequent allocation triggers communication between parallel tasks that
|
||||
defeats any performance improvement.
|
||||
|
||||
By using @tech{flonum}-specific operations (see
|
||||
@secref["fixnums+flonums"]), we can re-write @scheme[mandelbot] to use
|
||||
much less allocation:
|
||||
|
||||
@schemeblock[
|
||||
(define (mandelbrot iterations x y n)
|
||||
(let ((ci (fl- (fl/ (* 2.0 (->fl y)) (->fl n)) 1.0))
|
||||
(cr (fl- (fl/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
|
||||
(let loop ((i 0) (zr 0.0) (zi 0.0))
|
||||
(if (> i iterations)
|
||||
i
|
||||
(let ((zrq (fl* zr zr))
|
||||
(ziq (fl* zi zi)))
|
||||
(cond
|
||||
((fl> (fl+ zrq ziq) 4.0) i)
|
||||
(else (loop (add1 i)
|
||||
(fl+ (fl- zrq ziq) cr)
|
||||
(fl+ (fl* 2.0 (fl* zr zi)) ci)))))))))
|
||||
]
|
||||
|
||||
This conversion can speed @scheme[mandelbrot] by a factor of 8, even
|
||||
in sequential mode, but avoiding allocation also allows
|
||||
@scheme[mandelbrot] to run usefully faster in parallel.
|
||||
|
||||
As a general guideline, any operation that is inlined by the
|
||||
@tech{JIT} compiler runs safely in parallel, while other operations
|
||||
that are not inlined (including all operations if the JIT compiler is
|
||||
disabled) are considered unsafe. The @exec{mzc} decompiler tool
|
||||
annotates operations that can be inlined by the compiler (see
|
||||
@secref[#:doc '(lib "scribblings/mzc/mzc.scrbl") "decompile"]), so the
|
||||
decompiler can be used to help predict parallel performance.
|
||||
|
||||
To more directly report what is happening in a program that uses
|
||||
@scheme[future] and @scheme[touch], operations are logged when they
|
||||
suspend a computation or synchronize with the main computation. For
|
||||
example, running the original @scheme[mandelbrot] in a future produces
|
||||
the following output in the @scheme['debug] log level:
|
||||
|
||||
@margin-note{To see @scheme['debug] logging output on stderr, set the
|
||||
@envvar{PLTSTDERR} environment variable to @tt{debug} or start
|
||||
@exec{mzscheme} with @Flag{W} @tt{debug}.}
|
||||
|
||||
@verbatim[#:indent 2]|{
|
||||
future: 0 waiting for runtime at 1267392979341.989: *
|
||||
}|
|
||||
|
||||
The message indicates which internal future-running task became
|
||||
blocked on an unsafe operation, the time it blocked (in terms of
|
||||
@scheme[current-inexact-miliseconds]), and the operation that caused
|
||||
the computation it to block.
|
||||
|
||||
The first revision to @scheme[mandelbrot] avoids suspending at
|
||||
@scheme[*], but produces many log entries of the form
|
||||
|
||||
@verbatim[#:indent 2]|{
|
||||
future: 0 waiting for runtime at 1267392980465.066: [acquire_gc_page]
|
||||
}|
|
85
collects/scribblings/reference/futures.scrbl
Normal file
85
collects/scribblings/reference/futures.scrbl
Normal file
|
@ -0,0 +1,85 @@
|
|||
#lang scribble/doc
|
||||
@(require "mz.ss"
|
||||
(for-label scheme
|
||||
scheme/base
|
||||
scheme/contract
|
||||
scheme/future))
|
||||
|
||||
@(define future-eval (make-base-eval))
|
||||
@(interaction-eval #:eval future-eval (require scheme/future))
|
||||
|
||||
@title[#:tag "futures"]{Futures for Parallelism}
|
||||
|
||||
@note-lib[scheme/future]
|
||||
|
||||
@margin-note{Currently, parallel support for @scheme[future] is
|
||||
enabled by default for Windows, Linux x86/x86_64, and Mac OS X
|
||||
x86/x86_64. To enable support for other platforms, use
|
||||
@DFlag{enable-futures} with @exec{configure} when building PLT
|
||||
Scheme.}
|
||||
|
||||
The @scheme[future] and @scheme[touch] functions from
|
||||
@schememodname[scheme/future] provide access to parallelism as
|
||||
supported by the hardware and operation system.
|
||||
In contrast to @scheme[thread], which provides concurrency for
|
||||
arbitrary computations without parallelism, @scheme[future] provides
|
||||
parallelism for limited computations. A future executes its work in
|
||||
parallel (assuming that support for parallelism is available) until it
|
||||
detects an attempt to perform an operation that is too complex for the
|
||||
system to run safely in parallel. Similarly, work in a future is
|
||||
suspended if it depends in some way on the current continuation, such
|
||||
as raising an exception. A suspended computation for a future is
|
||||
resumed when @scheme[touch] is applied to the future descriptor.
|
||||
|
||||
``Safe'' parallel execution of a future means that all operations
|
||||
provided by the system must be able to enforce contracts and produce
|
||||
results as documented. ``Safe'' does not preclude concurrent access to
|
||||
mutable data that is visible in the program. For example, a
|
||||
computation in a future might use @scheme[set!] to modify a shared
|
||||
variable, in which case concurrent assignment to the variable can be
|
||||
visible in other futures and threads. Furthermore, guarantees about
|
||||
the visibility of effects and ordering are determined by the operating
|
||||
system and hardware---which rarely support, for example, the guarantee
|
||||
of sequential consistency that is provided for @scheme[thread]-based
|
||||
concurrency. At the same time, operations that seem obviously safe may
|
||||
have a complex enough implementation internally that they cannot run
|
||||
in parallel. See also @guidesecref["effective-futures"].
|
||||
|
||||
@deftogether[(
|
||||
@defproc[(future [thunk (-> any)]) future?]
|
||||
@defproc[(touch [f future?]) any]
|
||||
)]{
|
||||
|
||||
The @scheme[future] procedure returns a future-descriptor value that
|
||||
encapsulates @scheme[thunk]. The @scheme[touch] function forces the
|
||||
evaluation of the @scheme[thunk] inside the given future, returning
|
||||
the values produced by @scheme[thunk]. After @scheme[touch] forces
|
||||
the evaluation of a @scheme[thunk], the resulting values are retained
|
||||
by the future descriptor in place of @scheme[thunk], and additional
|
||||
@scheme[touch]es of the future descriptor return those values.
|
||||
|
||||
Between a call to @scheme[future] and @scheme[touch] for a given
|
||||
future, the given @scheme[thunk] may run speculatively in parallel to
|
||||
other computations, as described above.
|
||||
|
||||
@interaction[
|
||||
#:eval future-eval
|
||||
(let ([f (future (lambda () (+ 1 2)))])
|
||||
(list (+ 3 4) (touch f)))
|
||||
]}
|
||||
|
||||
|
||||
@defproc[(future? [v any/c]) boolean?]{
|
||||
Returns @scheme[#t] if @scheme[v] is a future-descriptor value,
|
||||
@scheme[#f] otherwise.
|
||||
}
|
||||
|
||||
@defproc[(processor-count) exact-positive-integer?]{
|
||||
Returns the number of parallel computations units (e.g., processors
|
||||
or cores) that are available on the current machine.
|
||||
}
|
||||
|
||||
|
||||
@; ----------------------------------------------------------------------
|
||||
|
||||
@close-eval[future-eval]
|
Loading…
Reference in New Issue
Block a user