add missing doc files
svn: r18401
This commit is contained in:
parent
57ab0dee65
commit
2ddfa89a7a
182
collects/scribblings/guide/futures.scrbl
Normal file
182
collects/scribblings/guide/futures.scrbl
Normal file
|
@ -0,0 +1,182 @@
|
||||||
|
#lang scribble/doc
|
||||||
|
@(require scribble/manual
|
||||||
|
"guide-utils.ss"
|
||||||
|
(for-label scheme/flonum scheme/future))
|
||||||
|
|
||||||
|
@title[#:tag "effective-futures"]{Parallelism with Futures}
|
||||||
|
|
||||||
|
The @schememodname[scheme/future] library provides support for
|
||||||
|
performance improvement through parallelism with the @scheme[future]
|
||||||
|
and @scheme[touch] functions. The level of parallelism available from
|
||||||
|
those constructs, however, is limited by several factors, and the
|
||||||
|
current implementation is best suited to numerical tasks.
|
||||||
|
|
||||||
|
@margin-note{Other functions, such as @scheme[thread], support the
|
||||||
|
creation of reliably concurrent tasks. However, thread never run truly
|
||||||
|
in parallel, even if the hardware and operating system support
|
||||||
|
parallelism.}
|
||||||
|
|
||||||
|
As a starting example, the @scheme[any-double?] function below takes a
|
||||||
|
list of numbers and determines whether any number in the list has a
|
||||||
|
double that is also in the list:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(define (any-double? l)
|
||||||
|
(for/or ([i (in-list l)])
|
||||||
|
(for/or ([i2 (in-list l)])
|
||||||
|
(= i2 (* 2 i)))))
|
||||||
|
]
|
||||||
|
|
||||||
|
This function runs in quadratic time, so it can take a long time (on
|
||||||
|
the order of a second) on large lists like @scheme[l1] and
|
||||||
|
@scheme[l2]:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(define l1 (for/list ([i (in-range 5000)])
|
||||||
|
(+ (* 2 i) 1)))
|
||||||
|
(define l2 (for/list ([i (in-range 5000)])
|
||||||
|
(- (* 2 i) 1)))
|
||||||
|
(or (any-double? l1)
|
||||||
|
(any-double? l2))
|
||||||
|
]
|
||||||
|
|
||||||
|
The best way to speed up @scheme[any-double?] is to use a different
|
||||||
|
algorithm. However, on a machine that offers at least two processing
|
||||||
|
units, the example above can run in about half the time using
|
||||||
|
@scheme[future] and @scheme[touch]:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(let ([f (future (lambda () (any-double? l2)))])
|
||||||
|
(or (any-double? l1)
|
||||||
|
(touch f)))
|
||||||
|
]
|
||||||
|
|
||||||
|
The future @scheme[f] runs @scheme[(any-double? l2)] in parallel to
|
||||||
|
@scheme[(any-double? l1)], and the result for @scheme[(any-double?
|
||||||
|
l2)] becomes available about the same time that it is demanded by
|
||||||
|
@scheme[(touch f)].
|
||||||
|
|
||||||
|
Futures run in parallel as long as they can do so safely, but the
|
||||||
|
notion of ``safe'' for parallelism is inherently tied to the system
|
||||||
|
implementation. The distinction between ``safe'' and ``unsafe''
|
||||||
|
operations may be far from apparent at the level of a Scheme program.
|
||||||
|
|
||||||
|
Consider the following core of a Mandelbrot-set computation:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(define (mandelbrot iterations x y n)
|
||||||
|
(let ((ci (- (/ (* 2.0 y) n) 1.0))
|
||||||
|
(cr (- (/ (* 2.0 x) n) 1.5)))
|
||||||
|
(let loop ((i 0) (zr 0.0) (zi 0.0))
|
||||||
|
(if (> i iterations)
|
||||||
|
i
|
||||||
|
(let ((zrq (* zr zr))
|
||||||
|
(ziq (* zi zi)))
|
||||||
|
(cond
|
||||||
|
((> (+ zrq ziq) 4.0) i)
|
||||||
|
(else (loop (add1 i)
|
||||||
|
(+ (- zrq ziq) cr)
|
||||||
|
(+ (* 2.0 zr zi) ci)))))))))
|
||||||
|
]
|
||||||
|
|
||||||
|
The expressions @scheme[(mandelbrot 10000000 62 500 1000)] and
|
||||||
|
@scheme[(mandelbrot 10000000 62 501 1000)] each take a while to
|
||||||
|
produce an answer. Computing them both, of course, takes twice as
|
||||||
|
long:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(list (mandelbrot 10000000 62 500 1000)
|
||||||
|
(mandelbrot 10000000 62 501 1000))
|
||||||
|
]
|
||||||
|
|
||||||
|
Unfortunately, attempting to run the two computations in parallel with
|
||||||
|
@scheme[future] does not improve performance:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(let ([f (future (lambda () (mandelbrot 10000000 62 501 1000)))])
|
||||||
|
(list (mandelbrot 10000000 62 500 1000)
|
||||||
|
(touch f)))
|
||||||
|
]
|
||||||
|
|
||||||
|
One problem is that the @scheme[*] and @scheme[/] operations in the
|
||||||
|
first two lines of @scheme[mandelbrot] involve a mixture of exact and
|
||||||
|
inexact real numbers. Such mixtures typically trigger a slow path in
|
||||||
|
execution, and the general slow path is not safe for
|
||||||
|
parallelism. Consequently, the future created in this example is
|
||||||
|
almost immediately suspended, and it cannot resume until
|
||||||
|
@scheme[touch] is called.
|
||||||
|
|
||||||
|
Changing the first two lines of @scheme[mandelbrot] addresses that
|
||||||
|
first the problem:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(define (mandelbrot iterations x y n)
|
||||||
|
(let ((ci (- (/ (* 2.0 (->fl y)) (->fl n)) 1.0))
|
||||||
|
(cr (- (/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
|
||||||
|
....))
|
||||||
|
]
|
||||||
|
|
||||||
|
With that change, @scheme[mandelbrot] computations can run in
|
||||||
|
parallel. Nevertheless, performance still does not improve. The
|
||||||
|
problem is that most every arithmetic operation in this example
|
||||||
|
produces an inexact number whose storage must be allocated. Especially
|
||||||
|
frequent allocation triggers communication between parallel tasks that
|
||||||
|
defeats any performance improvement.
|
||||||
|
|
||||||
|
By using @tech{flonum}-specific operations (see
|
||||||
|
@secref["fixnums+flonums"]), we can re-write @scheme[mandelbot] to use
|
||||||
|
much less allocation:
|
||||||
|
|
||||||
|
@schemeblock[
|
||||||
|
(define (mandelbrot iterations x y n)
|
||||||
|
(let ((ci (fl- (fl/ (* 2.0 (->fl y)) (->fl n)) 1.0))
|
||||||
|
(cr (fl- (fl/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
|
||||||
|
(let loop ((i 0) (zr 0.0) (zi 0.0))
|
||||||
|
(if (> i iterations)
|
||||||
|
i
|
||||||
|
(let ((zrq (fl* zr zr))
|
||||||
|
(ziq (fl* zi zi)))
|
||||||
|
(cond
|
||||||
|
((fl> (fl+ zrq ziq) 4.0) i)
|
||||||
|
(else (loop (add1 i)
|
||||||
|
(fl+ (fl- zrq ziq) cr)
|
||||||
|
(fl+ (fl* 2.0 (fl* zr zi)) ci)))))))))
|
||||||
|
]
|
||||||
|
|
||||||
|
This conversion can speed @scheme[mandelbrot] by a factor of 8, even
|
||||||
|
in sequential mode, but avoiding allocation also allows
|
||||||
|
@scheme[mandelbrot] to run usefully faster in parallel.
|
||||||
|
|
||||||
|
As a general guideline, any operation that is inlined by the
|
||||||
|
@tech{JIT} compiler runs safely in parallel, while other operations
|
||||||
|
that are not inlined (including all operations if the JIT compiler is
|
||||||
|
disabled) are considered unsafe. The @exec{mzc} decompiler tool
|
||||||
|
annotates operations that can be inlined by the compiler (see
|
||||||
|
@secref[#:doc '(lib "scribblings/mzc/mzc.scrbl") "decompile"]), so the
|
||||||
|
decompiler can be used to help predict parallel performance.
|
||||||
|
|
||||||
|
To more directly report what is happening in a program that uses
|
||||||
|
@scheme[future] and @scheme[touch], operations are logged when they
|
||||||
|
suspend a computation or synchronize with the main computation. For
|
||||||
|
example, running the original @scheme[mandelbrot] in a future produces
|
||||||
|
the following output in the @scheme['debug] log level:
|
||||||
|
|
||||||
|
@margin-note{To see @scheme['debug] logging output on stderr, set the
|
||||||
|
@envvar{PLTSTDERR} environment variable to @tt{debug} or start
|
||||||
|
@exec{mzscheme} with @Flag{W} @tt{debug}.}
|
||||||
|
|
||||||
|
@verbatim[#:indent 2]|{
|
||||||
|
future: 0 waiting for runtime at 1267392979341.989: *
|
||||||
|
}|
|
||||||
|
|
||||||
|
The message indicates which internal future-running task became
|
||||||
|
blocked on an unsafe operation, the time it blocked (in terms of
|
||||||
|
@scheme[current-inexact-miliseconds]), and the operation that caused
|
||||||
|
the computation it to block.
|
||||||
|
|
||||||
|
The first revision to @scheme[mandelbrot] avoids suspending at
|
||||||
|
@scheme[*], but produces many log entries of the form
|
||||||
|
|
||||||
|
@verbatim[#:indent 2]|{
|
||||||
|
future: 0 waiting for runtime at 1267392980465.066: [acquire_gc_page]
|
||||||
|
}|
|
85
collects/scribblings/reference/futures.scrbl
Normal file
85
collects/scribblings/reference/futures.scrbl
Normal file
|
@ -0,0 +1,85 @@
|
||||||
|
#lang scribble/doc
|
||||||
|
@(require "mz.ss"
|
||||||
|
(for-label scheme
|
||||||
|
scheme/base
|
||||||
|
scheme/contract
|
||||||
|
scheme/future))
|
||||||
|
|
||||||
|
@(define future-eval (make-base-eval))
|
||||||
|
@(interaction-eval #:eval future-eval (require scheme/future))
|
||||||
|
|
||||||
|
@title[#:tag "futures"]{Futures for Parallelism}
|
||||||
|
|
||||||
|
@note-lib[scheme/future]
|
||||||
|
|
||||||
|
@margin-note{Currently, parallel support for @scheme[future] is
|
||||||
|
enabled by default for Windows, Linux x86/x86_64, and Mac OS X
|
||||||
|
x86/x86_64. To enable support for other platforms, use
|
||||||
|
@DFlag{enable-futures} with @exec{configure} when building PLT
|
||||||
|
Scheme.}
|
||||||
|
|
||||||
|
The @scheme[future] and @scheme[touch] functions from
|
||||||
|
@schememodname[scheme/future] provide access to parallelism as
|
||||||
|
supported by the hardware and operation system.
|
||||||
|
In contrast to @scheme[thread], which provides concurrency for
|
||||||
|
arbitrary computations without parallelism, @scheme[future] provides
|
||||||
|
parallelism for limited computations. A future executes its work in
|
||||||
|
parallel (assuming that support for parallelism is available) until it
|
||||||
|
detects an attempt to perform an operation that is too complex for the
|
||||||
|
system to run safely in parallel. Similarly, work in a future is
|
||||||
|
suspended if it depends in some way on the current continuation, such
|
||||||
|
as raising an exception. A suspended computation for a future is
|
||||||
|
resumed when @scheme[touch] is applied to the future descriptor.
|
||||||
|
|
||||||
|
``Safe'' parallel execution of a future means that all operations
|
||||||
|
provided by the system must be able to enforce contracts and produce
|
||||||
|
results as documented. ``Safe'' does not preclude concurrent access to
|
||||||
|
mutable data that is visible in the program. For example, a
|
||||||
|
computation in a future might use @scheme[set!] to modify a shared
|
||||||
|
variable, in which case concurrent assignment to the variable can be
|
||||||
|
visible in other futures and threads. Furthermore, guarantees about
|
||||||
|
the visibility of effects and ordering are determined by the operating
|
||||||
|
system and hardware---which rarely support, for example, the guarantee
|
||||||
|
of sequential consistency that is provided for @scheme[thread]-based
|
||||||
|
concurrency. At the same time, operations that seem obviously safe may
|
||||||
|
have a complex enough implementation internally that they cannot run
|
||||||
|
in parallel. See also @guidesecref["effective-futures"].
|
||||||
|
|
||||||
|
@deftogether[(
|
||||||
|
@defproc[(future [thunk (-> any)]) future?]
|
||||||
|
@defproc[(touch [f future?]) any]
|
||||||
|
)]{
|
||||||
|
|
||||||
|
The @scheme[future] procedure returns a future-descriptor value that
|
||||||
|
encapsulates @scheme[thunk]. The @scheme[touch] function forces the
|
||||||
|
evaluation of the @scheme[thunk] inside the given future, returning
|
||||||
|
the values produced by @scheme[thunk]. After @scheme[touch] forces
|
||||||
|
the evaluation of a @scheme[thunk], the resulting values are retained
|
||||||
|
by the future descriptor in place of @scheme[thunk], and additional
|
||||||
|
@scheme[touch]es of the future descriptor return those values.
|
||||||
|
|
||||||
|
Between a call to @scheme[future] and @scheme[touch] for a given
|
||||||
|
future, the given @scheme[thunk] may run speculatively in parallel to
|
||||||
|
other computations, as described above.
|
||||||
|
|
||||||
|
@interaction[
|
||||||
|
#:eval future-eval
|
||||||
|
(let ([f (future (lambda () (+ 1 2)))])
|
||||||
|
(list (+ 3 4) (touch f)))
|
||||||
|
]}
|
||||||
|
|
||||||
|
|
||||||
|
@defproc[(future? [v any/c]) boolean?]{
|
||||||
|
Returns @scheme[#t] if @scheme[v] is a future-descriptor value,
|
||||||
|
@scheme[#f] otherwise.
|
||||||
|
}
|
||||||
|
|
||||||
|
@defproc[(processor-count) exact-positive-integer?]{
|
||||||
|
Returns the number of parallel computations units (e.g., processors
|
||||||
|
or cores) that are available on the current machine.
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@; ----------------------------------------------------------------------
|
||||||
|
|
||||||
|
@close-eval[future-eval]
|
Loading…
Reference in New Issue
Block a user