racket/collects/scribblings/guide/futures.scrbl
2010-06-04 16:40:00 -04:00

183 lines
6.5 KiB
Racket

#lang scribble/doc
@(require scribble/manual
"guide-utils.ss"
(for-label racket/flonum racket/future))
@title[#:tag "effective-futures"]{Parallelism with Futures}
The @racketmodname[racket/future] library provides support for
performance improvement through parallelism with the @racket[future]
and @racket[touch] functions. The level of parallelism available from
those constructs, however, is limited by several factors, and the
current implementation is best suited to numerical tasks.
@margin-note{Other functions, such as @racket[thread], support the
creation of reliably concurrent tasks. However, thread never run truly
in parallel, even if the hardware and operating system support
parallelism.}
As a starting example, the @racket[any-double?] function below takes a
list of numbers and determines whether any number in the list has a
double that is also in the list:
@racketblock[
(define (any-double? l)
(for/or ([i (in-list l)])
(for/or ([i2 (in-list l)])
(= i2 (* 2 i)))))
]
This function runs in quadratic time, so it can take a long time (on
the order of a second) on large lists like @racket[l1] and
@racket[l2]:
@racketblock[
(define l1 (for/list ([i (in-range 5000)])
(+ (* 2 i) 1)))
(define l2 (for/list ([i (in-range 5000)])
(- (* 2 i) 1)))
(or (any-double? l1)
(any-double? l2))
]
The best way to speed up @racket[any-double?] is to use a different
algorithm. However, on a machine that offers at least two processing
units, the example above can run in about half the time using
@racket[future] and @racket[touch]:
@racketblock[
(let ([f (future (lambda () (any-double? l2)))])
(or (any-double? l1)
(touch f)))
]
The future @racket[f] runs @racket[(any-double? l2)] in parallel to
@racket[(any-double? l1)], and the result for @racket[(any-double?
l2)] becomes available about the same time that it is demanded by
@racket[(touch f)].
Futures run in parallel as long as they can do so safely, but the
notion of ``safe'' for parallelism is inherently tied to the system
implementation. The distinction between ``safe'' and ``unsafe''
operations may be far from apparent at the level of a Racket program.
Consider the following core of a Mandelbrot-set computation:
@racketblock[
(define (mandelbrot iterations x y n)
(let ((ci (- (/ (* 2.0 y) n) 1.0))
(cr (- (/ (* 2.0 x) n) 1.5)))
(let loop ((i 0) (zr 0.0) (zi 0.0))
(if (> i iterations)
i
(let ((zrq (* zr zr))
(ziq (* zi zi)))
(cond
((> (+ zrq ziq) 4.0) i)
(else (loop (add1 i)
(+ (- zrq ziq) cr)
(+ (* 2.0 zr zi) ci)))))))))
]
The expressions @racket[(mandelbrot 10000000 62 500 1000)] and
@racket[(mandelbrot 10000000 62 501 1000)] each take a while to
produce an answer. Computing them both, of course, takes twice as
long:
@racketblock[
(list (mandelbrot 10000000 62 500 1000)
(mandelbrot 10000000 62 501 1000))
]
Unfortunately, attempting to run the two computations in parallel with
@racket[future] does not improve performance:
@racketblock[
(let ([f (future (lambda () (mandelbrot 10000000 62 501 1000)))])
(list (mandelbrot 10000000 62 500 1000)
(touch f)))
]
One problem is that the @racket[*] and @racket[/] operations in the
first two lines of @racket[mandelbrot] involve a mixture of exact and
inexact real numbers. Such mixtures typically trigger a slow path in
execution, and the general slow path is not safe for
parallelism. Consequently, the future created in this example is
almost immediately suspended, and it cannot resume until
@racket[touch] is called.
Changing the first two lines of @racket[mandelbrot] addresses that
first the problem:
@racketblock[
(define (mandelbrot iterations x y n)
(let ((ci (- (/ (* 2.0 (->fl y)) (->fl n)) 1.0))
(cr (- (/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
....))
]
With that change, @racket[mandelbrot] computations can run in
parallel. Nevertheless, performance still does not improve. The
problem is that most every arithmetic operation in this example
produces an inexact number whose storage must be allocated. Especially
frequent allocation triggers communication between parallel tasks that
defeats any performance improvement.
By using @tech{flonum}-specific operations (see
@secref["fixnums+flonums"]), we can re-write @racket[mandelbot] to use
much less allocation:
@racketblock[
(define (mandelbrot iterations x y n)
(let ((ci (fl- (fl/ (* 2.0 (->fl y)) (->fl n)) 1.0))
(cr (fl- (fl/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
(let loop ((i 0) (zr 0.0) (zi 0.0))
(if (> i iterations)
i
(let ((zrq (fl* zr zr))
(ziq (fl* zi zi)))
(cond
((fl> (fl+ zrq ziq) 4.0) i)
(else (loop (add1 i)
(fl+ (fl- zrq ziq) cr)
(fl+ (fl* 2.0 (fl* zr zi)) ci)))))))))
]
This conversion can speed @racket[mandelbrot] by a factor of 8, even
in sequential mode, but avoiding allocation also allows
@racket[mandelbrot] to run usefully faster in parallel.
As a general guideline, any operation that is inlined by the
@tech{JIT} compiler runs safely in parallel, while other operations
that are not inlined (including all operations if the JIT compiler is
disabled) are considered unsafe. The @exec{mzc} decompiler tool
annotates operations that can be inlined by the compiler (see
@secref[#:doc '(lib "scribblings/raco/raco.scrbl") "decompile"]), so the
decompiler can be used to help predict parallel performance.
To more directly report what is happening in a program that uses
@racket[future] and @racket[touch], operations are logged when they
suspend a computation or synchronize with the main computation. For
example, running the original @racket[mandelbrot] in a future produces
the following output in the @racket['debug] log level:
@margin-note{To see @racket['debug] logging output on stderr, set the
@envvar{PLTSTDERR} environment variable to @tt{debug} or start
@exec{racket} with @Flag{W} @tt{debug}.}
@verbatim[#:indent 2]|{
future: 0 waiting for runtime at 1267392979341.989: *
}|
The message indicates which internal future-running task became
blocked on an unsafe operation, the time it blocked (in terms of
@racket[current-inexact-miliseconds]), and the operation that caused
the computation it to block.
The first revision to @racket[mandelbrot] avoids suspending at
@racket[*], but produces many log entries of the form
@verbatim[#:indent 2]|{
future: 0 waiting for runtime at 1267392980465.066: [acquire_gc_page]
}|