183 lines
6.5 KiB
Racket
183 lines
6.5 KiB
Racket
#lang scribble/doc
|
|
@(require scribble/manual
|
|
"guide-utils.ss"
|
|
(for-label racket/flonum racket/future))
|
|
|
|
@title[#:tag "effective-futures"]{Parallelism with Futures}
|
|
|
|
The @racketmodname[racket/future] library provides support for
|
|
performance improvement through parallelism with the @racket[future]
|
|
and @racket[touch] functions. The level of parallelism available from
|
|
those constructs, however, is limited by several factors, and the
|
|
current implementation is best suited to numerical tasks.
|
|
|
|
@margin-note{Other functions, such as @racket[thread], support the
|
|
creation of reliably concurrent tasks. However, thread never run truly
|
|
in parallel, even if the hardware and operating system support
|
|
parallelism.}
|
|
|
|
As a starting example, the @racket[any-double?] function below takes a
|
|
list of numbers and determines whether any number in the list has a
|
|
double that is also in the list:
|
|
|
|
@racketblock[
|
|
(define (any-double? l)
|
|
(for/or ([i (in-list l)])
|
|
(for/or ([i2 (in-list l)])
|
|
(= i2 (* 2 i)))))
|
|
]
|
|
|
|
This function runs in quadratic time, so it can take a long time (on
|
|
the order of a second) on large lists like @racket[l1] and
|
|
@racket[l2]:
|
|
|
|
@racketblock[
|
|
(define l1 (for/list ([i (in-range 5000)])
|
|
(+ (* 2 i) 1)))
|
|
(define l2 (for/list ([i (in-range 5000)])
|
|
(- (* 2 i) 1)))
|
|
(or (any-double? l1)
|
|
(any-double? l2))
|
|
]
|
|
|
|
The best way to speed up @racket[any-double?] is to use a different
|
|
algorithm. However, on a machine that offers at least two processing
|
|
units, the example above can run in about half the time using
|
|
@racket[future] and @racket[touch]:
|
|
|
|
@racketblock[
|
|
(let ([f (future (lambda () (any-double? l2)))])
|
|
(or (any-double? l1)
|
|
(touch f)))
|
|
]
|
|
|
|
The future @racket[f] runs @racket[(any-double? l2)] in parallel to
|
|
@racket[(any-double? l1)], and the result for @racket[(any-double?
|
|
l2)] becomes available about the same time that it is demanded by
|
|
@racket[(touch f)].
|
|
|
|
Futures run in parallel as long as they can do so safely, but the
|
|
notion of ``safe'' for parallelism is inherently tied to the system
|
|
implementation. The distinction between ``safe'' and ``unsafe''
|
|
operations may be far from apparent at the level of a Racket program.
|
|
|
|
Consider the following core of a Mandelbrot-set computation:
|
|
|
|
@racketblock[
|
|
(define (mandelbrot iterations x y n)
|
|
(let ((ci (- (/ (* 2.0 y) n) 1.0))
|
|
(cr (- (/ (* 2.0 x) n) 1.5)))
|
|
(let loop ((i 0) (zr 0.0) (zi 0.0))
|
|
(if (> i iterations)
|
|
i
|
|
(let ((zrq (* zr zr))
|
|
(ziq (* zi zi)))
|
|
(cond
|
|
((> (+ zrq ziq) 4.0) i)
|
|
(else (loop (add1 i)
|
|
(+ (- zrq ziq) cr)
|
|
(+ (* 2.0 zr zi) ci)))))))))
|
|
]
|
|
|
|
The expressions @racket[(mandelbrot 10000000 62 500 1000)] and
|
|
@racket[(mandelbrot 10000000 62 501 1000)] each take a while to
|
|
produce an answer. Computing them both, of course, takes twice as
|
|
long:
|
|
|
|
@racketblock[
|
|
(list (mandelbrot 10000000 62 500 1000)
|
|
(mandelbrot 10000000 62 501 1000))
|
|
]
|
|
|
|
Unfortunately, attempting to run the two computations in parallel with
|
|
@racket[future] does not improve performance:
|
|
|
|
@racketblock[
|
|
(let ([f (future (lambda () (mandelbrot 10000000 62 501 1000)))])
|
|
(list (mandelbrot 10000000 62 500 1000)
|
|
(touch f)))
|
|
]
|
|
|
|
One problem is that the @racket[*] and @racket[/] operations in the
|
|
first two lines of @racket[mandelbrot] involve a mixture of exact and
|
|
inexact real numbers. Such mixtures typically trigger a slow path in
|
|
execution, and the general slow path is not safe for
|
|
parallelism. Consequently, the future created in this example is
|
|
almost immediately suspended, and it cannot resume until
|
|
@racket[touch] is called.
|
|
|
|
Changing the first two lines of @racket[mandelbrot] addresses that
|
|
first the problem:
|
|
|
|
@racketblock[
|
|
(define (mandelbrot iterations x y n)
|
|
(let ((ci (- (/ (* 2.0 (->fl y)) (->fl n)) 1.0))
|
|
(cr (- (/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
|
|
....))
|
|
]
|
|
|
|
With that change, @racket[mandelbrot] computations can run in
|
|
parallel. Nevertheless, performance still does not improve. The
|
|
problem is that most every arithmetic operation in this example
|
|
produces an inexact number whose storage must be allocated. Especially
|
|
frequent allocation triggers communication between parallel tasks that
|
|
defeats any performance improvement.
|
|
|
|
By using @tech{flonum}-specific operations (see
|
|
@secref["fixnums+flonums"]), we can re-write @racket[mandelbot] to use
|
|
much less allocation:
|
|
|
|
@racketblock[
|
|
(define (mandelbrot iterations x y n)
|
|
(let ((ci (fl- (fl/ (* 2.0 (->fl y)) (->fl n)) 1.0))
|
|
(cr (fl- (fl/ (* 2.0 (->fl x)) (->fl n)) 1.5)))
|
|
(let loop ((i 0) (zr 0.0) (zi 0.0))
|
|
(if (> i iterations)
|
|
i
|
|
(let ((zrq (fl* zr zr))
|
|
(ziq (fl* zi zi)))
|
|
(cond
|
|
((fl> (fl+ zrq ziq) 4.0) i)
|
|
(else (loop (add1 i)
|
|
(fl+ (fl- zrq ziq) cr)
|
|
(fl+ (fl* 2.0 (fl* zr zi)) ci)))))))))
|
|
]
|
|
|
|
This conversion can speed @racket[mandelbrot] by a factor of 8, even
|
|
in sequential mode, but avoiding allocation also allows
|
|
@racket[mandelbrot] to run usefully faster in parallel.
|
|
|
|
As a general guideline, any operation that is inlined by the
|
|
@tech{JIT} compiler runs safely in parallel, while other operations
|
|
that are not inlined (including all operations if the JIT compiler is
|
|
disabled) are considered unsafe. The @exec{mzc} decompiler tool
|
|
annotates operations that can be inlined by the compiler (see
|
|
@secref[#:doc '(lib "scribblings/raco/raco.scrbl") "decompile"]), so the
|
|
decompiler can be used to help predict parallel performance.
|
|
|
|
To more directly report what is happening in a program that uses
|
|
@racket[future] and @racket[touch], operations are logged when they
|
|
suspend a computation or synchronize with the main computation. For
|
|
example, running the original @racket[mandelbrot] in a future produces
|
|
the following output in the @racket['debug] log level:
|
|
|
|
@margin-note{To see @racket['debug] logging output on stderr, set the
|
|
@envvar{PLTSTDERR} environment variable to @tt{debug} or start
|
|
@exec{racket} with @Flag{W} @tt{debug}.}
|
|
|
|
@verbatim[#:indent 2]|{
|
|
future: 0 waiting for runtime at 1267392979341.989: *
|
|
}|
|
|
|
|
The message indicates which internal future-running task became
|
|
blocked on an unsafe operation, the time it blocked (in terms of
|
|
@racket[current-inexact-miliseconds]), and the operation that caused
|
|
the computation it to block.
|
|
|
|
The first revision to @racket[mandelbrot] avoids suspending at
|
|
@racket[*], but produces many log entries of the form
|
|
|
|
@verbatim[#:indent 2]|{
|
|
future: 0 waiting for runtime at 1267392980465.066: [acquire_gc_page]
|
|
}|
|