From 2ddfa89a7a31b37a3c63008f57514af33380eb95 Mon Sep 17 00:00:00 2001 From: Matthew Flatt Date: Mon, 1 Mar 2010 01:45:49 +0000 Subject: [PATCH] add missing doc files svn: r18401 --- collects/scribblings/guide/futures.scrbl | 182 +++++++++++++++++++ collects/scribblings/reference/futures.scrbl | 85 +++++++++ 2 files changed, 267 insertions(+) create mode 100644 collects/scribblings/guide/futures.scrbl create mode 100644 collects/scribblings/reference/futures.scrbl diff --git a/collects/scribblings/guide/futures.scrbl b/collects/scribblings/guide/futures.scrbl new file mode 100644 index 0000000000..a78fd074eb --- /dev/null +++ b/collects/scribblings/guide/futures.scrbl @@ -0,0 +1,182 @@ +#lang scribble/doc +@(require scribble/manual + "guide-utils.ss" + (for-label scheme/flonum scheme/future)) + +@title[#:tag "effective-futures"]{Parallelism with Futures} + +The @schememodname[scheme/future] library provides support for +performance improvement through parallelism with the @scheme[future] +and @scheme[touch] functions. The level of parallelism available from +those constructs, however, is limited by several factors, and the +current implementation is best suited to numerical tasks. + +@margin-note{Other functions, such as @scheme[thread], support the +creation of reliably concurrent tasks. However, thread never run truly +in parallel, even if the hardware and operating system support +parallelism.} + +As a starting example, the @scheme[any-double?] function below takes a +list of numbers and determines whether any number in the list has a +double that is also in the list: + +@schemeblock[ +(define (any-double? l) + (for/or ([i (in-list l)]) + (for/or ([i2 (in-list l)]) + (= i2 (* 2 i))))) +] + +This function runs in quadratic time, so it can take a long time (on +the order of a second) on large lists like @scheme[l1] and +@scheme[l2]: + +@schemeblock[ +(define l1 (for/list ([i (in-range 5000)]) + (+ (* 2 i) 1))) +(define l2 (for/list ([i (in-range 5000)]) + (- (* 2 i) 1))) +(or (any-double? l1) + (any-double? l2)) +] + +The best way to speed up @scheme[any-double?] is to use a different +algorithm. However, on a machine that offers at least two processing +units, the example above can run in about half the time using +@scheme[future] and @scheme[touch]: + +@schemeblock[ +(let ([f (future (lambda () (any-double? l2)))]) + (or (any-double? l1) + (touch f))) +] + +The future @scheme[f] runs @scheme[(any-double? l2)] in parallel to +@scheme[(any-double? l1)], and the result for @scheme[(any-double? +l2)] becomes available about the same time that it is demanded by +@scheme[(touch f)]. + +Futures run in parallel as long as they can do so safely, but the +notion of ``safe'' for parallelism is inherently tied to the system +implementation. The distinction between ``safe'' and ``unsafe'' +operations may be far from apparent at the level of a Scheme program. + +Consider the following core of a Mandelbrot-set computation: + +@schemeblock[ +(define (mandelbrot iterations x y n) + (let ((ci (- (/ (* 2.0 y) n) 1.0)) + (cr (- (/ (* 2.0 x) n) 1.5))) + (let loop ((i 0) (zr 0.0) (zi 0.0)) + (if (> i iterations) + i + (let ((zrq (* zr zr)) + (ziq (* zi zi))) + (cond + ((> (+ zrq ziq) 4.0) i) + (else (loop (add1 i) + (+ (- zrq ziq) cr) + (+ (* 2.0 zr zi) ci))))))))) +] + +The expressions @scheme[(mandelbrot 10000000 62 500 1000)] and +@scheme[(mandelbrot 10000000 62 501 1000)] each take a while to +produce an answer. Computing them both, of course, takes twice as +long: + +@schemeblock[ +(list (mandelbrot 10000000 62 500 1000) + (mandelbrot 10000000 62 501 1000)) +] + +Unfortunately, attempting to run the two computations in parallel with +@scheme[future] does not improve performance: + +@schemeblock[ + (let ([f (future (lambda () (mandelbrot 10000000 62 501 1000)))]) + (list (mandelbrot 10000000 62 500 1000) + (touch f))) +] + +One problem is that the @scheme[*] and @scheme[/] operations in the +first two lines of @scheme[mandelbrot] involve a mixture of exact and +inexact real numbers. Such mixtures typically trigger a slow path in +execution, and the general slow path is not safe for +parallelism. Consequently, the future created in this example is +almost immediately suspended, and it cannot resume until +@scheme[touch] is called. + +Changing the first two lines of @scheme[mandelbrot] addresses that +first the problem: + +@schemeblock[ +(define (mandelbrot iterations x y n) + (let ((ci (- (/ (* 2.0 (->fl y)) (->fl n)) 1.0)) + (cr (- (/ (* 2.0 (->fl x)) (->fl n)) 1.5))) + ....)) +] + +With that change, @scheme[mandelbrot] computations can run in +parallel. Nevertheless, performance still does not improve. The +problem is that most every arithmetic operation in this example +produces an inexact number whose storage must be allocated. Especially +frequent allocation triggers communication between parallel tasks that +defeats any performance improvement. + +By using @tech{flonum}-specific operations (see +@secref["fixnums+flonums"]), we can re-write @scheme[mandelbot] to use +much less allocation: + +@schemeblock[ +(define (mandelbrot iterations x y n) + (let ((ci (fl- (fl/ (* 2.0 (->fl y)) (->fl n)) 1.0)) + (cr (fl- (fl/ (* 2.0 (->fl x)) (->fl n)) 1.5))) + (let loop ((i 0) (zr 0.0) (zi 0.0)) + (if (> i iterations) + i + (let ((zrq (fl* zr zr)) + (ziq (fl* zi zi))) + (cond + ((fl> (fl+ zrq ziq) 4.0) i) + (else (loop (add1 i) + (fl+ (fl- zrq ziq) cr) + (fl+ (fl* 2.0 (fl* zr zi)) ci))))))))) +] + +This conversion can speed @scheme[mandelbrot] by a factor of 8, even +in sequential mode, but avoiding allocation also allows +@scheme[mandelbrot] to run usefully faster in parallel. + +As a general guideline, any operation that is inlined by the +@tech{JIT} compiler runs safely in parallel, while other operations +that are not inlined (including all operations if the JIT compiler is +disabled) are considered unsafe. The @exec{mzc} decompiler tool +annotates operations that can be inlined by the compiler (see +@secref[#:doc '(lib "scribblings/mzc/mzc.scrbl") "decompile"]), so the +decompiler can be used to help predict parallel performance. + +To more directly report what is happening in a program that uses +@scheme[future] and @scheme[touch], operations are logged when they +suspend a computation or synchronize with the main computation. For +example, running the original @scheme[mandelbrot] in a future produces +the following output in the @scheme['debug] log level: + +@margin-note{To see @scheme['debug] logging output on stderr, set the +@envvar{PLTSTDERR} environment variable to @tt{debug} or start +@exec{mzscheme} with @Flag{W} @tt{debug}.} + +@verbatim[#:indent 2]|{ + future: 0 waiting for runtime at 1267392979341.989: * +}| + +The message indicates which internal future-running task became +blocked on an unsafe operation, the time it blocked (in terms of +@scheme[current-inexact-miliseconds]), and the operation that caused +the computation it to block. + +The first revision to @scheme[mandelbrot] avoids suspending at +@scheme[*], but produces many log entries of the form + +@verbatim[#:indent 2]|{ + future: 0 waiting for runtime at 1267392980465.066: [acquire_gc_page] +}| diff --git a/collects/scribblings/reference/futures.scrbl b/collects/scribblings/reference/futures.scrbl new file mode 100644 index 0000000000..5cc5fb11a1 --- /dev/null +++ b/collects/scribblings/reference/futures.scrbl @@ -0,0 +1,85 @@ +#lang scribble/doc +@(require "mz.ss" + (for-label scheme + scheme/base + scheme/contract + scheme/future)) + +@(define future-eval (make-base-eval)) +@(interaction-eval #:eval future-eval (require scheme/future)) + +@title[#:tag "futures"]{Futures for Parallelism} + +@note-lib[scheme/future] + +@margin-note{Currently, parallel support for @scheme[future] is +enabled by default for Windows, Linux x86/x86_64, and Mac OS X +x86/x86_64. To enable support for other platforms, use +@DFlag{enable-futures} with @exec{configure} when building PLT +Scheme.} + +The @scheme[future] and @scheme[touch] functions from +@schememodname[scheme/future] provide access to parallelism as +supported by the hardware and operation system. +In contrast to @scheme[thread], which provides concurrency for +arbitrary computations without parallelism, @scheme[future] provides +parallelism for limited computations. A future executes its work in +parallel (assuming that support for parallelism is available) until it +detects an attempt to perform an operation that is too complex for the +system to run safely in parallel. Similarly, work in a future is +suspended if it depends in some way on the current continuation, such +as raising an exception. A suspended computation for a future is +resumed when @scheme[touch] is applied to the future descriptor. + +``Safe'' parallel execution of a future means that all operations +provided by the system must be able to enforce contracts and produce +results as documented. ``Safe'' does not preclude concurrent access to +mutable data that is visible in the program. For example, a +computation in a future might use @scheme[set!] to modify a shared +variable, in which case concurrent assignment to the variable can be +visible in other futures and threads. Furthermore, guarantees about +the visibility of effects and ordering are determined by the operating +system and hardware---which rarely support, for example, the guarantee +of sequential consistency that is provided for @scheme[thread]-based +concurrency. At the same time, operations that seem obviously safe may +have a complex enough implementation internally that they cannot run +in parallel. See also @guidesecref["effective-futures"]. + +@deftogether[( +@defproc[(future [thunk (-> any)]) future?] +@defproc[(touch [f future?]) any] +)]{ + + The @scheme[future] procedure returns a future-descriptor value that + encapsulates @scheme[thunk]. The @scheme[touch] function forces the + evaluation of the @scheme[thunk] inside the given future, returning + the values produced by @scheme[thunk]. After @scheme[touch] forces + the evaluation of a @scheme[thunk], the resulting values are retained + by the future descriptor in place of @scheme[thunk], and additional + @scheme[touch]es of the future descriptor return those values. + + Between a call to @scheme[future] and @scheme[touch] for a given + future, the given @scheme[thunk] may run speculatively in parallel to + other computations, as described above. + +@interaction[ +#:eval future-eval +(let ([f (future (lambda () (+ 1 2)))]) + (list (+ 3 4) (touch f))) +]} + + +@defproc[(future? [v any/c]) boolean?]{ + Returns @scheme[#t] if @scheme[v] is a future-descriptor value, + @scheme[#f] otherwise. +} + +@defproc[(processor-count) exact-positive-integer?]{ + Returns the number of parallel computations units (e.g., processors + or cores) that are available on the current machine. +} + + +@; ---------------------------------------------------------------------- + +@close-eval[future-eval]