update compilation info, especially for CS

Update the Guide's performance section with current information for
Racket CS, and also document the Racket CS compilation mode and
inspection environment variables. Make a couple of environment
variables work more consistently: PLTDISABLEGC for CS and PLT_ZO_PATH
for BC.
This commit is contained in:
Matthew Flatt 2020-06-14 14:26:22 -06:00
parent 6309c64f6c
commit 2a7d94d89c
16 changed files with 291 additions and 55 deletions

View File

@ -5,7 +5,8 @@
compiler/decompile
compiler/compilation-path
racket/pretty
racket/format)
racket/format
"../private/chez.rkt")
(define (get-name)
(string->symbol (short-program+command-name)))
@ -27,6 +28,8 @@
(pretty-print-columns num))]
[("--linklet") "Decompile to linklets"
(set! to-linklets? #t)]
[("--no-disassemble") "Show machine code as-is"
(current-can-disassemble #f)]
#:args source-or-bytecode-file
source-or-bytecode-file))

View File

@ -189,7 +189,7 @@
`((begin-for-all
(define (.get-syntax-literal! pos)
....
,(decompile-data-linklet l)
,@(decompile-data-linklet l)
....)))
null))))

View File

@ -4,7 +4,10 @@
racket/promise)
(provide decompile-chez-procedure
unwrap-chez-interpret-jitified)
unwrap-chez-interpret-jitified
current-can-disassemble)
(define current-can-disassemble (make-parameter #t))
(define (decompile-chez-procedure p)
(unless (procedure? p)
@ -106,7 +109,8 @@
null))
;; Show machine/assembly code:
(cond
[(force disassemble-bytes)
[(and (current-can-disassemble)
(force disassemble-bytes))
=> (lambda (disassemble-bytes)
(define o (open-output-bytes))
(parameterize ([current-output-port o])

View File

@ -13,4 +13,4 @@
(define pkg-authors '(mflatt))
(define version "1.7")
(define version "1.8")

View File

@ -100,7 +100,7 @@ variants versus the @tech{CS} variant.
@; ----------------------------------------------------------------------
@section[#:tag "JIT"]{The Bytecode and Just-in-Time (JIT) Compilers}
@section[#:tag "JIT"]{Bytecode, Machine Code, and Just-in-Time (JIT) Compilers}
Every definition or expression to be evaluated by Racket is compiled
to an internal bytecode format, although ``bytecode'' may actually be
@ -131,7 +131,7 @@ compiled to native code via a @deftech{just-in-time} or @deftech{JIT}
compiler. The @tech{JIT} compiler substantially speeds programs that
execute tight loops, arithmetic on small integers, and arithmetic on
inexact real numbers. Currently, @tech{JIT} compilation is supported
for x86, x86_64 (a.k.a. AMD64), ARM, and 32-bit PowerPC processors.
for x86, x86_64 (a.k.a. AMD64), 32-bit ARM, and 32-bit PowerPC processors.
The @tech{JIT} compiler can be disabled via the
@racket[eval-jit-enabled] parameter or the @DFlag{no-jit}/@Flag{j}
command-line flag for @exec{racket}. Setting @racket[eval-jit-enabled]
@ -146,6 +146,10 @@ not counting the bodies of any lexically nested procedures. The
overhead for @tech{JIT} compilation is normally so small that it is
difficult to detect.
For information about viewing intermediate Racket code
representations, especially for the @tech{CS} variant, see
@refsecref["compiler-inspect"].
@; ----------------------------------------------------------------------
@section[#:tag "modules-performance"]{Modules and Performance}
@ -350,8 +354,8 @@ bindings.
A @deftech{fixnum} is a small exact integer. In this case, ``small''
depends on the platform. For a 32-bit machine, numbers that can be
expressed in 30 bits plus a sign bit are represented as fixnums. On a
64-bit machine, 62 bits plus a sign bit are available.
expressed in 29-30 bits plus a sign bit are represented as fixnums. On
a 64-bit machine, 60-62 bits plus a sign bit are available.
A @deftech{flonum} is used to represent any inexact real number. They
correspond to 64-bit IEEE floating-point numbers on all platforms.
@ -379,8 +383,7 @@ typically cheap to use.
@tech{flonum}-specific operations.}
The @racketmodname[racket/flonum] library provides flonum-specific
operations, and combinations of flonum operations allow the @tech{JIT}
compiler for the @tech{3m} and @tech{CGC} variants of Racket
operations, and combinations of flonum operations allow the compiler
to generate code that avoids boxing and unboxing intermediate
results. Besides results within immediate combinations,
flonum-specific results that are bound with @racket[let] and consumed
@ -388,14 +391,37 @@ by a later flonum-specific operation are unboxed within temporary
storage. @margin-note*{Unboxing applies most reliably to uses of a
flonum-specific operation with two arguments.}
Finally, the compiler can detect some flonum-valued loop
accumulators and avoid boxing of the accumulator. The bytecode
decompiler (see @secref[#:doc '(lib "scribblings/raco/raco.scrbl")
"decompile"]) annotates combinations where the JIT can avoid boxes with
@racketidfont{#%flonum}, @racketidfont{#%as-flonum}, and
@racketidfont{#%from-flonum}.
accumulators and avoid boxing of the accumulator.
@margin-note*{Unboxing of local bindings and accumulators is not
supported by the @tech{3m} variant's JIT for PowerPC.}
@margin-note{Unboxing of local bindings and accumulators is not
supported by the JIT for PowerPC.}
For some loop patterns, the compiler may need hints to enable
unboxing. For example:
@racketblock[
(define (flvector-sum vec init)
(let loop ([i 0] [sum init])
(if (fx= i (flvector-length vec))
sum
(loop (fx+ i 1) (fl+ sum (flvector-ref vec i))))))
]
The compiler may not be able to unbox @racket[sum] in this example for
two reasons: it cannot determine locally that its initial value from
@racket[init] will be a flonum, and it cannot tell locally that the
@racket[eq?] identity of the result @racket[sum] is irrelevant.
Changing the reference @racket[init] to @racket[(fl+ init)] and
changing the result @racket[sum] to @racket[(fl+ sum)] gives the
compiler hints and license to unbox @racket[sum].
The bytecode decompiler (see @secref[#:doc '(lib
"scribblings/raco/raco.scrbl") "decompile"]) for the @tech{3m} variant
annotates combinations where the JIT can avoid boxes with
@racketidfont{#%flonum}, @racketidfont{#%as-flonum}, and
@racketidfont{#%from-flonum}. For the @tech{CS} variant, the
``bytecode'' decompiler shows machine code, but install the
@filepath{disassemble} package to potentially see the machine code as
machine-specific assembly code. See also @refsecref["compiler-inspect"].
The @racketmodname[racket/unsafe/ops] library provides unchecked
fixnum- and flonum-specific operations. Unchecked flonum-specific
@ -475,7 +501,6 @@ string or byte string, write a constant @tech{regexp} using an
(regexp-match? pattern-rx str)))
]
@; ----------------------------------------------------------------------
@section[#:tag "gc-perf"]{Memory Management}
@ -611,7 +636,7 @@ Imagine you're designing a data structure that needs to
hold onto some value temporarily but then should clear a field or
somehow break a link to avoid referencing that value so it can be
collected. Weak boxes are a good way to test that your data structure
properly clears the value. This is, you might write a test case
properly clears the value. That is, you might write a test case
that builds a value, extracts some other value from it
(that you hope becomes unreachable), puts the extracted value into a weak-box,
and then checks to see if the value disappears from the box.
@ -651,31 +676,51 @@ only the most recently allocated objects, and long pauses for infrequent
For some applications, such as animations and games,
long pauses due to a major collection can interfere
unacceptably with a program's operation. To reduce major-collection
pauses, the Racket garbage collector supports @deftech{incremental
garbage-collection} mode. In incremental mode, minor collections
create longer (but still relatively short) pauses by performing extra
work toward the next major collection. If all goes well, most of a
major collection's work has been performed by minor collections the
time that a major collection is needed, so the major collection's
pause is as short as a minor collection's pause. Incremental mode
tends to run more slowly overall, but it can
provide much more consistent real-time behavior.
pauses, the @tech{3m} garbage collector supports @deftech{incremental
garbage-collection} mode, and the @tech{CS} garbage collector supports
a useful approximation:
If the @envvar{PLT_INCREMENTAL_GC} environment variable is set
to a value that starts with @litchar{1}, @litchar{y}, or @litchar{Y}
when Racket starts, incremental mode is permanently enabled. Since
incremental mode is only useful for certain parts of some programs,
however, and since the need for incremental mode is a property of a
program rather than its environment, the preferred way to enable
incremental mode is with @racket[(collect-garbage 'incremental)].
@itemlist[
@item{In @tech{3m}'s incremental mode, minor collections create longer
(but still relatively short) pauses by performing extra work
toward the next major collection. If all goes well, most of a
major collection's work has been performed by minor collections
the time that a major collection is needed, so the major
collection's pause is as short as a minor collection's pause.
Incremental mode tends to run more slowly overall, but it can
provide much more consistent real-time behavior.}
@item{In @tech{CS}'s incremental mode, objects are never promoted out
of the category of ``recently allocated,'' although there are
degrees of ``recently'' so that most minor collections can still
skip recent-but-not-too-recent objects. In the common case that
most of the memory use for animation or game is allocated on
startup (including its code and the code of the Racket runtime
system), a major collection may never become necessary.}
]
If the @envvar{PLT_INCREMENTAL_GC} environment variable is set to a
value that starts with @litchar{0}, @litchar{n}, or @litchar{N} when
Racket starts, incremental mode is permanently disabled. For
@tech{3m}, if the @envvar{PLT_INCREMENTAL_GC} environment variable is
set to a value that starts with @litchar{1}, @litchar{y}, or
@litchar{Y} when Racket starts, incremental mode is permanently
enabled. Since incremental mode is only useful for certain parts of
some programs, however, and since the need for incremental mode is a
property of a program rather than its environment, the preferred way
to enable incremental mode is with @racket[(collect-garbage
'incremental)].
Calling @racket[(collect-garbage 'incremental)] does not perform an
immediate garbage collection, but instead requests that each minor
collection perform incremental work up to the next major collection.
The request expires with the next major collection. Make a call to
collection perform incremental work up to the next major collection
(unless incremental model is permanently disabled). The request
expires with the next major collection. Make a call to
@racket[(collect-garbage 'incremental)] in any repeating task within
an application that needs to be responsive in real time. Force a
full collection with @racket[(collect-garbage)] just before an initial
an application that needs to be responsive in real time. Force a full
collection with @racket[(collect-garbage)] just before an initial
@racket[(collect-garbage 'incremental)] to initiate incremental mode
from an optimal state.
@ -688,5 +733,5 @@ times, enable @tt{debug}-level logging output for the
runs @filepath{main.rkt} with garbage-collection logging to stderr
(while preserving @tt{error}-level logging for all topics). Minor
collections are reported by @litchar{min} lines, increment-mode minor
collection are reported with @litchar{mIn} lines, and major
collections on @tech{3m} are reported with @litchar{mIn} lines, and major
collections are reported with @litchar{MAJ} lines.

View File

@ -2,6 +2,7 @@
(require scribble/manual)
(provide inside-doc
guide-doc
reference-doc)
(define inside-doc

View File

@ -12,9 +12,12 @@
The @exec{raco decompile} command takes the path of a bytecode file (which usually
has the file extension @filepath{.zo}) or a source file with an
associated bytecode file (usually created with @exec{raco make}) and
converts the bytecode file's content back to an approximation of Racket code. Decompiled
bytecode is mostly useful for checking the compiler's transformation
and optimization of the source program.
converts the bytecode file's content back to an approximation of Racket code.
When the ``bytecode'' file contains machine code, as for the @tech[#:doc guide-doc]{CS}
variant of Racket, then it cannot be converted back to an approximation of
Racket, but installing the @filepath{disassemble} package may enable disassembly
of the machine code. Decompilation is mostly useful for checking the
compiler's transformation and optimization of the source program.
The @exec{raco decompile} command accepts the following command-line flags:
@ -23,11 +26,16 @@ The @exec{raco decompile} command accepts the following command-line flags:
given file's path and an associated @filepath{.zo} file (if any)}
@item{@Flag{n} @nonterm{n} or @DFlag{columns} @nonterm{n} --- format
output for a display with @nonterm{n} columns}
@item{@DFlag{linklet} --- decompile only as far as linklets, instead
of decoding linklets to approximate Racket @racket[module] forms}
@item{@DFlag{no-disassemble} --- show machine code as-is in a byte string,
instead of attempting to disassemble}
]
Many forms in the decompiled code, such as @racket[module],
@racket[define], and @racket[lambda], have the same meanings as
always. Other forms and transformations are specific to the rendering
To the degree that it can be converted back to Racket code,
many forms in the decompiled code have the same meanings as
always, such as @racket[module], @racket[define], and @racket[lambda].
Other forms and transformations are specific to the rendering
of bytecode, and they reflect a specific execution model:
@itemize[
@ -125,8 +133,16 @@ Many forms in the decompiled code, such as @racket[module],
@item{A @racketidfont{#%decode-syntax} form corresponds to a syntax
object.}
@item{A @racketidfont{#%machine-code} form corresponds to machine code
that is not disassembled, where the machine code is in a byte string.}
@item{A @racketidfont{#%assembly-code} form corresponds to disassembled
machine code, where the assembly code is shown as a sequence of strings.}
]
@history[#:changed "1.8" @elem{Added @DFlag{no-disassemble}.}]
@; ------------------------------------------------------------
@section{API for Decompiling}

View File

@ -0,0 +1,132 @@
#lang scribble/doc
@(require "mz.rkt")
@title[#:tag "compiler"]{Controlling and Inspecting Compilation}
Racket programs and expressions are compiled automatically and
on-the-fly. The @exec{raco make} tool (see @secref[#:doc raco-doc
"make"]) can compile a Racket module to a compiled @filepath{.zo}
file, but that kind of ahead-to-time compilation simply allows a
program takes to start more quickly, and it does not affect the
performance of a Racket program.
@; ------------------------------------------------------------
@section[#:tag "compiler-modes"]{Compilation Modes}
All Racket variants suppose a machine-independent compilation mode,
which generates compiled @filepath{.zo} files that work with all
Racket variants on all platforms. To select machine-independent
compilation mode, set the @racket[current-compile-target-machine]
parameter to @racket[#f] or supplying the @DFlag{compile-any}/@Flag{M}
flag on startup. See @racket[current-compile-target-machine] for more
information.
Other compilation modes depend on the Racket variant
(3m/CGC versus CS).
@subsection[#:tag "3m-compiler-modes"]{3m and CGC Compilation Modes}
The 3m and CGC variants of Racket support two
compilation modes: bytecode and machine-independent. The bytecode
format is also machine-independent in the sense that it works the same
on all operating systems for the 3m and/or CGC variants
of Racket, but it does not work with the CS variant of Racket.
Bytecode is further compiled to machine code at run time, unless the
JIT compiler is disabled. See @racket[eval-jit-enabled].
@subsection[#:tag "cs-compiler-modes"]{CS Compilation Modes}
The CS variant of Racket supports several compilation modes:
machine code, machine-independent, interpreted, and JIT. Machine code
is the primay mode, and the machine-independent mode is the same as
for 3m and CGC. Interpreted mode uses an interpreter at
the level of core @tech{linklet} forms with no compilation. JIT mode
triggers compilation of individual function forms on demand.
The default mode is a hybrid of machine-code and interpreter modes,
where interpreter mode is used only for the outer contour of an
especially large linklet, and machine-code mode is used for functions
that are small enough within that outer contour. ``Small enough'' is
determined by the @envvar-indexed{PLT_CS_COMPILE_LIMIT} environment
variable, and the default value of 10000 means that most Racket
modules have no interpreted component.
JIT compilation mode is used only if the @envvar-indexed{PLT_CS_JIT}
environment variable is set on startup, otherwise pure interpreter
mode is used only if @envvar-indexed{PLT_CS_INTERP} is set on startup,
and the default hybrid machine code and interpreter mode is used if
@envvar-indexed{PLT_CS_MACH} is set and @envvar{PLT_CS_JIT} is not set
or if none of those environment variables is set. A module compiled in
any mode can be loaded into the CS variant of Racket independent of
the current compilation mode.
The @envvar{PLT_CS_DEBUG} environment variable, as described in
@secref["debugging"], affects only compilation in machine-code mode.
Generated machine code is much larger when @envvar{PLT_CS_DEBUG} is
enabled, but performance is not otherwise affected.
@; ------------------------------------------------------------
@section[#:tag "compiler-inspect"]{Inspecting Compiler Passes}
When the @envvar-indexed{PLT_LINKLET_SHOW} environment variable is set
on startup, the Racket process's standard output shows intermediate
compiled forms whenever a Racket form is compiled. For all Racket
variants, the output shows one or more @tech{linklets} that are
generated from the original Racket form.
For the CS variant of Racket, a ``schemified'' version of the linklet
is also shown as the translation of the @racket[linklet] form to a
Chez Scheme procedure form. The following environment variables imply
@envvar{PLT_LINKLET_SHOW} and show additional intermediate compiled
forms or adjust the way forms are displayed:
@itemlist[
@item{@envvar-indexed{PLT_LINKLET_SHOW_GENSYM} --- prints full
generated names, instead of abbreviations that may conflate
different symbols}
@item{@envvar-indexed{PLT_LINKLET_SHOW_PRE_LIFT} --- shows a
schemified forms before closure transformations are applied}
@item{@envvar-indexed{PLT_LINKLET_SHOW_PRE_JIT} --- shows a
schemified forms before a transformation to JIT mode, which
applies only when @envvar{PLT_CS_JIT} is set}
@item{@envvar-indexed{PLT_LINKLET_SHOW_LAMBDA} --- shows individual
schemified forms that are compiled within a larger form that
has an interpreted outer contour}
@item{@envvar-indexed{PLT_LINKLET_SHOW_POST_LAMBDA} --- shows an
outer form after inner individual forms are compiled}
@item{@envvar-indexed{PLT_LINKLET_SHOW_POST_INTERP} --- shows an
outer form after its transformation to interpretable form}
@item{@envvar-indexed{PLT_LINKLET_SHOW_JIT_DEMAND} --- shows JIT
compilation of form that were previously prepared by
compilation with @envvar{PLT_CS_JIT} set}
@item{@envvar-indexed{PLT_LINKLET_SHOW_KNOWN} --- show recorded
known-binding information alongside a schemified form}
@item{@envvar-indexed{PLT_LINKLET_SHOW_CP0} --- show a schemified
form after transformation by Chez Scheme's front-end
optimizer}
@item{@envvar-indexed{PLT_LINKLET_SHOW_ASSEMBLY} --- show the
compiled form of a schemified linklet in Chez Scheme's
abstraction of machine instructions}
]
When the @envvar-indexed{PLT_LINKLET_TIMES} environment variable is
set on startup, then Racket prints cumulative timing information about
compilation and evaluation times on exit. When the
@envvar-indexed{PLT_EXPANDER_TIMES} environment variable is set,
information about macro-expansion time is printed on exit.

View File

@ -5,12 +5,17 @@
Racket's built-in debugging support is limited to context (i.e.,
``stack trace'') information that is printed with an exception. In
some cases, disabling the @tech{JIT} compiler can affect context
information. The @racketmodname[errortrace] library supports more
consistent (independent of the @tech{JIT} compiler) and precise context
information. The @racketmodname[racket/trace] library provides simple
some cases, for 3m and CGC variants of Racket, disabling the
@tech{JIT} compiler can affect context information. For the CS variant
of Racket, setting the @envvar-indexed{PLT_CS_DEBUG} environment
variable causes compilation to record expression-level context
information, instead of just function-level information.
The @racketmodname[errortrace] library supports more consistent
(independent of the compiler) and precise context
information. The @racketmodname[racket/trace] library provides simple
tracing support. Finally, the @seclink[#:doc '(lib
"scribblings/drracket/drracket.scrbl") "top" #:indirect? #t]{DrRacket} programming environment
provides much more debugging support.
"scribblings/drracket/drracket.scrbl") "top" #:indirect? #t]{DrRacket}
programming environment provides much more debugging support.
@include-section["trace.scrbl"]

View File

@ -3,6 +3,13 @@
@title[#:tag "eval"]{Evaluation and Compilation}
@guideintro["reflection"]{dynamic evaluation}
Racket provides programmatic control over evaluation through
@racket[eval] and related functions. See @secref["compiler"] for
information about extra-linguistic facilities related to the Racket
compiler.
@defparam[current-eval proc (any/c . -> . any)]{
A @tech{parameter} that determines the current @deftech{evaluation handler}.
@ -373,7 +380,13 @@ path. (The directory need not exist.)}
A list of relative paths, which defaults to @racket[(list
(string->path "compiled"))]. It is used by the @tech{compiled-load
handler} (see @racket[current-load/use-compiled]).}
handler} (see @racket[current-load/use-compiled]).
If the @envvar-indexed{PLT_ZO_PATH} environment variable is set on
startup, it supplies a path instead of @racket["compiled"] to
use for the initial parameter value.
@history[#:changed "7.7.0.9" @elem{Added @envvar{PLT_ZO_PATH}.}]}
@defparam*[current-compiled-file-roots paths (listof (or/c path-string? 'same)) (listof (or/c path? 'same))]{

View File

@ -219,6 +219,9 @@ to a value that starts with @litchar{1}, @litchar{y}, or @litchar{Y} to
request incremental mode at all times, but calling
@racket[(collect-garbage 'incremental)] in a program with a periodic
task is generally a better mechanism for requesting incremental mode.
Set the @as-index{@envvar{PLT_INCREMENTAL_GC}} environment variable
to a value that starts with @litchar{0}, @litchar{n}, or @litchar{N} to
disable incremental-mode requests.
Each garbage collection logs a message (see @secref["logging"]) at the
@racket['debug] level with topic @racket['GC]. In Racket 3m and CS

View File

@ -162,3 +162,6 @@
(history #:changed "7.0.0.13" @elem{Allow one argument, in addition to allowing two or more.}
arg ...))
(provide envvar-indexed)
(define (envvar-indexed s)
(as-index (envvar s)))

View File

@ -10,4 +10,5 @@
@include-section["help.scrbl"]
@include-section["interactive.scrbl"]
@include-section["debugging.scrbl"]
@include-section["compiler.scrbl"]
@include-section["kernel.scrbl"]

View File

@ -255,7 +255,7 @@ flags:
@item{@FlagFirst{c} or @DFlagFirst{no-compiled} : Disables loading
of compiled byte-code @filepath{.zo} files, by initializing
@racket[current-compiled-file-paths] to @racket[null].
@racket[use-compiled-file-paths] to @racket[null].
Use judiciously: this effectively ignores the content of all
@filepath{compiled} subdirectories, so that any used modules are
compiled on the fly---even @racketmodname[racket/base] and

View File

@ -843,6 +843,9 @@
(#%memv (string-ref s 0) '(#\0 #\n #\N)))
(set-incremental-collection-enabled! #f)))
(when (getenv "PLTDISABLEGC")
(collect-request-handler void))
(when version?
(display (banner)))
(call/cc ; Chez Scheme's `call/cc`, used here to escape from the Racket-thread engine loop

View File

@ -1299,6 +1299,13 @@ static int run_from_cmd_line(int argc, char *_argv[],
if (no_compiled)
scheme_set_compiled_file_paths(scheme_make_null());
else {
char *s;
s = getenv("PLT_ZO_PATH");
if (s)
scheme_set_compiled_file_paths(scheme_make_pair(scheme_make_path(s),
scheme_make_null()));
}
/* Setup compiled-file search path: */
if (!compiled_paths) {