Update and revise docs.

original commit: 3cc51f20ac787c87b3d337d96e2afa0eecd5452d
This commit is contained in:
Eli Barzilay 2011-09-06 16:32:20 -04:00
parent 71b66bbf9f
commit 9f36d5a2cb
5 changed files with 139 additions and 137 deletions

View File

@ -1,45 +1,45 @@
#lang scribble/doc #lang scribble/doc
@(require scribble/manual @(require scribble/manual
(for-label scheme profile/analyzer)) (for-label racket/base profile/analyzer))
@title[#:tag "analyzer"]{Analyzing Profile Data} @title[#:tag "analyzer"]{Analyzing Profile Data}
@defmodule[profile/analyzer] @defmodule[profile/analyzer]
Once a profile run is done, and the results are collected, the next Once a profile run is done and the results are collected, the next
step is to analyze the data. In this step the sample time are step is to analyze the data. In this step sample times are computed
computed and summed, a call-graph representing the observed function and summed, a call-graph representing observed function calls is
calls is built, and per-node and per-edge information is created. built, and per-node and per-edge information is created. This is the
This is the job of the main function provided by job of the main function provided by @racket[profile/analyzer].
@racket[profile/analyzer].
@defproc[(analyze-samples [raw-sample-data any/c]) @defproc[(analyze-samples [raw-sample-data any/c])
profile?]{ profile?]{
This function consumes the raw result of the This function consumes the raw result of the
@seclink["sampler"]{sampler} (which is given in an undocumented form), @seclink["sampler"]{sampler} (given in an undocumented form), analyzes
analyzes it, and returns a @racket[profile] value holding the analyzed it, and returns a @racket[profile] value holding the analyzed results.
results. Without this function, the results of the sampler are Without this function, the results of the sampler should be considered
meaningless.} meaningless.}
@defstruct[profile ([total-time exact-nonnegative-integer?] @defstruct*[profile ([total-time exact-nonnegative-integer?]
[cpu-time exact-nonnegative-integer?] [cpu-time exact-nonnegative-integer?]
[sample-number exact-nonnegative-integer?] [sample-number exact-nonnegative-integer?]
[thread-times (listof (cons exact-nonnegative-integer? [thread-times (listof (cons exact-nonnegative-integer?
exact-nonnegative-integer?))] exact-nonnegative-integer?))]
[nodes (listof node?)] [nodes (listof node?)]
[*-node node?])]{ [*-node node?])]{
Represents the analyzed profile result. Represents an analyzed profile result.
@itemize[ @itemize[
@item{@racket[total-time] is the total observed time (in milliseconds) @item{@racket[total-time] is the total observed time (in milliseconds)
included in the profile. This is different than the actual time the included in the profile run. This can be different from the actual
profiling took, due to unaccounted-for time spent in untracked time the profiling took, due to unaccounted-for time spent in
threads. (E.g., the sampler thread itself.)} untracked threads. (E.g., time spent in the sampler thread
itself.)}
@item{@racket[cpu-time] is the actual cpu time consumed by the process @item{@racket[cpu-time] is the actual cpu time consumed by the process
during the profiler's work.} during the profiler's work.}
@ -61,69 +61,70 @@ Represents the analyzed profile result.
amount of time (time spent either in the function or in its callees) amount of time (time spent either in the function or in its callees)
as a secondary key.} as a secondary key.}
@item{@racket[*-node] holds a ``special'' node value that is @item{@racket[*-node] holds a ``special'' root node value that is
constructed for every graph. This node is used as the caller for constructed for every call graph. This node is used as the caller
all top-level function nodes and as the callee for all leaf nodes. for all top-level function nodes and as the callee for all leaf
It can therefore be used to start a scan of the call graph. In nodes. It can therefore be used to start a recursive scan of the
addition, the times associated with its "callers and callees" call graph. In addition, the times associated with its ``callers''
actually represent the time these functions spent being the root of and ``callees'' actually represent the time these functions spent
the computation or its leaf. (This can be different from a node's being the root of the computation or its leaf. (This can be
``self'' time, since it is divided by the number of instances a different from a node's ``self'' time, since it is divided by the
function had on the stack for every sample --- so for recursive number of instances a function had on the stack in each sample---so
functions this value is different from.)} for recursive functions this value is always different from the
``self'' time.)}
]} ]}
@defstruct[node ([id (or/c #f symbol?)] @defstruct*[node ([id (or/c #f symbol?)]
[src (or/c #f srcloc?)] [src (or/c #f srcloc?)]
[thread-ids (listof exact-nonnegative-integer?)] [thread-ids (listof exact-nonnegative-integer?)]
[total exact-nonnegative-integer?] [total exact-nonnegative-integer?]
[self exact-nonnegative-integer?] [self exact-nonnegative-integer?]
[callers (listof edge?)] [callers (listof edge?)]
[callees (listof edge?)])]{ [callees (listof edge?)])]{
Represents a function call node in the call graph of an analyzed Represents a function call node in the call graph of an analyzed
profile result. profile result.
@itemize[ @itemize[
@item{The @racket[id] and @racket[src] field hold a symbol naming the @item{The @racket[id] and @racket[src] fields hold a symbol naming the
function and/or its source location as a @racket[srcloc] value. function and/or its source location as a @racket[srcloc] value.
This is the same as the results of This is the same as the results of
@racket[continuation-mark-set->context], so at most of of these can @racket[continuation-mark-set->context], so at most one of these can
be @racket[#f], except for the special @racket[*-node] (see the be @racket[#f], except for the special @racket[*-node] (see the
@racket[profile] struct) that can be identified by both of these @racket[profile] struct) that can be identified by both being
being @racket[#f].} @racket[#f].}
@item{@racket[thread-ids] holds a list of thread identifiers that were @item{@racket[thread-ids] holds a list of thread identifiers that were
observed executing this function.} observed executing this function.}
@item{@racket[total] holds the total time (in milliseconds) where this @item{@racket[total] holds the total time (in milliseconds) that this
function was anywhere on the stack. It is common to see a few function was anywhere on the stack. It is common to see a few
toplevel functions that have close to a 100% total time, but toplevel functions that have close to a 100% total time, but
otherwise small @racket[self] times --- these functions are the ones otherwise small @racket[self] times---these functions are the ones
that derive the work that was done, but they don't do any hard work that initiate the actual work, but they don't do any hard work
directly.} directly.}
@item{@racket[self] holds the total time (in milliseconds) where this @item{@racket[self] holds the total time (in milliseconds) that this
function was observed as the leaf of the stack. It represents the function was observed as the leaf of the stack. It represents the
actual work done by this function, rather than @racket[total] that actual work done by this function, rather than the @racket[total]
represents the work done by both the function and its callees.} time spent by both the function and its callees.}
@item{@racket[callers] and @racket[callees] hold the list of caller @item{@racket[callers] and @racket[callees] hold the list of callers
and callee nodes. The nodes are not actually held in these lists, and callees. The nodes are not actually held in these lists,
instead, @racket[edge] values are used --- and provide information instead, @racket[edge] values are used---and provide information
specific to an edge in the call-graph.} specific to each edge in the call-graph.}
]} ]}
@defstruct[edge ([total exact-nonnegative-integer?] @defstruct*[edge ([total exact-nonnegative-integer?]
[caller node?] [caller node?]
[caller-time exact-nonnegative-integer?] [caller-time exact-nonnegative-integer?]
[callee node?] [callee node?]
[callee-time exact-nonnegative-integer?])]{ [callee-time exact-nonnegative-integer?])]{
Represents an edge between two function call nodes in the call graph Represents an edge between two function call nodes in the call graph
of an analyzed profile result. of an analyzed profile result.

View File

@ -1,15 +1,15 @@
#lang scribble/doc #lang scribble/doc
@(require scribble/manual @(require scribble/manual
(for-label scheme)) (for-label racket/base))
@title{Profile: Statistical Profiler} @title{Profile: Statistical Profiler}
The @racket[profile] collection implements a statistical profiler. The @racketmodname[profile] collection implements a statistical
The profiling is done by running a background thread that collects profiler. The profiling is done by running a background thread that
stack snapshots via @racket[continuation-mark-set->context], meaning collects stack snapshots via @racket[continuation-mark-set->context],
that the result is an estimate of the execution costs and it is meaning that the result is an estimate of the execution costs and it
limited to the kind of information that is limited to the kind of information that
@racket[continuation-mark-set->context] produces (most notably being @racket[continuation-mark-set->context] produces (most notably being
limited to functions calls, and subject to compiler optimizations); limited to functions calls, and subject to compiler optimizations);
but the result is often useful. In practice, since this method does but the result is often useful. In practice, since this method does

View File

@ -1,7 +1,7 @@
#lang scribble/doc #lang scribble/doc
@(require scribble/manual @(require scribble/manual
(for-label scheme (for-label racket/base
profile/analyzer profile/analyzer
(prefix-in text: profile/render-text) (prefix-in text: profile/render-text)
(prefix-in graphviz: profile/render-graphviz))) (prefix-in graphviz: profile/render-graphviz)))
@ -9,11 +9,11 @@
@title[#:tag "renderers"]{Profile Renderers} @title[#:tag "renderers"]{Profile Renderers}
After collecting the profile samples and analyzing the data, the last After collecting the profile samples and analyzing the data, the last
aspect of profiling is to render the results. The profile collection step of the profiling process is to render the results. The profile
provides several renderers, each providing a rendering function that collection provides several renderers, each providing a rendering
consumes a @racket[profile] instance. See the function that consumes a @racket[profile] instance. See the
@seclink["analyzer"]{analyzer} section for a description of the @seclink["analyzer"]{analyzer} section for a description of the
@racket[profile] struct if you want to implement your own renderer. @racket[profile] struct if you want to implement a new renderer.
@;-------------------------------------------------------------------- @;--------------------------------------------------------------------
@section{Textual Rendering} @section{Textual Rendering}
@ -29,7 +29,7 @@ consumes a @racket[profile] instance. See the
Prints the given @racket[profile] results as a textual table. Prints the given @racket[profile] results as a textual table.
The printout begins with some general facts about the profile, and The printout begins with some general details about the profile, and
then a table that represents the call-graph is printed. Each row in then a table that represents the call-graph is printed. Each row in
this table looks like: this table looks like:
@ -38,17 +38,17 @@ this table looks like:
[N1] N2(N3%) N4(N5%) A ...path/to/source.rkt:12:34 [N1] N2(N3%) N4(N5%) A ...path/to/source.rkt:12:34
C [M3] M4%} C [M3] M4%}
Where actual numbers appear in the printout. The meaning of the where actual numbers appear in the printout. The meaning of the
numbers and labels is as follows: numbers and labels is as follows:
@itemize[ @itemize[
@item{@tt{A} --- the name of the function that this node represents, @item{@tt{A} --- the name of the function that this node represents,
followed by the source location for the function if it is known. followed by the source location for the function if it is known.
The name can be ``???'' for functions with no identifier, but in The name can be ``???'' for anonymous functions, but in this case
this case the source location will identify them.} the source location will identify them.}
@item{@tt{N1} --- an index number associated with this node. This is @item{@tt{N1} --- an index number associated with this node. This is
important in references to this function, since the symbolic names useful in references to this function, since the symbolic names are
are not unique (and some can be missing). The number itself has no not unique (and some can be missing). The number itself has no
significance, it simply goes from 1 up.} significance.}
@item{@tt{N2} --- the time (in milliseconds) that this function has @item{@tt{N2} --- the time (in milliseconds) that this function has
been anywhere in a stack snapshot. This is the total time that the been anywhere in a stack snapshot. This is the total time that the
execution was somewhere in this function or in its callees. execution was somewhere in this function or in its callees.
@ -59,13 +59,12 @@ numbers and labels is as follows:
whole execution.} whole execution.}
@item{@tt{N4} --- the time (in milliseconds) that this function has @item{@tt{N4} --- the time (in milliseconds) that this function has
been at the top of the stack snapshot. This is the time that this been at the top of the stack snapshot. This is the time that this
function consumed doing work itself rather than calling other function was itself doing work rather than calling other functions.
functions. (Corresponds to the @racket[node-self] field.)} (Corresponds to the @racket[node-self] field.)}
@item{@tt{N5} --- this is the percentage of @tt{N4} out of the total @item{@tt{N5} --- this is the percentage of @tt{N4} out of the total
observed time of the profile. Functions with high values here can observed time of the profile. Functions with high values here can
be good candidates for optimization, But, of course, they can be good candidates for optimization, But, of course, they can
represent doing real work due to one of its callers that need to be represent doing real work for a caller that needs to be optimized.}
optimized.}
@item{@tt{B} and @tt{C} --- these are labels for the callers and @item{@tt{B} and @tt{C} --- these are labels for the callers and
callees of the function. Any number of callers and callees can callees of the function. Any number of callers and callees can
appear here (including 0). The function itself can also appear in appear here (including 0). The function itself can also appear in

View File

@ -1,7 +1,7 @@
#lang scribble/doc #lang scribble/doc
@(require scribble/manual @(require scribble/manual
(for-label scheme profile/sampler profile/analyzer)) (for-label racket/base profile/sampler profile/analyzer))
@title[#:tag "sampler"]{Collecting Profile Information} @title[#:tag "sampler"]{Collecting Profile Information}
@ -13,7 +13,7 @@
[super-cust custodian? (current-custodian)]) [super-cust custodian? (current-custodian)])
((symbol?) (any/c) . ->* . any/c)]{ ((symbol?) (any/c) . ->* . any/c)]{
Creates a sample collector thread, which tracks the given Creates a stack-snapshot collector thread, which tracks the given
@racket[to-track] value every @racket[delay] seconds. The @racket[to-track] value every @racket[delay] seconds. The
@racket[to-track] value can be either a thread (track just that @racket[to-track] value can be either a thread (track just that
thread), a custodian (track all threads managed by the custodian), or thread), a custodian (track all threads managed by the custodian), or
@ -26,33 +26,33 @@ consisting of a symbol and an optional argument, and can affect the
sampler. The following messages are currently supported: sampler. The following messages are currently supported:
@itemize[ @itemize[
@item{@racket['pause] and @racket['resume] will stop or resume data @item{@racket['pause] and @racket['resume] will stop or resume
collection. These messages can be nested. Note that the thread will snapshot collection. These messages can be nested. Note that the
continue running it will just stop collecting snapshots.} thread will continue running---it will just stop collecting
snapshots.}
@item{@racket['stop] kills the controlled thread. It should be called @item{@racket['stop] kills the sampler thread. It should be called
when no additional data should be collected. (This is currently when no additional data should be collected. (This is currently
irreversible: there is no message to start a new sampler thread.)} irreversible: there is no message to start a new sampler thread.)}
@item{@racket['set-tracked!] with a value will change the tracked @item{@racket['set-tracked!] with a value will change the tracked
objects (initially specified as the @racket[to-track] argument) to object(s) which were initially specified as the @racket[to-track]
the given value.} argument.}
@item{@racket['set-tracked!] with a value will change the delay that @item{@racket['set-tracked!] with a numeric value will change the
the sampler us taking between snapshots. Note that although delay that the sampler is taking between snapshots. Note that
changing this means that the snapshots are not uniformly although changing this means that the snapshots are not uniformly
distributed, the results will still be sensible --- this is because distributed, the results will still be correct: the cpu time between
the cpu time between samples is taken into account when the samples is taken into account when the collected data is analyzed.}
resulting data is analyzed.}
@item{Finally, a @racket['get-snapshots] message will make the @item{Finally, a @racket['get-snapshots] message will make the
controller return the currently collected data. Note that this can controller return the currently collected data. Note that this can
be called multiple times, each call will return the data thatis be called multiple times, each call will return the data that is
collected up to that point in time. In addition, it can be (and collected up to that point in time. In addition, it can be (and
usually is) called after the sampler was stopped. usually is) called after the sampler was stopped.
The value that is returned should be considered as an undocumented The value that is returned should be considered as an undocumented
internal detail of the profiler, to be sent to internal detail of the profiler, intended to be sent to
@racket[analyze-samples] for analysis. The reason this is not done @racket[analyze-samples] for analysis. The reason this is not done
automatically, is that a future extension might allow you to combine automatically, is that a future extension might allow you to combine
several sampler results, making it possible to combine a profile several sampler results, making it possible to combine a profile

View File

@ -1,8 +1,8 @@
#lang scribble/doc #lang scribble/doc
@(require scribble/manual @(require scribble/manual
(for-label scheme profile profile/sampler (for-label racket/base profile profile/sampler
(only-in profile/analyzer analyze-samples) (only-in profile/analyzer analyze-samples profile?)
(prefix-in text: profile/render-text))) (prefix-in text: profile/render-text)))
@title{Toplevel Interface} @title{Toplevel Interface}
@ -10,7 +10,7 @@
@defmodule[profile] @defmodule[profile]
This module provides one procedure and one macro that are convenient This module provides one procedure and one macro that are convenient
high-level entry points for timing expressions. This hides the high-level entry points for profiling expressions. It abstracts over
details that are available through other parts of the library, and is details that are available through other parts of the library, and is
intended as a convenient tool for profiling code. intended as a convenient tool for profiling code.
@ -25,53 +25,55 @@ intended as a convenient tool for profiling code.
#f]) #f])
void?]{ void?]{
Executes the given thunk while collecting profiling data, and render Executes the given @racket[thunk] and collect profiling data during
this data when done. Keyword arguments can customize the profiling: execution, eventually analyzing and rendering this. Keyword arguments
can customize the profiling:
@itemize[ @itemize[
@item{The profiler works by @racket[create-sampler] starting a @item{The profiler works by starting a ``sampler'' thread to
``sampler'' thread whose job is to collect stack samples periodically collect stack snapshots (using
periodically (using @racket[continuation-mark-set->context]). @racket[continuation-mark-set->context]). To determine the
@racket[delay] determines the amount of time the sampler frequency of these collections, the sampler thread sleeps
@racket[sleep]s for between samples. Note that this is will be @racket[delay] seconds between collections. Note that this is will
close, but not identical to, the frequency in which data is actually be close, but not identical to, the frequency in which data is
sampled.} actually sampled. (The @racket[delay] value is passed on to
@racket[create-sampler], which creates the sampler thread.)}
@item{When the profiled computation takes a short amount of time, the @item{Due to the statistical nature of the profiler, longer executions
collected data will not be accurate. In this case, you can specify result in more accurate analysis. You can specify a number of
an @racket[iterations] argument to repeat the evaluation a number of @racket[iterations] to repeat the @racket[thunk] to collect more
times which will improve the accuracy of the resulting report.} data.}
@item{Normally, the sampler collects snapshots of the @item{Normally, the sampler collects only snapshots of the
@racket[current-thread]'s stack. If there is some computation that @racket[current-thread]'s stack. Profiling a computation that
happens on a different thread, that work will not be reflected in creates threads will therefore lead to bad analysis: the timing
the results: the only effect will be suspiciously small value for results will be correct, but because the profiler is unaware of
the observed time, because the collected data is taking into account other threads the observed time will be suspiciously small, and work
the cpu time that the thread actually performed (it uses done in other threads will not be included in the results. To track
@racket[current-process-milliseconds] with the running thread as an all threads, specify a non-@racket[#f] value for the
argument). Specifying a non-@racket[#f] value for the @racket[threads?] argument---this will execute the computation in a
@racket[threads?] argument will arrange for all threads that are fresh custodian, and keep track of all threads under this
started during the evaluation to be tracked. Note that this means custodian.}
that the computation will actually run in a new sub-custodian, as
this is the only way to be able to track such threads.}
@item{Once the computation has finished, the sampler is stopped, and @item{Once the computation is done and the sampler is stopped, the
the accumulated data is collected. It is then analyzed by accumulated data is analyzed (by @racket[analyze-samples]) and the
@racket[analyze-samples], and the analyzed profile data is fed into resulting profile value is sent to the @racket[renderer] function.
a renderer. Use an identity function (@racket[values]) to get the See @secref["renderers"] for available renderers. You can also use
analyzed result, and render it yourself, or use one of the existing @racket[values] as a ``renderer''---in this case the
renderers (see @secref["renderers"]).} @racket[profile-thunk] returns the analyzed information which can
now be rendered multiple times, or saved for future rendering.}
@item{The @racket[periodic-renderer] argument can be set to a list @item{To provide feedback information during execution, specify a
holding a delay time and a renderer. In this case, the given @racket[periodic-renderer]. This should be a list holding a delay
renderer will be called periodically. This is useful for cases time (in seconds) and a renderer function. The delay determines the
where you want a dynamically updated display of the results. This frequency in which the renderer is called, and it should be larger
delay should be larger than the sampler delay.} than the sampler delay (usually much larger since it can involve
more noticeable overhead, and it is intended for a human observer).}
]} ]}
@defform[(profile expr keyword-arguments ...)]{ @defform[(profile expr keyword-arguments ...)]{
A macro version of @racket[profile-thunk]. The keyword arguments can A macro version of @racket[profile-thunk]. Keyword arguments can be
be specified in the same was as for a function call: they can appear specified as in a function call: they can appear before and/or after
before and/or after the expression to be profiled.} the expression to be profiled.}