
and functionality improvements (including support for measuring coverage), primitive argument-checking fixes, and object-file changes resulting in reduced load times (and some backward incompatibility): - annotations are now preserved in object files for debug only, for profiling only, for both, or not at all, depending on the settings of generate-inspector-information and compile-profile. in particular, when inspector information is not enabled but profiling is, source information does not leak into error messages and inspector output, though it is still available via the profile tools. The mechanics of this involved repurposing the fasl a? parameter to hold an annotation flags value when it is not #f and remaking annotations with new flags if necessary before emitting them. compile.ss, fasl.ss, misc.ms - altered a number of mats to produce correct results even when the 's' directory is profiled. misc.ms, cp0.ms, record.ms - profile-release-counters is now generation-friendly; that is, it doesn't look for dropped code objects in generations that have not been collected since the last call to profile-release-counters. also, it no longer allocates memory when it releases counters. pdhtml.ss, gc.c, gcwrapper.c, globals.h, prim5.c - removed unused entry points S_ifile, S_ofile, and S_iofile alloc.c, externs.h - mats that test loading profile info into the compiler's database to guide optimization now weed out preexisting entries, in case the 's' directory is profiled. 4.ms, mat.ss, misc.ms, primvars.ms - counters for dropped code objects are now released at the start of each mat group. mat.ss - replaced ehc (enable-heap-check) option with hci (heap-check-interval) option that allows heap checks to be performed periodically rather than on each collection. hci=0 is equivalent to ehc=f (disabling heap checks) and hci=1 is equivalent to ehc=t (enabling heap checks every collection), while hci=100 enables heap checks only every 100th collection. allx and bullyx mats use this feature to reduce heap-checking overhead to a more reasonable level. this is particularly important when the 's' directory is profiled, since the amount of static memory to be checked is greatly increased due to the counters. mats/Mf-base, mat.ss, primvars.ms - added a mat that calls #%show-allocation, which was otherwise not being tested. misc.ms - removed a broken primvars mat and updated two others. in each case, the mat was looking for information about primitives in the wrong (i.e., old) place and silently succeeding when it didn't find any primitives to tests. the revised mats (along with a few others) now check to make sure at least one identifier has the information they look for. the removed mat was checking for library information that is now compiled in, so the mat is now unnecessary. the others were (not) doing argument-error checks. fixing these turned up a handful of problems that have also been fixed: a couple of unbound variables in the mat driver, two broken primdata declarations, a tardy argument check by profile-load-data, and a bug in char-ready?, which was requiring an argument rather than defaulting it to the current input port. primdata.ss, pdhtml.ss, io.ms, primdvars.ms, 4.ms, 6.ms, misc.ms, patch* - added initial support for recording coverage information. when the new parameter generate-covin-files is set, the compiler generates .covin files containing the universe of all source objects for which profile forms are present in the expander output. when profiling and generation of covin files are enabled in the 's' directory, the mats optionally generate .covout files for each mat file giving the subset of the universe covered by the mat file, along with an all.covout in each mat output directory aggregating the coverage for the directory and another all.covout in the top-level mat directory aggregating the coverage for all directories. back.ss, compile.ss, cprep.ss, primdata.ss, s/Mf-base, mat.ss, mats/Mf-base, mats/primvars.ms - support for generating covout files is now built in. with-coverage-output gathers and dumps coverage information, and aggregate-coverage-output combines (aggregates) covout files. pdhtml.ss, primdata.ss, compile.ss, mat.ss, mats/Mf-base, primvars.ms - profile-clear now adjusts active coverage trackers to avoid losing coverage information. pdhtml.ss, prim5.c - nested with-coverage calls are now supported. pdhtml.ss - switched to a more compact representation for covin and covout files; reduces disk space (compressed or not) by about a factor of four and read time by about a factor of two with no increase in write time. primdata.ss, pdhtml.ss, cprep.ss, compile.ss, mat.ss, mats/Mf-base - added support for determining coverage for an entire run, including coverage for expressions hit during boot time. 'all' mats now produce run.covout files in each output directory, and 'allx' mats produce an aggregate run.covout file in the mat directory. pdhtml.ss, mat.ss, mats/Mf-base - profile-release-counters now adjusts active coverage trackers to account for the counters that have been released. pdhtml.ss, prim5.c - replaced the artificial "examples" target with a real "build-examples" target so make won't think it always has to mats that depend upon the examples directory having been compiled. mats make clean now runs make clean in the examples directory. mats/Mf-base importing a library from an object file now just visits the object file rather than doing a full load so that the run-time code for the library is not retained. The run-time code is still read because the current fasl format forces the entire file to be read, but not retaining the code can lower heap size and garbage-collection cost, particularly when many object-code libraries are imported. The downside is that the file must be revisited if the run-time code turns out to be required. This change exposed several places where the code was failing to check if a revisit is needed. syntax.ss, 7.ms, 8.ms, misc.ms, root-experr* - fixed typos: was passing unquoted load rather than quoted load to $load-library along one path (where it is loading source code and therefore irrelevant), and was reporting src-path rather than obj-path in a message about failing to define a library. syntax.ss - compile-file and friends now put all recompile information in the first fasl object after the header so the library manager can find it without loading the entire fasl file. The library manager now does so. It also now checks to see if library object files need to be recreated before loading them rather than loading them and possibly recompiling them after discovering they are out of date, since the latter requires loading the full object file even if it's out of date, while the former takes advantage of the ability to extract just recompile information. as well as reducing overhead, this eliminates possibly undesirable side effects, such as creation and registration of out-of-date nongenerative record-type descriptors. because the library manager expects to find recompile information at the front of an object file, it will not find all recompile information if object files are "catted" together. also, compile-file has to hold in memory the object code for all expressions in the file so that it can emit the unified recompile information, rather than writing to the object file incrementally, which can significantly increase the memory required to compile a large file full of individual top-level forms. This does not affect top-level programs, which were already handled as a whole, or a typical library file that contains just a single library form. compile.ss, syntax.ss - the library manager now checks include files before library dependencies when compile-imported-libraries is false (as it already did when compile-imported-libraries is true) in case a source change affects the set of imported libraries. (A library change can affect the set of include files as well, but checking dependencies before include files can cause unneeded libraries to be loaded.) The include-file check is based on recompile-info rather than dependencies, but the library checks are still based on dependencies. syntax.ss - fixed check for binding of scheme-version. (the check prevents premature treatment of recompile-info records as Lexpand forms to be passed to $interpret-backend.) scheme.c - strip-fasl-file now preserves recompile-info when compile-time info is stripped. strip.ss - removed include-req* from library/ct-info and ctdesc records; it is no longer needed now that all recompile information is maintained separately. expand-lang.ss, syntax.ss, compile.ss, cprep.ss, syntax.ss - changed the fasl format and reworked a lot of code in the expander, compiler, fasl writer, and fasl reader to allow the fasl reader to skip past run-time information when it isn't needed and compile-time information when it isn't needed. Skipping past still involves reading and decoding when encrypted, but the fasl reader no longer parses or allocates code and data in the portions to be skipped. Side effects of associating record uids with rtds are also avoided, as are the side effects of interning symbols present only in the skipped data. Skipping past code objects also reduces or eliminates the need to synchronize data and instruction caches. Since the fasl reader no longer returns compile-time (visit) or run-time (revisit) code and data when not needed, the fasl reader no longer wraps these objects in a pair with a 0 or 1 visit or revisit marker. To support this change, the fasl writer generates separate top-level fasl entries (and graphs) for separate forms in the same top-level source form (e.g., begin or library). This reliably breaks eq-ness of shared structure across these forms, which was previously broken only when visit or revisit code was loaded at different times (this is an incompatible change). Because of the change, fasl "groups" are no longer needed, so they are no longer handled. 7.ss, cmacros.ss, compile.ss, expand-lang.ss, strip.ss, externs.h, fasl.c, scheme.c, hash.ms - the change above is surfaced in an optional fasl-read "situation" argument (visit, revisit, or load). The default is load. visit causes it to skip past revisit code and data; revisit causes it to skip past visit code and data; and load causes it not to skip past either. visit-revisit data produced by (eval-when (visit revisit) ---) is never skipped. 7.ss, primdata.ss, io.stex - to improve compile-time and run-time error checking, the Lexpand recompile-info, library/rt-info, library-ct-info, and program-info forms have been replaced with list-structured forms, e.g., (recompile-info ,rcinfo). expand-lang.ss, compile.ss, cprep.ss, interpret.ss, syntax.ss - added visit-compiled-from-port and revisit-compiled-from-port to complement the existing load-compiled-from-port. 7.ss, primdata.ss, 7.ms, system.stex - increased amount read when seeking an lz4-encrypted input file from 32 to 1024 bytes at a time compress-io.c - replaced the fasl a? parameter value #t with an "all" flag value so it's value is consistently a mask. cmacros.ss, fasl.ss, compile.ss - split off profile mats into a separate file misc.ms, profile.ms (new), root-experr*, mats/Mf-base - added coverage percent computations to mat allx/bullyx output mat.ss, mats/Mf-base, primvars.ms - replaced coverage tables with more generic and generally useful source tables, which map source objects to arbitrary values. pdhtml.ss, compile.ss, cprep.ss, primdata.ss, mat.ss, mats/Mf-base, primvars.ms, profile.ms, syntax.stex - reduced profile counting overhead by using calls to fold-left instead of calls to apply and map and by using fixnum operations for profile counts on 64-bit machines. pdhtml.ss - used a critical section to fix a race condition in the calculations of profile counts that sometimes resulted in bogus (including negative) counts, especially when the 's' directory is profiled. pdhtml.ss - added discard flag to declaration for hashtable-size primdata.ss - redesigned the printed representation of source tables and rewrote get-source-table! to read and store incrementally to reduce memory overhead. compile.ss - added generate-covin-files to the set of parameters preserved by compile-file, etc. compile.ss, system.stex - moved covop argument before the undocumented machine and hostop arguments to compile-port and compile-to-port. removed the undocumented ofn argument from compile-to-port; using (port-name ip) instead. compile.ss, primdata.ss, 7.ms, system.stex - compile-port now tries to come up with a file position to supply to make-read, which it can do if the port's positions are character positions (presently string ports) or if the port is positioned at zero. compile.ss - audited the argument-type-error fuzz mat exceptions and fixed a host of problems this turned up (entries follow). added #f as an invalid argument for every type for which #f is indeed invalid to catch places where the maybe- prefix was missing on the argument type. the mat tries hard to determine if the condition raised (if any) as the result of an invalid argument is appropriate and redirects the remainder to the mat-output (.mo) file prefixed with 'Expected error', causing them to show up in the expected error output so developers will be encouraged to audit them in the future. primvars.ms, mat.ss - added an initial symbol? test on machine type names so we produce an invalid machine type error message rather than something confusing like "machine type #f is not supported". compile.ss - fixed declarations for many primitives that were specified as accepting arguments of more general types than they actually accept, such as number -> real for various numeric operations, symbol -> endianness for various bytevector operations, time -> time-utc for time-utc->date, and list -> list-of-string-pairs for default-library-search-handler. also replaced some of the sub-xxxx types with specific types such as sub-symbol -> endianness in utf16->string, but only where they were causing issues with the primvars argument-type-error fuzz mat. (this should be done more generally.) primdata.ss - fixed incorrect who arguments (was map instead of fold-right, current-date instead of time-utc->date); switched to using define-who/set-who! generally. 4.ss, date.ss - append! now checks all arguments before any mutation 5_2.ss - with-source-path now properly supplies itself as who for the string? argument check; callers like load now do their own checks. 7.ss - added missing integer? check to $fold-bytevector-native-ref whose lack could have resulted in a compile-time error. cp0.ss - fixed typo in output-port-buffer-mode error message io.ss - fixed who argument (was fx< rather than fx<?) library.ss - fixed declaration of first source-file-descriptor argument (was sfd, now string) primdata.ss - added missing article 'a' in a few error messages prims.ss - fixed the copy-environment argument-type error message for the list of symbols argument. syntax.ss - the environment procedure now catches exceptions that occur and reraises the exception with itself as who if the condition isn't already a who condition. syntax.ss - updated experr and allx patch files for changes to argument-count fuzz mat and fixes for problems turned up by them. root-experr*, patch* - fixed a couple of issues setting port sizes: string and bytevector output port put handlers don't need room to store the character or byte, so they now set the size to the buffer length rather than one less. binary-file-port-clear-output now sets the index rather than size to zero; setting the size to zero is inappropriate for some types of ports and could result in loss of buffering and even suppression of future output. removed a couple of redundant sets of the size that occur immediately after setting the buffer. io.ss - it is now possible to return from a call to with-profile-tracker multiple times and not double-count (or worse) any counts. pdhtml.ss, profile.ms - read-token now requires a file position when it is handed a source-file descriptor (since the source-file descriptor isn't otherwise useful), and the source-file descriptor argument can no longer be #f. the input file position plays the same role as the input file position in get-datum/annotations. these extra read-token arguments are now documented. read.ss, 6.ms, io.stex - the source-file descriptor argument to get-datum/annotations can no longer be #f. it was already documented that way. read.ss - read-token and do-read now look for the character-positions port flag before asking if the port has port-position, since the latter is slightly more expensive. read.ss - rd-error now reports the current port position if it can be determined when fp isn't already set, i.e., when reading from a port without character positions (presently any non string port) and fp has not been passed in explicitly (to read-token or get-datum/annotations). the port position might not be a character position, but it should be better than nothing. read.ss - added comment noting an invariant for s_profile_release_counters. prim5.c - restored accidentally dropped fasl-write formdef and dropped duplicate fasl-read formdef io.stex - added a 'coverage' target that tests the coverage of the Scheme-code portions of Chez Scheme by the mats. Makefile.in, Makefile-workarea.in - added .PHONY declarations for all of the targets in the top-level and workarea make files, and renamed the create-bintar, create-rpm, and create-pkg targets bintar, rpm, and pkg. Makefile.in, Makefile-workarea.in - added missing --retain-static-relocation command-line argument and updated the date scheme.1.in - removed a few redundant conditional variable settings configure - fixed declaration of condition wait (timeout -> maybe-timeout) primdata.ss original commit: 88501743001393fa82e89c90da9185fc0086fbcb
2620 lines
115 KiB
Plaintext
2620 lines
115 KiB
Plaintext
\documentclass{releasenotes}
|
|
|
|
\thisversion{Version 9.5.3}
|
|
\thatversion{Version 8.4}
|
|
\pubmonth{September}
|
|
\pubyear{2019}
|
|
|
|
\begin{document}
|
|
|
|
\maketitle
|
|
|
|
% \tableofcontents
|
|
|
|
\section{Overview}
|
|
|
|
This document outlines the changes made to {\ChezScheme} for
|
|
{\thisversion} since {\thatversion}.
|
|
|
|
{\thisversion} is supported for the following platforms.
|
|
The Chez Scheme machine type (returned by the \scheme{machine-type}
|
|
procedure) is given in parentheses.
|
|
|
|
\begin{itemize}
|
|
\item Linux x86, nonthreaded (i3le) and threaded (ti3le)
|
|
\item Linux x86\_64, nonthreaded (a6le) and threaded (ta6le)
|
|
\item MacOS X x86, nonthreaded (i3osx) and threaded (ti3osx)
|
|
\item MacOS X x86\_64, nonthreaded (a6osx) and threaded (ta6osx)
|
|
\item Linux ARMv6 (32-bit), nonthreaded (arm32le)
|
|
\item Linux PowerPC (32-bit), nonthreaded (ppc32le) and threaded (tppc32le)
|
|
\item Windows x86, nonthreaded (i3nt) and threaded (ti3nt)
|
|
\item Windows x86\_64, nonthreaded (a6nt) and threaded (ta6nt) [experimental]
|
|
%\item OpenBSD x86, nonthreaded (i3ob) and threaded (ti3ob)
|
|
%\item OpenBSD x86\_64, nonthreaded (a6ob) and threaded (ta6ob)
|
|
%\item FreeBSD x86, nonthreaded (i3fb) and threaded (ti3fb)
|
|
%\item FreeBSD x86\_64, nonthreaded (a6fb) and threaded (ta6fb)
|
|
%\item NetBSD x86, nonthreaded (i3nb) and threaded (ti3nb)
|
|
%\item NetBSD x86\_64, nonthreaded (a6nb) and threaded (ta6nb)
|
|
%\item OpenSolaris x86, nonthreaded (i3s2) and threaded (ti3s2)
|
|
%\item OpenSolaris x86\_64, nonthreaded (a6s2) and threaded (ta6s2)
|
|
\end{itemize}
|
|
|
|
This document contains three sections describing significant
|
|
(1) \href[static]{section:functionality}{functionality changes},
|
|
(2) \href[static]{section:bugfixes}{bugs fixed}, and
|
|
(3) \href[static]{section:performance}{performance enhancements}.
|
|
A version number listed in parentheses in the header for a change
|
|
indicates the first minor release or internal prerelease to support
|
|
the change.
|
|
|
|
More information on {\ChezScheme} and {\PetiteChezScheme} can
|
|
\scheme{be} found at \hyperlink{http://www.scheme.com/}{http://www.scheme.com},
|
|
and extensive documentation is available in
|
|
\TSPL{4}{th} (available directly from MIT Press or from online and local retailers)
|
|
and the \CSUG{9}.
|
|
Online versions of both books can be found at
|
|
\hyperlink{http://www.scheme.com/}{http://www.scheme.com}.
|
|
|
|
%-----------------------------------------------------------------------------
|
|
\section{Functionality Changes}\label{section:functionality}
|
|
|
|
\subsection{Coverage support and source tables (9.5.3)}
|
|
|
|
When the new parameter \scheme{generate-covin-files} is set to \scheme{#t}
|
|
rather than the default \scheme{#f}, file compilation routines such as
|
|
\scheme{compile-file} and \scheme{compile-library} produce coverage
|
|
information (\scheme{.covin}) files that can be used in conjunction with
|
|
profile information to measure coverage of a source-code base.
|
|
Coverage information is also written out when the optional \var{covop}
|
|
argument is supplied to \scheme{compile-port} and \scheme{compile-to-port}.
|
|
|
|
A covin file contains a printed representation of a \emph{source
|
|
table} mapping each profiled source object in the code base to a
|
|
count of zero.
|
|
Source tables generally associate source objects with arbitrary values
|
|
and are allocated and manipulated with hashtable-like operations specific
|
|
to source tables.
|
|
|
|
Profile information can be tracked even through releasing and clearing
|
|
of profile counters via the new procedure \scheme{with-profile-tracker},
|
|
which produces a source table.
|
|
|
|
Coverage of a source-code base can thus be achieved by comparing
|
|
the set of source objects in the covin-file source tables for one
|
|
or more source files with the set of source objects in the source
|
|
tables produced by one or more runs of tests run with profile
|
|
information tracked by \scheme{with-profile-tracker}.
|
|
|
|
\subsection{Importing a library from an object file now visits the file (9.5.3)}
|
|
|
|
As described in Section~\ref{sec:faster-object-file-loading},
|
|
importing a library from an object file now causes the object file
|
|
to be visited rather than fully loaded.
|
|
If the run-time information is needed, i.e., if the library is
|
|
invoked, the file will be revisited.
|
|
This is typically transparent to the program, but problems can arise
|
|
if the program changes its current directory (via
|
|
\scheme{current-directory}) prior to invoking a library, and the
|
|
object file cannot be found.
|
|
|
|
\subsection{Recompile information (9.5.3)}
|
|
|
|
As described in Section~\ref{sec:faster-object-file-loading}, all
|
|
recompile information is now placed at the front of each object
|
|
file where it can be read without the need to scan through the
|
|
remainder of the file.
|
|
Because the library manager expects to find recompile information
|
|
at the front of an object file, it will not find all recompile
|
|
information if object files are concatenated together.
|
|
|
|
Also, the compiler has to hold in memory the object code for all
|
|
expressions in a file so that it can emit the unified recompile
|
|
information, rather than writing to the object file incrementally,
|
|
which can significantly increase the memory required to compile a
|
|
large file full of individual top-level forms.
|
|
This does not affect top-level programs, which were already handled
|
|
as a whole, or a typical library file that contains just a single
|
|
library form.
|
|
|
|
\subsection{Optional new \protect\scheme{fasl-read} situation argument (9.5.3)}
|
|
|
|
It is now possible to direct \scheme{fasl-read} to read only visit
|
|
(compile-time) or revisit (run-time) objects via the optional new
|
|
situation argument.
|
|
Situation \scheme{visit} causes the fasl reader to skip over
|
|
revisit (run-time-only) objects, while
|
|
\scheme{revisit} causes the fasl reader to skip over
|
|
visit (compile-time-only) objects.
|
|
Situation \scheme{load} doesn't skip over any objects.
|
|
|
|
\subsection{Optional \protect\scheme{read-token} \protect\var{sfd} and \protect\var{bfp} arguments (9.5.3)}
|
|
|
|
In addition to the optional input-port argument, \scheme{read-token} now takes
|
|
optional \var{sfd} (source-file-descriptor) and \var{bfp} (beginning-file-position)
|
|
arguments.
|
|
If either is provided, both must be provided.
|
|
Specifying \var{sfd} and \var{bfp} improves the quality of error messages,
|
|
guarantees the \scheme{read-token} \var{start} and \var{end} return values can be determined,
|
|
and eliminates the overhead of asking for a file position on each call
|
|
to \scheme{read-token}.
|
|
\var{bfp} is normally 0 for the first call
|
|
to \scheme{read-token} at the start of a file,
|
|
and the \var{end} return value of the preceding
|
|
call for each subsequent call.
|
|
|
|
\subsection{Compression format and level (9.5.3)}
|
|
|
|
Support for LZ4 compression has been added.
|
|
LZ4 is now the default format when compressing files (including
|
|
object files produced by the compiler) and bytevectors, while {\tt
|
|
gzip} is still supported and can be enabled by setting
|
|
the new \scheme{compress-format} parameter to the symbol \scheme{gzip} instead of the
|
|
default \scheme{lz4}. Reading in compressed mode
|
|
infers the format, so reading {\tt gzip}-compressed files will still
|
|
work without changing \scheme{compress-format}. Reading LZ4-format
|
|
files tends to be much faster than reading {\tt gzip}-format files,
|
|
while {\tt gzip}-compressed files tend to be smaller.
|
|
In particular, object files created by the compiler now tend to be
|
|
larger but load more quickly.
|
|
|
|
The new \scheme{compress-level} parameter can be used to control
|
|
the amount of time spent on file compression (but not
|
|
bytevector compression).
|
|
It can be set to one of the symbols \scheme{low},
|
|
\scheme{medium}, \scheme{high}, and \scheme{maximum}, which are
|
|
listed in order from shortest to longest compression time and least
|
|
to greatest effectiveness.
|
|
The default value is \scheme{medium}.
|
|
|
|
\subsection{Mutexes and condition variables can have names (9.5.3)}
|
|
|
|
The procedures \scheme{make-mutex} and \scheme{make-condition} now
|
|
accept an optional argument \scheme{name}, which must be a symbol
|
|
that identifies the object or \scheme{f} for no name. The name is
|
|
printed every time the mutex or condition object is printed, which
|
|
is useful for debugging.
|
|
|
|
\subsection{Improved packaging support (9.5.1)}
|
|
|
|
The Chez Scheme \scheme{Makefile} has been enhanced with new targets for
|
|
creating binary packages for Unix-like operating systems.
|
|
The \scheme{create-tarball} target generates a binary tarball package for
|
|
distribution, the \scheme{create-rpm} target generates a Linux RPM package, and
|
|
the \scheme{create-pkg} target generates a macOS package file.
|
|
|
|
\subsection{Library search handler (9.5.1)}
|
|
|
|
The new \scheme{library-search-handler} parameter controls how library source
|
|
or object code is located when \scheme{import}, \scheme{compile-whole-program},
|
|
or \scheme{compile-whole-library} are used to load a library.
|
|
The value of the \scheme{library-search-handler} parameter must be a procedure
|
|
expecting four arguments: the \var{who} argument is a symbol that provides
|
|
context in \scheme{import-notify} messages, the \var{library} argument is the
|
|
name of the desired library, the \var{directories} is a list of source and
|
|
object directory pairs in the form returned by \scheme{library-directories},
|
|
and the \var{extensions} parameter is a list of source and object extension
|
|
pairs in the form returned by \scheme{library-extensions}.
|
|
The default vaue of the \scheme{library-search-handler} is the newly exposed
|
|
\scheme{default-library-search-handler} procedure.
|
|
|
|
\subsection{Ftype guardians (9.5.1)}
|
|
|
|
Applications that manage memory outside the Scheme heap can leverage
|
|
new support for ftype guardians to help perform reference counting.
|
|
An ftype guardian is like an ordinary guardian except that it does
|
|
not necessarily save from collection each ftype pointer registered
|
|
with it but instead decrements (atomically) a reference count at
|
|
the head of the object to which the ftype pointer points.
|
|
If the reference count becomes zero as a result of the decrement,
|
|
it preserves the object so that it can be retrieved from the guardian
|
|
and freed; otherwise it allows it to be collected.
|
|
|
|
\subsection{Recompile information and whole-program optimization (9.5.1)}
|
|
|
|
\scheme{compile-whole-program} and \scheme{compile-whole-library}
|
|
now propagate recompile information from the named \scheme{wpo}
|
|
file to the object file to support \scheme{maybe-compile-program}
|
|
and \scheme{maybe-compile-library} in the case where the new object
|
|
file overwrites the original object file.
|
|
|
|
\subsection{Directly accessing the value of compile-time values (9.5.1)}
|
|
|
|
The value of a compile-time value created by \scheme{make-compile-time-value}
|
|
can be retrieved via the new procedure \scheme{compile-time-value-value}.
|
|
The new predicate \scheme{compile-time-value?} can be used to determine if
|
|
an object is a compile-time value.
|
|
|
|
\subsection{Extracting a subset of hashtable entries (9.5.1)}
|
|
|
|
The new \scheme{hashtable-cells} function is similar to
|
|
\scheme{hashtable-entries}, but it returns a vector of cells instead
|
|
of two vectors. An optional argument to \scheme{hashtable-keys},
|
|
\scheme{hashtable-values}, \scheme{hashtable-entries}, or \scheme{hashtable-cells}
|
|
limits the size of the result vector.
|
|
|
|
\subsection{Profile data retained for reclaimed code (9.5.1)}
|
|
|
|
Profile data is now retained indefinitely even for code objects
|
|
that have been reclaimed by the garbage collector.
|
|
Previously, the counters holding the data were reclaimed by the
|
|
collector along with the code objects.
|
|
This makes profile output more complete and accurate, but it does
|
|
represent a potential space leak in programs that create or load
|
|
and release code dynamically.
|
|
Such programs can avoid the potential space leak by releasing the
|
|
counters explicitly via the new procedure
|
|
\scheme{profile-release-counters}.
|
|
|
|
\subsection{Procedure source location without inspector information (9.5.1)}
|
|
|
|
When \scheme{generate-inspector-information} is set to \scheme{#f} and
|
|
\scheme{generate-procedure-source-information} is set to \scheme{#t},
|
|
source location information is preserved for a procedure, even though
|
|
other inspector information is not preserved.
|
|
|
|
\subsection{Atomic compare-and-set (9.5.1)}
|
|
|
|
The new procedures \scheme{box-cas!} and \scheme{vector-cas!}
|
|
atomically update a box or vector with a given new value when the
|
|
current content is \scheme{eq?} to a given old value. Atomicity is
|
|
guaranteed even if multiple threads attempt to update the same box or
|
|
vector.
|
|
|
|
\subsection{Foreign-procedure thread activation (9.5.1)}
|
|
|
|
A new \scheme{__collect_safe} foreign-procedure convention, which can
|
|
be combined with other conventions, causes a foreign-procedure call to
|
|
deactive the current thread during the call so that other threads can
|
|
perform a garbage collection. Similarly, the \scheme{__collect_safe}
|
|
convention modifier for callables causes the current thread to be
|
|
activated on entry to the callable, and the activation state is
|
|
reverted on exit from the callable; this activation makes callables
|
|
work from threads that are otherwise unknown to the Scheme system.
|
|
|
|
\subsection{Garbage collection and threads (9.5.1)}
|
|
|
|
A new \scheme{collect-rendezvous} function performs a garbage
|
|
collection in the same way as when the system determines that a
|
|
collection should occur. For many purposes,
|
|
\scheme{collect-rendezvous} is a variant of \scheme{collect} that
|
|
works when multiple threads are active. More precisely, the
|
|
\scheme{collect-rendezvous} function invokes the collect-request
|
|
handler (in an unspecified thread) after synchronizing all active
|
|
threads and temporarily deactivating all but the one used to call the
|
|
collect-request handler.
|
|
|
|
\subsection{Foreign-procedure struct arguments and results (9.5.1)}
|
|
|
|
A new \scheme{(& \var{ftype})} form allows a struct or union to be
|
|
passed between Scheme and a foreign procedure. The Scheme-side
|
|
representation of a \scheme{(& \var{ftype})} argument is the
|
|
same as a \scheme{(* \var{ftype})} argument, but where
|
|
\scheme{(& \var{ftype})} passes an address between the Scheme and C
|
|
worlds, \scheme{(& \var{ftype})} passes a copy of the data at the
|
|
address. When \scheme{(& \var{ftype})} is used as a result type,
|
|
an extra \scheme{(* \var{ftype})} argument must be provided to receive
|
|
the copied result, and the directly returned result is unspecified.
|
|
|
|
\subsection{Record equality and hashing (9.5, 9.5.1)}
|
|
|
|
Several new procedures and parameters allow a program to control what
|
|
\scheme{equal?} and \scheme{equal-hash} do when applied
|
|
to structures containing record instances.
|
|
The procedures \scheme{record-type-equal-procedure} and
|
|
\scheme{record-type-hash-procedure} can be used to customize the
|
|
handling of records of specific types by \scheme{equal?} and \scheme{hash}, and
|
|
the procedures \scheme{record-equal-procedure} and
|
|
\scheme{record-hash-procedure} can be used to look up the
|
|
applicable (possibly inherited) equality and hashing procedures
|
|
for specific record instances.
|
|
The parameters \scheme{default-record-equal-procedure} and
|
|
\scheme{default-record-hash-procedure} can be used to control
|
|
the default behavior when comparing or hashing records without
|
|
type-specific equality and hashing procedures.
|
|
|
|
\subsection{Immutable vectors, fxvectors, bytevectors, strings, and boxes (9.5)}
|
|
|
|
Support for immutable vectors, fxvectors, bytevectors, strings, and boxes
|
|
has been added.
|
|
Immutable vectors are created via \scheme{vector->immutable-vector},
|
|
and immutable fxvectors, bytevectors, and strings are created by similarly named
|
|
procedures.
|
|
Immutable boxes are created via \scheme{box-immutable}.
|
|
Any attempt to modify an immutable object causes an exception to be raised.
|
|
|
|
\subsection{Ephemeron pairs and hashtables (9.5)}
|
|
|
|
Support for ephemeron pairs has been added, along with eq and eqv
|
|
hashtables that use ephemeron pairs to combine keys and values. An
|
|
ephemeron pair avoids the ``key in value'' problem of weak pairs,
|
|
where a weakly held key is paired to a value that refers back to the
|
|
key, in which case the key remains reachable as long as the pair is
|
|
reachable. In an ephemeron pair, the cdr of the pair is not considered
|
|
reachable by the garbage collector until both the pair and the car of
|
|
the pair have been found reachable. An ephemeron hashtable implements
|
|
a weak mapping where referencing a key in a value does not prevent the
|
|
mapping from being removed from the table.
|
|
|
|
\subsection{Optional timeout for \protect\scheme{condition-wait} (9.5)}
|
|
|
|
The \scheme{condition-wait} procedure now takes an optional
|
|
\var{timeout} argument and returns a boolean indicating whether the
|
|
thread was awakened by the condition before the timeout. The
|
|
\var{timeout} can be a time record of type \scheme{time-duration} or
|
|
\scheme{time-utc}, or it can be \scheme{#f} for no timeout (the
|
|
default).
|
|
|
|
\subsection{\protect\scheme{date-dst?} and \protect\scheme{date-zone-name} (9.5)}
|
|
|
|
The new primitive procedures \scheme{date-dst?} and
|
|
\scheme{date-zone-name} access time-zone information for a
|
|
\scheme{date} record that is created without an explicit
|
|
zone offset. The zone-offset argument to \scheme{make-date}
|
|
is now optional.
|
|
|
|
\subsection{\protect\scheme{procedure-arity-mask} (9.5)}
|
|
|
|
The new primitive procedure \scheme{procedure-arity-mask} takes a
|
|
procedure \var{p} and returns a two's complement bitmask representing
|
|
the argument counts accepted by \var{p}.
|
|
For example, the arity mask for a two-argument procedure such as
|
|
\var{cons} is $4$ (only bit two set),
|
|
while the arity mask for a procedure that accepts one or more arguments,
|
|
such as \var{list*}, is $-2$ (all but bit 0 set).
|
|
|
|
\subsection{Bytevector compression (9.5)}
|
|
|
|
The new primitive procedures \scheme{bytevector-compress} and
|
|
\scheme{bytevector-decompress} exposes for bytevectors the kind of
|
|
compression functionality that is used for files with the
|
|
\scheme{compressed} option.
|
|
|
|
\subsection{Line caching and source objects (9.5)}
|
|
|
|
The \scheme{locate-source} function accepts an optional argument that
|
|
enables the use of a cache for line information, so that a source file
|
|
does not have to be consulted each time to compute line information.
|
|
To further avoid file and caching issues, a source object has optional
|
|
beginning-line and beginning-column components. Source objects with line
|
|
and column components take more space, but they allow reporting of line and column
|
|
information even if a source file is later modified or becomes unavailable.
|
|
The value of the \scheme{current-make-source-object} parameter is used by the
|
|
reader to construct source objects for programs, and the parameter can be
|
|
modified to collect line and column information eagerly. The value of the
|
|
\scheme{current-locate-source-object-source} parameter is used for
|
|
error reporting, instead of calling \scheme{locate-source} or
|
|
\scheme{locate-source-object-source} directly, so that just-in-time
|
|
source-location lookup can be adjusted, too.
|
|
|
|
\subsection{High-precision clock time in Windows 8 and up (9.5)}
|
|
|
|
When running on Windows 8 and up, Chez Scheme uses the high-precision
|
|
clock time function for the current date and time.
|
|
|
|
\subsection{Printing of non-standard (extended) identifiers (9.5)}
|
|
|
|
Chez Scheme extends the syntax of identifiers as described in the
|
|
introduction to the Chez Scheme User's Guide, except within forms prefixed
|
|
by \scheme{#!r6rs}, which is implied by in a library or top-level program.
|
|
Prior to Version~9.5, the printer always printed such identifiers using
|
|
hex scalar value escapes as necessary to render them with valid R6RS identifier syntax.
|
|
When the new parameter \scheme{print-extended-identifiers} is set
|
|
to \scheme{#t}, these identifiers are printed without escapes, e.g.,
|
|
\scheme{1+} prints as \scheme{1+} rather than as \scheme{\x31;+}.
|
|
The default value of this parameter is \scheme{#f}.
|
|
|
|
\subsection{Expression-editor Unicode support (9.5)}
|
|
|
|
The expression editor now supports Unicode characters under Linux and MacOS~X
|
|
except that combining characters are not treated correctly for
|
|
line-wrapping.
|
|
|
|
\subsection{Extensions to whole-program, whole-library optimization (9.3.1, 9.3.4)}
|
|
|
|
\scheme{compile-whole-program} now supports incomplete
|
|
whole-program optimization, i.e., whole program optimization that
|
|
incorporates only libraries for which wpo files are available while
|
|
leaving separate libraries for which only object files are available.
|
|
In addition, imported libraries can be left visible for run-time
|
|
use by the \scheme{environment} procedure or for dynamically loaded
|
|
object files that might require them.
|
|
The new procedure \scheme{compile-whole-library} supports the combination
|
|
of groups of libraries separate from programs and unconditionally
|
|
leaves all imported libraries visible.
|
|
|
|
\subsection{24-, 40-, 48-, and 56-bit bit-field containers (9.3.3)}
|
|
|
|
The total size of the fields within an ftype \scheme{bits} can now be
|
|
24, 40, 48, or 56 (as well as 8, 16, 32, and 64).
|
|
|
|
\subsection{Object-counting for static-generation collections (9.3.3)}
|
|
|
|
Object counting (see \scheme{object-counts} below) is now enabled for
|
|
all collections targeting the static generation.
|
|
|
|
\subsection{Support for off-line profile profile-dump processing (9.3.2)}
|
|
|
|
Previously, the output of \scheme{profile-dump} was not specified.
|
|
It is now specified to be a list of source-object, profile-count pairs.
|
|
In addition, \scheme{profile-dump-html}, \scheme{profile-dump-list},
|
|
and \scheme{profile-dump-data} all now take an optional \var{dump}
|
|
argument, which is a list of source-object, profile-count pairs in
|
|
the form returned by \scheme{profile-dump} and defaults to the current
|
|
value of \scheme{(profile-dump)}.
|
|
|
|
With these changes, it is now possible to obtain a dump from
|
|
\scheme{profile-dump} in one process, and write it to a fasl file
|
|
(using \scheme{fasl-write}) for subsequent off-line processing in
|
|
another process, where it can be read from the fasl file (using
|
|
\scheme{fasl-read}) and processed using \scheme{profile-dump-html},
|
|
\scheme{profile-dump-list}, \scheme{profile-dump-data} or some
|
|
custom mechanism.
|
|
|
|
\subsection{More support for controlling return of memory to the O/S (9.3.2)}
|
|
|
|
A new parameter, \scheme{release-minimum-generation}, determines when
|
|
the collector attempts to return unneeded virtual memory to the O/S.
|
|
It defaults to the value of \scheme{collect-maximum-generation}, so the
|
|
collector attempts to return memory to the O/S only when performing a
|
|
maximum-generation collection.
|
|
It can be set to a lower generation number to cause the collector to
|
|
do so for younger generations we well.
|
|
|
|
\subsection{sstats changes (9.3.1)}
|
|
|
|
The vector-based sstats structure has been replaced with a record type.
|
|
The time fields are all time objects, and the bytes and count fields
|
|
are now exact integers.
|
|
\scheme{time-difference} no longer coerces negative results to zero.
|
|
|
|
\subsection{\protect\scheme{library-group} eliminated (9.3.1)}
|
|
|
|
With the extensions to \scheme{compile-whole-program} and the
|
|
addition of \scheme{compile-whole-library}, as described above,
|
|
support for whole-program and whole-library optimization now subsumes
|
|
the functionality of the experimental \scheme{library-group} form,
|
|
and the form has been eliminated.
|
|
This is an \emph{incompatible change}.
|
|
|
|
\subsection{Support for Version~7 interaction-environment semantics eliminated (9.3.1)}
|
|
|
|
Prior to Version~8, the semantics of the interaction environment
|
|
used by the read-eval-print loop (REPL), aka waiter, and by
|
|
\scheme{load}, \scheme{compile}, and \scheme{interpret} without
|
|
explicit environment arguments treated all variables in the environment
|
|
as mutable, including those bound to primitives.
|
|
This meant that top-level references to primitive names could not
|
|
be optimized by the compiler because their values might change at
|
|
run time, except that, at optimize-level 2 and above, the compiler
|
|
did treat primitive names as always having their original values.
|
|
|
|
In Version 8 and subsequent versions, primitive bindings in the
|
|
interaction environment are immutable, as if imported directly from
|
|
the immutable Scheme environment.
|
|
That is, they cannot be assigned, although they can be replaced
|
|
with new bindings with a top-level definition.
|
|
|
|
To provide temporary backward compatibility, the
|
|
\scheme{--revert-interaction-semantics} command-line option and
|
|
\scheme{revert-interaction-semantics} parameter allowed programmers
|
|
to revert the interaction environment to Version~7 semantics.
|
|
This functionality has now been eliminated and along with it the
|
|
special treatment of primitive bindings at optimize level 2 and
|
|
above.
|
|
|
|
This is an \emph{incompatible change}.
|
|
|
|
\subsection{Explicit specification of profile source locations (9.3.1)}
|
|
|
|
Version 9.3.1 augments existing support for explicit source-code
|
|
annotations with additional features targeted at source profiling
|
|
for externally generated programs, including programs generated by
|
|
language front ends that target Scheme and use Chez Scheme as the
|
|
back end.
|
|
Included is a \scheme{profile} expression that explicitly associates
|
|
a specified source object with a profile count (of times the
|
|
expression is evaluated), \scheme{generate-profile-forms} parameter
|
|
that controls whether the compiler (also) associates profile counts
|
|
with source locations implicitly identified by annotated expressions
|
|
in the input, and a finer-grained method for marking whether an
|
|
individual annotation should be used for debugging, profiling, or
|
|
both.
|
|
|
|
\subsection{``Maybe'' file (re)compilation (9.3.1)}
|
|
|
|
When \scheme{compile-imported-libraries} is set to \scheme{#t},
|
|
libraries required indirectly by one of the
|
|
file-compilation procedures, e.g., \scheme{compile-library},
|
|
\scheme{compile-program}, and \scheme{compile-file}, are automatically
|
|
compiled if and only if the object file is not present, older than
|
|
the source (main and include) files, or some library upon which
|
|
they depend has been or needs to be recompiled.
|
|
|
|
Version 9.3.1 adds three new procedures: \scheme{maybe-recompile-library},
|
|
\scheme{maybe-recompile-program}, and \scheme{maybe-recompile-file},
|
|
that perform a similar analysis and compile the library, program,
|
|
or file only under similar circumstances.
|
|
|
|
\subsection{New primitives for querying memory utilization (9.3.1)}
|
|
|
|
Three new primitives have been added to allow a Scheme process to
|
|
track usage of virtual memory for its heap.
|
|
|
|
\scheme{current-memory-bytes} returns the total number of bytes of
|
|
virtual memory used or reserved to represent the Scheme heap.
|
|
This differs from \scheme{bytes-allocated}, which returns the number
|
|
of bytes currently occupied by Scheme objects.
|
|
\scheme{current-memory-bytes} additionally includes memory used for
|
|
heap management as well as memory held in reserve to satisfy future
|
|
allocation requests.
|
|
|
|
\scheme{maximum-memory-bytes} returns the maximum number of bytes
|
|
of virtual memory occupied or reserved for the Scheme heap by the
|
|
calling process since the last call to \scheme{reset-maximum-memory-bytes!}
|
|
or, if \scheme{reset-maximum-memory-bytes!} has never been called,
|
|
since system start-up.
|
|
|
|
\scheme{reset-maximum-memory-bytes!} resets the maximum memory bytes
|
|
to the current memory bytes.
|
|
|
|
\subsection{Unicode 7.0 support (9.3.1)}
|
|
|
|
The character sets, character classes, and word-breaking algorithms
|
|
for character, string, and Unicode-related bytevector operations
|
|
have now been updated to Unicode 7.0.
|
|
|
|
\subsection{Linux PowerPC (32-bit) support (9.3)}
|
|
|
|
Support for running {\ChezScheme} on 32-bit PowerPC processors
|
|
running Linux has been added, with machines type ppc32le (nonthreaded)
|
|
and tppc32le (threaded).
|
|
C~code intended to be linked with these versions of the system
|
|
should be compiled using the GNU C~compiler's \scheme{-m32} option.
|
|
|
|
\subsection{Printed representation of procedures (9.2.1)}
|
|
|
|
The printed representation of a procedure now includes the source
|
|
file and beginning file position when available.
|
|
|
|
\subsection{I/O errors writing to the console error port (9.2.1)}
|
|
|
|
The default exception handler now catches I/O exceptions that occur
|
|
when it attempts to display a condition and, if an I/O exception
|
|
does occur, resets as if by calling the \scheme{reset} procedure.
|
|
The intent is to avoid an infinite regression (ultimately ending
|
|
in exhaustion of memory) in which the process repeatedly recurs
|
|
back to the default exception handler trying to write to a console-error
|
|
port (typically stderr) that is no longer writable, e.g., due to
|
|
the other end of a pipe or socket having been closed.
|
|
|
|
\subsection{C locking macros (9.2.1)}
|
|
|
|
The header file scheme.h distributed with Chez Scheme now includes
|
|
several new lock-related macros:
|
|
\scheme{INITLOCK} (corresponding to \scheme{ftype-init-lock!}),
|
|
\scheme{SPINLOCK} (\scheme{ftype-spin-lock!}),
|
|
\scheme{UNLOCK} (\scheme{ftype-unlock!}),
|
|
\scheme{LOCKED_INCR} (\scheme{ftype-locked-incr!}), and
|
|
\scheme{LOCKED_DECR} (\scheme{ftype-locked-decr!}).
|
|
All take a pointer to an iptr or uptr.
|
|
\scheme{LOCKED_INCR} and \scheme{LOCKED_DECR} also take an
|
|
\scheme{lvalue} argument that is set to true (nonzero) if the result
|
|
of the increment or decrement is zero, otherwise false (zero).
|
|
|
|
\subsection{New \protect\scheme{compile-to-file} procedure (9.2.1)}
|
|
|
|
The new procedure \scheme{compile-to-file} is similar to
|
|
\scheme{compile-to-port} with the output port replaced with an
|
|
output pathname.
|
|
|
|
\subsection{Whole-program optimization (9.2)}
|
|
|
|
Version 9.2 includes support for whole-program optimization of a top-level
|
|
program and the libraries upon which it depends at run time based on ``wpo''
|
|
(whole-program-optimization) files produced as a byproduct of compiling
|
|
the program and libraries when the parameter \scheme{generate-wpo-files}
|
|
is set to \scheme{#t}.
|
|
The new procedure \scheme{compile-whole-program} takes as input
|
|
a wpo file for a top-level program, combines it with the wpo files for
|
|
any libraries the program requires at run time, and produces a single
|
|
object file containing a self-contained program.
|
|
In so doing, it discards unused code and optimizes across program and
|
|
library boundaries, potentially reducing program load time, run time,
|
|
and memory requirements.
|
|
|
|
\scheme{compile-file}, \scheme{compile-program}, \scheme{compile-library},
|
|
and \scheme{compile-script} produce wpo files as well as ordinary
|
|
object files when the new \scheme{generate-wpo-files} parameter is set
|
|
to \scheme{#t} (the default is \scheme{#f}).
|
|
\scheme{compile-port} and \scheme{compile-to-port} do so when passed
|
|
an optional \var{wpo output port}.
|
|
|
|
\subsection{Type-specific symbol-hashtable operators (9.2)\label{sec:symbol-hashtables}}
|
|
|
|
A new set of primitives that operate on symbol
|
|
hashtables has been added:
|
|
|
|
\schemedisplay
|
|
symbol-hashtable?
|
|
symbol-hashtable-ref
|
|
symbol-hashtable-set!
|
|
symbol-hashtable-contains?
|
|
symbol-hashtable-cell
|
|
symbol-hashtable-update!
|
|
symbol-hashtable-delete!
|
|
\endschemedisplay
|
|
|
|
These are like their generic counterparts but operate only on symbol
|
|
hashtables, i.e., hashtables created with \scheme{symbol-hash} as
|
|
the hash function and \scheme{eq?}, \scheme{eqv?}, \scheme{equal?},
|
|
or \scheme{symbol=?} as the equivalence function.
|
|
|
|
These primitives are more efficient at optimize-level 3 than their
|
|
generic counterparts when both are applied to symbol hashtables.
|
|
The performance of symbol hashtables has been improved even when the new
|
|
operators are not used (Section~\ref{sec:symbol-hashtable-performance}).
|
|
|
|
\subsection{\protect\scheme{strip-fasl-file} is now machine-independent (9.2)}
|
|
|
|
\scheme{strip-fasl-file} can now strip fasl files created for a machine
|
|
type other than the machine type of the calling process as long as the
|
|
Chez Scheme version is the same.
|
|
|
|
\subsection{\protect\scheme{source-file-descriptor} and \protect\scheme{locate-source} (9.2)}
|
|
|
|
The new procedure \scheme{source-file-descriptor} can be used to construct
|
|
a custom source-file descriptor or reconstruct a source-file descriptor
|
|
from values previously extracted from another source-file descriptor.
|
|
It takes two arguments: a string \var{path} and exact nonnegative integer
|
|
\var{checksum} and returns a new source-file descriptor.
|
|
|
|
The new procedure \scheme{locate-source} can be used to determine a full
|
|
path, line number, and character position from a source-file descriptor
|
|
and file position.
|
|
It accepts two arguments: a source-file descriptor \var{sfd} and an
|
|
exact nonnegative integer file position \var{fp}.
|
|
It returns zero values if the unmodified file is not found in the source
|
|
directories and three values (string \var{path}, exact nonnegative
|
|
integer \var{line}, and exact nonnegative integer \var{char}) if the
|
|
file is found.
|
|
|
|
\subsection{Compressed compiled scripts and partially compressed files (9.2)}
|
|
|
|
Support for creating and handling files that begin with uncompressed
|
|
data and end with compressed data has been added in the form of the
|
|
new procedure \scheme{port-file-compressed!} that takes a port and
|
|
if not already set up to read or write compressed data, sets it up
|
|
to do so.
|
|
The port must be a file port pointing to a regular file, i.e., a
|
|
file on disk rather than a socket or pipe, and the port must not be
|
|
an input/output port.
|
|
The port can be a binary or textual port.
|
|
If the port is an output port, subsequent output sent to the port
|
|
will be compressed.
|
|
If the port is an input port, subsequent input will be decompressed
|
|
if and only if the port is currently pointing at compressed data.
|
|
|
|
When the parameter \scheme{compile-compressed} is set ot \scheme{#t},
|
|
the \scheme{compile-script} and \scheme{compile-program} procedures
|
|
take advantage of this functionality to copy the \scheme{#!} prefix,
|
|
if present in the source file, uncompressed in the object file while
|
|
compressing the object code emitted for the program, thus reducing
|
|
the size of the resulting file without preventing the \scheme{#!}
|
|
line from being read and interpreted properly by the operating
|
|
system.
|
|
|
|
\subsection{Change in library import handling (9.2)}
|
|
|
|
In previous releases, when an object file was found before the
|
|
corresponding source file in the library directories, the object file was
|
|
older, and the parameter \scheme{compile-imported-libraries} was not set,
|
|
the object file was loaded rather than the source file.
|
|
The (newer) source file is now loaded instead, just as it would be if
|
|
the source file is found before the corresponding, older object file.
|
|
This is an \emph{incompatible change}.
|
|
|
|
\subsection{Change in fasl-strip options (9.1)}
|
|
|
|
\scheme{strip-fasl-file} now supports stripping of all compile-time
|
|
information and no longer supports stripping of just library visit code.
|
|
Stripping all compile-time information nearly always results in smaller
|
|
object files than stripping just library visit code, with a corresponding
|
|
reduction in the memory required when the resulting
|
|
file is loaded.
|
|
|
|
To reflect this, the old fasl-strip option \scheme{library-visit-code}
|
|
has been eliminated, and the new fasl-strip option
|
|
\scheme{compile-time-information} has been added.
|
|
This is an \emph{incompatible change} in that code that previously
|
|
used the fasl-strip option \scheme{library-visit-code} will
|
|
have to be modified to omit the option or to replace it with
|
|
\scheme{compile-time-information}.
|
|
|
|
\subsection{Library loading (9.1)}
|
|
|
|
Visiting (via \scheme{visit}) a library no longer loads the library's
|
|
run-time information (invoke dependencies and invoke code), and revisiting
|
|
(via \scheme{revisit}) a library no longer loads the library's
|
|
compile-time information (import and visit dependencies and import and
|
|
visit code).
|
|
|
|
When a library is invoked due to a run-time dependency of another
|
|
library or a top-level program on the library, the library is now
|
|
``revisited'' (as if via \scheme{revisit}) rather than ``loaded''
|
|
(as if via \scheme{load}).
|
|
As a result, the compile-time information is not loaded, which can result
|
|
in substantial reductions in both library invocation time and memory
|
|
footprint.
|
|
|
|
If a library is revisited, either explicitly or as the result of run-time
|
|
dependency, a subsequent import of the library causes it to be
|
|
``visited'' (as if via \scheme{visit}) if the same object file can be
|
|
found at the same path and the visit code has not been stripped.
|
|
The compile-time code can alternatively be loaded explicitly from the same or a
|
|
different file via a direct call to \scheme{visit}.
|
|
|
|
While this change is mostly transparent (ignoring the reduced invocation
|
|
time and memory footprint), it is an \emph{incompatible change} in the
|
|
sense that the system potentially reads the file twice and can run
|
|
code that is marked using \scheme{eval-when} as both visit
|
|
and revisit code.
|
|
|
|
\subsection{Finding objects in the heap (9.1)}
|
|
|
|
Version 9.1 includes support for a new heap inspection tool that
|
|
allows a programmer to look for objects in the heap according to
|
|
arbitrary predicates.
|
|
The new procedure \scheme{make-object-finder} takes a predicate \var{pred} and two optional
|
|
arguments: a starting point \var{x} and a maximum generation \var{g}.
|
|
The starting point defaults to the value of the procedure \scheme{oblist},
|
|
and the maximum generation defaults to the value of the parameter
|
|
\scheme{collect-maximum-generation}.
|
|
\scheme{make-object-finder} returns an object finder \var{p} that can be used to
|
|
search for objects satisfying \var{pred} within the starting-point object \var{x}.
|
|
Immediate objects and objects in generations older than \var{g} are treated
|
|
as leaves.
|
|
\var{p} is a procedure accepting no arguments.
|
|
If an object \var{y} satisfying \var{pred} can be found starting with \var{x},
|
|
\var{p} returns a list whose first element is \var{y} and whose remaining
|
|
elements represent the path of objects from \var{x} to \var{y}, listed
|
|
in reverse order.
|
|
\var{p} can be invoked multiple times to find additional objects satisfying
|
|
the predicate, if any.
|
|
\var{p} returns \scheme{#f} if no more objects matching the predicate
|
|
can be found.
|
|
|
|
\var{p} maintains internal state recording where it has been so that it
|
|
can restart at the point of the last found object and not return
|
|
the same object twice.
|
|
The state can be several times the size of the starting-point object
|
|
\var{x} and all that is reachable from \var{x}.
|
|
|
|
The interactive inspector provides a convenient interface to the object
|
|
finder in the form of \scheme{find} and \scheme{find-next} commands.
|
|
The \scheme{find} command evaluates its first argument, which should
|
|
evaluate to the desired predicate, and treats its second argument, if
|
|
present, as the maximum generation, overriding the default.
|
|
The starting point \var{x} is the object upon which the
|
|
inspector is currently focused.
|
|
If an object is found, the inspector's new focus is the found object,
|
|
the parent focus (obtainable via the \scheme{up} command) is the first
|
|
element in the (reversed) path, the parent's parent is the next element,
|
|
and so on up to \var{x}.
|
|
The \scheme{find-next} command repeats the last find, as if by an explicit
|
|
invocation of the same object finder.
|
|
|
|
Relocation tables for static code objects are discarded by default, which
|
|
prevents object finders from providing accurate results when static code
|
|
objects are involved.
|
|
That is, they will not find any objects pointed to directly from a code
|
|
object that has been promoted to the static generation.
|
|
If this is a problem, the command-line argument
|
|
\scheme{--retain-static-relocation} can be used to prevent the relocation
|
|
tables from being discarded.
|
|
|
|
\subsection{Object counts (9.1)}
|
|
|
|
The new procedure \scheme{object-counts} can be used to determine,
|
|
for each type of object, the number and size in bytes of objects of
|
|
that type in each generation.
|
|
Its return value has the following structure:
|
|
|
|
\schemedisplay
|
|
((\var{type} (\var{generation} \var{count} . \var{bytes}) \dots) \dots)
|
|
\endschemedisplay
|
|
|
|
\var{type} is either the name of a primitive type, represented as a
|
|
symbol, e.g., \scheme{pair}, or a record-type descriptor (rtd).
|
|
\var{generation} is a nonnegative fixnum between 0 and the value
|
|
of \scheme{(collect-maximum-generation)}, inclusive, or the symbol
|
|
\scheme{static} representing the static generation.
|
|
\var{count} and \var{bytes} are nonnegative fixnums.
|
|
|
|
Object counts are accurate for a generation $n$ immediately after
|
|
a collection of generation $n$ or higher if enabled during that
|
|
collection.
|
|
Object counts are enabled by setting the parameter
|
|
\scheme{enable-object-counts} to \scheme{#t}.
|
|
The command-line option \scheme{--enable-object-counts} can be used to
|
|
set this parameter to \scheme{#t} on startup.
|
|
Object counts are not enabled by default since it adds overhead to
|
|
garbage collection.
|
|
|
|
To make the information more useful in the presence of ftype pointers,
|
|
the ftype descriptors produced by \scheme{define-ftype} for each
|
|
defined ftype now carry the name of the ftype rather than a generic
|
|
name like \scheme{ftd-struct}.
|
|
(Ftype descriptors are subtypes of record-type descriptors and can appear
|
|
as types in the \scheme{object-counts} return value.)
|
|
|
|
\subsection{Native-eol style is now none (9.1)}
|
|
|
|
To simplify interaction with tools that naively expose multiple-character
|
|
end-of-line sequences such as CRLF as separate characters to the user, the
|
|
native end-of-line style (\scheme{native-eol-style}) is now \scheme{none}
|
|
on all machine types.
|
|
This is an \emph{incompatible change}.
|
|
|
|
\subsection{Library-requirements options (9.1)}
|
|
|
|
In previous releases, the \scheme{library-requirements} procedure
|
|
returns a list of all libraries required by the specified library,
|
|
whether they are needed when the specified library is imported,
|
|
visited, or invoked.
|
|
While this remains the default behavior, \scheme{library-requirements}
|
|
now takes an optional ``options'' argument.
|
|
This must be a library-requirements-options enumerations set, i.e., the
|
|
value of a \scheme{library-requirements-options} form with some subset of
|
|
the options \scheme{import}, \scheme{visit@visit}, \scheme{invoke@visit},
|
|
and \scheme{invoke}. \scheme{import} includes the libraries
|
|
that must be imported when the specified library is imported;
|
|
\scheme{visit@visit} includes the libraries that must be visited when
|
|
the specified library is visited; \scheme{invoke@visit} includes the libraries
|
|
that must be invoked when the specified library is visited; and
|
|
\scheme{invoke} includes the libraries that must be invoked when
|
|
the specified library is invoked.
|
|
The default behavior is obtained by supplying a enumeration set containing all
|
|
of these options.
|
|
|
|
\subsection{Nested object size and composition (9.1)}
|
|
|
|
Two new procedures, \scheme{compute-size} and
|
|
\scheme{compute-composition}, can be used to determine the
|
|
size and make-up of nested objects with the heap.
|
|
|
|
Both take an object and an optional generation.
|
|
The generation must be a fixnum between 0 and the value of
|
|
\scheme{(collect-maximum-generation)}, inclusive, or the symbol static.
|
|
It defaults to the value of \scheme{(collect-maximum-generation)}.
|
|
|
|
\scheme{compute-size} returns the number of bytes occupied by the object
|
|
and everything to which it points, ignoring objects in generations older
|
|
than the specified generation.
|
|
|
|
\scheme{compute-composition} returns an association list giving the
|
|
number and number of bytes of each type of object that the specified
|
|
object is constructed from, ignoring objects in generations older than
|
|
the specified generation. The association list maps type names (e.g.,
|
|
pair and flonum) or record-type descriptors to a pair of fixnums
|
|
giving the count and bytes.
|
|
Types with zero counts are not included in the list.
|
|
|
|
A surprising number of objects effectively point indirectly to a large
|
|
percentage of all objects in the heap due to the attachment of top-level
|
|
environment bindings to symbols, but the generation argument can be used
|
|
in combination with explicit calls to collect (with automatic collections
|
|
disabled) to measure precisely how much space is allocated to freshly
|
|
allocated structures.
|
|
|
|
When used directly from the REPL with no other threads running,
|
|
\scheme{(compute-size (oblist) 'static)} effectively gives the size of
|
|
the entire heap, and \scheme{(compute-composition (oblist) 'static)}
|
|
effectively gives the composition of the entire heap.
|
|
|
|
The inspector makes the aggregate size of an object similarly available
|
|
through the \scheme{size} inspector-object message and the corresponding
|
|
\scheme{size} interactive-inspector command, with the twist that it
|
|
does not include objects whose sizes were previously requested in the
|
|
same session, making it possible to see the effectively smaller sizes
|
|
of what the programmer perceives to be substructures in shared and
|
|
cyclic structures.
|
|
|
|
These procedures potentially allocate a large amount of memory and
|
|
so should be used only when the information returned by the
|
|
procedure \scheme{object-counts} (see preceding entry) does not suffice.
|
|
|
|
Relocation tables for static code objects are discarded by default,
|
|
which prevents these procedures from providing accurate results when
|
|
static code objects are involved.
|
|
That is, they will not find any objects pointed to directly from a code
|
|
object that has been promoted to the static generation.
|
|
If accurate sizes and compositions for static code objects are
|
|
required, the command-line argument \scheme{--retain-static-relocation}
|
|
can be used to prevent the relocation tables from being discarded.
|
|
|
|
\subsection{Showing expander and optimizer output (9.1)}
|
|
|
|
When the parameter \scheme{expand-output} is set to a textual output
|
|
port, the output of the expander is printed to the port as a side effect
|
|
of running \scheme{compile}, \scheme{interpret}, or any of the file
|
|
compiling primitives, e.g., \scheme{compile-file} or
|
|
\scheme{compile-library}.
|
|
Similarly, when the parameter \scheme{expand/optimize-output} is set to a
|
|
textual output port, the output of the source optimizer is printed.
|
|
|
|
\subsection{Undefined-variable warnings (9.1)}
|
|
|
|
When \scheme{undefined-variable-warnings} is set to \scheme{#t}, the
|
|
compiler issues a warning message whenever it cannot determine that
|
|
a variable bound by \scheme{letrec}, \scheme{letrec*}, or an internal
|
|
definition will not be referenced before it is defined.
|
|
The default value is \scheme{#f}.
|
|
|
|
Regardless of the setting of this parameter, the compiler inserts code
|
|
to check for the error, except at optimize level 3.
|
|
The check is fairly inexpensive and does not typically inhibit inlining
|
|
or other optimizations.
|
|
In code that must be carefully tuned, however, it is sometimes useful
|
|
to reorder bindings or make other changes to eliminate the checks.
|
|
Enabling this warning can facilitate this process.
|
|
|
|
The checks are also visible in the output of \scheme{expand/optimize}.
|
|
|
|
\subsection{Detecting accidental use of generative record types (9.1)}
|
|
|
|
When the new boolean parameter \scheme{require-nongenerative-clause}
|
|
is set to \scheme{#t}, a \scheme{define-record-type} without a
|
|
\scheme{nongenerative} clause is treated as a syntax error.
|
|
This allows the programmer to detect accidental use of generative
|
|
record types.
|
|
Generative record types are rarely useful and are less efficient
|
|
than nongenerative types, since generative record types require the
|
|
construction of a record-type-descriptor each time a
|
|
\scheme{define-record-type} form is evaluated rather than once,
|
|
at compile time.
|
|
To support the rare need for a generative record type while still
|
|
allowing accidental generativity to be detected,
|
|
\scheme{define-record-type} has been extended to allow a generative
|
|
record type to be explicitly declared with a \scheme{nongenerative}
|
|
clause with \scheme{#f} for the uid, i.e., \scheme{(nongenerative #f)}.
|
|
|
|
\subsection{Improved support for cross compilation (9.1)}
|
|
|
|
Cross-compilation support has been improved in two ways: (1) it is
|
|
now possible to cross-compile a library and import it later in a
|
|
separate process for cross-compilation of dependent libraries, and
|
|
(2) the code produced for the target machine when cross compiling is no
|
|
longer less efficient than code produced natively on the target
|
|
machine.
|
|
|
|
\subsection{Linux ARMv6 (32-bit) support (9.1)}
|
|
|
|
Support for running {\ChezScheme} on ARMv6 processors running Linux
|
|
has been added, with machine type arm32le (32-bit nonthreaded).
|
|
C~code intended to be linked with these versions of the system
|
|
should be compiled using the GNU C~compiler's \scheme{-m32} option.
|
|
|
|
\subsection{Source information in ftype ref/set! error messages (9.0)}
|
|
|
|
When available at compile time, source information is now included
|
|
in run-time error messages produced when \scheme{ftype-&ref},
|
|
\scheme{ftype-ref}, \scheme{ftype-set!}, and the locked ftype
|
|
operations are handed invalid inputs, e.g., ftype pointers of some
|
|
unexpected type, RHS values of some unexpected type, or improper
|
|
indices.
|
|
|
|
\subsection{\protect\scheme{compile-to-port} top-level-program dependencies (9.0)}
|
|
|
|
When passed a single \scheme{top-level-program} form,
|
|
\scheme{compile-to-port} now returns a list of the libraries the
|
|
top-level program requires at run time, as with \scheme{compile-program}.
|
|
Otherwise, the return value is unspecified.
|
|
|
|
\subsection{Better feedback for record-type mismatches (9.0)}
|
|
|
|
When \scheme{make-record-type} or \scheme{make-record-type-descriptor}
|
|
detect an incompatibility between two record types with the same
|
|
UID, the resulting error messages provide more information to
|
|
describe the mismatch, i.e., whether the parent, fields, flags, or
|
|
mutability differ.
|
|
|
|
\subsection{\protect\scheme{enable-cross-library-optimization} parameter (9.0)}
|
|
|
|
When a library is compiled, information is stored with the object
|
|
code to enable propagation of constants and inlining of procedures
|
|
defined in the library into dependent libraries.
|
|
The new parameter \scheme{enable-cross-library-optimization}, whose
|
|
value defaults to \scheme{#t}, can be set to \scheme{#f} to prevent
|
|
this information from being stored and disable the corresponding
|
|
optimizations.
|
|
This might be done to reduce the size of the object files or to
|
|
reduce the potential for exposure of near-source information via
|
|
the object file.
|
|
|
|
\subsection{Stripping object files (9.0)}
|
|
|
|
The new procedure \scheme{strip-fasl-file} allows the removal of
|
|
source information of various sorts from a compiled object (fasl) file
|
|
produced by \scheme{compile-file} or one of the other file compiling
|
|
procedures.
|
|
It also allows removal of library visit code, i.e., the code
|
|
required to compile (but not run) dependent libraries.
|
|
|
|
\scheme{strip-fasl-file} accepts three arguments: an input pathname,
|
|
and output pathname, and a fasl-strip-options enumeration set,
|
|
created by \scheme{fasl-strip-options} with zero or more of the
|
|
following options.
|
|
|
|
\begin{description}
|
|
\item[\scheme{inspector-source}:]
|
|
Strip inspector source information.
|
|
|
|
\item[\scheme{source-annotations}:]
|
|
Strip source annotations.
|
|
|
|
\item[\scheme{profile-source}:]
|
|
Strip source file and character position information from profiled
|
|
code objects.
|
|
|
|
\item[\scheme{library-visit-code}:]
|
|
This strips library visit code from compiled libraries.
|
|
\end{description}
|
|
|
|
\subsection{Ftype array bound of zero (9.0)}
|
|
|
|
The bound of an ftype array can now be zero and, when zero, is
|
|
treated as unbounded in the sense that no run-time upper-bound
|
|
checks are performed for accesses to the array.
|
|
This simplifies the creation of ftype arrays whose actual bounds
|
|
are determined dynamically.
|
|
|
|
\subsection{\protect\scheme{compile-profile} no longer implies \protect\scheme{generate-inspector-information} (9.0)}
|
|
|
|
In previous releases, profile and inspector source information was
|
|
gathered and stored together so that compiling with profiling enabled
|
|
required that inspector information also be stored with each code object.
|
|
This is no longer the case.
|
|
|
|
\subsection{\protect\scheme{case} now uses \protect\scheme{member} (9.0)}
|
|
|
|
\scheme{case} now uses \scheme{member} rather than \scheme{memv} for key
|
|
comparisons, a generalization that allows \scheme{case} to be used for
|
|
strings, lists, vectors, etc., rather than just atomic values.
|
|
This adds no overhead when keys are comparable with \scheme{memv},
|
|
since the compiler converts calls to \scheme{member} into calls to
|
|
\scheme{memv} (or \scheme{memq}, or even individual inline pointer
|
|
comparisons) when it can determine the more expensive test is not
|
|
required.
|
|
|
|
The \scheme{case} syntax exported by the \scheme{(rnrs)} and
|
|
\scheme{(rnrs base)} libraries still uses \scheme{memv} for
|
|
compatibility with the R6RS standard.
|
|
|
|
\subsection{\protect\scheme{write} and \protect\scheme{display} and foreign addresses (9.0)}
|
|
|
|
The \scheme{write} and \scheme{display} procedures now recognize
|
|
foreign addresses that happen to look like Scheme objects and print
|
|
them as \scheme{#<foreign>}; previously, \scheme{write} and
|
|
\scheme{display} would attempt to treat the addresses as Scheme
|
|
objects, typically leading to invalid memory references.
|
|
Some foreign addresses are indistinguishable from fixnums and
|
|
still print as fixnums.
|
|
|
|
\subsection{Profile-directed optimization (9.0)}
|
|
|
|
Compiled code can be instrumented to gather two kinds of
|
|
execution counts, source-level and block-level, via different settings
|
|
of the \scheme{compile-profile} parameter.
|
|
When \scheme{compile-profile} is set to the symbol \scheme{source}
|
|
at compile time, source execution counts are gathered by the generated
|
|
code, and when \scheme{compile-profile} is set to \scheme{block},
|
|
block execution counts are gathered.
|
|
Setting it to \scheme{#f} (the default) disables instrumentation.
|
|
|
|
Source counts are identical to the source counts gathered by generated
|
|
code in previous releases when compiled with
|
|
\scheme{compile-profile} set to \scheme{#t}, and \scheme{#t}
|
|
can be still be used in place of \scheme{source} for backward
|
|
compatibility.
|
|
Source counts can be viewed by the programmer at the end of the run
|
|
of the generated code via \scheme{profile-dump-list} and
|
|
\scheme{profile-dump-html}.
|
|
|
|
Block counts are per \emph{basic block}.
|
|
Basic blocks are individual sequences of straight-line code and are
|
|
the building blocks of the machine code generated by the compiler.
|
|
Counting the number of times a block is executed is thus equivalent
|
|
to counting the number of times the instructions within it are
|
|
executed.
|
|
|
|
There is no mechanism for the programmer to view block counts, but
|
|
both block counts and source counts can now be saved after a sample
|
|
run of the generated code for use in guiding various optimizations
|
|
during a subsequent compilation of the same code.
|
|
|
|
The source counts can be used by ``profile-aware macros,'' i.e.,
|
|
macros whose expansion is guided by profiling information.
|
|
A profile-aware macro can use profile information to optimize
|
|
the code it produces.
|
|
For example, a macro defining an abstract datatype might choose
|
|
representations and algorithms based on the frequencies
|
|
of its operations.
|
|
Similarly, a macro, like \scheme{case}, that performs a set of
|
|
disjoint tests might choose to order those tests based on which are
|
|
most likely to succeed.
|
|
Indeed, the built-in \scheme{case} now does just that.
|
|
A new syntactic form, \scheme{exclusive-cond}, abstracts a common
|
|
use case for profile-aware macros.
|
|
|
|
The block counts are used to guide certain low-level optimizations,
|
|
such as block ordering and register allocation.
|
|
|
|
The procedure \scheme{profile-dump-data} writes to a specified file
|
|
the profile data collected during the run of a program compiled
|
|
with \scheme{compile-profile} set to either \scheme{source} or
|
|
\scheme{block}.
|
|
It is similar to \scheme{profile-dump-list} or \scheme{profile-dump-html}
|
|
but stores the profile data in a machine readable form.
|
|
|
|
The procedure \scheme{profile-load-data} loads one or more files
|
|
previously created by \scheme{profile-dump-data} into an internal
|
|
database.
|
|
|
|
The database associates \emph{weights} with source locations or
|
|
blocks, where a weight is a flonum representing the ratio of the
|
|
location's count versus the maximum count.
|
|
When multiple profile data sets are loaded, the weights for each
|
|
location are averaged across the data sets.
|
|
|
|
The procedure \scheme{profile-query-weight} accepts a source object
|
|
and returns the weight associated with the location identified by
|
|
the source object, or \scheme{#f} if no weight is associated with
|
|
the location.
|
|
This procedure is intended to be used by a profile-aware macro on
|
|
pieces of its input to optimize code based on profile data previously
|
|
stored by \scheme{profile-dump-data} and loaded by
|
|
\scheme{profile-load-data}.
|
|
|
|
The procedure \scheme{profile-clear-data} clears the database.
|
|
|
|
The new \scheme{exclusive-cond} syntax is similar to \scheme{cond}
|
|
except it assumes the tests performed by the clauses are disjoint
|
|
and reorders them based on available profiling data.
|
|
Because the tests might be reordered, the order in which side effects
|
|
of the test expressions occur is undefined.
|
|
The built-in \scheme{case} form is implemented in terms of
|
|
\scheme{exclusive-cond}.
|
|
|
|
\subsection{New \protect\scheme{ssize_t} foreign type (9.0)}
|
|
|
|
A new foreign type, \scheme{ssize_t}, is now supported.
|
|
It is the signed analogue of \scheme{size_t}.
|
|
|
|
\subsection{Guardian representatives (9.0)}
|
|
|
|
When \scheme{make-guardian} is passed a second, \emph{representative},
|
|
argument, the representative is returned from the guardian in place
|
|
of the guarded object when the guarded object is no longer accessible.
|
|
|
|
\subsection{Library reloading on dependency change (9.0)}
|
|
|
|
A library initially imported from an object file is now reimported from
|
|
source when a dependency (another library or include file) has changed
|
|
since the library was compiled.
|
|
|
|
\subsection{Expression-editor filename completion (8.9.5)}
|
|
|
|
The expression editor now performs filename- rather than
|
|
command-completion within string constants.
|
|
It looks only at the current line to determine whether the cursor is
|
|
within a string constant; this can lead to the wrong kind of command
|
|
completion for strings that cross line boundaries.
|
|
|
|
\subsection{New lock mechanisms and elimination of old lock mechanism (8.9.5)}
|
|
|
|
The built in ftype \scheme{ftype-lock} has been eliminated along
|
|
with the corresponding procedures, \scheme{acquire-lock},
|
|
\scheme{release-lock}, and \scheme{initialize-lock}.
|
|
This is an incompatible change, although defining
|
|
\scheme{ftype-lock} and the associated procedures is straightforward
|
|
using the forms described below.
|
|
|
|
The functionality has been replaced and generalized by four new syntactic
|
|
forms that operate on lock fields wherever they appear within a foreign
|
|
type:
|
|
|
|
\schemedisplay
|
|
(ftype-init-lock! \var{T} (\var{a} ...) \var{e})
|
|
(ftype-lock! \var{T} (\var{a} ...) \var{e})
|
|
(ftype-spin-lock! \var{T} (\var{a} ...) \var{e})
|
|
(ftype-unlock! \var{T} (\var{a} ...) \var{e})
|
|
\endschemedisplay
|
|
|
|
The access chain \scheme{\var{a} \dots} must specify a word-size
|
|
integer represented using the native endianness, i.e., a \scheme{uptr}
|
|
or \scheme{iptr}.
|
|
It is a syntax violation when this is not the case.
|
|
|
|
For each of the forms, the expression \var{e} is evaluated first
|
|
and must evaluate to a ftype pointer \var{p} of type \var{T}.
|
|
|
|
\scheme{ftype-init-lock!} initializes the specified field of the foreign
|
|
object to which \var{p} points, puts the field into the unlocked state,
|
|
and returns an unspecified value.
|
|
|
|
If the field is in the unlocked state, \scheme{ftype-lock!} puts it
|
|
into the locked state and returns \scheme{#t}.
|
|
If the field is already in the locked state, \scheme{ftype-lock!}
|
|
returns \scheme{#f}.
|
|
|
|
\scheme{ftype-spin-lock!} loops until the lock is in the unlocked
|
|
state, then puts it into the locked state and returns an unspecified
|
|
value.
|
|
\emph{This operation will never return if no other thread or process
|
|
unlocks the field, causing interrupts and requests for collection to
|
|
be ignored.}
|
|
|
|
Finally, \scheme{ftype-unlock} puts the field into the unlocked state
|
|
(regardless of the current state) and returns an unspecified value.
|
|
|
|
An additional pair of syntactic forms can be used when just an
|
|
atomic increment or decrement is required:
|
|
|
|
\schemedisplay
|
|
(ftype-locked-incr! \var{T} (\var{a} ...) \var{e})
|
|
(ftype-locked-decr! \var{T} (\var{a} ...) \var{e})
|
|
\endschemedisplay
|
|
|
|
As for the first set of forms, the access chain \scheme{\var{a} \dots}
|
|
must specify a word-size integer represented using the native endianness.
|
|
|
|
\subsection{\protect\scheme{ftype-pointer-null?}, \protect\scheme{ftype-pointer=?} (8.9.5)}
|
|
|
|
The new procedure \scheme{ftype-pointer-null?} can be used to compare the
|
|
address of its single argument, which must be an ftype pointer, against 0.
|
|
It returns \scheme{#t} if the address is 0 and \scheme{#f} otherwise.
|
|
Similarly, \scheme{ftype-pointer=?} can be used to compare the
|
|
addresses of two ftype-pointer arguments.
|
|
It returns \scheme{#t} if the address are the same and \scheme{#f}
|
|
otherwise.
|
|
|
|
These are potentially more efficient than extracting ftype-pointer
|
|
addresses first, which might result in bignum allocation for addresses
|
|
outside the fixnum range,
|
|
although the compiler also now
|
|
tries to avoid allocation when the result of a call to
|
|
\scheme{ftype-pointer-address} is directly compared with 0 or with the
|
|
result of another call to \scheme{ftype-pointer-address}, as described
|
|
in Section~\ref{ftpaopt}.
|
|
|
|
\subsection{\protect\scheme{gensym}'s new optional unique-name argument (8.9.5)}
|
|
|
|
\scheme{gensym} now accepts a second optional argument, the unique
|
|
name to use.
|
|
It must be a string and should not be used by any other gensym intended
|
|
to be distinct from the new gensym.
|
|
|
|
\subsection{GC times now maintained with finer granularity (8.9.5)}
|
|
|
|
In previous releases, collection times as reported by \scheme{statistics}
|
|
or printed by \scheme{display-statistics} were gathered internally
|
|
with millisecond granularity at each collection, possibly leading to
|
|
significant inaccuracies over the course of many collections.
|
|
They are now maintained using high-resolution timers with generally
|
|
much better accuracy.
|
|
|
|
\subsection{New time types for tracking collection times (8.9.5)}
|
|
|
|
New time types \scheme{time-collector-cpu} and \scheme{time-collector-real}
|
|
have been added.
|
|
When \scheme{current-time} is passed one of these types, a time
|
|
object of the specified type is returned and represents the time
|
|
(cpu or real) spent during collection.
|
|
|
|
Previously, this information was available only via the
|
|
\scheme{statistics} or \scheme{display-statistics} procedures, and then
|
|
with lower precision.
|
|
|
|
\subsection{New storage-management introspection procedures (8.9.5)}
|
|
|
|
Three new storage-management introspection procedures have been
|
|
added:
|
|
|
|
\schemedisplay
|
|
(collections)
|
|
(initial-bytes-allocated)
|
|
(bytes-deallocated)
|
|
\endschemedisplay
|
|
|
|
\scheme{collections} returns the number of collections performed so
|
|
far by the current Scheme process.
|
|
|
|
\scheme{initial-bytes-allocated} returns the number of bytes
|
|
allocated after loading the boot files and before running any
|
|
non-boot user code.
|
|
|
|
\scheme{bytes-deallocated} returns the total number of bytes
|
|
deallocated by the collector.
|
|
|
|
Previously, this information was available only via the
|
|
\scheme{statistics} or \scheme{display-statistics}
|
|
procedures.
|
|
|
|
\subsection{New time-object manipulation procedures (8.9.5)}
|
|
|
|
Three new procedures for performing arithmetic on time objects have
|
|
been added, per SRFI~19:
|
|
|
|
\schemedisplay
|
|
(time-difference \var{t1} \var{t2}) ;=> \var{t3}
|
|
(add-duration \var{t1} \var{t2}) ;=> \var{t3}
|
|
(subtract-duration \var{t1} \var{t2}) ;=> \var{t3}
|
|
\endschemedisplay
|
|
|
|
\scheme{time-difference} takes two time objects \var{t1} and \var{t2},
|
|
which must have the same time type, and returns the result of subtracting
|
|
\var{t2} from \var{t1}, represented as a new time object with type
|
|
\scheme{time-duration}.
|
|
\scheme{add-duration} adds time object \var{t2}, which must be of type
|
|
\scheme{time-duration}, to time object \var{t1}, producing a new time object
|
|
\var{t3} with the same type as \var{t1}.
|
|
\scheme{subtract-duration} subtracts time object \var{t2} which must be
|
|
of type \scheme{time-duration}, from time object \var{t1}, producing a new
|
|
time object \var{t3} with the same type as \var{t1}.
|
|
|
|
SRFI~19 also names destructive versions of these operators:
|
|
|
|
\schemedisplay
|
|
(time-difference! \var{t1} \var{t2}) ;=> \var{t3}
|
|
(add-duration! \var{t1} \var{t2}) ;=> \var{t3}
|
|
(subtract-duration! \var{t1} \var{t2}) ;=> \var{t3}
|
|
\endschemedisplay
|
|
|
|
These are available as well in {\ChezScheme} but are actually
|
|
nondestructive, i.e., entirely equivalent to the nondestructive
|
|
versions.
|
|
|
|
\subsection{Better reporting of profile counts (8.9.4, 8.9.5)}
|
|
|
|
The compiler now collects and reports profile counts for every
|
|
source expression that is not determined to be dead either at
|
|
compile time or by the time the profile information is obtained via
|
|
\scheme{profile-dump-list} or \scheme{profile-dump-html}.
|
|
Previously, the compiler suppressed profile counts for constants and
|
|
variable references in contexts where the information was likely (though
|
|
not guaranteed) to be redundant, and it dropped profile counts for some
|
|
forms that were optimized away, such as inlined calls, folded calls,
|
|
or useless code.
|
|
Furthermore, profile counts now uniformly represent the number of times
|
|
a source expression's evaluation was started, which was not always the
|
|
case before.
|
|
|
|
A small related enhancement has been made in the HTML output produced
|
|
by \scheme{profile-dump-html}.
|
|
Hovering over a source expression now shows, in addition to the count,
|
|
the starting position (line number and character) of the source expression
|
|
to which the count belongs.
|
|
This is useful for identifying when a source expression does not have its
|
|
own count but instead inherits the count (and color) from an enclosing
|
|
expression.
|
|
|
|
\subsection{Virtual registers (8.9.4)}
|
|
|
|
A limited set of \emph{virtual registers} is now supported by the compiler
|
|
for use by programs that require high-speed, global, and mutable storage
|
|
locations.
|
|
Referencing or assigning a virtual register is potentially faster and
|
|
never slower than accessing an assignable local or global variable,
|
|
and the code sequences for doing so are generally smaller.
|
|
Assignment is potentially significantly faster because there is no need
|
|
to track pointers from the virtual registers to young objects, as there
|
|
is for variable locations that might reside in older generations.
|
|
On threaded versions of the system, virtual registers are ``per thread''
|
|
and thus serve as thread-local storage in a manner that is less expensive
|
|
than thread parameters.
|
|
|
|
The interface consists of three procedures:
|
|
|
|
\scheme{(virtual-register-count)} returns the number of virtual registers.
|
|
As of this writing, the count is set at 16. This number is fixed, i.e.,
|
|
cannot be changed except by recompiling {\ChezScheme} from source.
|
|
|
|
\scheme{(set-virtual-register! \var{k} \var{x})} stores \var{x} in virtual
|
|
register \var{k}.
|
|
\var{k} must be a fixnum between 0 (inclusive) and the value of
|
|
\scheme{(virtual-register-count)} (exclusive).
|
|
|
|
\scheme{(virtual-register \var{k})} returns the value most recently
|
|
stored in virtual register \var{k} (on the current thread, in threaded
|
|
versions of the system).
|
|
|
|
To get the fastest possible speed out of the latter two procedures,
|
|
\var{k} should be a constant embedded right in the call
|
|
(or propagatable via optimization to the call).
|
|
To avoid putting these constants in the source code, programmers should
|
|
consider using identifier macros to give names to virtual registers, e.g.:
|
|
|
|
\schemedisplay
|
|
(define-syntax foo
|
|
(identifier-syntax
|
|
[id (virtual-register 0)]
|
|
[(set! id e) (set-virtual-register! 0 e)]))
|
|
(set! foo 'hello)
|
|
foo ;=> hello
|
|
\endschemedisplay
|
|
|
|
Virtual-registers must be treated as an application-level resource, i.e.,
|
|
libraries intended to be used by multiple applications should generally
|
|
not use virtual registers to avoid conflicts with the applications use of
|
|
the registers.
|
|
|
|
\subsection{24-, 40-, 48-, and 56-bit integer values (8.9.3)}
|
|
|
|
Support for storing and extracting 24-, 40-, 48-, and 56-bit integers
|
|
to and from records, bytevectors, and foreign types (ftypes) has been
|
|
added.
|
|
For records and ftypes, this is accomplished by declaring a field
|
|
to be of type
|
|
\scheme{integer-24}, \scheme{unsigned-24},
|
|
\scheme{integer-40}, \scheme{unsigned-40},
|
|
\scheme{integer-48}, \scheme{unsigned-48},
|
|
\scheme{integer-56}, or \scheme{unsigned-56}.
|
|
For bytevectors, this is accomplished via the following new
|
|
primitives:
|
|
|
|
\schemedisplay
|
|
bytevector-24-ref
|
|
bytevector-24-set!
|
|
bytevector-40-ref
|
|
bytevector-40-set!
|
|
bytevector-48-ref
|
|
bytevector-48-set!
|
|
bytevector-56-ref
|
|
bytevector-56-set!
|
|
\endschemedisplay
|
|
|
|
Similarly, support has been added for sending and receiving
|
|
24-, 40-, 48-, and 56-bit integers to and from foreign code via
|
|
\scheme{foreign-procedure} and \scheme{foreign-callable}.
|
|
Arguments and return values of type \scheme{integer-24} and
|
|
\scheme{unsigned-24} are passed as 32-bit quantities, while
|
|
those of type \scheme{integer-40}, \scheme{unsigned-40},
|
|
\scheme{integer-48}, \scheme{unsigned-48}, \scheme{integer-56},
|
|
and \scheme{unsigned-56} are passed as 64-bit quantities.
|
|
|
|
For unpacked ftypes, a 48-bit (6-byte) quantity is aligned
|
|
on an even two-byte boundary, while a
|
|
24-bit (3-byte), 40-bit (5-byte), or 56-bit (7-byte) quantity
|
|
is aligned on an arbitrary byte boundary.
|
|
|
|
\subsection{New \protect\scheme{pariah} expression (8.9.3)}
|
|
|
|
A \scheme{pariah} expression:
|
|
|
|
\schemedisplay
|
|
(pariah \var{expr} \var{expr} \dots)
|
|
\endschemedisplay
|
|
|
|
is syntactically similar and semantically equivalent to a begin
|
|
expression but tells the compiler that the expressions within are
|
|
relatively unlikely to be executed.
|
|
This information is currently used by the compiler for prioritizing
|
|
allocation of registers to variables and for putting pariah code
|
|
out-of-line in an attempt to reduce instruction cache misses for the
|
|
remaining code.
|
|
|
|
A \scheme{pariah} form is generally most usefully wrapped around the
|
|
consequent or alternative of an \scheme{if} expression to identify which
|
|
is the less likely path.
|
|
|
|
The compiler implicitly treats as pariah code any code that leads
|
|
up to an unconditional call to \scheme{raise}, \scheme{error},
|
|
\scheme{errorf}, \scheme{assertion-violation}, etc., so it is not
|
|
necessary to wrap a \scheme{pariah} around such a call.
|
|
|
|
At some point, there will likely be an option for gathering similar
|
|
information automatically via profiling.
|
|
In the meantime, we are interested in feedback about whether the
|
|
mechanism is beneficial and whether the benefit of using the
|
|
\scheme{pariah} form outweighs the programming overhead.
|
|
|
|
\subsection{Improved automatic library recompilation (8.9.2)}
|
|
|
|
Local imports within a library now trigger automatic recompilation
|
|
of the library when the imported library has been recompiled or needs
|
|
to be recompiled, in the same manner as imports listed directly in the
|
|
importing library's \scheme{library} form.
|
|
Changes in include files also trigger automatic recompilation.
|
|
|
|
(Automatic recompilation of a library is enabled when an import of
|
|
the library, e.g., in another library or in a top-level program, is
|
|
compiled and the parameter \scheme{compile-imported-libraries} is set
|
|
to a true value.)
|
|
|
|
\subsection{Redundant profile information (8.9.2)}
|
|
|
|
Profiling information is no longer produced for constants and variable
|
|
references where the information is likely to be redundant.
|
|
It is still produced in contexts where the counts are likely to differ
|
|
from those of the enclosing form, e.g., where a constant or variable
|
|
reference occurs in the consequent or alternative of an \scheme{if}
|
|
expression.
|
|
This change brings the profiling information largely in sync with
|
|
Version~8.4.1 and earlier, though Version~8.9.2 retains source information
|
|
in a few cases where it is inappropriately discarded by Version~8.4.1's
|
|
compiler, and Version~8.9.2 discards source information in a few cases
|
|
where the code has been optimized away.
|
|
|
|
\subsection{New \protect\scheme{compile-to-port} procedure (8.9.2)}
|
|
|
|
The procedure \scheme{compile-to-port} is like \scheme{compile-port}
|
|
but, instead of taking an input port from which it reads expressions
|
|
to be compiled, takes a list of expressions to be compiled.
|
|
As with \scheme{compile-port}, the second argument must be a binary
|
|
output port.
|
|
|
|
\subsection{Debug levels (8.9.1)}
|
|
|
|
Newly introduced debug levels control the amount of debugging support
|
|
embedded in the code generated by the compiler.
|
|
The current debug level is controlled by the parameter
|
|
\scheme{debug-level} and must be set when the compiler is run to have
|
|
any effect on the generated code.
|
|
Valid debug levels are~0, 1, 2, and~3, and the default is~1.
|
|
At present, the only difference between debug levels is whether calls to
|
|
certain error-producing routines, like \scheme{error}, whether explicit
|
|
or as the result of an implicit run-time check (such as the pair check
|
|
in \scheme{car}), are treated as tail calls even when not in tail position.
|
|
At debug levels 0 and 1, they are treated as tail calls, and at debug
|
|
levels 2 and 3, they are treated as nontail calls.
|
|
Treating them as tail calls is more efficient, but treating them as
|
|
nontail calls leaves more information on the stack, which affects what
|
|
can be shown by the inspector.
|
|
|
|
For example, assume \scheme{f} is defined as follows:
|
|
|
|
\schemedisplay
|
|
(define f
|
|
(lambda (x)
|
|
(unless (pair? x) (error #f "oops"))
|
|
(car x)))
|
|
\endschemedisplay
|
|
|
|
and is called with a non-pair argument, e.g.:
|
|
|
|
\schemedisplay
|
|
(f 3)
|
|
\endschemedisplay
|
|
|
|
If the debug level is 2 or more at the time the definition is compiled,
|
|
the call to \scheme{f} will still be on the stack when the exception
|
|
is raised by \scheme{error} and will thus be visible to the inspector:
|
|
|
|
\schemedisplay
|
|
> (f 3)
|
|
Exception: oops
|
|
Type (debug) to enter the debugger.
|
|
> (debug)
|
|
debug> i
|
|
#<continuation in f> : sf
|
|
0: #<continuation in f>
|
|
1: #<system continuation in new-cafe>
|
|
#<continuation in f> : s
|
|
continuation: #<system continuation in new-cafe>
|
|
procedure code: (lambda (x) (if (...) ...) (car x))
|
|
call code: (error #f "oops")
|
|
frame and free variables:
|
|
0. x: 3
|
|
\endschemedisplay
|
|
|
|
On the other hand, if the debug level is 1 (the default) or 0 at the
|
|
time the definition of \scheme{f} is compiled, the call to \scheme{f}
|
|
will no longer be on the stack:
|
|
|
|
\schemedisplay
|
|
> (f 3)
|
|
Exception: oops
|
|
Type (debug) to enter the debugger.
|
|
> (debug)
|
|
debug> i
|
|
#<system continuation in new-cafe> : sf
|
|
1: #<system continuation in new-cafe>
|
|
\endschemedisplay
|
|
|
|
\subsection{Cost centers (8.9.1)}
|
|
|
|
Cost centers are used to track the bytes allocated, instructions executed,
|
|
and/or cpu time elapsed while evaluating selected sections of code.
|
|
Cost centers are created via the procedure \scheme{make-cost-center}, and
|
|
costs are tracked via the procedure \scheme{with-cost-center}.
|
|
|
|
Allocation and instruction counts are tracked only for code instrumented
|
|
for that purpose.
|
|
This instrumentation is controlled by the \scheme{generate-allocation-counts}
|
|
and \scheme{generate-instruction-counts} parameters.
|
|
Instrumentation is disabled by default.
|
|
Built in procedures are not instrumented, nor is interpreted code or
|
|
non-Scheme code.
|
|
Elapsed time is tracked only when the optional \scheme{timed?} argument to
|
|
\scheme{with-cost-center} is provided and is not false.
|
|
|
|
The \scheme{with-cost-center} procedure accurately tracks costs, subject
|
|
to the caveats above, even when reentered with the same cost center, used
|
|
simultaneously in multiple threads, and exited or reentered one or more
|
|
times via continuation invocation.
|
|
|
|
\textbf{thread parameter:} \scheme{generate-allocation-counts}
|
|
|
|
When this parameter has a true value, the compiler inserts a short sequence of
|
|
instructions at each allocation point in generated code to track the amount of
|
|
allocation that occurs.
|
|
This parameter is initially false.
|
|
|
|
\textbf{thread parameter:} \scheme{generate-instruction-counts}
|
|
|
|
When this parameter has a true value, the compiler inserts a short
|
|
sequence of instructions in each block of generated code to track the
|
|
number of instructions executed by that block.
|
|
This parameter is initially false.
|
|
|
|
\textbf{procedure:} \scheme{(make-cost-center)}
|
|
|
|
Creates a new \scheme{cost-center} object with all of its recorded costs
|
|
set to zero.
|
|
|
|
\textbf{procedure:} \scheme{(cost-center? \var{obj})}
|
|
|
|
Returns \scheme{#t} if \var{obj} is a \scheme{cost-center} object, otherwise
|
|
returns \scheme{#f}.
|
|
|
|
\textbf{procedure:} \scheme{(with-cost-center \var{cost-center} \var{thunk})}\\
|
|
\textbf{procedure:} \scheme{(with-cost-center \var{timed?} \var{cost-center} \var{thunk})}
|
|
|
|
This procedure invokes \var{thunk} without arguments and returns its
|
|
values.
|
|
It also tracks, dynamically, the bytes allocated, instructions executed,
|
|
and cpu time elapsed while evaluating the invocation of \var{thunk} and
|
|
adds the tracked costs to the cost center's running record of these costs.
|
|
|
|
Allocation counts are tracked only for code compiled with the parameter
|
|
\scheme{generate-allocation-counts} set to true, and
|
|
instruction counts are tracked only for code compiled with
|
|
\scheme{generate-instruction-counts} set to true.
|
|
Cpu time is tracked only if \var{timed?} is provided and not false and
|
|
includes cpu time spent in instrumented, uninstrumented, and non-Scheme
|
|
code.
|
|
|
|
\textbf{procedure:} \scheme{(cost-center-instruction-count \var{cost-center})}
|
|
|
|
This procedure returns instructions executed recorded by
|
|
\var{cost-center}.
|
|
|
|
\textbf{procedure:} \scheme{(cost-center-allocation-count \var{cost-center})}
|
|
|
|
This procedure returns the bytes allocated recorded by \var{cost-center}.
|
|
|
|
\textbf{procedure:} \scheme{(cost-center-time \var{cost-center})}
|
|
|
|
This procedure returns the cpu time recorded by \var{cost-center}.
|
|
|
|
\textbf{procedure:} \scheme{(reset-cost-center! \var{cost-center})}
|
|
|
|
This procedure resets the costs recorded by \var{cost-center} to zero.
|
|
|
|
\subsection{Experimental access to hardware performance counters (8.9.1)}
|
|
|
|
Two system primitives, \scheme{#%$read-time-stamp-counter} and
|
|
\scheme{#%$read-performance-monitoring-counter}, provide access to the
|
|
x86 and x86\_64 hardware time-stamp counter register and to the
|
|
model-specific performance monitoring registers.
|
|
|
|
These primitives rely on instructions that might be restricted to run only in
|
|
kernel mode, depending on kernel configuration.
|
|
The performance monitoring counters must also be configured to enable
|
|
monitoring and to specify which event to monitor.
|
|
This can be configured only by instructions executed in kernel mode.
|
|
|
|
\textbf{procedure:} \scheme{(#%$read-time-stamp-counter)}
|
|
|
|
This procedure returns the current value of the time-stamp counter for
|
|
the processor core executing this code.
|
|
A general protection fault, which manifests as an invalid memory
|
|
reference exception, results if this operation is not permitted by
|
|
the operating system.
|
|
|
|
Since multiple processes might run on the same core between reads of
|
|
the time-stamp counter, the counter does not necessarily reflect time
|
|
spent only in the current process.
|
|
Also, on machines with multiple cores, the executing process might be
|
|
swapped to a different core with a different time-stamp counter.
|
|
|
|
\textbf{procedure:} \scheme{(#%$read-performance-monitoring-counter \var{counter})}
|
|
|
|
This procedure returns the current value of the model-specific
|
|
performance monitoring register specified by \var{counter}.
|
|
\var{counter} must be a fixnum and should specify a valid performance
|
|
monitoring register.
|
|
Allowable values depend on the processor model.
|
|
A general protection fault, which manifests as an invalid memory
|
|
reference exception, results if this operation is not permitted by
|
|
the operating system or if the specified counter does not exist.
|
|
|
|
In order to get meaningful results, the performance monitoring registers
|
|
must be enabled, and the event to be monitored must by configured by
|
|
the performance monitoring control register.
|
|
This configuration can be done only by code run in kernel mode.
|
|
|
|
Since multiple processes might run on the same core between reads of
|
|
a performance monitoring register, the register does not necessarily reflect
|
|
only the activities of the current process.
|
|
Also, on machines with multiple cores, the executing process might be
|
|
swapped to a different core with its own set of performance monitoring
|
|
registers and possibly a different configuration for those registers.
|
|
|
|
\subsection{New inspector functionality (8.9.1)}
|
|
|
|
Within the interactive inspector, closure and frame variables can now
|
|
be set by name, and the forward (f) and back (b) commands can now be
|
|
used to to move among the frames that comprise a continuation.
|
|
|
|
A new show-local (sl) command can be be used to look at just the local
|
|
variables of a stack frame.
|
|
This contrasts with the show (s) command, which shows the free variables
|
|
of the frame's closure as well.
|
|
|
|
Errors occurring during inspection, such as attempts to assign immutable
|
|
variables, are handled more smoothly than in previous versions.
|
|
|
|
\subsection{Fasl support for records with non-ptr fields (8.4.1)}
|
|
|
|
The fasl writer and reader now support records with non-ptr fields,
|
|
e.g., integer-32, wchar, etc., allowing constant record instances with
|
|
such fields to appear in source code (or be introduced as constants
|
|
by macros) into code to be compiled via \scheme{compile-file},
|
|
\scheme{compile-library}, \scheme{compile-program},
|
|
\scheme{compile-script}, or \scheme{compile-port}.
|
|
Ftype-pointer fields are not supported, since storing addresses
|
|
in fasl files does not generally make sense.
|
|
|
|
%-----------------------------------------------------------------------------
|
|
\section{Bug Fixes}\label{section:bugfixes}
|
|
|
|
\subsection{Clear-output bug (9.5.3)}
|
|
|
|
A bug has been fixed in which a call to \scheme{clear-output-port}
|
|
on a port could lead to unexpected behavior involving the port,
|
|
including loss of buffering or suppression of future output to the
|
|
port.
|
|
|
|
\subsection{Various argument type-error issues (9.5.3)}
|
|
|
|
A variety of primitive argument type-checking issues have been
|
|
fixed, including missing checks, misleading error messages,
|
|
and checks made later than appropriate, i.e., after the primitive
|
|
has already had side effects.
|
|
|
|
\subsection{\protect\scheme{__collect_safe}, x86\_64, and floating-point arguments or results (9.5.3)}
|
|
|
|
The \scheme{__collect_safe} mode for a foreign call or callable now
|
|
correctly preserves floating-point registers used for arguments or
|
|
results while activating or deactivating a thread on x86\_64.
|
|
|
|
\subsection{\protect\scheme{putenv} memory leak (9.5.3)}
|
|
|
|
\scheme{putenv} now calls the host system's \scheme{setenv} instead of
|
|
\scheme{putenv} on non-Windows hosts and avoids allocating memory that
|
|
is never freed, although \scheme{setenv} might do so.
|
|
|
|
\subsection{String ports from immutable strings (9.5.4)}
|
|
|
|
A bug that miscalculated the buffer size for
|
|
\scheme{open-string-input-port} given an immutable string has been
|
|
fixed.
|
|
|
|
\subsection{Multiplying $-2^{30}$ with itself on 64-bit platforms (9.5.3)}
|
|
|
|
A bug that produced the wrong sign when multiplying $-2^{30}$ with
|
|
itself on 64-bit platforms has been fixed.
|
|
|
|
\subsection{Compiler dropping affects from record-accessor calls (9.5.3)}
|
|
|
|
A bug that could cause the source optimizer to drop effects within
|
|
the argument of a record-accessor call has been fixed.
|
|
|
|
\subsection{Welcome text in macOS package file (9.5.2)}
|
|
|
|
The welcome text and copyright year in the macOS package file was
|
|
corrected.
|
|
|
|
\subsection{Fasl representation change for recursive ftypes (9.5.2)}
|
|
|
|
A bug in the reading of mutually recursive ftype definitions from
|
|
compiled files has been fixed.
|
|
The bug was triggered by recursive ftype definitions in which one
|
|
of the mutually recursive ftypes is a subtype of another, as in:
|
|
|
|
\schemedisplay
|
|
(define-ftype
|
|
[A (* B)]
|
|
[B (struct [h A])]))
|
|
\endschemedisplay
|
|
|
|
It manifested in the fasl reader raising bogus "incompatible record
|
|
type" exceptions when two or more references to one of the ftypes
|
|
occur in in separate compiled files or in separate top-level forms
|
|
of a file compiled via \scheme{compile-file}.
|
|
The bug could also have affected other record-type descriptors with
|
|
cycles involving parent rtds and ``extra'' fields as well as fasl
|
|
output created via \scheme{fasl-write}.
|
|
|
|
\subsection{Unbound object resulting from libraries combined with \protect\scheme{compile-whole-library} (9.5.1)}
|
|
|
|
A bug in \scheme{compile-whole-library} that allowed the invoke code for a
|
|
library included in the combined library body to be executed without first
|
|
invoking its binary library dependencies has been fixed.
|
|
This bug could arise when a member of a combined library was invoked without
|
|
invoking the requirements of the other libraries it was combined with. For
|
|
instance, consider the case where libraries \scheme{(A)} and \scheme{(B)} are
|
|
combined and \scheme{(B)} has dependencies on library \scheme{(A)} and binary
|
|
library \scheme{(C)}.
|
|
One possible sort order of this graph is \scheme{(C)}, \scheme{(A)},
|
|
\scheme{(B)}, where the invoke code for \scheme{(A)} and \scheme{(B)} are
|
|
combined into a single block of invoke code. If library \scheme{(A)} is
|
|
invoked first, it will implicitly cause the invoke code for \scheme{(B)} to be
|
|
invoked without invoking the code for \scheme{(C)}.
|
|
We address this by adding explicit dependencies between \scheme{(A)} and all
|
|
the binary libraries that precede it and all of the other libraries clustered
|
|
with \scheme{(A)} and \scheme{(A)}, such that no matter which library clustered
|
|
with \scheme{(A)} is invoked firts, \scheme{(A)} will be invoked, causing all
|
|
binary libraries that precede \scheme{(A)} to be invoked.
|
|
It is also possible for a similar problem to exist between clusters, where
|
|
invoking a later cluster may invoke an earlier cluster without invoking the
|
|
binary dependencies for the earlier cluster.
|
|
We address this issue by adding an invoke requirement between each cluster and
|
|
the first library in the cluster that precedes it.
|
|
These extended invoke requirements are also added to the import requirements
|
|
for each library, and the dependency graph is enhanced with import requirement
|
|
links to ensure these are taken into account during the topological sort.
|
|
|
|
|
|
\subsection{Automatic recompilation and missing include files (9.5.1)}
|
|
|
|
A bug in automatic recompilation involving missing include files
|
|
has been fixed.
|
|
The bug caused automatic recompilation to fail, often with an
|
|
exception in \scheme{file-modification-time}, when a file specified
|
|
by an absolute pathname or pathname starting with "./" or "../" was
|
|
included via \scheme{include} during a previous compilation run and
|
|
is no longer present.
|
|
|
|
\subsection{Invalid memory reference instantiating \protect\scheme{foreign-callable} code object (9.5.1)}
|
|
|
|
A bug that caused evaluation of a \scheme{foreign-callable} expression in
|
|
code that has been collected into the static generation (e.g., when the
|
|
\scheme{foreign-callable} form appears in code compiled to a boot file)
|
|
to result in an invalid memory reference has been fixed.
|
|
|
|
\subsection{Invalid constant-folding of some calls to \protect\scheme{apply} (9.5.1)}
|
|
|
|
A bug in the source optimizer (cp0) allowed constant-folding of some calls to
|
|
\scheme{apply} where the last argument is not known to be a list. For example,
|
|
cp0 incorrectly reduced
|
|
\scheme{(apply zero? 0)} to \scheme{#t}
|
|
and reduced
|
|
\scheme{(lambda (x) (apply box? x) x)} to \scheme{(lambda (x) x)},
|
|
but now preserves these calls to \scheme{apply} so that they may raise an
|
|
exception.
|
|
|
|
\subsection{Disk-relative filenames in Windows (9.5.1)}
|
|
|
|
In Windows, filenames that start with a disk designator but no
|
|
directory separator are now treated as relative paths. For example,
|
|
\scheme{(path-absolute? "C:")} now returns \scheme{#f}, and
|
|
\scheme{(directory-list "C:")} now lists the files in the current
|
|
directory on disk C instead of the files in the root directory of disk
|
|
C.
|
|
|
|
In addition, \scheme{file-access-time}, \scheme{file-change-time},
|
|
\scheme{file-directory?}, \scheme{file-exists?},
|
|
\scheme{file-modification-time}, and \scheme{get-mode} no longer
|
|
remove trailing directory separators on Windows.
|
|
|
|
\subsection{Globally unique names on non-Windows systems no longer contain the IP address (9.5.1)}
|
|
|
|
The globally unique names of gensyms no longer contain the IP address
|
|
on non-Windows systems. Windows systems already used a universally
|
|
unique identifier.
|
|
|
|
\subsection{Invalid memory reference from \protect\scheme{fxvector} calls (9.5)}
|
|
|
|
A compiler bug that could result in an invalid memory reference or
|
|
some other unpleasant behavior for calls to \scheme{fxvector} in
|
|
which the nested subexpression to compute the new value to be stored
|
|
is nontrivial has been fixed.
|
|
This bug could also affect calls to \scheme{vector-set-fixnum!} and possibly
|
|
other primitive operations.
|
|
|
|
\subsection{Incorrect return code when \protect\scheme{exit} is called with multiple arguments (9.5)}
|
|
|
|
A bug in the implementation of the default exit handler with multiple
|
|
values has been fixed.
|
|
|
|
\subsection{Boot files containing compiled library code fail to load (9.5)}
|
|
|
|
Compiled library code may now appear within fasl objects loaded during
|
|
the boot process, provided that they are appended to the end of the base boot
|
|
file or appear within a later boot file.
|
|
|
|
\subsection{Misleading cyclic dependency error (9.5)}
|
|
|
|
The library system no longer reports a cyclic dependency error
|
|
during the second and subsequent attempts to visit or invoke a
|
|
library after the first attempt fails for some reason other than
|
|
an actual cyclic dependency.
|
|
The fix also allows a library to be visited or invoked successfully
|
|
on the second or subsequent attempt if the visit or invoke failed
|
|
for a transient reason, such as a missing or incorrect version in
|
|
an imported library.
|
|
|
|
\subsection{Incomplete handling of import specs within standalone export forms (9.5)}
|
|
|
|
A bug that limited the \scheme{(import \var{import-spec} \dots)} form within a
|
|
standalone \scheme{export} form to \scheme{(import \var{import-spec})} has been
|
|
fixed.
|
|
|
|
\subsection{Permission denied after deleting files or directories in Windows (9.5)}
|
|
|
|
In Windows, deleting a file or directory briefly leaves the file or
|
|
directory in a state where a subsequent create operation fails with
|
|
permission denied. This race condition is now mitigated.
|
|
[This bug applies to all versions up to 9.5 on Windows 7 and later.]
|
|
|
|
\subsection{Incorrect handling of offset in
|
|
\protect\scheme{date->time-utc} on Windows (9.5)}
|
|
|
|
A bug when \scheme{date->time-utc} is called on Windows with a
|
|
date-zone-offset smaller than the system's time-zone offset has been
|
|
fixed.
|
|
[This bug dated back to Version 9.5.]
|
|
|
|
\subsection{Compiler mishandling of fx /carry operations (9.5)}
|
|
|
|
A bug in the source optimizer that caused an internal compiler error when
|
|
folding certain calls to \scheme{fx+/carry}, \scheme{fx-/carry}, and
|
|
\scheme{fx*/carry} has been fixed.
|
|
[This bug dated back to Version 9.1.]
|
|
|
|
\subsection{Compiler mishandling of nested \protect\scheme{call-with-values} calls (9.5)}
|
|
|
|
A bug in that caused an internal compiler error when optimizing certain
|
|
nested calls to \scheme{call-with-values} has been fixed.
|
|
[This bug dated back to Version 8.9.1.]
|
|
|
|
\subsection{Incorrect expansion of \protect\scheme{define-values} of no values (9.5)}
|
|
|
|
A bug in the expansion of \scheme{define-values} that caused it to produce
|
|
a non-definition form when used to define no values has been fixed.
|
|
[This bug dated back to at least Version 8.4.]
|
|
|
|
\subsection{Optimizer dropping \protect\scheme{pariah} forms (9.5)}
|
|
|
|
A bug in the source optimizer that caused pariah forms to be ignored
|
|
has been fixed.
|
|
[This bug dated back to at least Version 9.3.1.]
|
|
|
|
\subsection{Invalid memory references involving complex numbers (9.5)}
|
|
|
|
A bug on 64-bit platforms that occasionally caused invalid memory
|
|
references when operating on inexact complex numbers or the imaginary parts
|
|
of inexact complex numbers has been fixed.
|
|
[This bug dated back to Version 8.9.1.]
|
|
|
|
\subsection{Overflow detection for left-shift operations on fixnums (9.5)}
|
|
|
|
A bug that caused \scheme{fxsll}, \scheme{fxarithmetic-shift-left},
|
|
and \scheme{fxarithmetic-shift} to fail to detect overflow in certain
|
|
cases has been fixed.
|
|
[This bug dated back to Version 4.0.]
|
|
|
|
\subsection{Missing \protect\scheme{enum-set-indexer} argument check (9.5)}
|
|
|
|
A missing argument check that resulted in the procedure returned by \scheme{enum-set-indexer}
|
|
causing an invalid memory reference when passed a non-symbol argument has been fixed.
|
|
[This bug dated back to Version 7.5.]
|
|
|
|
\subsection{Storage for inaccessible mutexes and conditions is reclaimed (9.5)}
|
|
|
|
The C heap storage for inaccessible mutexes and conditions is now reclaimed.
|
|
[This bug dated back to Version 6.5.]
|
|
|
|
\subsection{Missing guardian entries when a thread exits (9.5)}
|
|
|
|
A bug that caused guardian entries for a thread to be lost when a
|
|
thread exits has been fixed.
|
|
[This bug dated back to Version 6.5.]
|
|
|
|
\subsection{Incorrect code for certain nested \protect\scheme{if} patterns (9.5)}
|
|
|
|
A bug in the source optimizer that produced incorrect code for certain
|
|
nested \scheme{if} patterns has been fixed.
|
|
For example, the code generated for the following expression:
|
|
|
|
\schemedisplay
|
|
(if (if (if (if (zero? (a)) #f #t) (begin (b) #t) #f)
|
|
(c)
|
|
#f)
|
|
(x)
|
|
(y))
|
|
\endschemedisplay
|
|
|
|
inappropriately evaluated the subexpression \scheme{(b)} when the
|
|
subexpression \scheme{(a)} evaluates to 0 and not when \scheme{(a)}
|
|
evaluates to 1.
|
|
[This bug dated back to Version 9.0.]
|
|
|
|
\subsection{Leaked or unexpected \protect\scheme{cpvalid-defer} form (9.5)}
|
|
|
|
A bug in the pass of the compiler that inserts valid checks for
|
|
\scheme{letrec} and \scheme{letrec*} bindings has been fixed.
|
|
The bug resulted in an internal compiler exception with a condition
|
|
message regarding a leaked or unexpected \scheme{cpvalid-defer} form.
|
|
[This bug dated back to Version 6.9c.]
|
|
|
|
\subsection{\protect\scheme{string->number} and reader numeric syntax issues (9.4)}
|
|
|
|
\scheme{string->number} and the reader previously treated all complex
|
|
numbers written in polar notation that Chez Scheme cannot represent
|
|
exactly as inexact, even with an explicit \scheme{#e} prefix.
|
|
For such numbers with the \scheme{#e} prefix, \scheme{string->number}
|
|
now returns \scheme{#f} and the reader now raises an exception with
|
|
condition type \scheme{&implementation-restriction}.
|
|
Both still return an inexact representation for such numbers written without
|
|
the \scheme{#e} prefix, even if R6RS requires an exact result, i.e.,
|
|
even if they have no decimal point, exponent, or mantissa width.
|
|
|
|
Ratios with an exponent, like \scheme{1/2e10}, are non-standard and
|
|
now cause cause the procedure \scheme{string->number} imported from
|
|
\scheme{(rnrs)} to return \scheme{#f}.
|
|
When the reader encounters a ratio followed by an exponent while in R6RS
|
|
mode (i.e., when reading a library or top-level program and not following
|
|
an \scheme{#!chezscheme}, or when following an explicit \scheme{#!r6rs}),
|
|
it raises an exception.
|
|
|
|
Positive or negative zero followed by a large exponent now properly
|
|
produces zero rather than an infinity, e.g., \scheme{0e3000} now produces
|
|
\scheme{0} rather than \scheme{+inf.0}.
|
|
|
|
A rounding bug converting some small ratios into floating point numbers,
|
|
when those numbers fall into the range of denormalized floats, has
|
|
been fixed.
|
|
This bug also affected the reading of and conversion of strings into
|
|
denormalized floating-point numbers.
|
|
[Some of these bugs dated back to Version 3.0.]
|
|
|
|
\subsection{\protect\scheme{date->time-utc} ignoring zone-offset field (9.4)}
|
|
|
|
\scheme{date->time-utc} has been fixed to properly take into account the
|
|
zone-offset field.
|
|
[This bug dated back to Version 8.0.]
|
|
|
|
\subsection{\protect\scheme{wchar} and \protect\scheme{wchar_t} record field types fail to inline in Windows (9.4)}
|
|
|
|
On Windows, the source optimizer has been fixed to handle \scheme{wchar} and
|
|
\scheme{wchar_t} record field types.
|
|
|
|
\subsection{path-related procedures cause invalid memory reference with non-string arguments in Windows (9.4)}
|
|
|
|
On Windows, the path-related procedures now raise an appropriate exception when the path argument is not a string.
|
|
|
|
\subsection{Mutex acquisition bug (9.4)}
|
|
|
|
A bug in the handling of mutexes has been fixed.
|
|
The bug typically presented as a spurious ``recursively locked'' exception.
|
|
|
|
\subsection{\protect\scheme{dynamic-wind} mistakenly enabling interrupts (9.3.3)}
|
|
|
|
A bug causing \scheme{dynamic-wind} to unconditionally enable
|
|
interrupts upon a nonlocal exit from the body thunk has been fixed.
|
|
Interrupts are now properly enabled only when the optional
|
|
\var{critical?} argument is supplied and is not false.
|
|
[This bug dated back to Version 6.9c.]
|
|
|
|
\subsection{Incorrect optimization of various primitives (9.3.1)}
|
|
|
|
Mistakes in our primitive database that caused the source optimizer
|
|
to treat \scheme{append}, \scheme{append!}, \scheme{list*},
|
|
\scheme{cons*}, and \scheme{record-type-parent} as always returning
|
|
true values have been fixed, along with mistakes that caused the
|
|
source optimizer to treat \scheme{null-environment},
|
|
\scheme{source-object-bfp}, \scheme{source-object-efp}, and
|
|
\scheme{source-object-sfd} as not requiring argument checks.
|
|
[This bug dated back to Version 6.0.]
|
|
|
|
\subsection{Increased allocation ceiling under 32-bit Windows (9.3.1)}
|
|
|
|
We have worked around a limitation in the number of distinct allocation
|
|
areas the Windows VirtualAlloc function permits to be allocated by
|
|
allocating fewer, larger chunks of memory, effectively increasing the
|
|
maximum size of the heap to the full amount permitted by the operating
|
|
system.
|
|
|
|
\subsection{Syntax errors for \protect\scheme{let} and \protect\scheme{let*} (9.2.1)}
|
|
|
|
The expander now handles \scheme{let} and \scheme{let*} in such a
|
|
way that certain syntax errors previously reported as syntax errors
|
|
in \scheme{lambda} are now reported properly as syntax errors in
|
|
\scheme{let} or \scheme{let*}. This includes duplicate identifier
|
|
errors for \scheme{let} and errors involving internal definitions
|
|
for both \scheme{let} and \scheme{let*}.
|
|
|
|
\subsection{Dropped \protect\scheme{profile-dump-html} calls (9.0)}
|
|
|
|
A bug that caused effect-context calls to \scheme{profile-dump-html}
|
|
to be dropped at optimize-level 3 has been fixed.
|
|
[This bug dated back to Version 7.5.]
|
|
|
|
\subsection{Proper treatment of imported meta bindings (8.9.3)}
|
|
|
|
A deficiency in the handling of library dependencies that prevented meta
|
|
definitions exported in one library from being used reliably by a macro
|
|
defined in another library has been fixed.
|
|
Handling imported meta bindings involves tracking
|
|
visit-visit-requirements, which for a library \scheme{(A)} is the set of
|
|
libraries that must be visited (rather than invoked) when \scheme{(A)}
|
|
is visited.
|
|
An attempt to assign a meta variable imported from a library now results
|
|
in a syntax error.
|
|
[This bug dated back to Version 7.9.1.]
|
|
|
|
\subsection{Reexport of identifiers with properties (8.9.3)}
|
|
|
|
A bug that prevented an identifier given a property via
|
|
\scheme{define-property} from being exported from a library \scheme{(A)},
|
|
imported into and reexported from a second library \scheme{(B)}, and
|
|
imported from both \scheme{(A)} and \scheme{(B)} into and reexported
|
|
from a third library \scheme{(C)} has been fixed.
|
|
[This bug dated back to Version 8.1.]
|
|
|
|
\subsection{Cyclic record-type descriptors (8.4.1)}
|
|
|
|
The fasl (fast load) format used for compiled files now supports cyclic
|
|
record-type descriptors (RTDs), which are produced for recursive ftype
|
|
definitions.
|
|
Previously, compiling a file containing a recursive ftype definition
|
|
and subsequently loading the file resulted in corruption of the ftype
|
|
descriptor used to typecheck ftype pointers, potentially leading to
|
|
incorrect behavior or invalid memory references.
|
|
[This bug dated back to Version 8.2.]
|
|
|
|
\subsection{Invalid folding of record accesses (8.4.1)}
|
|
|
|
A bug that caused the optimizer to fold calls to record accessors applied
|
|
to a constant value of the wrong type, sometimes resulting in compile-time
|
|
invalid memory references or other compile-time errors, has been fixed.
|
|
[This bug dated back to Version 8.4.]
|
|
|
|
\subsection{4GB+ allocation for Windows x86\_64 (8.4.1)}
|
|
|
|
A bug that prevented objects larger than 4GB to be created under Windows
|
|
x86\_64 has been fixed.
|
|
[This bug dated back to Version 8.4.]
|
|
|
|
%-----------------------------------------------------------------------------
|
|
\section{Performance Enhancements}\label{section:performance}
|
|
|
|
\subsection{Faster object-file loading (9.5.3)}\label{sec:faster-object-file-loading}
|
|
|
|
Visiting an object file (to obtain only compile-time information and
|
|
code) and revisiting an object file (to obtain only run-time information
|
|
and code) is now faster, because revisions to the fasl format, fasl
|
|
writer, and fasl reader allow run-time code to be seeked past when
|
|
visiting and compile-time code to be seeked past when revisiting.
|
|
For compressed object files (the default), seeking still requires
|
|
reading all of the data, but the cost of parsing the fasl format and
|
|
building objects in the skipped portions is avoided, as are certain
|
|
side effects, such as associating record type descriptors with their
|
|
uids.
|
|
|
|
Similarly, recompile information is now placed at the front of each
|
|
object file where it can be loaded separately from
|
|
the remainder of an object file without even seeking past the other
|
|
portions of the file.
|
|
Recompile information is used by \scheme{import} (when
|
|
\scheme{compile-imported-libraries} is \scheme{#t}) and by maybe-compile
|
|
routines such as \scheme{maybe-compile-program} to help determine
|
|
whether recompilation is necessary.
|
|
|
|
Importing a library from an object file now causes the object file
|
|
to be visited rather than fully loaded. (Libraries were already
|
|
just revisited when required for their run-time code, e.g., when
|
|
used from a top-level program.)
|
|
|
|
Together these changes can significantly reduce compile-time and
|
|
run-time overhead, particularly in applications that make use of
|
|
a large number of libraries.
|
|
|
|
\subsection{Faster \protect\scheme{profile-release-counters} (9.5.3)}
|
|
|
|
\scheme{profile-release-counters} is now generation-friendly, meaning
|
|
it does not incur any overhead for code objects in generations that
|
|
have not been collected since the last call to\scheme{profile-release-counters}.
|
|
Also, it no longer allocates memory when counters are released.
|
|
|
|
\subsection{Reduced cost for obtaining profile counts (9.5.3)}
|
|
|
|
The cost of obtaining profile counts via \scheme{profile-dump} and
|
|
other mechanisms has been reduced significantly.
|
|
|
|
\subsection{Better code for \protect\scheme{bytevector} (9.5.1)}
|
|
|
|
The compiler now generates better inline code for the \scheme{bytevector}
|
|
procedure.
|
|
Instead of one byte memory write for each argument, it writes up
|
|
to four (32-bit machines) or eight (64-bit machines) bytes at a
|
|
time, which almost always results in fewer instructions and fewer
|
|
writes.
|
|
|
|
\subsection{\protect\scheme{vector-for-each} and \protect\scheme{string-for-each} improvement (9.5.1)}
|
|
|
|
The last call to the procedure passed to \scheme{vector-for-each}
|
|
or \scheme{string-for-each} is now reliably implemented as tail
|
|
call, as was already the case for \scheme{for-each}.
|
|
|
|
\subsection{Lambda commonization (9.5.1)}
|
|
|
|
After running the main source optimization pass (cp0), the
|
|
compiler optionally runs a \emph{commonization} pass, which
|
|
commonizes code for similar lambda expressions.
|
|
The parameter \scheme{commonization-level} controls whether the
|
|
commonization pass is run and, if so, how aggressive it is.
|
|
The parameter's value must be a nonnegative exact integer ranging
|
|
from 0 through 9. When the parameter is set to 0, the default,
|
|
commonization is not run. Otherwise, higher values result in more
|
|
commonization.
|
|
|
|
\subsection{Improved compile times (9.5.1)}
|
|
|
|
Compile times are now lower, sometimes by an order of magnitude or
|
|
more, for procedures with thousands of parameters, local variables,
|
|
and compiler-introduced temporaries.
|
|
For such procedures, the register/frame allocator proactively spills
|
|
variables with large live ranges, cutting down on the size and cost
|
|
of building the conflict graph used to represent pairs of variables
|
|
that are live at the same time and therefore cannot share a location.
|
|
|
|
\subsection{Improved oblist management (9.3.3)}
|
|
|
|
As a result of improvements in the handing of the oblist (symbol table),
|
|
the storage for a symbol is often reclaimed more quickly after it
|
|
becomes inaccessible, less space is set aside for the oblist at
|
|
start-up, oblist lookups are faster when the oblist contains a large
|
|
number of symbols, and the minimum cost of a maximum-generation
|
|
collection has been cut significantly, down from tens of microseconds
|
|
to just a handful on contemporary hardware.
|
|
|
|
\subsection{Reduced maximum-generation collection overhead (9.3.3)}
|
|
|
|
Various changes in the storage manager have reduced the amount of
|
|
extra memory required for managing heap storage and increased the
|
|
likelihood that memory can be returned to the O/S as the heap
|
|
shrinks.
|
|
Returning memory to the O/S is now faster, so the minimum time for
|
|
a maximum-generation collection, or any other collection where
|
|
release of memory to the O/S is enabled, has been cut.
|
|
|
|
\subsection{Faster library load times (9.3.1)}
|
|
|
|
Libraries now load faster at both compile and run time, with more
|
|
pronounced improvements when dozens of libraries or more are being
|
|
loaded.
|
|
|
|
\subsection{Partially static record instances (9.3.1)}
|
|
|
|
The source optimizer now maintains information about partially static
|
|
record instances to eliminate field accesses and type checks when a
|
|
binding site for a record instance is visible to the access or checking
|
|
code.
|
|
For example,
|
|
|
|
\schemedisplay
|
|
(let ()
|
|
(import scheme)
|
|
(define-record foo ([immutable ptr a] [immutable ptr b]))
|
|
(define (inc r) (make-foo (foo-a r) (+ (foo-b r) 1)))
|
|
(lambda (x)
|
|
(let* ([r (make-foo 37 x)]
|
|
[r (inc r)]
|
|
[r (inc r)])
|
|
r)))
|
|
\endschemedisplay
|
|
|
|
is reduced by the source optimizer down to:
|
|
|
|
\schemedisplay
|
|
(lambda (x) ($record '#<record type foo> 37 (+ (+ x 1) 1)))
|
|
\endschemedisplay
|
|
|
|
where \scheme{$record} is a low-level primitive for creating record
|
|
instances.
|
|
That is, the source optimizer eliminates the intermediate record
|
|
structures, record references, and type checks, in addition to
|
|
creating the record-type descriptor at compile time, eliminating
|
|
the record-constructor descriptor, record constructor, and record
|
|
accessors produced by expansion of the record definition.
|
|
|
|
\subsection{More source-optimizer improvements (9.3.1)}
|
|
|
|
The source optimizer now handles \scheme{apply} with a known-list
|
|
final argument, e.g., a constant list or list constructed directly
|
|
within the apply operation via \scheme{cons}, \scheme{list}, or
|
|
\scheme{list*} (\scheme{cons*}) as if it were an ordinary call,
|
|
i.e., without the \scheme{apply} and without the constant list
|
|
wrapper or list constructor.
|
|
For example:
|
|
|
|
\schemedisplay
|
|
(apply apply apply + (list 1 (cons 2 (list x (cons* 4 '(5 6))))))
|
|
\endschemedisplay
|
|
|
|
folds down to \scheme{(+ 18 x)}.
|
|
While not common at the source level, patterns like this can
|
|
materialize as the result of other source optimizations,
|
|
particularly inlining.
|
|
|
|
The source optimizer now also reduces applications of \scheme{car} and
|
|
\scheme{cdr} to the list-building operators \scheme{cons} and
|
|
\scheme{list}, e.g.:
|
|
|
|
\schemedisplay
|
|
(car (cons \var{e_1} \var{e_2})) ;-> (begin \var{e_2} \var{e_1})
|
|
(car (list \var{e_1} \var{e_2} \var{e_3})) ;-> (begin \var{e_2} \var{e_3} \var{e_1})
|
|
(cdr (list \var{e_1} \var{e_2} \var{e_3})) ;-> (begin \var{e_1} (list \var{e_2} \var{e_3}))
|
|
\endschemedisplay
|
|
|
|
discarding side-effect-free expressions in the \scheme{begin} forms
|
|
where appropriate.
|
|
It treats similarly calls of \scheme{vector-ref} on \scheme{vector};
|
|
\scheme{list-ref} on \scheme{list}, \scheme{list*}, and \scheme{cons*};
|
|
\scheme{string-ref} on \scheme{string}; and \scheme{fxvector-ref}
|
|
on \scheme{fxvector}, taking care with \scheme{string-ref} and
|
|
\scheme{fxvector-ref} not to optimize when doing so might mask an
|
|
invalid type of argument to a safe constructor.
|
|
|
|
Finally, the source optimizer now removes certain unnecessary
|
|
\scheme{let} bindings within the constraints of evaluation-order
|
|
preservation.
|
|
For example,
|
|
|
|
\schemedisplay
|
|
(let ([x \var{e_1}] [y \var{e_2}]) (list (cons x y) 7))
|
|
\endschemedisplay
|
|
|
|
reduces to:
|
|
|
|
\schemedisplay
|
|
(list (cons \var{e_1} \var{e_2}) 7)
|
|
\endschemedisplay
|
|
|
|
Such bindings commonly arise from inlining. Eliminating them tends
|
|
to make the output of \scheme{expand/optimize} more readable.
|
|
|
|
The impact on performance is minimal, but it can result in smaller
|
|
expressions and thus enable more inlining within the same size limits.
|
|
|
|
\subsection{Improved foreign-pointer address handling (9.3.1)}
|
|
|
|
Various composed operation on ftypes now avoid allocating
|
|
and dereferencing intermediate ftype pointers, i.e., \scheme{ftype-ref},
|
|
\scheme{ftype-set!}, \scheme{ftype-init-lock!}, \scheme{ftype-lock!},
|
|
\scheme{ftype-unlock!}, \scheme{ftype-spin-lock!},
|
|
\scheme{ftype-locked-incr!}, or \scheme{ftype-locked-decr!} applied
|
|
directly to the result of \scheme{ftype-ref}, \scheme{ftype-&ref}, or
|
|
\scheme{make-ftype-pointer}.
|
|
|
|
\subsection{New source optimizations (9.2.1)}
|
|
|
|
The source optimizer does a few new optimizations: it folds
|
|
calls to \scheme{symbol->string}, \scheme{string->symbol}, and
|
|
\scheme{gensym->unique-string} if the argument is known at compile
|
|
time and has the right type; it folds zero-argument calls to
|
|
\scheme{vector}, \scheme{string}, \scheme{bytevector}, and
|
|
\scheme{fxvector}; and it discards subsumed case-lambda clauses,
|
|
e.g., the second clause in
|
|
\scheme{(case-lambda [(x . y) \var{e_1}] [(x y) \var{e_2}])}.
|
|
|
|
\subsection{Reduced stack requirements after large apply (9.2)}
|
|
|
|
A call to \scheme{apply} with a very long argument list can cause a
|
|
large chunk of memory to be allocated for the topmost portion of
|
|
the stack.
|
|
This space is now reclaimed during the next collection.
|
|
|
|
\subsection{Improved symbol-hashtables performance (9.2)\label{sec:symbol-hashtable-performance}}
|
|
|
|
The performance of operations on symbol hashtables has been improved
|
|
generally over previous releases by eliminating call overhead for the
|
|
hash and equality functions.
|
|
Further improvements are possible with the use of the new type-specific
|
|
symbol-hashtable operators (Section~\ref{sec:symbol-hashtables}).
|
|
|
|
\subsection{Reduced library-invocation time, memory consumption (9.1)}
|
|
|
|
The amount of time required to invoke a library and the amount of memory
|
|
occupied by the library when the library is invoked as the result of a
|
|
run-time dependency of another library or a top-level program have both
|
|
been reduced by ``revisiting'' rather than ``invoking'' the library,
|
|
effectively leaving the compile-time information on disk until if and
|
|
when it is needed.
|
|
|
|
\subsection{Discarding relocation tables for static code objects (9.1)}
|
|
|
|
Unless the command-line parameter \scheme{--retain-static-relocation}
|
|
is supplied, the collector now discards relocation tables for code
|
|
objects when the code objects are promoted to the static generation,
|
|
either at boot time via heap compaction or via a call to \scheme{collect}
|
|
with the symbol \scheme{static} as the target generation.
|
|
This results in a significant reduction in the memory occupied by the
|
|
code object (around 20\% in our tests).
|
|
|
|
\subsection{Guardian registration (9.1)}
|
|
|
|
The code to register an object with a guardian is now open-coded, at
|
|
the cost of some additional work during the next collection.
|
|
The result is a modest net improvement in registration overhead (around
|
|
15\% in our tests).
|
|
Of potentially greater importance when threaded, each registration no
|
|
longer requires synchronization.
|
|
|
|
\subsection{Generated code improvements (9.1)}
|
|
|
|
The compiler generates better code in several small ways, resulting
|
|
in small decreases in code size and corresponding small
|
|
performance improvements in the range of 1--5\% in our tests.
|
|
|
|
\subsection{Reduced collector overhead for large heaps (9.0)}
|
|
|
|
In previous releases, a factor in collector performance was the
|
|
overall size of the heap (measured both in number of pages and the
|
|
amount of virtual memory spanned by the heap).
|
|
Through various changes to the data structures used to support the
|
|
storage manager, this factor has been eliminated, which can
|
|
significantly reduce the cost of collecting a younger generation
|
|
with a small number of accessible objects relative to overall heap
|
|
size.
|
|
In our experiments, the minimum cost of collection on contemporary
|
|
hardware exceeded 100 microseconds for heaps of 64MB or more and 5
|
|
milliseconds for heaps of 1GB or more.
|
|
The minimum cost grew in proportion to the heap size from there.
|
|
This is now fixed for all heap sizes at just a few microseconds.
|
|
|
|
\subsection{Reduced mutation overhead (9.0)}
|
|
|
|
Improvements in the compiler and storage manager have been made to
|
|
reduce the cost of tracking possible pointers from older to younger
|
|
generations when objects are mutated.
|
|
|
|
\subsection{Improved foreign-pointer address handling (8.9.5)\label{ftpaopt}}
|
|
|
|
Ftype pointers with constant addresses are now created at compile
|
|
time, with ftype-pointer address checks optimized away as well.
|
|
|
|
Bignum allocation overhead is avoided for addresses outside the
|
|
fixnum range when the results of two \scheme{ftype-pointer-address}
|
|
calls are directly compared or the result of one
|
|
\scheme{ftype-pointer-address} call is directly compared with 0.
|
|
That is, comparisons like:
|
|
|
|
\schemedisplay
|
|
(= (ftype-pointer-address x) 0)
|
|
(= (ftype-pointer-address x) (ftype-pointer-address y))
|
|
\endschemedisplay
|
|
|
|
are effectively optimized to:
|
|
|
|
\schemedisplay
|
|
(ftype-pointer-null? x)
|
|
(ftype-pointer=? x y)
|
|
\endschemedisplay
|
|
|
|
This optimization is performed when the comparison procedure is
|
|
\scheme{=}, \scheme{eqv?}, or \scheme{equal?} and the arguments
|
|
are given in either order.
|
|
The optimization is also performed when \scheme{zero?} is applied directly
|
|
to the result of \scheme{ftype-pointer-address}.
|
|
|
|
Bignum allocation overhead is also avoided at optimize-level~3
|
|
when \scheme{ftype-pointer-address} is used in combination with
|
|
\scheme{make-ftype-pointer} to effect a type cast, as in:
|
|
|
|
\schemedisplay
|
|
(make-ftype-pointer T (ftype-pointer-address x))
|
|
\endschemedisplay
|
|
|
|
Both bignum and ftype-pointer allocation is avoided when the result
|
|
of such a cast is used directly as the base pointer in an
|
|
\scheme{ftype-ref}, \scheme{ftype-&ref}, \scheme{ftype-set!},
|
|
\scheme{ftype-locked-incr!}, \scheme{ftype-locked-decr!},
|
|
\scheme{ftype-init-lock!}, \scheme{ftype-lock!}, \scheme{ftype-spin-lock!},
|
|
or \scheme{ftype-unlock!} form, as in:
|
|
|
|
\schemedisplay
|
|
(ftype-ref T (fld) (make-ftype-pointer T (ftype-pointer-address x)))
|
|
\endschemedisplay
|
|
|
|
These optimizations do not occur when the calls to
|
|
\scheme{ftype-pointer-address} are not nested directly within the outer
|
|
form, as when a \scheme{let} binding is used to name the result of the
|
|
\scheme{ftype-pointer-address} call, e.g.:
|
|
|
|
\schemedisplay
|
|
(let ([addr (ftype-pointer-address x)]) (= addr 0))
|
|
\endschemedisplay
|
|
|
|
In other places where \scheme{ftype-pointer-address} is used, the compiler
|
|
now open-codes the extraction and (if necessary) bignum allocation,
|
|
reducing overhead by the cost of a procedure call.
|
|
|
|
\subsection{Improved performance when profiling (8.9.5)}
|
|
|
|
In addition to improvements in the tracking of profile counts, the
|
|
run-time overhead for gathering profile information has gone down by
|
|
5--10\% in our tests and is now typically around 10\% of the total
|
|
unprofiled run time.
|
|
(Unprofiled code is also slightly faster, but by less than 2\% in
|
|
our tests.)
|
|
|
|
\subsection{New compiler back-end (8.9.1, 8.9.2, 8.9.5)}
|
|
|
|
Versions starting with 8.9.1 employ a new compiler back end that is
|
|
structured as a series of nanopassees and replaces the old linear-time
|
|
register allocator with a graph-coloring register allocator.
|
|
Compilation with the new back end is substantially slower (up to a factor
|
|
of two) than with the old back end, while code generated with the new
|
|
back end is faster (14--40\% depending on architecture and optimization
|
|
level) in our tests.
|
|
These improvements are independent of improvements
|
|
resulting from cross-library constant folding and inlining
|
|
(Section~\ref{subsection:clcfai}).
|
|
The code generated for a specific program might be faster or slower.
|
|
|
|
\subsection{Open-coding of \protect\scheme{make-guardian} (8.9.4)}
|
|
|
|
Calls to \scheme{make-guardian} are now open-coded by the compiler to
|
|
expose the implicit resulting \scheme{case-lambda} expression so that
|
|
calls to the guardian can themselves be inlined, thus reducing the overhead
|
|
for registering objects with a guardian and querying the guardian for
|
|
resurrected objects.
|
|
|
|
\subsection{Improved open-coding of \protect\scheme{make-parameter} and \protect\scheme{make-thread-parameter} (8.9.4)}
|
|
|
|
\scheme{make-parameter} and \scheme{make-thread-parameter}
|
|
are now open-coded in all cases to expose the implicit resulting
|
|
\scheme{case-lambda} expression.
|
|
(They were already open-coded when the second, \emph{filter},
|
|
argument was a \scheme{lambda} expression or primitive name.)
|
|
|
|
\subsection{Cross-library constant folding and inlining (8.9.2)\label{subsection:clcfai}}
|
|
|
|
The compiler now propagates constants and inlines simple procedures
|
|
across library boundaries.
|
|
A simple procedure is one that, after optimization of the exporting
|
|
library, is smaller than a given threshold, contains no free references
|
|
to other bindings in the exporting library, and contains no constants
|
|
that cannot be copied without breaking pointer identity.
|
|
The size threshold is determined, as for inlining within a library or
|
|
other compilation unit, by the parameter \scheme{cp0-score-limit}.
|
|
In this case, the size threshold is determined based on the size
|
|
\emph{before} inlining rather than the size \emph{after} inlining,
|
|
which is often more conservative.
|
|
Omitting larger procedures that might generate less code when inlined in
|
|
a particular context reduces the amount of information that must be stored
|
|
in the exporting library's object code to support cross-library inlining.
|
|
|
|
One particularly useful benefit of this optimization is that record
|
|
predicates, accessors, mutators, and (depending on protocols)
|
|
constructors created by a record definition in one library and exported
|
|
by another are inlined in the importing library, just as if the record
|
|
type were defined in the importing library.
|
|
|
|
\end{document}
|