Commit Graph

47 Commits

Author SHA1 Message Date
Matthew Flatt
540c58bbe8 use POPCNT instruction when available on x86_64
On x86_64, a POPCNT instruction is usually available, and it can speed
up `fxpopcount` operations by a factor of 2-3.

Since POPCNT isn't always available, code using `fxpopcount` is
compiled to a call to a generic implementation. The linker substitutes
a POPCNT instruction when it determines at runtime that POPCNT is
available.

Some measurements on a 2018 MacBook Pro (2.7 GHz Core i7) using the
program below:

 popcnt = this implementation, POPCNT discovered
 nocnt  = this implementation, POPCNT considered unavailable
 optcnt = compile to use POPCNT directly (no linker work)
 cpcnt  = compile to inlined generic (no linker work, no POPCNT)

Since the generic implementation is always a 64-bit popcount, it's not
as good as an inlined version for `fxpopcount32`, but otherwise the
link-edit approach to POPCNT works well:

            fxpopcount      fxpopcount32
 popcnt:       0.098s
 nocnt:        0.284s
 optcnt        0.109s  [slower means noise?]
 cpcnt:        0.279s         0.188s

 (optimize-level 3)
 (time
  (let loop ([v #f] [i 100000000])
    (if (fx= i 0)
        v
        (loop (fxpopcount i) (fx- i 1)))))

original commit: 5f090e509f8fe5edc777ed9f0463b20c2e571336
2020-01-11 11:04:48 -07:00
Matthew Flatt
81ea967aea add stencil vectors and fxpopcount
original commit: ec766fca869b5e0407c4f54230b72619af73b40b
2020-01-06 05:34:28 -07:00
Matthew Flatt
c8ea435c85 make strings within symbols always immutable
original commit: 7859d16dac7bae6ab836e2200003583dc572deba
2019-12-16 17:11:49 -07:00
Matthew Flatt
de2dedcdd7 Add uninterned symbols
Uninterned symbols are slightly more expensive to allocate than 0- or
1-argument calls to `gensym`, but they're much cheaper to hash (and
print). They're also more consistently distinct when unfasled, and the
fasled form is determinsitic.

original commit: 3167083008031b1f880e76a6f573563c7d9c888c
2019-12-04 12:43:35 -07:00
Matthew Flatt
18d18b7ff6 add pseudo-random generator API
The MRG32k3a generator is fast when using unboxed floating-point
arithemtic. Since the Scheme compiler doesn't yet support that,
build MRG32k3a into the kernel and provide access via
`pseudo-random-generator` functions.

original commit: 3dd74679a6c2705440488d8c07c47852eb50a94b
2019-10-07 10:58:39 -06:00
Matthew Flatt
174c416f9e repair for opportunistic 1-shot
If normal 1-shot continuations are mixed with opportunistic 1-shot
continuations created by `call-setting-continuation-attachment`, then
promoting an opportunistic 1-shot at a GC is wrong unless the whole
chain is promoted.

original commit: 2dfac475666763b60935e382386af4438f3029e0
2019-09-24 11:41:50 -06:00
Matthew Flatt
71846161f9 Merge branch 'bsd' of github.com:mflatt/ChezScheme
original commit: 198477a40c2c580924d95491e63d80e1f9a39c0d
2019-07-05 07:30:37 -06:00
Matthew Flatt
c38194c0ca adjust build for BSDs, MinGW cross-compile, and more configuration
Includes joint work with @abmclin, @pmatos, and @jessealama.

original commit: 70559d074f70dcadec5cea3619f75f91fcda77eb
2019-07-03 18:54:04 -06:00
Paulo Matos
a3f325bbea mark functions that never return as NORETURN
original commit: 6377313ecb063273b573139c9e91de263e191e60
2019-07-02 11:30:59 -06:00
Matthew Flatt
dd0fe4ac40 unbreak MSVC build
Move `NORETURN` of 2e3a618b00 to start of function declaration, where
it works for both GCC and MSVC.

original commit: 10fc4a2406ecd34fa686d9d643ee63d7c12d6f97
2019-06-23 05:57:53 -06:00
Matthew Flatt
9f1fe73797 change build to use archives instead of merging objects
Merging ".o" files to one "kernel.o" can be convenient for further
linking, but it requires running `ld` directly. Running `ld` directly
sometimes runs into a mismatch between the C compiler and the default
`ld`. It's better to use the more typical approach of collecting
objects into an archive.

original commit: 7d5b60c7566570655e567495d86d546101cf8fb4
2019-06-21 18:53:33 -06:00
Matthew Flatt
a043c4b3a8 mark functions that never return as NORETURN
@pmatos did all the work here in racket/ChezScheme#8 and
racket/racket#2344.

original commit: 2e3a618b0072d547b6c5abe6dd8dbac36a98c10e
2019-06-21 14:26:01 -06:00
Matthew Flatt
2cf27c4727 Merge github.com:cisco/ChezScheme
original commit: 8118200e237d756f83be54e8bf3eabb4af2388ed
2019-05-22 10:46:59 -06:00
dyb
82b2cda639 compress-level parameter, improvement in lz4 compression, and various other related improvements
- added compress-level parameter to select a compression level for
  file writing and changed the default for lz4 compression to do a
  better job compressing.  finished splitting glz input routines
  apart from glz output routines and did a bit of other restructuring.
  removed gzxfile struct-as-bytevector wrapper and moved its fd
  into glzFile.  moved DEACTIVATE to before glzdopen_input calls
  in S_new_open_input_fd and S_compress_input_fd, since glzdopen_input
  reads from the file and could block.  the compress format and now
  level are now recorded directly the thread context.  replaced
  as-gz? flag bit in compressed bytevector header word with a small
  number of bits recording the compression format at the bottom of
  the header word.  flushed a couple of bytevector compression mats
  that depended on the old representation.  (these last few changes
  should make adding new compression formats easier.)  added
  s-directory build options to choose whether to compress and, if
  so, the format and level.
    compress-io.h, compress-io.c, new-io.c, equates.h, system.h,
    scheme.c, gc.c,
    io.ss, cmacros.ss, back.ss, bytevector.ss, primdata.ss, s/Mf-base,
    io.ms, mat.ss, bytevector.ms, root-experr*,
    release_notes.stex, io.stex, system.stex, objects.stex
- improved the effectiveness of LZ4 boot-file compression to within
  15% of gzip by increasing the lz4 output-port in_buffer size to
  1<<18.  With the previous size (1<<14) LZ4-compressed boot files
  were about 50% larger.  set the lz4 input-port in_buffer and
  out_buffer sizes to 1<<12 and 1<<14.  there's no clear win at
  present for larger input-port buffer sizes.
    compress-io.c
- To reduce the memory hit for the increased output-port in_buffer
  size and the corresponding increase in computed out_buffer size,
  one output-side out_buffer is now allocated (lazily) per thread
  and stored in the thread context.  The other buffers are now
  directly a part of the lz4File_out and lz4File_in structures
  rather than allocated separately.
    compress-io.c, scheme.c, gc.c,
    cmacros.ss
- split out the buffer emit code from glzwrite_lz4 into a
  separate glzemit_lz4 helper that is now also used by gzclose
  so we can avoid dealing with a NULL buffer in glzwrite_lz4.
  glzwrite_lz4 also uses it to writing large buffers directly and
  avoid the memcpy.
    compress-io.c
- replaced lz4File_out and lz4File_in mode enumeration with the
  compress format and inputp boolean.  using switch to check and
  raising exceptions for unexpected values to further simplify
  adding new compression formats in the future.
    compress-io.c
- replaced the never-defined struct lz4File pointer in glzFile
  union with the more specific struct lz4File_in_r and Lz4File_out_r
  pointers.
    compress-io.h, compress-io.c
- added free of lz4 structures to gzclose.  also changed file-close
  logic generally so that (1) port is marked closed before anything is
  freed to avoid dangling pointers in the case of an interrupt or
  error, and (2) structures are freed even in the case of a write
  or close error, before the error is reported.  also now mallocing
  glz and lz4 structures after possibility of errors have passed where
  possible and freeing them when not.
    compress-io.c,
    io.ss
- added return-value checks to malloc calls and to a couple of other
  C-library calls.
    compress-io.c
- corrected EINTR checks to look at errno rather than return codes.
    compress-io.c
- added S_ prefixes to the glz* exports
    externs.h, compress-io.c, new-io.c, scheme.c, fasl.c
- added entries for mutex-name and mutex-thread
    threads.stex

original commit: 722ffabef4c938bc92c0fe07f789a9ba350dc6c6
2019-04-18 05:47:19 -07:00
Matthew Flatt
e622a495b6 Add LZ4 support and use it by default for compressing files
original commit: 8858b34bd92ac8d2b6511dc9ca17ebfa06a1bd93
2019-04-06 07:32:37 +02:00
Matthew Flatt
8b68320dcb Merge branch 'lz4' of https://github.com/mflatt/ChezScheme
original commit: f74329a3254dbdfda1c4f86585a2d5028bbe03a3
2019-03-20 15:49:49 -06:00
Matthew Flatt
8ab973300d Add LZ4 support and use it by default for compressing files
original commit: bbcd7fc2188e798ce53b765db0808e9ea6510350
2019-03-20 13:35:04 -06:00
Matthew Flatt
8070a7b910 Merge branch 'eqfl' of github.com:mflatt/ChezScheme
original commit: 8b36396eacb139e0fff70efcd2c9dc842815324f
2019-01-22 05:57:17 -07:00
Matthew Flatt
b27f3c0a94 Merge branch 'phantom' of github.com:mflatt/ChezScheme
original commit: 743a56d8f1920620e8f6e14edca7984101425e14
2019-01-20 07:56:59 -07:00
Matthew Flatt
538def47de add phantom bytevectors
original commit: 001917fd98ac6a0f13ccab902e15b9d2169c4b9c
2019-01-20 07:41:09 -07:00
dyb
ee9a4b3f59 profile counts are now maintained even for code that has been
reclaimed by the collector and must be released explicitly by the
programmer via (profile-release-counters).
  pdhtml.ss, primdata.ss,
  globals.h, externs.h, fasl.c, prim5.c, prim.c, alloc.c, scheme.c,
  misc.ms,
  release_notes.stex, system.stex

original commit: 68e20f721618dbaf4c1634067c2bee24a493a750
2019-01-17 09:43:18 -08:00
Matthew Flatt
3e297e025e adjust make-arity-wrapper to enforce the supplied arity mask
original commit: a9ec7da3ea3b8edc665b060bcba675248119d260
2019-01-15 11:56:03 -07:00
Matthew Flatt
7c548bb3a1 update vfasl merge
original commit: 99dac3f53f4a7d2b2c373489135e5d270c256726
2018-12-28 08:39:21 -06:00
Matthew Flatt
545a465cf4 Merge ../ChezScheme-vfasl
original commit: dbe15d6cae6f23c4e218974ac83f36a935292ad2
2018-12-24 05:28:16 -07:00
Matthew Flatt
a993c9c11e combine multiple fasl to one vfasl when possible
original commit: d8d4400b42196088defac994b7f97a26446d8ed2
2018-12-21 08:58:28 -07:00
Matthew Flatt
f0376299a8 experiment with a different fasl format
original commit: e2c50bd7ae5b323fcc796eb78d892f4a2c487dfc
2018-12-20 20:27:41 -07:00
Matthew Flatt
5cace8bee3 repairs
original commit: a7c8036d40fc3c92b6b08ba8d1a62f76f2d5fab6
2018-12-20 20:24:35 -07:00
Matthew Flatt
ed1d5c982d Merge ../ChezScheme-vfasl
original commit: 78ba118cbde76dd42bc4275ccc76219e159e04d7
2018-12-20 17:51:38 -07:00
Matthew Flatt
c90bd7bb6d experiment with a different fasl format
original commit: 6e32ed2a43f6b3d8531e98dfa52a56594dd6a2f4
2018-12-20 17:47:01 -07:00
Matthew Flatt
efb93d2653 Merge branch 'gcbt' of github.com:mflatt/ChezScheme
original commit: 51c6b2a880000ce754e1595f4481957e9fc7f722
2018-12-16 07:00:22 -07:00
dyb
19f3c85fe2 attempted partial fix for github issue 352
- when thread_get_room exhausts the local allocation area, it now
  goes through a common path with S_get_more_room to allocate a new
  local allocation area when appropriate.  this can greatly reduce
  the use of global allocation (and the number of tc mutex acquires
  in threaded builds) when a lot of small objects are allocated by
  C code with no intervening Scheme-side allocation or dirty writes.
    alloc.c, types.h, externs.h

original commit: 93dfa7674a95837e5a22bc622fecc50b0224f60d
2018-10-05 09:03:30 -07:00
Matthew Flatt
95d3146c16 Merge branch 'cm' of github.com:mflatt/ChezScheme
original commit: 9d8e3e99e79c1a2fa2cd20849c99f05b91db70d9
2018-07-25 16:07:41 -06:00
Matthew Flatt
f919bbcab6 add support for continuation attachments
original commit: 32669b104ef1119aea21f8592cee09d55f696afa
2018-07-25 06:33:46 -06:00
Matthew Flatt
48228739fe add object-references to reflect GC's tracing of objects
The `object-references` function is intended to support debugging of
memory leaks by providing a mapping from each live object to the
object that retained it.

original commit: 61f6602b7e6c388c529f3c5995dcf71a7c42e005
2018-07-16 18:08:48 -06:00
Bob Burger
8885445d6d Improved Unicode support for command-line arguments, environment variables, the C interface and error messages, and the Windows registry, DLL loading, and process creation
original commit: aa1c2c4ec95c286a12730ea75588a18dd9fb9d59
2018-06-14 14:24:15 -04:00
Matthew Flatt
22d4fd9978 Add __thread foreign-call convention
See the `foreign-callable` docs for a good example use.

original commit: e3463c78c511ad861dfa49865bb447e9777f9eb8
2018-03-14 17:20:33 -06:00
Matthew Flatt
743800bbb5 support struct args to and results from foreign procedures
original commit: f0a94bdb9f57c1bf7ffbb66693fb5476a6f0e65b
2018-03-12 21:01:47 -06:00
dyb
e5ee788d6f - renamed s_gettime => S_gettime to remain consistent with the
convention that the only undocumented externs are prefixed with
  S_.
    externs.h, stats.c, thread.c

original commit: 898c3941d807a18c393a4575a72283e3671ee7e5
2017-10-11 16:35:24 -04:00
Matthew Flatt
1932612543 add bytevector-compress and bytevector-decompress
original commit: aa062c09c9f0d129250db84aeb0a5190647c5574
2017-06-21 17:52:28 -06:00
Bob Burger
cbae4b9d77 mutexes and conditions are now freed when no longer used
added $close-resurrected-mutexes&conditions and $keep-live

original commit: 8d9aa4dffc371fc365020e5dac62270dae2aaa95
2017-04-13 09:41:58 -04:00
Bob Burger
831ea8ad18 changed copyright year to 2017
7.ss, scheme.1.in, comments of many files

original commit: 06f858f9a505b9d6fb6ca1ac97234927cb2dc641
2017-04-06 11:41:33 -04:00
Kent Dybvig
9cd0199a39 merge @mflatt immutable-vector, immutable-string, immutable-bytevector,
immutable-fxvector, and immutable-box support

original commit: 547fce9b99c6566d5cb3f7af9ca84654e798486e
2017-03-15 11:09:57 -04:00
Bob Burger
288243924f added optional timeout to condition-wait
original commit: 26d4dfc437fde2ba85d2d1a9ccadc2fa36979da0
2017-03-14 12:53:10 -04:00
dybvig
aa6a006e39 fixed a few comments to refer to scheme.c rather than main.c
externs.h, globals.h, thread.c

original commit: 1a05b09f421fb19d38f3fdfa1f8485fc7943068b
2017-03-13 11:25:25 -04:00
Andy Keep
7389a3ba16 - reworked the S_create_thread_object to print an error and exit when
allocating the thread context fails from Sactivate_thread.  before
  this change, the error was raised on the main thread, which resulted
  in strange behavior at best.  also added who argument to
  S_create_thread_object to allow it to report either Sactivate_thread
  or fork-thread led to the error.
    externs.h, schsig.c, scheme.c, thread.c

original commit: 89691cee27ee7d5e9bffae530b6346e96f8cc7ad
2016-08-05 00:04:00 -04:00
dybvig
b4d452cc71 - eliminated a couple of thread-safety issues and limitations on the
sizes of pathnames produced by expansion of tilde (home-directory)
  prefixes by replacing S_pathname, S_pathname_impl, and S_homedir
  with S_malloc_pathname, which always mallocs space for the result.
  one thread-safety issue involved the use of static strings for expanded
  pathnames and affected various file-system operations.  the other
  affected the file open routines and involved use of the incoming
  pathname while deactivated.  the incoming pathname is sometimes if not
  always a pointer into a Scheme bytevector, which can be overwritten if a
  collection occurs while the thread is deactivated.  the size limitation
  corresponded to the use of the static strings, which were limited to
  PATH_MAX bytes.  (PATH_MAX typically isn't actually the maximum path
  length in contemporary operating systems.)  eliminated similar issues
  for wide pathnames under Windows by adding S_malloc_wide_pathname.
  consumers of the old routines have been modified to use the new
  routines and to free the result strings.  the various file operations
  now consistently treat a pathname with an unresolvable home directory
  as a pathname that happens to start with a tilde.  eliminated unused
  foreign-symbol binding of "(cs)pathname" to S_pathname.
    io.c, externs.h, new_io.c, prim5.c, scheme.c, prim.c
- various places where a call to close or gzclose was retried when
  the close operation was interrupted no longer do so, since this can
  cause problems when another thread has reallocated the same file
  descriptor.
    new_io.c
- now using vcvarsall type x86_amd64 rather than amd64 when the
  former appears to supported and the latter does not, as is the
  case with VS Express 2015.
    c/Mf-a6nt, c/Mf-ta6nt
- commented out one of the thread mats that consistently causes
  indefinite delays under Windows and OpenBSD due to starvation.
    thread.ms
- increased wait time for a couple of subprocess responses
    6.ms
- added call to collector to close files opened during iconv mats
  specifically for when mats are run under Windows with no iconv dll.
    io.ms

original commit: ad44924307c576eb2fc92e7958afe8b615a7f48b
2016-06-16 23:04:32 -04:00
dyb
1356af91b3 initial upload of open-source release
original commit: 47a210c15c63ba9677852269447bd2f2598b51fe
2016-04-26 10:04:54 -04:00