racket/makefiles
Matthew Flatt 540c58bbe8 use POPCNT instruction when available on x86_64
On x86_64, a POPCNT instruction is usually available, and it can speed
up `fxpopcount` operations by a factor of 2-3.

Since POPCNT isn't always available, code using `fxpopcount` is
compiled to a call to a generic implementation. The linker substitutes
a POPCNT instruction when it determines at runtime that POPCNT is
available.

Some measurements on a 2018 MacBook Pro (2.7 GHz Core i7) using the
program below:

 popcnt = this implementation, POPCNT discovered
 nocnt  = this implementation, POPCNT considered unavailable
 optcnt = compile to use POPCNT directly (no linker work)
 cpcnt  = compile to inlined generic (no linker work, no POPCNT)

Since the generic implementation is always a 64-bit popcount, it's not
as good as an inlined version for `fxpopcount32`, but otherwise the
link-edit approach to POPCNT works well:

            fxpopcount      fxpopcount32
 popcnt:       0.098s
 nocnt:        0.284s
 optcnt        0.109s  [slower means noise?]
 cpcnt:        0.279s         0.188s

 (optimize-level 3)
 (time
  (let loop ([v #f] [i 100000000])
    (if (fx= i 0)
        v
        (loop (fxpopcount i) (fx- i 1)))))

original commit: 5f090e509f8fe5edc777ed9f0463b20c2e571336
2020-01-11 11:04:48 -07:00
..
installsh Bash test(1) does not allow bare numbers with ==, so use -eq 2016-05-22 17:41:40 -04:00
Makefile-csug.in Makefile-csug.in install target is now consistent with the project 2018-03-28 09:25:20 -07:00
Makefile-release_notes.in - Updated CSUG to replace \INSERTREVISIONMONTHSPACEYEAR with the current 2017-10-13 23:50:20 -04:00
Makefile-workarea.in minor build and new-release updates 2019-03-19 23:23:10 -07:00
Makefile.in adjust build for BSDs, MinGW cross-compile, and more configuration 2019-07-03 18:54:04 -06:00
Mf-boot.in changed copyright year to 2017 2017-04-06 11:41:33 -04:00
Mf-install.in use POPCNT instruction when available on x86_64 2020-01-11 11:04:48 -07:00