racket

Author	SHA1	Message	Date
Matthew Flatt	7768b09118	unbox local floating-point arithmetic Avoid allocating a flonum object for floating-opint calculations that are consumed only by other floating-point caculations. For this first cut, unboxing applies only to fl+, fl-, fl*, fl/, flabs, fl<, fl<=, fl=, fl>, fl>=, bytevector-ieee-double-[native-]ref, and bytevector-ieee-double-[native-]set!. Local variables can be unboxed in the same way as implicit temporaries, and loop arguments can be unboxed, but values in a closure and function-call arguments are always boxed. arm32 support is mostly in place, but not yet right. ppc32 support is not yet implemented. This commit includes a small change that is incompatible with previous Chez Scheme versions: `(fl= +nan.0)` (and similar for other comparisons) produces true instead of false. original commit: 36459e43f10705aa3e383376ca7d54cf2998b7ee	2020-05-31 17:08:38 -06:00
Matthew Flatt	540c58bbe8	use POPCNT instruction when available on x86_64 On x86_64, a POPCNT instruction is usually available, and it can speed up `fxpopcount` operations by a factor of 2-3. Since POPCNT isn't always available, code using `fxpopcount` is compiled to a call to a generic implementation. The linker substitutes a POPCNT instruction when it determines at runtime that POPCNT is available. Some measurements on a 2018 MacBook Pro (2.7 GHz Core i7) using the program below: popcnt = this implementation, POPCNT discovered nocnt = this implementation, POPCNT considered unavailable optcnt = compile to use POPCNT directly (no linker work) cpcnt = compile to inlined generic (no linker work, no POPCNT) Since the generic implementation is always a 64-bit popcount, it's not as good as an inlined version for `fxpopcount32`, but otherwise the link-edit approach to POPCNT works well: fxpopcount fxpopcount32 popcnt: 0.098s nocnt: 0.284s optcnt 0.109s [slower means noise?] cpcnt: 0.279s 0.188s (optimize-level 3) (time (let loop ([v #f] [i 100000000]) (if (fx= i 0) v (loop (fxpopcount i) (fx- i 1))))) original commit: 5f090e509f8fe5edc777ed9f0463b20c2e571336	2020-01-11 11:04:48 -07:00
Matthew Flatt	81ea967aea	add stencil vectors and fxpopcount original commit: ec766fca869b5e0407c4f54230b72619af73b40b	2020-01-06 05:34:28 -07:00
Bob Burger	831ea8ad18	changed copyright year to 2017 7.ss, scheme.1.in, comments of many files original commit: 06f858f9a505b9d6fb6ca1ac97234927cb2dc641	2017-04-06 11:41:33 -04:00
dyb	1356af91b3	initial upload of open-source release original commit: 47a210c15c63ba9677852269447bd2f2598b51fe	2016-04-26 10:04:54 -04:00

5 Commits