cs: switch to stencil-vector HAMT
With recent improvements, the run-time performance of vector-stencil HAMTs for immutable hash tables seems close enough (on microbenchmarks) to the Patricia-trie implementation to be worthwhile, since they use less memory. Performance remains better in most cases than the traditional Racket implementation. The table at the end of this message summarizes relative performance on microbenchmarks. Overall, though, immutable hash-table operations are already so fast that these difference very rarely translate to measurable differences in overall run times --- not even for the macro expander, which relies heavily on immutable hash tables to represent scope sets. Stencil-vector HAMTs tend to take about 1/3 the space of Patricia tries, and those space savings can turn into run-time improvements in applications by reducing GC time. I've observed a 10% reduction in compile time for some programs. When building a full Racket distribution, run time shrinks by about 2 minutes out of 80 minutes, probbaly because just average memory use goes down by 10%. DrRacket's initial memory footprint goes down by about 37M out of 657M (a 5% savings). Mincrobenchmark relative performance, normalized to previous Racket CS implementation (measured on 2018 MacBook Pro, 2.7 GHz Core i7; Chez Scheme can substitute POPCNT instructions at link time): patricia = previous Racket CS implementation as a Patricia Trie stencil = new Racket CS implementation as a stencil-vector HAMT racket = traditional Racket implementation patricia stencil racket set-in-empty:eq#t: ==| ==| ==|= set-many:eq#t: ==| ==|== ==|======== set-many-in-order:eq#t: ==| ==| ==|==== set-same:eq#t: ==| == ==|= set-in-empty:eq: ==| == ==|= set-many:eq: ==| ==|== ==|======== set-many-in-order:eq: ==| ==|= ==|===== set-same:eq: ==| == ==|= set-in-empty:eqv: ==| ==| ==|== set-many:eqv: ==| ==|== ==|========= set-many-in-order:eqv: ==| ==|= ==|===== set-same:eqv: ==| ==| ==|= set-in-empty:equal: ==| ==|== ==|=== set-many:equal: ==| ==|== ==|===== set-many-in-order:equal: ==| ==|= ==|=== set-same:equal: ==| ==|= ==|=== ref:eq#t: ==| ==| ==|= ref-fail:eq#t: ==| ==| == ref:eq: ==| ==| ==|= ref-fail:eq: ==| ==| == ref:eqv: ==| ==| ==|==== ref-fail:eqv: ==| ==| ==| ref:equal: ==| ==| ==|=== ref-large:equal: ==| ==| == ref-fail:equal: ==| ==| ==|=== ref-large-fail:equal: ==| ==| == removes:eq#t: ==| ==|=== ==|=========== add+remove:eq#t: ==| ==|= ==|======= removes:eq: ==| ==|==== ==|============ add+remove:eq: ==| ==|= ==|======= removes:eqv: ==| ==|=== ==|============= add+remove:eqv: ==| ==| ==|======== removes:equal: ==| ==|== ==|======= add+remove:equal: ==| ==|= ==|====== iterate-keys:eq: ==| ==| ==|= iterate-vals:eq#t: ==| ==|= ==|= iterate-vals:eq: ==| ==|= ==|= iterate-unsafe-keys:eq: ==| ==| ==|======= iterate-unsafe-vals:eq#t: ==| ==| ==| iterate-unsafe-vals:eq: ==| ==|= ==| for-each:eq: ==| ==| ==|========== subset-lil-shared:eq: ==| ==| ==|= subset-lil-unshared:eq: ==| ==| ==|== subset-lil-not:eq: ==| == == subset-med+lil-shared:eq: ==| ==|==== ==|= subset-med+med-shared:eq: ==| ==|= ==|= subset-big-same:eq: ==| ==| ==|=============== subset-big+lil-shared:eq: ==| ==|=== ==|==== subset-big+med-shared:eq: ==| ==|== ==|=== subset-big-unshared:eq: ==| ==| ==|==
This commit is contained in:
parent
b8398f796c
commit
33f8173970
|
@ -12,7 +12,7 @@
|
||||||
|
|
||||||
(define collection 'multi)
|
(define collection 'multi)
|
||||||
|
|
||||||
(define version "7.6.0.1")
|
(define version "7.6.0.2")
|
||||||
|
|
||||||
(define deps `("racket-lib"
|
(define deps `("racket-lib"
|
||||||
["racket" #:version ,version]))
|
["racket" #:version ,version]))
|
||||||
|
|
|
@ -19,7 +19,7 @@
|
||||||
(loop (hash-set ht KEY (MAKE-VAL 'true))
|
(loop (hash-set ht KEY (MAKE-VAL 'true))
|
||||||
(sub1 i)))))
|
(sub1 i)))))
|
||||||
|
|
||||||
'set-many
|
'set-many-in-order
|
||||||
(times
|
(times
|
||||||
(for ([i (in-range Q)])
|
(for ([i (in-range Q)])
|
||||||
(let loop ([ht EMPTY] [i K])
|
(let loop ([ht EMPTY] [i K])
|
||||||
|
@ -28,7 +28,7 @@
|
||||||
(loop (hash-set ht (MAKE-KEY i) (MAKE-VAL 'true))
|
(loop (hash-set ht (MAKE-KEY i) (MAKE-VAL 'true))
|
||||||
(sub1 i))))))
|
(sub1 i))))))
|
||||||
|
|
||||||
'set-many-in-order
|
'set-many
|
||||||
(times
|
(times
|
||||||
(for ([i (in-range Q)])
|
(for ([i (in-range Q)])
|
||||||
(let loop ([ht EMPTY] [l shuffled])
|
(let loop ([ht EMPTY] [l shuffled])
|
||||||
|
|
|
@ -2,7 +2,7 @@
|
||||||
;; Check to make we're using a build of Chez Scheme
|
;; Check to make we're using a build of Chez Scheme
|
||||||
;; that has all the features we need.
|
;; that has all the features we need.
|
||||||
(define-values (need-maj need-min need-sub need-dev)
|
(define-values (need-maj need-min need-sub need-dev)
|
||||||
(values 9 5 3 10))
|
(values 9 5 3 11))
|
||||||
|
|
||||||
(unless (guard (x [else #f]) (eval 'scheme-fork-version-number))
|
(unless (guard (x [else #f]) (eval 'scheme-fork-version-number))
|
||||||
(error 'compile-file
|
(error 'compile-file
|
||||||
|
|
|
@ -174,8 +174,9 @@
|
||||||
[val-i (fx- i child-count)]) ; same as key index
|
[val-i (fx- i child-count)]) ; same as key index
|
||||||
(bnode-val-local-index-ref n child-count key-count val-i))]
|
(bnode-val-local-index-ref n child-count key-count val-i))]
|
||||||
[else
|
[else
|
||||||
;; Complicated case: we have to figure out how many
|
;; Complicated case that we expect to be rare: figure out how many
|
||||||
;; previous keys have values
|
;; previous keys have values, since we don't know how the key/value
|
||||||
|
;; index maps to a key/value bit
|
||||||
(let* ([child-count (hamt-mask->child-count mask)]
|
(let* ([child-count (hamt-mask->child-count mask)]
|
||||||
[key-count (hamt-mask->key-count mask)]
|
[key-count (hamt-mask->key-count mask)]
|
||||||
[key-i (fx- i child-count)])
|
[key-i (fx- i child-count)])
|
||||||
|
@ -561,7 +562,10 @@
|
||||||
|
|
||||||
;; ----------------------------------------
|
;; ----------------------------------------
|
||||||
;; unsafe iteration; position is a stack
|
;; unsafe iteration; position is a stack
|
||||||
;; represented by a list of (cons node index)
|
;; of the form
|
||||||
|
;; - '()
|
||||||
|
;; - (cons indent (cons node stack))
|
||||||
|
;; - (cons (box assoc-list) stack)
|
||||||
|
|
||||||
(define (unsafe-intmap-iterate-first h)
|
(define (unsafe-intmap-iterate-first h)
|
||||||
(and (not (intmap-empty? h))
|
(and (not (intmap-empty? h))
|
||||||
|
@ -573,12 +577,12 @@
|
||||||
(let ([mask (stencil-vector-mask n)])
|
(let ([mask (stencil-vector-mask n)])
|
||||||
(let ([child-count (hamt-mask->child-count mask)]
|
(let ([child-count (hamt-mask->child-count mask)]
|
||||||
[key-count (hamt-mask->key-count mask)])
|
[key-count (hamt-mask->key-count mask)])
|
||||||
(let ([stack (cons (cons n (fx+ key-count child-count -1)) stack)])
|
(let ([stack (cons (fx+ key-count child-count -1) (cons n stack))])
|
||||||
(if (fx= key-count 0)
|
(if (fx= key-count 0)
|
||||||
(unsafe-node-iterate-first (bnode-child-index-ref n (fx- child-count 1)) stack)
|
(unsafe-node-iterate-first (bnode-child-index-ref n (fx- child-count 1)) stack)
|
||||||
stack))))]
|
stack))))]
|
||||||
[(cnode? n)
|
[(cnode? n)
|
||||||
(cons (box (cnode-content n))
|
(cons (#%box (cnode-content n))
|
||||||
stack)]))
|
stack)]))
|
||||||
|
|
||||||
(define (unsafe-intmap-iterate-next h pos)
|
(define (unsafe-intmap-iterate-next h pos)
|
||||||
|
@ -590,32 +594,31 @@
|
||||||
;; Stack is empty, so we're done
|
;; Stack is empty, so we're done
|
||||||
#f]
|
#f]
|
||||||
[else
|
[else
|
||||||
(let ([p (car pos)]
|
(let ([i (car pos)]
|
||||||
[stack (cdr pos)])
|
[stack (cdr pos)])
|
||||||
(cond
|
(cond
|
||||||
[(box? p)
|
[(fixnum? i)
|
||||||
;; in a cnode
|
(let ([n (car stack)])
|
||||||
(let ([new-p (cdr (unbox p))])
|
|
||||||
(if (null? new-p)
|
|
||||||
;; Exhausted this node, so return to parent node
|
|
||||||
(unsafe-node-iterate-next stack)
|
|
||||||
;; still in cnode:
|
|
||||||
(cons (box new-p) stack)))]
|
|
||||||
[else
|
|
||||||
(let ([n (car p)]
|
|
||||||
[i (cdr p)])
|
|
||||||
(cond
|
(cond
|
||||||
[(fx= 0 i)
|
[(fx= 0 i)
|
||||||
;; Exhausted this node, so return to parent node
|
;; Exhausted this node, so return to parent node
|
||||||
(unsafe-node-iterate-next stack)]
|
(unsafe-node-iterate-next (cdr stack))]
|
||||||
[else
|
[else
|
||||||
;; Move to next (lower) index in the current node
|
;; Move to next (lower) index in the current node
|
||||||
(let ([i (fx1- i)])
|
(let ([i (fx1- i)])
|
||||||
(let ([child-count (hamt-mask->child-count (stencil-vector-mask n))]
|
(let ([child-count (hamt-mask->child-count (stencil-vector-mask n))]
|
||||||
[stack (cons (cons n i) stack)])
|
[stack (cons i stack)])
|
||||||
(if (fx< i child-count)
|
(if (fx< i child-count)
|
||||||
(unsafe-node-iterate-first (bnode-child-index-ref n i) stack)
|
(unsafe-node-iterate-first (bnode-child-index-ref n i) stack)
|
||||||
stack)))]))]))]))
|
stack)))]))]
|
||||||
|
[else
|
||||||
|
;; in a cnode
|
||||||
|
(let ([new-p (cdr (#%unbox i))])
|
||||||
|
(if (null? new-p)
|
||||||
|
;; Exhausted this node, so return to parent node
|
||||||
|
(unsafe-node-iterate-next stack)
|
||||||
|
;; still in cnode:
|
||||||
|
(cons (#%box new-p) stack)))]))]))
|
||||||
|
|
||||||
(define (unsafe-intmap-iterate-key h pos)
|
(define (unsafe-intmap-iterate-key h pos)
|
||||||
(eqtype-dispatch
|
(eqtype-dispatch
|
||||||
|
@ -924,84 +927,83 @@
|
||||||
|
|
||||||
(define (bnode-entry-at-position n pos mode fail)
|
(define (bnode-entry-at-position n pos mode fail)
|
||||||
(let* ([mask (stencil-vector-mask n)]
|
(let* ([mask (stencil-vector-mask n)]
|
||||||
[child-count (hamt-mask->child-count mask)])
|
[child-count (hamt-mask->child-count mask)]
|
||||||
(let loop ([i 0] [pos pos])
|
[key-count (hamt-mask->key-count mask)])
|
||||||
(cond
|
(cond
|
||||||
[(fx= i child-count)
|
[(fx< pos key-count)
|
||||||
(let ([key-count (hamt-mask->key-count mask)])
|
(let ([get-key (lambda () (hamt-unwrap-key (bnode-key-index-ref n (fx+ pos child-count))))]
|
||||||
(cond
|
[get-value (lambda () (bnode-val-index-ref n (fx+ pos child-count)))])
|
||||||
[(fx< pos key-count)
|
(case mode
|
||||||
(let ([get-key (lambda () (hamt-unwrap-key (bnode-key-index-ref n (fx+ pos child-count))))]
|
[(key) (get-key)]
|
||||||
[get-value (lambda () (bnode-val-index-ref n (fx+ pos child-count)))])
|
[(val) (get-value)]
|
||||||
(case mode
|
[(both) (values (get-key) (get-value))]
|
||||||
[(key) (get-key)]
|
[else (cons (get-key) (get-value))]))]
|
||||||
[(val) (get-value)]
|
[else
|
||||||
[(both) (values (get-key) (get-value))]
|
(let loop ([i 0] [pos (fx- pos key-count)])
|
||||||
[else (cons (get-key) (get-value))]))]
|
(cond
|
||||||
[else fail]))]
|
[(fx= i child-count)
|
||||||
[else
|
fail]
|
||||||
(let ([c (bnode-child-index-ref n i)])
|
[else
|
||||||
(cond
|
(let ([c (bnode-child-index-ref n i)])
|
||||||
[(bnode? c)
|
(cond
|
||||||
(let ([sz (hamt-count c)])
|
[(bnode? c)
|
||||||
(if (fx>= pos sz)
|
(let ([sz (hamt-count c)])
|
||||||
(loop (fx+ i 1) (fx- pos sz))
|
(if (fx>= pos sz)
|
||||||
(bnode-entry-at-position c pos mode fail)))]
|
(loop (fx+ i 1) (fx- pos sz))
|
||||||
[else
|
(bnode-entry-at-position c pos mode fail)))]
|
||||||
(let* ([alist (cnode-content c)]
|
[else
|
||||||
[len (length alist)])
|
(let* ([alist (cnode-content c)]
|
||||||
(if (fx>= pos len)
|
[len (length alist)])
|
||||||
(loop (fx+ i 1) (fx- pos len))
|
(if (fx>= pos len)
|
||||||
(let ([p (list-ref alist pos)])
|
(loop (fx+ i 1) (fx- pos len))
|
||||||
(case mode
|
(let ([p (list-ref alist pos)])
|
||||||
[(key) (car p)]
|
(case mode
|
||||||
[(val) (cdr p)]
|
[(key) (car p)]
|
||||||
[(both) (values (car p) (cdr p))]
|
[(val) (cdr p)]
|
||||||
[else p]))))]))]))))
|
[(both) (values (car p) (cdr p))]
|
||||||
|
[else p]))))]))]))])))
|
||||||
|
|
||||||
(define (bnode-unsafe-intmap-iterate-key pos)
|
(define (bnode-unsafe-intmap-iterate-key pos)
|
||||||
(let ([p (car pos)])
|
(let ([i (car pos)])
|
||||||
(cond
|
(cond
|
||||||
[(box? p)
|
[(fixnum? i)
|
||||||
;; in a cnode
|
(let ([h (cadr pos)])
|
||||||
(caar (unbox p))]
|
(hamt-unwrap-key (bnode-key-index-ref h i)))]
|
||||||
[else
|
[else
|
||||||
(let ([h (car p)])
|
;; in a cnode
|
||||||
(hamt-unwrap-key (bnode-key-index-ref h (cdr p))))])))
|
(caar (#%unbox i))])))
|
||||||
|
|
||||||
(define (bnode-unsafe-intmap-iterate-value pos)
|
(define (bnode-unsafe-intmap-iterate-value pos)
|
||||||
(let ([p (car pos)])
|
(let ([i (car pos)])
|
||||||
(cond
|
(cond
|
||||||
[(box? p)
|
[(fixnum? i)
|
||||||
;; in a cnode
|
(bnode-val-index-ref (cadr pos) i)]
|
||||||
(cdar (unbox p))]
|
|
||||||
[else
|
[else
|
||||||
(bnode-val-index-ref (car p) (cdr p))])))
|
;; in a cnode
|
||||||
|
(cdar (#%unbox i))])))
|
||||||
|
|
||||||
(define (bnode-unsafe-intmap-iterate-key+value pos)
|
(define (bnode-unsafe-intmap-iterate-key+value pos)
|
||||||
(let ([p (car pos)])
|
(let ([i (car pos)])
|
||||||
(cond
|
(cond
|
||||||
[(box? p)
|
[(fixnum? i)
|
||||||
;; in a cnode
|
(let ([n (cadr pos)])
|
||||||
(let ([pr (car (unbox p))])
|
|
||||||
(values (car pr) (cdr pr)))]
|
|
||||||
[else
|
|
||||||
(let ([n (car p)]
|
|
||||||
[i (cdr p)])
|
|
||||||
(values (hamt-unwrap-key (bnode-key-index-ref n i))
|
(values (hamt-unwrap-key (bnode-key-index-ref n i))
|
||||||
(bnode-val-index-ref n i)))])))
|
(bnode-val-index-ref n i)))]
|
||||||
|
[else
|
||||||
|
;; in a cnode
|
||||||
|
(let ([pr (car (#%unbox i))])
|
||||||
|
(values (car pr) (cdr pr)))])))
|
||||||
|
|
||||||
(define (bnode-unsafe-intmap-iterate-pair pos)
|
(define (bnode-unsafe-intmap-iterate-pair pos)
|
||||||
(let ([p (car pos)])
|
(let ([i (car pos)])
|
||||||
(cond
|
(cond
|
||||||
[(box? p)
|
[(fixnum? i)
|
||||||
;; in a cnode
|
(let ([n (cadr pos)])
|
||||||
(car (unbox p))]
|
|
||||||
[else
|
|
||||||
(let ([n (car p)]
|
|
||||||
[i (cdr p)])
|
|
||||||
(cons (hamt-unwrap-key (bnode-key-index-ref n i))
|
(cons (hamt-unwrap-key (bnode-key-index-ref n i))
|
||||||
(bnode-val-index-ref n i)))])))
|
(bnode-val-index-ref n i)))]
|
||||||
|
[else
|
||||||
|
;; in a cnode
|
||||||
|
(car (#%unbox i))])))
|
||||||
|
|
||||||
(define (bnode=? a b eql? shift)
|
(define (bnode=? a b eql? shift)
|
||||||
(or
|
(or
|
||||||
|
|
|
@ -1,21 +1,23 @@
|
||||||
|
|
||||||
;; We have several implementations of immutable hash tables. Pick one...
|
;; We have several implementations of immutable hash tables. Pick one...
|
||||||
|
|
||||||
(include "rumble/patricia.ss")
|
;; (include "rumble/patricia.ss")
|
||||||
;;
|
;;
|
||||||
;; This Patricia-trie implementation is the prettiest and fastest. It
|
;; This Patricia-trie implementation is the prettiest and fastest. It
|
||||||
;; uses the most memory, though --- typically much more than the
|
;; uses the most memory, though --- typically much more than the
|
||||||
;; vector-stencil HAMT.
|
;; vector-stencil HAMT.
|
||||||
|
|
||||||
;; (include "rumble/hamt-stencil.ss")
|
(include "rumble/hamt-stencil.ss")
|
||||||
;;
|
;;
|
||||||
;; This HAMT implementation using stencil vectors tends to use the
|
;; This HAMT implementation using stencil vectors tends to use the
|
||||||
;; last memory, often by a lot. It's slower than the Patricia-tree
|
;; least memory, often 1/3 the space of the Patricia-trie
|
||||||
;; implementation, though, especially for `hash-keys-subset?`.
|
;; implementation. It's slower than the Patricia-tree implementation
|
||||||
|
;; for some operations, up to a factor of 2 for `hash-set` or
|
||||||
|
;; `hash-keys-subset?`.
|
||||||
|
|
||||||
;; (include "rumble/hamt-vector.ss")
|
;; (include "rumble/hamt-vector.ss")
|
||||||
;;
|
;;
|
||||||
;; This HAMT implementaiton uses plain vectors instead of stencil
|
;; This HAMT implementaiton uses plain vectors instead of stencil
|
||||||
;; vectors. Its speed and memory use are intermediate, but its speed
|
;; vectors. Its speed and memory use are both worse than the
|
||||||
;; is closer to the stencil-vector HAMT implementation, and memory use
|
;; stencil-vector HAMT implementation, but it was the original source
|
||||||
;; is closer to the Patrica trie implementation.
|
;; of the stencil-vector implementation.
|
||||||
|
|
|
@ -16,7 +16,7 @@
|
||||||
#define MZSCHEME_VERSION_X 7
|
#define MZSCHEME_VERSION_X 7
|
||||||
#define MZSCHEME_VERSION_Y 6
|
#define MZSCHEME_VERSION_Y 6
|
||||||
#define MZSCHEME_VERSION_Z 0
|
#define MZSCHEME_VERSION_Z 0
|
||||||
#define MZSCHEME_VERSION_W 1
|
#define MZSCHEME_VERSION_W 2
|
||||||
|
|
||||||
/* A level of indirection makes `#` work as needed: */
|
/* A level of indirection makes `#` work as needed: */
|
||||||
#define AS_a_STR_HELPER(x) #x
|
#define AS_a_STR_HELPER(x) #x
|
||||||
|
|
Loading…
Reference in New Issue
Block a user