cs: switch to stencil-vector HAMT

With recent improvements, the run-time performance of vector-stencil
HAMTs for immutable hash tables seems close enough (on
microbenchmarks) to the Patricia-trie implementation to be worthwhile,
since they use less memory. Performance remains better in most cases
than the traditional Racket implementation.

The table at the end of this message summarizes relative performance
on microbenchmarks. Overall, though, immutable hash-table operations
are already so fast that these difference very rarely translate to
measurable differences in overall run times --- not even for the macro
expander, which relies heavily on immutable hash tables to represent
scope sets.

Stencil-vector HAMTs tend to take about 1/3 the space of Patricia
tries, and those space savings can turn into run-time improvements in
applications by reducing GC time. I've observed a 10% reduction in
compile time for some programs. When building a full Racket
distribution, run time shrinks by about 2 minutes out of 80 minutes,
probbaly because just average memory use goes down by 10%. DrRacket's
initial memory footprint goes down by about 37M out of 657M (a 5%
savings).

Mincrobenchmark relative performance, normalized to previous Racket CS
implementation (measured on 2018 MacBook Pro, 2.7 GHz Core i7; Chez
Scheme can substitute POPCNT instructions at link time):

 patricia = previous Racket CS implementation as a Patricia Trie
 stencil = new Racket CS implementation as a stencil-vector HAMT
 racket = traditional Racket implementation

                           patricia  stencil  racket
       set-in-empty:eq#t:  ==|       ==|      ==|=
           set-many:eq#t:  ==|       ==|==    ==|========
  set-many-in-order:eq#t:  ==|       ==|      ==|====
           set-same:eq#t:  ==|       ==       ==|=
         set-in-empty:eq:  ==|       ==       ==|=
             set-many:eq:  ==|       ==|==    ==|========
    set-many-in-order:eq:  ==|       ==|=     ==|=====
             set-same:eq:  ==|       ==       ==|=
        set-in-empty:eqv:  ==|       ==|      ==|==
            set-many:eqv:  ==|       ==|==    ==|=========
   set-many-in-order:eqv:  ==|       ==|=     ==|=====
            set-same:eqv:  ==|       ==|      ==|=
      set-in-empty:equal:  ==|       ==|==    ==|===
          set-many:equal:  ==|       ==|==    ==|=====
 set-many-in-order:equal:  ==|       ==|=     ==|===
          set-same:equal:  ==|       ==|=     ==|===
                ref:eq#t:  ==|       ==|      ==|=
           ref-fail:eq#t:  ==|       ==|      ==
                  ref:eq:  ==|       ==|      ==|=
             ref-fail:eq:  ==|       ==|      ==
                 ref:eqv:  ==|       ==|      ==|====
            ref-fail:eqv:  ==|       ==|      ==|
               ref:equal:  ==|       ==|      ==|===
         ref-large:equal:  ==|       ==|      ==
          ref-fail:equal:  ==|       ==|      ==|===
    ref-large-fail:equal:  ==|       ==|      ==
            removes:eq#t:  ==|       ==|===   ==|===========
         add+remove:eq#t:  ==|       ==|=     ==|=======
              removes:eq:  ==|       ==|====  ==|============
           add+remove:eq:  ==|       ==|=     ==|=======
             removes:eqv:  ==|       ==|===   ==|=============
          add+remove:eqv:  ==|       ==|      ==|========
           removes:equal:  ==|       ==|==    ==|=======
        add+remove:equal:  ==|       ==|=     ==|======
         iterate-keys:eq:  ==|       ==|      ==|=
       iterate-vals:eq#t:  ==|       ==|=     ==|=
         iterate-vals:eq:  ==|       ==|=     ==|=
  iterate-unsafe-keys:eq:  ==|       ==|      ==|=======
iterate-unsafe-vals:eq#t:  ==|       ==|      ==|
  iterate-unsafe-vals:eq:  ==|       ==|=     ==|
             for-each:eq:  ==|       ==|      ==|==========
    subset-lil-shared:eq:  ==|       ==|      ==|=
  subset-lil-unshared:eq:  ==|       ==|      ==|==
       subset-lil-not:eq:  ==|       ==       ==
subset-med+lil-shared:eq:  ==|       ==|====  ==|=
subset-med+med-shared:eq:  ==|       ==|=     ==|=
      subset-big-same:eq:  ==|       ==|      ==|===============
subset-big+lil-shared:eq:  ==|       ==|===   ==|====
subset-big+med-shared:eq:  ==|       ==|==    ==|===
  subset-big-unshared:eq:  ==|       ==|      ==|==
This commit is contained in:
Matthew Flatt 2020-01-11 10:44:09 -07:00
parent b8398f796c
commit 33f8173970
6 changed files with 98 additions and 94 deletions

View File

@ -12,7 +12,7 @@
(define collection 'multi) (define collection 'multi)
(define version "7.6.0.1") (define version "7.6.0.2")
(define deps `("racket-lib" (define deps `("racket-lib"
["racket" #:version ,version])) ["racket" #:version ,version]))

View File

@ -19,7 +19,7 @@
(loop (hash-set ht KEY (MAKE-VAL 'true)) (loop (hash-set ht KEY (MAKE-VAL 'true))
(sub1 i))))) (sub1 i)))))
'set-many 'set-many-in-order
(times (times
(for ([i (in-range Q)]) (for ([i (in-range Q)])
(let loop ([ht EMPTY] [i K]) (let loop ([ht EMPTY] [i K])
@ -28,7 +28,7 @@
(loop (hash-set ht (MAKE-KEY i) (MAKE-VAL 'true)) (loop (hash-set ht (MAKE-KEY i) (MAKE-VAL 'true))
(sub1 i)))))) (sub1 i))))))
'set-many-in-order 'set-many
(times (times
(for ([i (in-range Q)]) (for ([i (in-range Q)])
(let loop ([ht EMPTY] [l shuffled]) (let loop ([ht EMPTY] [l shuffled])

View File

@ -2,7 +2,7 @@
;; Check to make we're using a build of Chez Scheme ;; Check to make we're using a build of Chez Scheme
;; that has all the features we need. ;; that has all the features we need.
(define-values (need-maj need-min need-sub need-dev) (define-values (need-maj need-min need-sub need-dev)
(values 9 5 3 10)) (values 9 5 3 11))
(unless (guard (x [else #f]) (eval 'scheme-fork-version-number)) (unless (guard (x [else #f]) (eval 'scheme-fork-version-number))
(error 'compile-file (error 'compile-file

View File

@ -174,8 +174,9 @@
[val-i (fx- i child-count)]) ; same as key index [val-i (fx- i child-count)]) ; same as key index
(bnode-val-local-index-ref n child-count key-count val-i))] (bnode-val-local-index-ref n child-count key-count val-i))]
[else [else
;; Complicated case: we have to figure out how many ;; Complicated case that we expect to be rare: figure out how many
;; previous keys have values ;; previous keys have values, since we don't know how the key/value
;; index maps to a key/value bit
(let* ([child-count (hamt-mask->child-count mask)] (let* ([child-count (hamt-mask->child-count mask)]
[key-count (hamt-mask->key-count mask)] [key-count (hamt-mask->key-count mask)]
[key-i (fx- i child-count)]) [key-i (fx- i child-count)])
@ -561,7 +562,10 @@
;; ---------------------------------------- ;; ----------------------------------------
;; unsafe iteration; position is a stack ;; unsafe iteration; position is a stack
;; represented by a list of (cons node index) ;; of the form
;; - '()
;; - (cons indent (cons node stack))
;; - (cons (box assoc-list) stack)
(define (unsafe-intmap-iterate-first h) (define (unsafe-intmap-iterate-first h)
(and (not (intmap-empty? h)) (and (not (intmap-empty? h))
@ -573,12 +577,12 @@
(let ([mask (stencil-vector-mask n)]) (let ([mask (stencil-vector-mask n)])
(let ([child-count (hamt-mask->child-count mask)] (let ([child-count (hamt-mask->child-count mask)]
[key-count (hamt-mask->key-count mask)]) [key-count (hamt-mask->key-count mask)])
(let ([stack (cons (cons n (fx+ key-count child-count -1)) stack)]) (let ([stack (cons (fx+ key-count child-count -1) (cons n stack))])
(if (fx= key-count 0) (if (fx= key-count 0)
(unsafe-node-iterate-first (bnode-child-index-ref n (fx- child-count 1)) stack) (unsafe-node-iterate-first (bnode-child-index-ref n (fx- child-count 1)) stack)
stack))))] stack))))]
[(cnode? n) [(cnode? n)
(cons (box (cnode-content n)) (cons (#%box (cnode-content n))
stack)])) stack)]))
(define (unsafe-intmap-iterate-next h pos) (define (unsafe-intmap-iterate-next h pos)
@ -590,32 +594,31 @@
;; Stack is empty, so we're done ;; Stack is empty, so we're done
#f] #f]
[else [else
(let ([p (car pos)] (let ([i (car pos)]
[stack (cdr pos)]) [stack (cdr pos)])
(cond (cond
[(box? p) [(fixnum? i)
;; in a cnode (let ([n (car stack)])
(let ([new-p (cdr (unbox p))])
(if (null? new-p)
;; Exhausted this node, so return to parent node
(unsafe-node-iterate-next stack)
;; still in cnode:
(cons (box new-p) stack)))]
[else
(let ([n (car p)]
[i (cdr p)])
(cond (cond
[(fx= 0 i) [(fx= 0 i)
;; Exhausted this node, so return to parent node ;; Exhausted this node, so return to parent node
(unsafe-node-iterate-next stack)] (unsafe-node-iterate-next (cdr stack))]
[else [else
;; Move to next (lower) index in the current node ;; Move to next (lower) index in the current node
(let ([i (fx1- i)]) (let ([i (fx1- i)])
(let ([child-count (hamt-mask->child-count (stencil-vector-mask n))] (let ([child-count (hamt-mask->child-count (stencil-vector-mask n))]
[stack (cons (cons n i) stack)]) [stack (cons i stack)])
(if (fx< i child-count) (if (fx< i child-count)
(unsafe-node-iterate-first (bnode-child-index-ref n i) stack) (unsafe-node-iterate-first (bnode-child-index-ref n i) stack)
stack)))]))]))])) stack)))]))]
[else
;; in a cnode
(let ([new-p (cdr (#%unbox i))])
(if (null? new-p)
;; Exhausted this node, so return to parent node
(unsafe-node-iterate-next stack)
;; still in cnode:
(cons (#%box new-p) stack)))]))]))
(define (unsafe-intmap-iterate-key h pos) (define (unsafe-intmap-iterate-key h pos)
(eqtype-dispatch (eqtype-dispatch
@ -924,84 +927,83 @@
(define (bnode-entry-at-position n pos mode fail) (define (bnode-entry-at-position n pos mode fail)
(let* ([mask (stencil-vector-mask n)] (let* ([mask (stencil-vector-mask n)]
[child-count (hamt-mask->child-count mask)]) [child-count (hamt-mask->child-count mask)]
(let loop ([i 0] [pos pos]) [key-count (hamt-mask->key-count mask)])
(cond (cond
[(fx= i child-count) [(fx< pos key-count)
(let ([key-count (hamt-mask->key-count mask)]) (let ([get-key (lambda () (hamt-unwrap-key (bnode-key-index-ref n (fx+ pos child-count))))]
(cond [get-value (lambda () (bnode-val-index-ref n (fx+ pos child-count)))])
[(fx< pos key-count) (case mode
(let ([get-key (lambda () (hamt-unwrap-key (bnode-key-index-ref n (fx+ pos child-count))))] [(key) (get-key)]
[get-value (lambda () (bnode-val-index-ref n (fx+ pos child-count)))]) [(val) (get-value)]
(case mode [(both) (values (get-key) (get-value))]
[(key) (get-key)] [else (cons (get-key) (get-value))]))]
[(val) (get-value)] [else
[(both) (values (get-key) (get-value))] (let loop ([i 0] [pos (fx- pos key-count)])
[else (cons (get-key) (get-value))]))] (cond
[else fail]))] [(fx= i child-count)
[else fail]
(let ([c (bnode-child-index-ref n i)]) [else
(cond (let ([c (bnode-child-index-ref n i)])
[(bnode? c) (cond
(let ([sz (hamt-count c)]) [(bnode? c)
(if (fx>= pos sz) (let ([sz (hamt-count c)])
(loop (fx+ i 1) (fx- pos sz)) (if (fx>= pos sz)
(bnode-entry-at-position c pos mode fail)))] (loop (fx+ i 1) (fx- pos sz))
[else (bnode-entry-at-position c pos mode fail)))]
(let* ([alist (cnode-content c)] [else
[len (length alist)]) (let* ([alist (cnode-content c)]
(if (fx>= pos len) [len (length alist)])
(loop (fx+ i 1) (fx- pos len)) (if (fx>= pos len)
(let ([p (list-ref alist pos)]) (loop (fx+ i 1) (fx- pos len))
(case mode (let ([p (list-ref alist pos)])
[(key) (car p)] (case mode
[(val) (cdr p)] [(key) (car p)]
[(both) (values (car p) (cdr p))] [(val) (cdr p)]
[else p]))))]))])))) [(both) (values (car p) (cdr p))]
[else p]))))]))]))])))
(define (bnode-unsafe-intmap-iterate-key pos) (define (bnode-unsafe-intmap-iterate-key pos)
(let ([p (car pos)]) (let ([i (car pos)])
(cond (cond
[(box? p) [(fixnum? i)
;; in a cnode (let ([h (cadr pos)])
(caar (unbox p))] (hamt-unwrap-key (bnode-key-index-ref h i)))]
[else [else
(let ([h (car p)]) ;; in a cnode
(hamt-unwrap-key (bnode-key-index-ref h (cdr p))))]))) (caar (#%unbox i))])))
(define (bnode-unsafe-intmap-iterate-value pos) (define (bnode-unsafe-intmap-iterate-value pos)
(let ([p (car pos)]) (let ([i (car pos)])
(cond (cond
[(box? p) [(fixnum? i)
;; in a cnode (bnode-val-index-ref (cadr pos) i)]
(cdar (unbox p))]
[else [else
(bnode-val-index-ref (car p) (cdr p))]))) ;; in a cnode
(cdar (#%unbox i))])))
(define (bnode-unsafe-intmap-iterate-key+value pos) (define (bnode-unsafe-intmap-iterate-key+value pos)
(let ([p (car pos)]) (let ([i (car pos)])
(cond (cond
[(box? p) [(fixnum? i)
;; in a cnode (let ([n (cadr pos)])
(let ([pr (car (unbox p))])
(values (car pr) (cdr pr)))]
[else
(let ([n (car p)]
[i (cdr p)])
(values (hamt-unwrap-key (bnode-key-index-ref n i)) (values (hamt-unwrap-key (bnode-key-index-ref n i))
(bnode-val-index-ref n i)))]))) (bnode-val-index-ref n i)))]
[else
;; in a cnode
(let ([pr (car (#%unbox i))])
(values (car pr) (cdr pr)))])))
(define (bnode-unsafe-intmap-iterate-pair pos) (define (bnode-unsafe-intmap-iterate-pair pos)
(let ([p (car pos)]) (let ([i (car pos)])
(cond (cond
[(box? p) [(fixnum? i)
;; in a cnode (let ([n (cadr pos)])
(car (unbox p))]
[else
(let ([n (car p)]
[i (cdr p)])
(cons (hamt-unwrap-key (bnode-key-index-ref n i)) (cons (hamt-unwrap-key (bnode-key-index-ref n i))
(bnode-val-index-ref n i)))]))) (bnode-val-index-ref n i)))]
[else
;; in a cnode
(car (#%unbox i))])))
(define (bnode=? a b eql? shift) (define (bnode=? a b eql? shift)
(or (or

View File

@ -1,21 +1,23 @@
;; We have several implementations of immutable hash tables. Pick one... ;; We have several implementations of immutable hash tables. Pick one...
(include "rumble/patricia.ss") ;; (include "rumble/patricia.ss")
;; ;;
;; This Patricia-trie implementation is the prettiest and fastest. It ;; This Patricia-trie implementation is the prettiest and fastest. It
;; uses the most memory, though --- typically much more than the ;; uses the most memory, though --- typically much more than the
;; vector-stencil HAMT. ;; vector-stencil HAMT.
;; (include "rumble/hamt-stencil.ss") (include "rumble/hamt-stencil.ss")
;; ;;
;; This HAMT implementation using stencil vectors tends to use the ;; This HAMT implementation using stencil vectors tends to use the
;; last memory, often by a lot. It's slower than the Patricia-tree ;; least memory, often 1/3 the space of the Patricia-trie
;; implementation, though, especially for `hash-keys-subset?`. ;; implementation. It's slower than the Patricia-tree implementation
;; for some operations, up to a factor of 2 for `hash-set` or
;; `hash-keys-subset?`.
;; (include "rumble/hamt-vector.ss") ;; (include "rumble/hamt-vector.ss")
;; ;;
;; This HAMT implementaiton uses plain vectors instead of stencil ;; This HAMT implementaiton uses plain vectors instead of stencil
;; vectors. Its speed and memory use are intermediate, but its speed ;; vectors. Its speed and memory use are both worse than the
;; is closer to the stencil-vector HAMT implementation, and memory use ;; stencil-vector HAMT implementation, but it was the original source
;; is closer to the Patrica trie implementation. ;; of the stencil-vector implementation.

View File

@ -16,7 +16,7 @@
#define MZSCHEME_VERSION_X 7 #define MZSCHEME_VERSION_X 7
#define MZSCHEME_VERSION_Y 6 #define MZSCHEME_VERSION_Y 6
#define MZSCHEME_VERSION_Z 0 #define MZSCHEME_VERSION_Z 0
#define MZSCHEME_VERSION_W 1 #define MZSCHEME_VERSION_W 2
/* A level of indirection makes `#` work as needed: */ /* A level of indirection makes `#` work as needed: */
#define AS_a_STR_HELPER(x) #x #define AS_a_STR_HELPER(x) #x