correct some hash-table locking cases

BC did not take a lock for `hash-clear!`. Maybe the intent was that a
lock isn't needed from the perspective of `hash-clear!` when it isn't
traversing the table, but failing to take a lock before modifying the
table can break other operations (that do take a lock) in progress.
Furthermore, some iterations that intentionally did not take a lock
also did not guard against changes well enough. Various repairs here
avoid crashes, and now `hash-clear!` consistently takes a lock.

CS had a similar (but more subtle and more limited) instance of the
problem in `hash-map` and `hash-for-each`. That problem is corrected
(without adding any new lock acquisitions).

The existing CS implementation of `equal-hash-code` for hash tables
takes a lock more than the BC implementation. A possibly surprising
result: when attempting to add a mutable hash table to itself as part
of a key, the `hash-set!` can block forever on the hash table's lock,
instead of doing something more random as a result of having a mutated
key. The documentation now notes that possibility.

Related to #3738
This commit is contained in:
Matthew Flatt 2021-03-21 12:39:12 -06:00
parent bcd16840da
commit ef0ff679d0
7 changed files with 142 additions and 37 deletions

View File

@ -3,11 +3,6 @@
@title[#:tag "hashtables"]{Hash Tables}
@(define (concurrency-caveat)
@elemref['(caveat "concurrency")]{caveats concerning concurrent modification})
@(define (mutable-key-caveat)
@elemref['(caveat "mutable-keys")]{caveat concerning mutable keys})
@(define (see-also-caveats)
@t{See also the @concurrency-caveat[] and the @mutable-key-caveat[] above.})
@(define (see-also-concurrency-caveat)
@ -67,18 +62,20 @@ when they are @racket[equal?].
modification:}} A mutable hash table can be manipulated with
@racket[hash-ref], @racket[hash-set!], and @racket[hash-remove!]
concurrently by multiple threads, and the operations are protected by
a table-specific semaphore as needed. Three caveats apply, however:
a table-specific semaphore as needed. Several caveats apply, however:
@itemize[
@item{If a thread is terminated while applying @racket[hash-ref],
@racket[hash-ref-key], @racket[hash-set!], @racket[hash-remove!],
@racket[hash-ref!], or @racket[hash-update!] to a hash table that
@racket[hash-ref!], @racket[hash-update!], or @racket[hash-clear!]
to a hash table that
uses @racket[equal?] or @racket[eqv?] key comparisons, all current
and future operations on the hash table may block indefinitely.}
@item{The @racket[hash-map], @racket[hash-for-each], and @racket[hash-clear!] procedures do
not use the table's semaphore to guard the traversal as a whole.
not use the table's semaphore to guard the traversal as a whole
(if a traversal is needed, in the case of @racket[hash-clear!]).
Changes by one thread to a hash table can affect the keys and values
seen by another thread part-way through its traversal of the same
hash table.}
@ -89,6 +86,13 @@ a table-specific semaphore as needed. Three caveats apply, however:
of their functionality, which means that the update as a whole is not
``atomic.''}
@item{Adding a mutable hash table as a key in itself is trouble on
the grounds that the key is being mutated (see the caveat below),
but it is also a kind of concurrent use of the hash table: computing
a hash table's hash code may require waiting on the table's
semaphore, but the semaphore is already held for modifying the hash
table, so the hash-table addition can block indefinitely.}
]
@elemtag['(caveat "mutable-keys")]{@bold{Caveat concerning mutable

View File

@ -175,3 +175,10 @@
(provide envvar-indexed)
(define (envvar-indexed s)
(as-index (envvar s)))
(provide concurrency-caveat
mutable-key-caveat)
@(define (concurrency-caveat)
@elemref['(caveat "concurrency")]{caveats concerning concurrent modification})
@(define (mutable-key-caveat)
@elemref['(caveat "mutable-keys")]{caveat concerning mutable keys})

View File

@ -5,6 +5,11 @@
@(define set-eval (make-base-eval))
@examples[#:hidden #:eval set-eval (require racket/set)]
@(define (hash-set-caveats)
@elem{For @tech{hash sets}, see also the @concurrency-caveat[]
for hash tables, which applies to
hash sets.})
A @deftech{set} represents a collection of distinct elements. The following
datatypes are all sets:
@ -354,7 +359,8 @@ Produces a set that includes @racket[v] plus all elements of
Adds the element @racket[v] to @racket[st]. This operation runs in constant
time for @tech{hash sets}. Has no fallback.
}
@hash-set-caveats[]}
@defproc[(set-remove [st generic-set?] [v any/c]) generic-set?]{
@ -368,7 +374,8 @@ Produces a set that includes all elements of @racket[st] except
Removes the element @racket[v] from @racket[st]. This operation runs in constant
time for @tech{hash sets}. Has no fallback.
}
@hash-set-caveats[]}
@defproc[(set-empty? [st generic-set?]) boolean?]{
@ -462,7 +469,8 @@ Removes all elements from @racket[st].
Supported for any @racket[st] that @impl{implements} @racket[set-remove!] and either
@supp{supports} @racket[set->stream] or @impl{implements} @racket[set-first] and either @racket[set-count] or @racket[set-empty?].
}
@hash-set-caveats[]}
@defproc[(set-union [st0 generic-set?] [st generic-set?] ...) generic-set?]{
@ -506,7 +514,7 @@ total size of the @racket[st]s.
Supported for any @racket[st] that @impl{implements} @racket[set-add!] and @supp{supports} @racket[set->stream].
}
@hash-set-caveats[]}
@defproc[(set-intersect [st0 generic-set?] [st generic-set?] ...) generic-set?]{
@ -541,7 +549,8 @@ size of @racket[st0].
Supported for any @racket[st] that @impl{implements} @racket[set-remove!] and @supp{supports} @racket[set->stream].
}
@hash-set-caveats[]}
@defproc[(set-subtract [st0 generic-set?] [st generic-set?] ...) generic-set?]{
@ -576,7 +585,8 @@ size of @racket[st0].
Supported for any @racket[st] that @impl{implements} @racket[set-remove!] and @supp{supports} @racket[set->stream].
}
@hash-set-caveats[]}
@defproc[(set-symmetric-difference [st0 generic-set?] [st generic-set?] ...) generic-set?]{
@ -616,7 +626,8 @@ total size of the @racket[st]s.
Supported for any @racket[st] that @impl{implements} @racket[set-remove!] and @supp{supports} @racket[set->stream].
}
@hash-set-caveats[]}
@defproc[(set=? [st generic-set?] [st2 generic-set?]) boolean?]{

View File

@ -704,4 +704,60 @@
;; ----------------------------------------
(for ([make-hash (in-list (list make-hash make-weak-hash make-ephemeron-hash))]
[hash-clear! (in-list (list hash-clear!
(lambda (ht)
(hash-for-each ht (lambda (k v) (hash-remove! ht k))))))]
[op (in-list (list
(lambda (ht ht2) (hash-set! ht ht #t))
(lambda (ht ht2) (equal? ht ht2))
(lambda (ht ht2) (equal-hash-code ht))
(lambda (ht ht2) (equal-secondary-hash-code ht))
(lambda (ht ht2) (hash-map ht (lambda (k v) (hash-clear! ht) k)))
(lambda (ht ht2) (hash-for-each ht (lambda (k v) (hash-clear! ht) k)))))])
(define amok? #f)
(define ht (make-hash))
(define ht2 (make-hash))
(struct a (x)
#:property prop:equal+hash (list (lambda (a1 a2 eql?)
(when amok?
(hash-clear! ht))
(eql? (a-x a1) (a-x a2)))
(lambda (a1 hc)
(when amok?
(hash-clear! ht))
(a-x a1))
(lambda (a2 hc)
(when amok?
(hash-clear! ht))
(a-x a2))))
(define saved null)
(define (save v)
(set! saved (cons v saved))
v)
(for ([i (in-range 1000)])
(hash-set! ht (save (a i)) #t)
(hash-set! ht2 (save (a i)) #t))
(set! amok? #t)
;; This operation can get stuck or raise an exception,
;; but it should not crash
(let* ([fail? #f]
[t (thread
(lambda ()
(with-handlers ([exn:fail:contract? void]
[exn:fail? (lambda (x)
(set! fail? #t)
(raise x))])
(op ht ht2))))])
(sync (system-idle-evt))
(test #f `(no-crash? ,op) fail?)))
;; ----------------------------------------
(report-errs)

View File

@ -647,6 +647,9 @@ int scheme_hash_table_equal_rec(Scheme_Hash_Table *t1, Scheme_Object *orig_t1,
return 0;
if (!scheme_recur_equal(val1, val2, eql))
return 0;
/* since we didn't take a lock, the size could have changed */
if (i > t1->size) i = t1->size;
}
}
@ -1823,6 +1826,8 @@ static uintptr_t equal_hash_key(Scheme_Object *o, uintptr_t k, Hash_Info *hi)
MZ_MIX(vk);
k += vk; /* can't mix k, because the key order shouldn't matter */
hi->depth = old_depth; /* also needed to avoid order-sensitivity */
/* since we didn't take a lock, the size could have changed */
if (i > ht->size) i = ht->size;
}
}
@ -1904,6 +1909,8 @@ static uintptr_t equal_hash_key(Scheme_Object *o, uintptr_t k, Hash_Info *hi)
MZ_MIX(vk);
k += vk; /* can't mix k, because the key order shouldn't matter */
hi->depth = old_depth; /* also needed to avoid order-sensitivity */
/* since we didn't take a lock, the size could have changed */
if (i > ht->size) i = ht->size;
}
}
}
@ -2313,6 +2320,8 @@ static uintptr_t equal_hash_key2(Scheme_Object *o, Hash_Info *hi)
k += equal_hash_key2(key, hi);
k += equal_hash_key2(val, hi);
hi->depth = old_depth;
/* since we didn't take a lock, the size could have changed */
if (i > ht->size) i = ht->size;
}
}
@ -2392,6 +2401,8 @@ static uintptr_t equal_hash_key2(Scheme_Object *o, Hash_Info *hi)
k += equal_hash_key2(key, hi);
hi->depth = old_depth;
}
/* since we didn't take a lock, the size could have changed */
if (i > ht->size) i = ht->size;
}
}

View File

@ -2949,9 +2949,15 @@ static Scheme_Object *hash_table_clear_bang(int argc, Scheme_Object *argv[])
}
if (SCHEME_BUCKTP(v)) {
scheme_clear_bucket_table((Scheme_Bucket_Table *)v);
Scheme_Bucket_Table *t = (Scheme_Bucket_Table *)v;
if (t->mutex) scheme_wait_sema(t->mutex, 0);
scheme_clear_bucket_table(t);
if (t->mutex) scheme_post_sema(t->mutex);
} else{
scheme_clear_hash_table((Scheme_Hash_Table *)v);
Scheme_Hash_Table *t = (Scheme_Hash_Table *)v;
if (t->mutex) scheme_wait_sema(t->mutex, 0);
scheme_clear_hash_table(t);
if (t->mutex) scheme_post_sema(t->mutex);
}
return scheme_void;
@ -3100,6 +3106,8 @@ static Scheme_Object *do_map_hash_table(int argc,
} else
_scheme_apply_multi(f, 2, p);
}
/* since we didn't take a lock, the size could have changed */
if (i > hash->size) i = hash->size;
}
}
} else if (SCHEME_HASHTP(obj)) {
@ -3132,6 +3140,8 @@ static Scheme_Object *do_map_hash_table(int argc,
} else
_scheme_apply_multi(f, 2, p);
}
/* since we didn't take a lock, the size could have changed */
if (i > hash->size) i = hash->size;
}
}
} else {

View File

@ -470,11 +470,17 @@
;; of calling `hash-iterate-...` for each step
(let vec-loop ([old-n 0] [try? #t])
(let ([vec (prepare-iterate! ht old-n)])
(cond
[(fx>= old-n (#%vector-length vec))
;; If `old-n` is not zero, the hash table changed while we
;; iterated, which is possible since we haven't taken a lock
(if map? '() (void))]
[else
(let loop ([i old-n])
(cond
[(= i (#%vector-length vec))
[(fx= i (#%vector-length vec))
(if try?
(vec-loop i (> i old-n))
(vec-loop i (fx> i old-n))
(if map? '() (void)))]
[else
(let ([p (#%vector-ref vec i)])
@ -489,7 +495,7 @@
(loop (fx+ i 1)))]
[else
(|#%app| proc key val)
(loop (fx+ i 1))])))])))))
(loop (fx+ i 1))])))]))]))))
;; In sorted hash-table travesals, make some effort to sort the key.
;; This attempt is useful for making hash-table traversals more