Clojure

Transient hashmaps mishandle hash collisions

Details

  • Type: Defect Defect
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Completed
  • Affects Version/s: Release 1.3
  • Fix Version/s: Release 1.4
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code and Test
  • Approval:
    Ok

Description

Clojure 1.2.1 and 1.3.0-beta1 both exhibit the following behavior:

user> (let [m (into {} (for [x (range 100000)] [(rand) (rand)]))]
(println (count (distinct (map hash (keys m)))))
((juxt count identity) (persistent!
(reduce dissoc! (transient m) (keys m)))))
99999
[2 {0.42548900739367024 0.8725039567983159}]

We create a large transient map with random keys and values, and check to see how many unique hashcodes we get. Then, we iterate over all the keys, dissoc'ing each out of the transient map. The resulting map has one element in it (wrong - it should be empty, since we dissoc'ed all the keys), and reports its count as being two (wrong - not sure whether it should be zero or one given the other breakage). As far as I can tell, each duplicated hash value is represented once in the output map, and the map's count is the number of keys that hashed to something duplicated.

The problem seems to be restricted to transients, as if we remove the transient/persistent! pair and use dissoc instead of dissoc!, the map is always empty.

Inspired by discussion at http://groups.google.com/group/clojure/browse_thread/thread/313ac122667bb4b5/c3e7faa8635403f1

  1. clj-829-2.diff
    10/Sep/11 3:48 AM
    2 kB
    Christophe Grand
  2. clj-829.diff
    08/Sep/11 1:15 PM
    2 kB
    Christophe Grand

Activity

Hide
Alan Malloy added a comment -

By the way, since this involves randomness, on occasion it doesn't fail. With the input as given it seems to fail around 80% of the time, but if you want to be sure to reproduce you can add another 0 to the input size.

Show
Alan Malloy added a comment - By the way, since this involves randomness, on occasion it doesn't fail. With the input as given it seems to fail around 80% of the time, but if you want to be sure to reproduce you can add another 0 to the input size.
Hide
Aaron Bedra added a comment -

Thanks for the update. I was able to reproduce with the extra zero. Moving this ticket to Release.Next so that it will ship with 1.3.

Show
Aaron Bedra added a comment - Thanks for the update. I was able to reproduce with the extra zero. Moving this ticket to Release.Next so that it will ship with 1.3.
Hide
Christophe Grand added a comment -

Sorry for the delay. Here is a fix and a reduced test case.

Show
Christophe Grand added a comment - Sorry for the delay. Here is a fix and a reduced test case.
Hide
Stuart Halloway added a comment -

I have checked behavior with an independent test

https://github.com/clojure/test.generative/commit/b1350235eb219b76f5b7c4cc21c8255f567892b3

which looks good, but don't have context to evaluate the code (particularly the array allocation) in the time I have available today.

Show
Stuart Halloway added a comment - I have checked behavior with an independent test https://github.com/clojure/test.generative/commit/b1350235eb219b76f5b7c4cc21c8255f567892b3 which looks good, but don't have context to evaluate the code (particularly the array allocation) in the time I have available today.
Hide
Rich Hickey added a comment -

The array alloc looks suspicious:

+ Object[] newArray = new Object[2*(array.length+1)]; // make room for next assoc
+ System.arraycopy(array, 0, newArray, 0, 2*count);

should it not be array.length + 2, (or 2*(count + 1), whichever?

Show
Rich Hickey added a comment - The array alloc looks suspicious: + Object[] newArray = new Object[2*(array.length+1)]; // make room for next assoc + System.arraycopy(array, 0, newArray, 0, 2*count); should it not be array.length + 2, (or 2*(count + 1), whichever?
Hide
Christophe Grand added a comment -

Thanks Rich for spotting this copy&paste error.

I attached an updated patch.

The problem reported by Alan was double:

  • the misplaced removedLeaf.val = removedLeaf was causing the global count to be incorrect
  • the missing array copy in ensureEditable was causing the seq returned by (keys m) to be mutated (shortened) and this is why all values were not dissoced.

Mutable code is hard, one should invent a cool language with sane state management

Show
Christophe Grand added a comment - Thanks Rich for spotting this copy&paste error. I attached an updated patch. The problem reported by Alan was double:
  • the misplaced removedLeaf.val = removedLeaf was causing the global count to be incorrect
  • the missing array copy in ensureEditable was causing the seq returned by (keys m) to be mutated (shortened) and this is why all values were not dissoced.
Mutable code is hard, one should invent a cool language with sane state management
Hide
Stuart Halloway added a comment -

New tests continue to pass.

It seems to me that the allocation in the second path is too big by two in some cases (e.g. it gets called in the dissoc! path when the array needs to be copied, but not get bigger). But this might be considered innocuous. The same method is called in the assoc! path where the +2 is needed, so avoiding two objects worth of allocation would require a more substantial patch.

Show
Stuart Halloway added a comment - New tests continue to pass. It seems to me that the allocation in the second path is too big by two in some cases (e.g. it gets called in the dissoc! path when the array needs to be copied, but not get bigger). But this might be considered innocuous. The same method is called in the assoc! path where the +2 is needed, so avoiding two objects worth of allocation would require a more substantial patch.
Hide
Christophe Grand added a comment -

The larger array allocation is similar to the one performed in BitmapIndexedNode.ensureEditable. My line of thought behind this heuristic is that the copying dominates the allocation and that I prefer one slightly larger array than having to allocates two arrays in case of growth.

Anyway it's on collision nodes so it's a rare occurence: I won't bother arguing further if switching to a more conservative allocation helps the patch landing in 1.3.

Show
Christophe Grand added a comment - The larger array allocation is similar to the one performed in BitmapIndexedNode.ensureEditable. My line of thought behind this heuristic is that the copying dominates the allocation and that I prefer one slightly larger array than having to allocates two arrays in case of growth. Anyway it's on collision nodes so it's a rare occurence: I won't bother arguing further if switching to a more conservative allocation helps the patch landing in 1.3.

People

Vote (0)
Watch (3)

Dates

  • Created:
    Updated:
    Resolved: