<< Back to previous view

[CLJ-829] Transient hashmaps mishandle hash collisions Created: 24/Aug/11  Updated: 01/Mar/13  Resolved: 07/Oct/11

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: Release 1.4

Type: Defect Priority: Major
Reporter: Alan Malloy Assignee: Christophe Grand
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-829-2.diff     File clj-829.diff    
Patch: Code and Test
Approval: Ok

 Description   

Clojure 1.2.1 and 1.3.0-beta1 both exhibit the following behavior:

user> (let [m (into {} (for [x (range 100000)] [(rand) (rand)]))]
(println (count (distinct (map hash (keys m)))))
((juxt count identity) (persistent!
(reduce dissoc! (transient m) (keys m)))))
99999
[2 {0.42548900739367024 0.8725039567983159}]

We create a large transient map with random keys and values, and check to see how many unique hashcodes we get. Then, we iterate over all the keys, dissoc'ing each out of the transient map. The resulting map has one element in it (wrong - it should be empty, since we dissoc'ed all the keys), and reports its count as being two (wrong - not sure whether it should be zero or one given the other breakage). As far as I can tell, each duplicated hash value is represented once in the output map, and the map's count is the number of keys that hashed to something duplicated.

The problem seems to be restricted to transients, as if we remove the transient/persistent! pair and use dissoc instead of dissoc!, the map is always empty.

Inspired by discussion at http://groups.google.com/group/clojure/browse_thread/thread/313ac122667bb4b5/c3e7faa8635403f1



 Comments   
Comment by Alan Malloy [ 25/Aug/11 12:26 PM ]

By the way, since this involves randomness, on occasion it doesn't fail. With the input as given it seems to fail around 80% of the time, but if you want to be sure to reproduce you can add another 0 to the input size.

Comment by Aaron Bedra [ 26/Aug/11 9:20 AM ]

Thanks for the update. I was able to reproduce with the extra zero. Moving this ticket to Release.Next so that it will ship with 1.3.

Comment by Christophe Grand [ 08/Sep/11 1:15 PM ]

Sorry for the delay. Here is a fix and a reduced test case.

Comment by Stuart Halloway [ 09/Sep/11 3:50 PM ]

I have checked behavior with an independent test

https://github.com/clojure/test.generative/commit/b1350235eb219b76f5b7c4cc21c8255f567892b3

which looks good, but don't have context to evaluate the code (particularly the array allocation) in the time I have available today.

Comment by Rich Hickey [ 09/Sep/11 4:14 PM ]

The array alloc looks suspicious:

+ Object[] newArray = new Object[2*(array.length+1)]; // make room for next assoc
+ System.arraycopy(array, 0, newArray, 0, 2*count);

should it not be array.length + 2, (or 2*(count + 1), whichever?

Comment by Christophe Grand [ 10/Sep/11 3:48 AM ]

Thanks Rich for spotting this copy&paste error.

I attached an updated patch.

The problem reported by Alan was double:

  • the misplaced removedLeaf.val = removedLeaf was causing the global count to be incorrect
  • the missing array copy in ensureEditable was causing the seq returned by (keys m) to be mutated (shortened) and this is why all values were not dissoced.

Mutable code is hard, one should invent a cool language with sane state management

Comment by Stuart Halloway [ 10/Sep/11 2:54 PM ]

New tests continue to pass.

It seems to me that the allocation in the second path is too big by two in some cases (e.g. it gets called in the dissoc! path when the array needs to be copied, but not get bigger). But this might be considered innocuous. The same method is called in the assoc! path where the +2 is needed, so avoiding two objects worth of allocation would require a more substantial patch.

Comment by Christophe Grand [ 12/Sep/11 3:39 AM ]

The larger array allocation is similar to the one performed in BitmapIndexedNode.ensureEditable. My line of thought behind this heuristic is that the copying dominates the allocation and that I prefer one slightly larger array than having to allocates two arrays in case of growth.

Anyway it's on collision nodes so it's a rare occurence: I won't bother arguing further if switching to a more conservative allocation helps the patch landing in 1.3.

Generated at Tue Jul 22 08:25:59 CDT 2014 using JIRA 4.4#649-r158309.