Problem: clojure.lang.APersistentVector#hashCode contain a deliberate data race on hash computation. However, the code as written does not follow safe practices for the intended data race. Specifically, the problem arises because the hashCode() (and hasheq()) method make multiple reads of the (unsynchronized) _hash field. The JMM permits these reads to return different values. Specifically, the last read in the return may return the pre-computed value -1, which is not the desired hash value. This problem also applies to APersistentMap, APersistentSet, and PersistentQueue.
See: http://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html for a good description of the problem.
Fix: The main fix is to read the cached hash field only once and return the value of the local computation, not the value of the field.
A secondary change that is also beneficial is to use the default initializer value (which has special ordering in the JMM to the beginning of the thread) rather than setting and using -1 as the sentinel value.
In both cases these changes follow the canonical idioms used in java.lang.String for lazy hash computation. The patch covers both.
Patch: clj-2091-default-initialization.diff - note that this patch will indicate whitespace errors when applied due to the wacky line endings in PersistentQueue. The problem here is really the PQ formatting, not the patch.
Prescreened by: Alex Miller
There are some hash-related tests already but I also spot-checked that hash computations are returning the same value with and without the patch for the collections in question.