clojure.core/sort is not thread-safe on Java collections with backing arrays
Description
Environment
Attachments
Activity

Stuart Halloway July 30, 2015 at 3:11 PM
I think this should be a docstring change, if anything at all.

Alex Miller June 19, 2015 at 5:53 PM
Well, that's kind of true. The former use did not force realization of the whole seq, just the first element. That said, from a quick test the extra cost seems small on a set (vector seqs are actually faster due to their structure).

Nicola Mometto June 19, 2015 at 5:02 PM
Alex, no additional sequence is being created, the seq call was already there

Alex Miller June 19, 2015 at 4:59 PM
I guess what I'm saying is, we should not make the performance worse for persistent collections in order to make it safer for arbitrary Java collections. But it may still be useful to make it safer without affecting persistent collection performance and/or updating the docstring.

Alex Miller June 19, 2015 at 4:55 PM
The docstring says "If coll is a Java array, it will be modified. To avoid this, sort a copy of the array." which also seems like solid advice in this case.
Creating a sequence view of the input collection would significantly alter the performance characteristics.
Details
Assignee
UnassignedUnassignedReporter
Nicola MomettoNicola MomettoLabels
Approval
TriagedPatch
CodePriority
Minor
Details
Details
Assignee
Reporter

If a (mutable) Java collection that exposes it's backing array is passed to c.c/sort in multiple threads, the collection will be concurrently modified in multiple threads.
Approach: Convert coll to a seq before converting it to an array, thus preserving the original collection.
Patch: 0001-CLJ-1763-make-sort-thread-safe.patch
Alternate approaches:
1. Document in sort that, like Java arrays, Java collections backed by arrays are modified in-place.
2. Change RT.toArray() to defensively copy the array returned from a (non-IPersistentCollection) Java collection. This has a number of potential ramifications as this method is called from several paths.
3. For non-Clojure collections, could also use Collections.sort() instead of dumping to array and using Arrays.sort().