[CLJ-1087] clojure.data/diff uses set union on key seqs Created: 15/Oct/12 Updated: 05/Jun/15
clojure.data/diff, on line 118, defines:
Since keys returns a key seq, this seems like an error. clojure.set/union has strange and inconsistent behavior with regard to non-sets, and in this case the two key seqs are concatenated. Based on a cursory benchmark, it seems that this bug is a slight performance gain when the maps have no common keys, and a significant performance loss when the maps have the same keys. The results are still correct because of the merging reduce in diff-associative.
The patch is easy (just call set on each key seq).
|Comment by Andy Fingerhut [ 15/Oct/12 2:52 PM ]|
clj-1087-diff-perf-enhance-patch-v1.txt dated Oct 15 2012 implements Tom's suggested performance enhancement, although not exactly in the way he suggested. It does calculate the union of the two key sequences.
|Comment by Andy Fingerhut [ 05/Jun/15 2:03 AM ]|
I would suggest that perhaps bigger than the issue of performance here is that Clojure is relying on its own undocumented behavior that clojure.set/union works on some non-set arguments.