<< Back to previous view

[CLJ-1087] clojure.data/diff uses set union on key seqs Created: 15/Oct/12  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Tom Jack Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: performance

Attachments: Text File clj-1087-diff-perf-enhance-patch-v1.txt    
Patch: Code

 Description   

clojure.data/diff, on line 118, defines:

java.util.Map
(diff-similar [a b]
  (diff-associative a b (set/union (keys a) (keys b))))

Since keys returns a key seq, this seems like an error. clojure.set/union has strange and inconsistent behavior with regard to non-sets, and in this case the two key seqs are concatenated. Based on a cursory benchmark, it seems that this bug is a slight performance gain when the maps have no common keys, and a significant performance loss when the maps have the same keys. The results are still correct because of the merging reduce in diff-associative.

The patch is easy (just call set on each key seq).



 Comments   
Comment by Andy Fingerhut [ 15/Oct/12 2:52 PM ]

clj-1087-diff-perf-enhance-patch-v1.txt dated Oct 15 2012 implements Tom's suggested performance enhancement, although not exactly in the way he suggested. It does calculate the union of the two key sequences.

Generated at Wed Jul 23 21:01:04 CDT 2014 using JIRA 4.4#649-r158309.