<< Back to previous view

[CLJ-1295] Speed up dissoc on array-maps Created: 15/Nov/13  Updated: 29/Apr/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Andy Fingerhut Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: ft, performance

Attachments: File clj-1295-1.diff    
Patch: Code
Approval: Triaged


In latest Clojure master as of Nov 15 2013, the method without() in PersistentArrayMap.java first searches for a matching key using indexOf(key) and saves the result in i.

If a matching key was found, the code then copies the old array to the new smaller one, but unnecessarily repeats the comparison of every key in the map to the key being removed, even though its location is already stored in i.

Screened by: Alex Miller

Comment by Andy Fingerhut [ 15/Nov/13 7:05 PM ]

The patch clj-1295-1.diff changes PersistentArrayMap's without() to use System.arraycopy to copy only the necessary parts from the current array to newArray, similar to PersistentHashMap's method removePair().

Benchmark 1 has strings for keys, which are relatively slow to compare to each other.

(def m1 (array-map "abcdef" 1 "abcdeg" 2 "abcdeh" 3 "abcdei" 4))
(time (dotimes [i 100000000] (dissoc m1 "abcdei")))

1.6.0-alpha2 with no changes:
"Elapsed time: 29663.443 msecs"
"Elapsed time: 29490.225 msecs"
"Elapsed time: 29600.138 msecs"
"Elapsed time: 29627.948 msecs"

1.6.0-alpha2 with patch clj-1295-1.diff:
"Elapsed time: 6362.006 msecs"
"Elapsed time: 6121.006 msecs"
"Elapsed time: 6163.377 msecs"
"Elapsed time: 6155.299 msecs"
"Elapsed time: 6395.224 msecs"

Averages about 21% of the run time before the change.

Benchmark 2 has keywords for keys, which are compared via Java ==, so as fast as comparison can get.

(def m2 (array-map :abcdef 1 :abcdeg 2 :abcdeh 3 :abcdei 4))
(time (dotimes [i 100000000] (dissoc m2 :abcdei)))

1.6.0-alpha2 with no changes:
"Elapsed time: 5033.863 msecs"
"Elapsed time: 5028.327 msecs"
"Elapsed time: 5045.019 msecs"
"Elapsed time: 5004.751 msecs"
"Elapsed time: 5039.143 msecs"

1.6.0-alpha2 with patch clj-1295-1.diff:
"Elapsed time: 2874.748 msecs"
"Elapsed time: 2862.878 msecs"
"Elapsed time: 2887.778 msecs"
"Elapsed time: 2874.196 msecs"
"Elapsed time: 2861.807 msecs"

Averages about 57% of the run time before the change.

Comment by Ghadi Shayban [ 08/Dec/14 9:47 AM ]

A nice boost, but probably obsoleted by CLJ-1517

Comment by Andy Fingerhut [ 08/Dec/14 11:10 AM ]

Always happy to be obsoleted by something even better

Comment by Ghadi Shayban [ 08/Dec/14 11:43 AM ]

I dunno, seems like CLJ-1517 phase 1 is just vectors not maps, so this is still easy low-hanging fruit.

Comment by Alex Miller [ 08/Dec/14 1:07 PM ]

There will be a second ticket for maps, both will be targeted for 1.8.

Generated at Wed Oct 07 03:38:56 CDT 2015 using JIRA 4.4#649-r158309.