[CLJ-1365] New collection hash functions are too slow Created: 20/Feb/14 Updated: 11/Mar/14 Resolved: 11/Mar/14
|Affects Version/s:||Release 1.6|
|Fix Version/s:||Release 1.6|
|Attachments:||clj-1365-v1.patch clj-1365-v2.patch testclj1365.zip|
As reported ( https://groups.google.com/d/msg/clojure-dev/t6LAmVe-RLM/ekLTKxYfU5UJ ) by Mark Engelberg, the new collection hashing functions are slower than invoking the Murmur3 functions directly. See the attached zip for performance tests.
Approach: Made mix-collection-hash, hash-ordered-coll, and hash-unordered-coll use primitive type hints to avoid the bulk of the time.
|Comment by Alex Miller [ 20/Feb/14 11:26 AM ]|
Added to 1.6 release.
|Comment by Alex Miller [ 20/Feb/14 12:40 PM ]|
Made hash functions inline for performance.
|Comment by Rich Hickey [ 20/Feb/14 7:55 PM ]|
This looks like bad benchmarking.
(dotimes [_ 10] (let [x 1 y 1] (time (dotimes [n 1000000000] (clojure.lang.Murmur3/mixCollHash x y)))))
(dotimes [_ 10] (let [x 1 y 1] (time (dotimes [n 1000000000] #_(clojure.lang.Murmur3/mixCollHash x y)))))
take the same time on my machine.
I'd need to see tests where the return was definitely used, it seems this is just more easily ignored by hotspot when not used.
We probably only need to hint count and the return for decent results.
|Comment by Alex Miller [ 20/Feb/14 8:55 PM ]|
It was reported by Mark Engelberg in his Instaparse rework - he observed these calls taking noticeably longer and overall times 10-20% down. I will ask him to chime in here.
|Comment by Rich Hickey [ 04/Mar/14 8:44 AM ]|
Could someone please test hinting hint count and the return? I'd hate for the answer to anyone's perf issues be inlining.
|Comment by Alex Miller [ 04/Mar/14 9:06 AM ]|
I will provide some more data for consideration of the options.
|Comment by Alex Miller [ 04/Mar/14 11:07 AM ]|
Test project for different variants
|Comment by Alex Miller [ 04/Mar/14 11:11 AM ]|
Attached a test project with different variants for testing and better benchmarking. To run:
From these results, I infer that the unhinted version is slower (21 ms) than a static call (7 ms). Inlining gives you same perf as static. Hinting inputs and return gives almost the same perf (9 ms).