Clojure

Better reduce performance

Details

  • Type: Enhancement Enhancement
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Declined
  • Affects Version/s: Backlog
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code and Test

Description

Currently reduce is implemented in Clojure by calling seq on a collection. Construction of an entire new seq is an unnecessary and expensive operation, particularly for the key persistent collections (map, set and vector suffer - list is not a problem because it implements ISeq directly for free).

This patch proposes to improve this by doing the following:
1. Make use of a Java interface for reducible collections (IReduce)
2. Make persistent collections support implementations of reduce
directly by implementing IReduce
3. Modify the internal-reduce protocol to operate on concrete
collections (not just seqs) and make use of IReduce implementations
where available
4. Change reduce itself to call internal-reduce directly rather than
calling seq first

Activity

Hide
Rich Hickey added a comment -
Show
Rich Hickey added a comment - Similar objective now achieved via a protocol: https://github.com/clojure/clojure/commit/1f90942690d5395330347cb31fdb3d69cea1ec56
Hide
Mike Anderson added a comment -

Thanks guys I've reviewed the patch and can confirm that it looks like exactly the changes I intended to make. I am happy for them to be committed as per my signed CA.

Show
Mike Anderson added a comment - Thanks guys I've reviewed the patch and can confirm that it looks like exactly the changes I intended to make. I am happy for them to be committed as per my signed CA.
Hide
Brian Taylor added a comment -

The patch contains only code that Mike wrote.

Show
Brian Taylor added a comment - The patch contains only code that Mike wrote.
Hide
Andy Fingerhut added a comment -

Correcting Mike's name and email address in patch.

Show
Andy Fingerhut added a comment - Correcting Mike's name and email address in patch.
Hide
Andy Fingerhut added a comment -

clj-894-better-reduce-performance-patch1.txt was created from pulling Brian Taylor's Clojure github repo, creating a diff of it, applying it to latest master as of Feb 22, 2012, and making a properly formatted patch from it. It has only a single commit, and has Mike Anderson's name and email address in it, along with a date of 4 Dec 2011, when it appears he made his final commit of this work. It compiles and tests cleanly, and I have run Mike's timing experiments and seen similar speed improvements.

No docstrings need changing that I can think of. Mike and I have both signed CAs. I don't see Brian Taylor's name on the list of those who signed a CA, but as long as the patch contains only code that Mike wrote, it should be clean. Patch status is "Code and Test".

Show
Andy Fingerhut added a comment - clj-894-better-reduce-performance-patch1.txt was created from pulling Brian Taylor's Clojure github repo, creating a diff of it, applying it to latest master as of Feb 22, 2012, and making a properly formatted patch from it. It has only a single commit, and has Mike Anderson's name and email address in it, along with a date of 4 Dec 2011, when it appears he made his final commit of this work. It compiles and tests cleanly, and I have run Mike's timing experiments and seen similar speed improvements. No docstrings need changing that I can think of. Mike and I have both signed CAs. I don't see Brian Taylor's name on the list of those who signed a CA, but as long as the patch contains only code that Mike wrote, it should be clean. Patch status is "Code and Test".
Hide
Mike Anderson added a comment -

Awesome, thanks Brian! I must admit that my git-fu is not yet good enough to do things like this, so your help is much appreciated!

Show
Mike Anderson added a comment - Awesome, thanks Brian! I must admit that my git-fu is not yet good enough to do things like this, so your help is much appreciated!
Hide
Brian Taylor added a comment -

Reviewers:

To make this change easier to review (and for a cleaner history), I mashed all of the commits that make up this change together and rebased that onto the latest from clojure/clojure master.

https://github.com/netguy204/clojure/tree/better-reduce
https://github.com/netguy204/clojure/commit/b262a69c32d26bb6f3c5ba711310361d77609a22

Hope it helps.

Show
Brian Taylor added a comment - Reviewers: To make this change easier to review (and for a cleaner history), I mashed all of the commits that make up this change together and rebased that onto the latest from clojure/clojure master. https://github.com/netguy204/clojure/tree/better-reduce https://github.com/netguy204/clojure/commit/b262a69c32d26bb6f3c5ba711310361d77609a22 Hope it helps.
Hide
Mike Anderson added a comment -

better_reduce branch now updated to merge in the latest changes from the clojure/clojure master branch.

Show
Mike Anderson added a comment - better_reduce branch now updated to merge in the latest changes from the clojure/clojure master branch.
Hide
Mike Anderson added a comment - - edited

Performance testing shows approx 2x speedup for relevant operations:

Clojure 1.3.0
(def ms (zipmap (range 100) (range 100)))
#'user/ms
user=> (dotimes [i 10] (time (dotimes [i 100000] (reduce (fn [acc [k
v]] (+ acc v)) 0 ms))))
"Elapsed time: 646.512414 msecs"
"Elapsed time: 553.264593 msecs"
"Elapsed time: 544.999672 msecs"
"Elapsed time: 510.606868 msecs"
"Elapsed time: 513.192451 msecs"
"Elapsed time: 537.665664 msecs"
"Elapsed time: 524.166514 msecs"
"Elapsed time: 512.966964 msecs"
"Elapsed time: 501.677219 msecs"
"Elapsed time: 496.379194 msecs"

Clojure 1.4.0-alpha2
user=> (def ms (zipmap (range 100) (range 100)))
#'user/ms
user=> (dotimes [i 10] (time (dotimes [i 100000] (reduce (fn [acc [k
v]] (+ acc v)) 0 ms))))
"Elapsed time: 548.822683 msecs"
"Elapsed time: 469.275299 msecs"
"Elapsed time: 464.742096 msecs"
"Elapsed time: 443.500129 msecs"
"Elapsed time: 431.272138 msecs"
"Elapsed time: 430.514649 msecs"
"Elapsed time: 432.753752 msecs"
"Elapsed time: 431.195876 msecs"
"Elapsed time: 435.254274 msecs"
"Elapsed time: 433.516375 msecs"

Clojure 1.4.0 snapshot (with better reduce modifications)
(def ms (zipmap (range 100) (range 100)))
#'user/ms
user=> (dotimes [i 10] (time (dotimes [i 100000] (reduce (fn [acc [k
v]] (+ acc v)) 0 ms))))
"Elapsed time: 202.29696 msecs"
"Elapsed time: 186.589505 msecs"
"Elapsed time: 179.691805 msecs"
"Elapsed time: 184.191644 msecs"
"Elapsed time: 183.265131 msecs"
"Elapsed time: 180.162578 msecs"
"Elapsed time: 182.323219 msecs"
"Elapsed time: 181.810649 msecs"
"Elapsed time: 182.896285 msecs"
"Elapsed time: 187.30153 msecs"

Show
Mike Anderson added a comment - - edited Performance testing shows approx 2x speedup for relevant operations: Clojure 1.3.0 (def ms (zipmap (range 100) (range 100))) #'user/ms user=> (dotimes [i 10] (time (dotimes [i 100000] (reduce (fn [acc [k v]] (+ acc v)) 0 ms)))) "Elapsed time: 646.512414 msecs" "Elapsed time: 553.264593 msecs" "Elapsed time: 544.999672 msecs" "Elapsed time: 510.606868 msecs" "Elapsed time: 513.192451 msecs" "Elapsed time: 537.665664 msecs" "Elapsed time: 524.166514 msecs" "Elapsed time: 512.966964 msecs" "Elapsed time: 501.677219 msecs" "Elapsed time: 496.379194 msecs" Clojure 1.4.0-alpha2 user=> (def ms (zipmap (range 100) (range 100))) #'user/ms user=> (dotimes [i 10] (time (dotimes [i 100000] (reduce (fn [acc [k v]] (+ acc v)) 0 ms)))) "Elapsed time: 548.822683 msecs" "Elapsed time: 469.275299 msecs" "Elapsed time: 464.742096 msecs" "Elapsed time: 443.500129 msecs" "Elapsed time: 431.272138 msecs" "Elapsed time: 430.514649 msecs" "Elapsed time: 432.753752 msecs" "Elapsed time: 431.195876 msecs" "Elapsed time: 435.254274 msecs" "Elapsed time: 433.516375 msecs" Clojure 1.4.0 snapshot (with better reduce modifications) (def ms (zipmap (range 100) (range 100))) #'user/ms user=> (dotimes [i 10] (time (dotimes [i 100000] (reduce (fn [acc [k v]] (+ acc v)) 0 ms)))) "Elapsed time: 202.29696 msecs" "Elapsed time: 186.589505 msecs" "Elapsed time: 179.691805 msecs" "Elapsed time: 184.191644 msecs" "Elapsed time: 183.265131 msecs" "Elapsed time: 180.162578 msecs" "Elapsed time: 182.323219 msecs" "Elapsed time: 181.810649 msecs" "Elapsed time: 182.896285 msecs" "Elapsed time: 187.30153 msecs"
Hide
Mike Anderson added a comment -

Link to experimental implementation on GitHub:

https://github.com/mikera/clojure/tree/better-reduce

Show
Mike Anderson added a comment - Link to experimental implementation on GitHub: https://github.com/mikera/clojure/tree/better-reduce
Hide
Mike Anderson added a comment -

Link to initial discussion on Clojure Dev Google Group

http://groups.google.com/group/clojure-dev/browse_frm/thread/bdab332c4e47721f?hl=en

Show
Mike Anderson added a comment - Link to initial discussion on Clojure Dev Google Group http://groups.google.com/group/clojure-dev/browse_frm/thread/bdab332c4e47721f?hl=en

People

Vote (2)
Watch (6)

Dates

  • Created:
    Updated:
    Resolved: