[CLJ-991] partition-by reducer Created: 10/May/12 Updated: 04/Mar/13 |
|
| Status: | Open |
| Project: | Clojure |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Enhancement | Priority: | Minor |
| Reporter: | Kevin Downey | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | enhancement, patch, patch, | ||
| Attachments: |
|
| Patch: | Code and Test |
| Comments |
| Comment by Rich Hickey [ 14/Aug/12 1:52 PM ] |
|
I'd like to see something much faster than this. |
| Comment by Kevin Downey [ 15/Aug/12 1:58 AM ] |
|
For reference here is a benchmark of a non-reducers (seq based) process that uses partition-by user=> (def x (vec (range 1e6)))
#'user/x
user=> (bench (reduce + (map count (partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count : 60
Execution time mean : 1.072157 sec 95.0% CI: (1.070606 sec, 1.073381 sec)
Execution time std-deviation : 165.818282 ms 95.0% CI: (163.873585 ms, 168.271261 ms)
Execution time lower ci : 972.562000 ms 95.0% CI: (972.562000 ms, 973.301850 ms)
Execution time upper ci : 1.419148 sec 95.0% CI: (1.419148 sec, 1.419148 sec)
Found 7 outliers in 60 samples (11.6667 %)
low-severe 2 (3.3333 %)
low-mild 5 (8.3333 %)
Variance from outliers : 85.8489 % Variance is severely inflated by outliers
nil
user=>
Same again using r/partition-by from reducer-partition-by.diff user=> (bench (r/reduce + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count : 60
Execution time mean : 1.418350 sec 95.0% CI: (1.417738 sec, 1.418948 sec)
Execution time std-deviation : 66.736477 ms 95.0% CI: (66.186568 ms, 67.610777 ms)
Execution time lower ci : 1.370419 sec 95.0% CI: (1.370419 sec, 1.370419 sec)
Execution time upper ci : 1.544151 sec 95.0% CI: (1.544151 sec, 1.544156 sec)
Found 10 outliers in 60 samples (16.6667 %)
low-severe 2 (3.3333 %)
low-mild 8 (13.3333 %)
Variance from outliers : 33.5591 % Variance is moderately inflated by outliers
nil
user=>
(using https://github.com/hugoduncan/criterium for benchmarking) |
| Comment by Kevin Downey [ 15/Aug/12 2:17 AM ] |
|
same again for r/partition-by from reducers-partition-by2.diff user=> (bench (r/reduce + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count : 180
Execution time mean : 307.596806 ms 95.0% CI: (307.271339 ms, 307.961550 ms)
Execution time std-deviation : 34.060809 ms 95.0% CI: (33.613169 ms, 34.416837 ms)
Execution time lower ci : 285.339333 ms 95.0% CI: (285.339333 ms, 285.339333 ms)
Execution time upper ci : 385.087950 ms 95.0% CI: (385.087950 ms, 385.087950 ms)
Found 10 outliers in 60 samples (16.6667 %)
low-severe 4 (6.6667 %)
low-mild 6 (10.0000 %)
Variance from outliers : 73.8053 % Variance is severely inflated by outliers
nil
user=>
same again driven using r/fold (for grins) instead of r/reduce user=> (bench (r/fold + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count : 360
Execution time mean : 215.214486 ms 95.0% CI: (214.915417 ms, 215.664236 ms)
Execution time std-deviation : 36.588464 ms 95.0% CI: (36.305548 ms, 36.847846 ms)
Execution time lower ci : 185.575000 ms 95.0% CI: (185.575000 ms, 185.575000 ms)
Execution time upper ci : 287.605175 ms 95.0% CI: (286.547833 ms, 287.605175 ms)
Found 6 outliers in 60 samples (10.0000 %)
low-severe 3 (5.0000 %)
low-mild 3 (5.0000 %)
Variance from outliers : 87.6303 % Variance is severely inflated by outliers
nil
user=>
reducers-partition-by2.diff is faster, but I am not wild about introducing a type and a protocol. also not sure about the CollFold impl. |
| Comment by Rich Hickey [ 15/Aug/12 6:58 AM ] |
|
Let's leave fold out for this first patch, please. |
| Comment by Kevin Downey [ 15/Aug/12 2:34 PM ] |
|
reducer-partition-by3.diff is a cleaned up version of reducer-partition-by2.diff without fold |
| Comment by Devin Walters [ 20/Oct/12 7:07 PM ] |
|
Per Andy Fingerhut's email reducer-partition-by3.diff was failing to apply. This patch should apply cleanly to current master. |
| Comment by Andy Fingerhut [ 01/Nov/12 6:59 PM ] |
|
Presumptuously changing Approval from Incomplete to None, since the reason for its being marked Incomplete seems to have been addressed with the latest patch. |
| Comment by Kevin Downey [ 04/Mar/13 2:49 PM ] |
|
should this be assigned to someone? |