<< Back to previous view

[CLJ-991] partition-by reducer Created: 10/May/12  Updated: 05/Mar/15  Resolved: 05/Mar/15

Status: Resolved
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Kevin Downey Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: reducers

Attachments: File reducer-partition-by2.diff     File reducer-partition-by3.diff     File reducer-partition-by4.diff     File reducer-partition-by.diff    
Patch: Code and Test

 Comments   
Comment by Rich Hickey [ 14/Aug/12 1:52 PM ]

I'd like to see something much faster than this.

Comment by Kevin Downey [ 15/Aug/12 1:58 AM ]

For reference here is a benchmark of a non-reducers (seq based) process that uses partition-by

user=> (def x (vec (range 1e6)))
#'user/x
user=> (bench (reduce + (map count (partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 60
             Execution time mean : 1.072157 sec  95.0% CI: (1.070606 sec, 1.073381 sec)
    Execution time std-deviation : 165.818282 ms  95.0% CI: (163.873585 ms, 168.271261 ms)
         Execution time lower ci : 972.562000 ms  95.0% CI: (972.562000 ms, 973.301850 ms)
         Execution time upper ci : 1.419148 sec  95.0% CI: (1.419148 sec, 1.419148 sec)

Found 7 outliers in 60 samples (11.6667 %)
	low-severe	 2 (3.3333 %)
	low-mild	 5 (8.3333 %)
 Variance from outliers : 85.8489 % Variance is severely inflated by outliers
nil
user=>

Same again using r/partition-by from reducer-partition-by.diff

user=> (bench (r/reduce + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 60
             Execution time mean : 1.418350 sec  95.0% CI: (1.417738 sec, 1.418948 sec)
    Execution time std-deviation : 66.736477 ms  95.0% CI: (66.186568 ms, 67.610777 ms)
         Execution time lower ci : 1.370419 sec  95.0% CI: (1.370419 sec, 1.370419 sec)
         Execution time upper ci : 1.544151 sec  95.0% CI: (1.544151 sec, 1.544156 sec)

Found 10 outliers in 60 samples (16.6667 %)
	low-severe	 2 (3.3333 %)
	low-mild	 8 (13.3333 %)
 Variance from outliers : 33.5591 % Variance is moderately inflated by outliers
nil
user=> 

(using https://github.com/hugoduncan/criterium for benchmarking)

Comment by Kevin Downey [ 15/Aug/12 2:17 AM ]

same again for r/partition-by from reducers-partition-by2.diff

user=> (bench (r/reduce + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 180
             Execution time mean : 307.596806 ms  95.0% CI: (307.271339 ms, 307.961550 ms)
    Execution time std-deviation : 34.060809 ms  95.0% CI: (33.613169 ms, 34.416837 ms)
         Execution time lower ci : 285.339333 ms  95.0% CI: (285.339333 ms, 285.339333 ms)
         Execution time upper ci : 385.087950 ms  95.0% CI: (385.087950 ms, 385.087950 ms)

Found 10 outliers in 60 samples (16.6667 %)
	low-severe	 4 (6.6667 %)
	low-mild	 6 (10.0000 %)
 Variance from outliers : 73.8053 % Variance is severely inflated by outliers
nil
user=> 

same again driven using r/fold (for grins) instead of r/reduce

user=> (bench (r/fold + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 360
             Execution time mean : 215.214486 ms  95.0% CI: (214.915417 ms, 215.664236 ms)
    Execution time std-deviation : 36.588464 ms  95.0% CI: (36.305548 ms, 36.847846 ms)
         Execution time lower ci : 185.575000 ms  95.0% CI: (185.575000 ms, 185.575000 ms)
         Execution time upper ci : 287.605175 ms  95.0% CI: (286.547833 ms, 287.605175 ms)

Found 6 outliers in 60 samples (10.0000 %)
	low-severe	 3 (5.0000 %)
	low-mild	 3 (5.0000 %)
 Variance from outliers : 87.6303 % Variance is severely inflated by outliers
nil
user=> 

reducers-partition-by2.diff is faster, but I am not wild about introducing a type and a protocol. also not sure about the CollFold impl.

Comment by Rich Hickey [ 15/Aug/12 6:58 AM ]

Let's leave fold out for this first patch, please.

Comment by Kevin Downey [ 15/Aug/12 2:34 PM ]

reducer-partition-by3.diff is a cleaned up version of reducer-partition-by2.diff without fold

Comment by Devin Walters [ 20/Oct/12 7:07 PM ]

Per Andy Fingerhut's email reducer-partition-by3.diff was failing to apply. This patch should apply cleanly to current master.

Comment by Andy Fingerhut [ 01/Nov/12 6:59 PM ]

Presumptuously changing Approval from Incomplete to None, since the reason for its being marked Incomplete seems to have been addressed with the latest patch.

Comment by Kevin Downey [ 04/Mar/13 2:49 PM ]

should this be assigned to someone?

Comment by Andy Fingerhut [ 03/Sep/14 8:09 PM ]

Patch reducer-partition-by4.diff dated Oct 20 2012 no longer applies cleanly to latest master after some commits were made to Clojure on Sep 3, 2014.

I have not checked whether this patch is straightforward to update, nor do I know if there is interest in an updated patch, or whether transducers are the preferred way to go over reducers, and thus all reducers-related tickets will be closed. See the section "Updating stale patches" at http://dev.clojure.org/display/community/Developing+Patches for suggestions on how to update patches.

Comment by Kevin Downey [ 05/Mar/15 11:41 AM ]

the single arity for completion added to reducing functions in 1.7 makes partition-by easier to write, and low and behold clojure.core/parition-by can return a partitioning transducing function





[CLJ-1660] Unify Stepper and MultiStepper in LazyTransformer Created: 16/Feb/15  Updated: 04/Mar/15  Resolved: 04/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None

Attachments: Text File CLJ-1660-v1.patch    
Patch: Code

 Description   

This seemed worthwhile to me mainly because some of the stepper logic is actually pretty fiddly, so it seems better not to have it duplicated.



 Comments   
Comment by Alex Miller [ 04/Mar/15 1:06 PM ]

Based on CLJ-1669 direction which eliminates this code.





Generated at Thu Mar 05 18:15:11 CST 2015 using JIRA 4.4#649-r158309.