take-nth transducer could be faster without rem
Description
Environment
Mac OS X 10.10.2, JDK 1.8.0_31
Attachments
- 20 Feb 2015, 06:32 PM
Activity
Renzo Borgatti March 27, 2018 at 5:07 PM
GIGO case, but rem is also responsible for:
user=> (take-nth 2.5 (range 10))
(0 3 6 9)
user=> (sequence (take-nth 2.5) (range 10))
(0 5)
The patch by Steve (CLJ-1665-faster-take-nth-transducer-without-rem.patch) is just missing a cast to int to solve the above:
(defn take-nth [n]
(fn [rf]
(let [n (int n)
iv (volatile! 1)]
(fn
([] (rf))
([result] (rf result))
([result input]
(let [i (vswap! iv dec)]
(if (zero? i)
(do (vreset! iv n)
(rf result input))
result)))))))

Michael Blume February 20, 2015 at 6:55 PM
Nice =)
I would say that the transducer version really ought to match the collection version as closely as possible, but I don't think there's actually a way to write a transducer that transforms a finite sequence into an infinite sequence, so no luck there.
Maybe while we're at it we should change both the transducer and the collection arities to throw on zero?

Steve Miner February 20, 2015 at 6:41 PM
I didn't worry about (take-nth 0) case, but my patch does give a different result. The current implementation gets a divide by zero error (from rem). My patched version returns just the first element once. The regular collection version returns an infinite sequence of the first element. I doubt anyone expects a sensible answer from the 0 case so I didn't try to do anything special with it.

Steve Miner February 20, 2015 at 6:34 PM
Patch attached. It's about 25% faster on a simple test like:
(time (transduce (take-nth 13) + (range 1e7)))
Details
Assignee
UnassignedUnassignedReporter
Steve MinerSteve MinerLabels
Approval
TriagedPatch
CodePriority
MajorAffects versions
Details
Details
Assignee
Reporter

Labels
Approval
Patch
Priority

The take-nth transducer is calling rem on each index, which is relatively expensive compared to a zero? test. It could just count down from N instead as the step size is fixed.