Clojure

Reify the result of range

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: Release 1.7
  • Fix Version/s: Release 1.7
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code and Test
  • Approval:
    Incomplete

Description

Currently range simply returns a lazy seq. If the return value of range were reified into a type (as it is in ClojureScript) we could optimize many functions on that resulting type. Some operations such as count and nth become O(1) in this case, while others such as reduce could receive a performance boost do to the reduced number of allocations.

Approach: this patch revives the unused (but previously existing) clojure.lang.Range class. This class acts as a lazy seq and implements several other appropriate interfaces such as Counted and Indexed. This type is implemented in Java since range is needed fairly on in core.clj before deftype is defined. The attached patch uses Numbers.* methods for all math due to the input types to range being unknown. The class also supplies a .iterator() method which allows for allocation free reducing over range.

Note: this code keeps backwards compatibility with the existing range code. This means some parts of the class (mostly relating to a step size of 0) are a bit more complex than desired, but these bits were needed to get all the tests to pass.

Note: this code does not preserve the chunked-seq nature of the original range. The fact that range used to return chunked seqs was not published in the doc strings and so it was removed to allow for simpler code in Range.java.

Performance:
(timings done at the repl run via java -jar)

(dotimes [x 100] (time (dotimes [x 1] (count (range (* 1024 1024))))))
master => 80-110ms
patch => 0.014ms


(dotimes [x 100] (time (dotimes [x 1] (reduce + (map inc (range (* 1024 1024)))))))
master => 76-87ms
patch => 340-360ms


(dotimes [x 100] (time (dotimes [x 1] (reduce + (map inc (map inc (range (* 1024 1024))))))))
master => 97-123ms
patch=> 490-577ms



(dotimes [x 100] (time (dotimes [x 1] (count (filter odd? (range (* 1024 1024)))))))
master => 87-104ms
patch => 370-330ms


(dotimes [x 100] (time (dotimes [x 1] (transduce (comp (map inc) (map inc)) + (range (* 1024 1024))))))
master=>76-116ms
patch => 44ms-59ms

Patch: reified-range4.diff (some tests fail)

  1. patch.diff
    29/Aug/14 3:34 PM
    9 kB
    Timothy Baldridge
  2. range-patch3.diff
    19/Sep/14 10:02 AM
    11 kB
    Timothy Baldridge
  3. reified-range4.diff
    31/Oct/14 11:30 AM
    16 kB
    Timothy Baldridge

Activity

Hide
Alex Miller added a comment -

1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

Show
Alex Miller added a comment - 1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable. 2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end 3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable. 4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1? 5) _hash and _hasheq should be transient. 6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once. 7) Why the change in test/clojure/test_clojure/serialization.clj ? 8) Can you squash into a single commit?
Hide
Timothy Baldridge added a comment -

1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Show
Timothy Baldridge added a comment - 1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this? 2) exception added 3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too. 4) fixed hash init values, defaults to -1 like ASeq 5) hash fields are now transient 6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it? 7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN. 8) squashed.
Hide
Alex Miller added a comment -

6) might be useful if you're walking through it with nth, which hits count everytime, but doubt that's common
7) yep, reasonable

Show
Alex Miller added a comment - 6) might be useful if you're walking through it with nth, which hits count everytime, but doubt that's common 7) yep, reasonable
Hide
Andy Fingerhut added a comment - - edited

I have already pointed out to Edipo in personal email the guidelines on what labels to use for Clojure JIRA tickets here: http://dev.clojure.org/display/community/Creating+Tickets

Show
Andy Fingerhut added a comment - - edited I have already pointed out to Edipo in personal email the guidelines on what labels to use for Clojure JIRA tickets here: http://dev.clojure.org/display/community/Creating+Tickets
Hide
Timothy Baldridge added a comment -

New patch with IReduce directly on Range instead of relying on iterators

Show
Timothy Baldridge added a comment - New patch with IReduce directly on Range instead of relying on iterators
Hide
Alex Miller added a comment - - edited

The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is.

Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.

Show
Alex Miller added a comment - - edited The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is. Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.
Hide
Timothy Baldridge added a comment -

Looks like chunked seqs do make lazy seq code about 5x faster in these tests.

Show
Timothy Baldridge added a comment - Looks like chunked seqs do make lazy seq code about 5x faster in these tests.
Hide
Ghadi Shayban added a comment -

I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.

Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

Show
Ghadi Shayban added a comment - I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge. Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)
Hide
Timothy Baldridge added a comment -

I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

Show
Timothy Baldridge added a comment - I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)
Hide
Alex Miller added a comment -

I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.

On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Show
Alex Miller added a comment - I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty. On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.
Hide
Ghadi Shayban added a comment -

Tim, I'm not suggesting unboxed math, but the singular fast-path of all-Longs that you and Alex describe. I mistakenly lower-cased Long/Float.

Show
Ghadi Shayban added a comment - Tim, I'm not suggesting unboxed math, but the singular fast-path of all-Longs that you and Alex describe. I mistakenly lower-cased Long/Float.
Hide
Timothy Baldridge added a comment -

Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.

Show
Timothy Baldridge added a comment - Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.
Hide
Nicola Mometto added a comment -

As discussed with Tim in #clojure, the current patch should not change ArrayChunk's reduce impl, that's an error.

Show
Nicola Mometto added a comment - As discussed with Tim in #clojure, the current patch should not change ArrayChunk's reduce impl, that's an error.

People

Vote (1)
Watch (4)

Dates

  • Created:
    Updated: