<< Back to previous view

[CLJ-1192] vec function is substantially slower than into function Created: 06/Apr/13  Updated: 24/Oct/14  Resolved: 24/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Luke VanderHart Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: performance

Approval: Incomplete

 Description   

(vec coll) and (into [] coll) do exactly the same thing. However, due to into using transients, it is substantially faster. On my machine:

(time (dotimes [_ 100] (vec (range 100000))))
"Elapsed time: 732.56 msecs"

(time (dotimes [_ 100] (into [] (range 100000))))
"Elapsed time: 491.411 msecs"

This is consistently repeatable.

Since vec's sole purpose is to transform collections into vectors, it should do so at the maximum speed available.



 Comments   
Comment by Andy Fingerhut [ 07/Apr/13 5:50 PM ]

I am pretty sure that Clojure 1.5.1 also uses transient vectors for (vec (range n)) (probably also some earlier versions of Clojure, too).

Look at vec in core.clj. It checks whether its arg is a java.util.Collection, which lazy seqs are, so calls (clojure.lang.LazilyPersistentVector/create coll).

LazilyPersistentVector's create method checks whether its argument is an ISeq, which lazy seqs are, so it calls PersistentVector.create(RT.seq(coll)).

All 3 of PersistentVector's create() methods use transient vectors to build up the result.

I suspect the difference in run times are not because of transients or not, but because of the way into uses reduce, and perhaps may also have something to do with the perhaps-unnecessary call to RT.seq in LazilyPersistentVector's create method (in this case, at least – it is likely needed for other types of arguments).

Comment by Alan Malloy [ 14/Jun/13 2:17 PM ]

I'm pretty sure the difference is that into uses reduce: since reducers were added in 1.5, chunked sequences know how to reduce themselves without creating unnecessary cons cells. PersistentVector/create doesn't use reduce, so it has to allocate a cons cell for each item in the sequence.

Comment by Gary Fredericks [ 08/Sep/13 1:55 PM ]

Is there any downside to (defn vec [coll] (into [] coll)) (or the inlined equivalent)?

Comment by Ghadi Shayban [ 11/Apr/14 5:13 PM ]

While I agree that there are improvements and possibly low-hanging fruit, FWIW https://github.com/clojure/tools.analyzer/commit/cf7dda81a22f4c9c1fe64c699ca17e7deed61db4#commitcomment-5989545

showed a 5% slowdown from a few callsites in tools.analyzer.

This ticket's benchmark is incomplete in that it covers a single type of argument (chunked range), and flawed as it timing the expense of realizing the range. (That could be a legit benchmark case, but it shouldn't be the only one).

Sorry to rain on a parade. I promise like speed too!

Comment by Greg Chapman [ 25/Apr/14 5:23 PM ]

One thing to note is that vec has a subtle difference from into when the collection is an Object array of length <= 32. In that case, vec aliases the supplied array, rather than copying it (this is noted in the warning here: http://clojuredocs.org/clojure_core/clojure.core/vec). I believe I read some place that this behavior is intentional, but I can't find the citation.

Comment by Andy Fingerhut [ 25/Apr/14 10:18 PM ]

Greg, CLJ-893 might be what you remember. That is the ticket that was closed by a patch updating the documentation of vec.

Comment by Mike Anderson [ 18/May/14 7:41 AM ]

I think there are quite a few performance improvements that can be made to vec in general. For example, if given a List it should use PersistentVector.create(List) rather than producing an unnecessary seq, which appears to be the case at the moment. Also it should probably return the same object if passed an existing IPersistentVector.

Basically there are a number of cases that we could be handling more efficiently....

I'm taking a look at this now.... will propose a quick patch if it seems there is a good solution.

Comment by Mike Anderson [ 24/Jul/14 4:01 AM ]

I've looked at this issue and it is quite complex. There are multiple types that need to potentially be converted into vectors, and doing so efficiently will often require making use of reduce-style operations on the source collections.

Doing this efficiently will probably in turn require making use of the IReduce interface, which doesn't yet seem to be fully utilised across the Clojure code based. If we do this, lots of operations (not just vec!) can be made faster but it will be quite a major change.

I have a branch that implements some of this but would appreciate feedback if this is the right direction before I take it any further:
https://github.com/mikera/clojure/tree/clj-1192-vec-performance

Comment by Alex Miller [ 24/Jul/14 9:45 AM ]

Thanks Mike! It may take a few days before I can get back to you about this.

Comment by Mike Anderson [ 25/Jul/14 3:44 AM ]

Basically the approach I am proposing is:

  • Make various collections implement IReduce efficiently (if they don't already). Especially applied to chunked seqs etc.
  • Have RT.reduce(...) methods that implement reduce on the Java side
  • Make the Clojure side use IReduce where relevant (should be as simple as extending the existing protocols)
  • Implement vec (and other similar operations) in terms of IReduce - which will solve this specific issue

If we really care about pushing vector performance even further, we can also consider:

  • Create specialised small vector types where appropriate - e.g. a specialised SmallPersistentVector class for <32 elements. This should outperform the more generic PersistentVector which is better suited for large vectors.
  • Some dedicated construction functions that know how to efficiently exploit knowledge about the data source (e.g. creating a vec from a segment of a big Object array can be done with a bunch of arraycopys into 32-element chunks and then constructing a PersistentVector around these)

This should give us a decent speedup overall (of course it would need benchmarking... but I'd hope to see some sort of measurable improvement on a macro benchmark like building and testing Clojure).

Comment by Alex Miller [ 24/Oct/14 8:36 AM ]

I believe this will be covered by a combination of CLJ-1546 (which makes vec work in more cases and faster on everything but chunked seqs) and CLJ-1515 (which will make the primary chunked seq use case of range faster) so marking as a dupe. After those are done, would consider more targeted ticket for whatever case is left.





[CLJ-1571] Transducer of partition-by over take gives wrong answer Created: 20/Oct/14  Updated: 21/Oct/14  Resolved: 21/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Rich Hickey
Resolution: Completed Votes: 1
Labels: transducers

Attachments: Text File 0001-CLJ-1571-fix-regression-introduced-by-43cc1854508d65.patch     Text File CLJ-1571.patch    
Approval: Ok

 Description   
(partition-by pos? (take 2 [-1 1]))
=> ((-1) (1))
(sequence (comp (take 2) (partition-by pos?)) [-1 1])
=> ([-1])


 Comments   
Comment by Nicola Mometto [ 21/Oct/14 7:49 AM ]

Given that it works fine when using transduce instead of sequence, the bug might be in LazyTransformer rather than in partition-by.

(into [] (comp (take 2) (partition-by pos?)) [-1 1])
=> [[-1] [1]]
Comment by Ghadi Shayban [ 21/Oct/14 9:21 AM ]

Patch fixes the test case, but needs eyes, I certainly may have broken something. This highlights the importance of CLJ-1554, something similar to the existing defequiv tests for reducers, but between #'into and #'sequence, also covering edge cases in reduced unwrapping.

Comment by Alex Miller [ 21/Oct/14 9:41 AM ]

Thanks Ghadi. This bug was found by the tests I wrote for CLJ-1554, so yes.

Comment by Nicola Mometto [ 21/Oct/14 9:53 AM ]

Applying this patch causes a regression in the lazyiness of sequence.
The lines that Ghadi removed for this patch were added by Rich in this commit https://github.com/clojure/clojure/commit/43cc1854508d655e58e377f84836ba128971f90c to address http://dev.clojure.org/jira/browse/CLJ-1497

Example of the regression:
current master:

user=>  (sequence (take 2) (map #(do (println (str "~" %)) %) (iterate inc 1)))
~1
~2
(1 2)

with this patch:

user=>  (sequence (take 2) (map #(do (println (str "~" %)) %) (iterate inc 1)))
~1
~2
~3
(1 2)
Comment by Nicola Mometto [ 21/Oct/14 10:03 AM ]

Patch 0001-CLJ-1571-fix-regression-introduced-by-43cc1854508d65.patch addresses this issue while preserving the current lazyness factor of `sequence`

Comment by Alex Miller [ 21/Oct/14 11:09 AM ]

Rich has a (different) patch for this on the way.

Comment by Alex Miller [ 21/Oct/14 1:16 PM ]

Fixed directly by Rich in commit https://github.com/clojure/clojure/commit/38d7572e4254afdd7f02b78095ccdb27065754d2





[CLJ-1569] transduce does not respect the init arity of transducers Created: 19/Oct/14  Updated: 20/Oct/14  Resolved: 20/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Daniel James Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: transducers


 Description   

Note: I initially raised this issue for discussion on the mailing list
https://groups.google.com/d/msg/clojure/uVKP4_0KMwQ/-oUJahvUarIJ

transduce and other transducible processes currently ignore the 'init' arity of transducers. The currently implementation of transduce takes the 'init' from the reducing function before being transformed by the transducer, rather the reducing function after being transformed.

The current implementation of transduce is equivalent to the following (simplified for exposition purposes):

Current implementation of transduce
(defn transduce
  ([xform f coll]
     (transduce xform f (f) coll))
  ([xform f init coll]
     (let [rf (xform f)]
       (rf (reduce rf init coll)))))

The arity 3 case uses (f) to construct the seed value of the reduction. The arity 4 case uses the explicitly provided seed, init.

I would like to propose an alternate implementation of transduce, one which makes use of the transducer when seeding the reduction.

Proposed implementation of transduce
(defn alt-transduce
  ([xform f coll]
     (let [rf (xform f)]
       (rf (reduce rf (rf) coll))))
  ([xform f init coll]
     (let [rf (xform
               (fn
                 ([] init)
                 ([result] (f result))
                 ([result input] (f result input))))]
       (rf (reduce rf (rf) coll)))))

Now, the arity 3 case uses (xform f) to construct the seed value of the reduction. The arity 4 case combines both f and init into a new reducing function that is given to xform. Both of these ensure that the init arity of the transducer is used.

As into is implemented in terms of transduce, it is also taken care of. However, sequence is separate, and would also have to be tweaked to respect the init arity.



 Comments   
Comment by Daniel James [ 19/Oct/14 1:24 PM ]

As a small addition, I just wanted to point out an example of where the current implementation raised curiosity:
https://groups.google.com/d/msg/clojure/M-13lRPfguc/IspgdpKDaGsJ

Comment by Alex Miller [ 20/Oct/14 9:12 AM ]

In transduce, the transducer is applied to the elements of the input and should not be entangled with the accumulation at all (either in initializing it or the act of accumulation). f is the final reducing function that deals with accumulation and initialization.

Comment by Daniel James [ 20/Oct/14 10:00 AM ]

Hi Alex,

I feel that you've misunderstood my proposal.

Could you explain how you consider

(defn init-with [x]
  (fn [rf]
    (fn
      ([] (rf (rf) x))
      ([result] (rf result))
      ([result input] (rf result input)))))

to be “entangled with the accumulation at all (either in initializing it or the act of accumulation).”

This seems like a completely legitimate transducer to me. It makes use of the init arity, while remaining oblivious to the accumulation.

Your explanation also seems to be at odds with

http://clojure.org/transducers

The inner function is defined with 3 arities used for different purposes:

  • Init (arity 0) - in most cases, this will just call the init arity on the nested transform xf, which will eventually call out to the transducing process to supply an initial value. It is also a place to establish the initial reducing state for the transducer.
Comment by Alex Miller [ 20/Oct/14 11:57 AM ]

By "entangling" I mean that in your alternate transduce you invoke the xform to obtain the initial value: ((xform f)) instead of (f). Transducers should not know about or be involved in the accumulating process.

The transducers page is in error and I will correct it (I wrote it; the error is mine).

Comment by Daniel James [ 20/Oct/14 3:25 PM ]

Ok, at the risk of belaboring the point (I have enough self-awareness to realized that I am probably about to do exactly that…) I feel that you are still missing something here.

Permit me to try one more time to explain my position.

Consider map

the map transducer
(defn map [f]
  (fn [rf]
    (fn
      ([] (rf))
      ([result] (rf result))
      ([result input] (rf result (f input))))))

It defines all three arities, init, step, and completion. It doesn’t have anything to do in init arity, and so the only thing it can do is “call the init arity on the nested transform rf, which will eventually call out to the transducing process.” (taken from your update to http://clojure.org/transducers)

Saying that transducers should not be involved in the accumulating process has the right spirit, but you are missing something. It is involved, but in a strictly constrained way. The transducer’s responsibility is to carefully thread the accumulator value around. Sure, it should not know what the value is, or what type it has, but it is still there. Every arity of map has access to it! In the init arity, map delegates to rf to construct it. In the completion arity, map has the result, but the only valid thing it can do with it is to pass it on to rf. Again, in the step arity, map has the result, and again the only legitimate thing it can do with it is to thread to through to rf.

Now consider the identity transducer:

the identity transducer
(def identity
  (fn [rf]
    ([] (rf))
    ([result] (rf result))
    ([result input] (rf result input))))

This is a transducer in its purest form. All it has to do is correctly thread the accumulation value around. It doesn’t and shouldn’t know any details of what that value is, nonetheless, it still has the responsibility of threading that value correctly.

In each arity the identity transducer does the ‘trivial’ thing. In my post to the mailing list, I illustrated three example of transducers that do something beyond the trivial thing in each of the three arities. (I’ll copy them here for completeness.)

non trivial threading of the accumulator in the init arity
(defn init-with
  [x]
  (fn [rf]
    (fn
      ([] (rf (rf) x))
      ([result] (rf result))
      ([result input]
         (rf result input)))))
non trivial threading of the accumulator in the completion arity
(defn complete-with
  [x]
  (fn [rf]
    (fn
      ([] (rf))
      ([result]
         (rf (rf result x)))
      ([result input]
         (rf result input)))))
non trivial threading of the accumulator in the step arity
(defn dupl
  []
  (fn [rf]
    (fn
      ([] (rf))
      ([result] (rf result))
      ([result input]
         (rf (rf result input)
             input)))))

I would consider all of these to be perfectly valid transducers. However, unless I’ve misunderstood, you appear to be taking issue with init-with. If so, I’m very curious as to why!

a closer look at the init arity of init-with
(defn init-with
  [x]
  (fn [rf]
    (fn
      ([] (rf (rf) x))
      ...

Rather than just delegating to (rf), it threads that value immediately into rf with (rf (rf) x). So I don’t agree at all that any of these, init-with, complete-with, or dupl, are “entangled” with the accumulation value or the accumulation process. They are completely oblivious to both its value and its type!

So, returning to transduce,

the first case of an alternate transduce
(defn alt-transduce
  ([xform f coll]
     (let [rf (xform f)]
       (rf (reduce rf (rf) coll))))
  ...

A valid transducer is one that threads the accumlation value correctly. Therefore, ((xform f)) is (f) threaded through xform. All the transducers in clojure.core have the trivial ([] (rf)), so ((xform f)) built from these core transducers degenerates into (identity (f)).
However, as transduce, into, and sequence never even invoke the init arity, it begs the question, why even require that transducers have that arity in the first place? Personally, I think that init arity is great as it enables a transducer such as init-with (while remaining stateless), but that requires transducible processes to actually make use of the init arity! Hence why I raised this issue.
It seems troubling to me that complete-with works perfectly fine in the current framework, yet init-with, its dual, does not.

I recognize that the various discussions around ‘typing transducers’ have made various approximations at elucidating the properties of transducers, but I feel strongly that the discussions around rank-2 polymorphism have some bearing on exactly this issue. In fact, it says rather a lot about correctly threading the accumulation value throught transducers without ever “entangling” it in the precise accumulation process of where a transducer is being used.

And on this, it appears that Rich Hickey agrees: “The rank-2 type in particular captures an important property.” (http://conscientiousprogrammer.com/blog/2014/08/07/understanding-cloure-transducers-through-types/#comment-1533318972) Maybe I’ve got him all wrong, but as of right now I’m pretty convinced I don’t. Still, I’m willing to be convinced otherwise

Comment by Alex Miller [ 20/Oct/14 10:03 PM ]

Rich asked me to decline the ticket because the init arity of the xform should not be involved in the reducing function accumulation.

Comment by Daniel James [ 20/Oct/14 10:34 PM ]

Ok, as you can guess I’m a little perplexed by that design choice, but I’ll accept it.

I’d appreciate any further insight you can offer on why this design choice has been taken.
Is the init arity simply a case of compatibility, despite it not being used? Is this a case of attempting to prevent the transducer writer from erroneously corrupting a transducible process? Is init-with actually actually considered to be an invalid transducer, and thus the only way to implement something equivalent would be as a stateful transducer?





Generated at Sat Oct 25 14:23:55 CDT 2014 using JIRA 4.4#649-r158309.