ClojureScript

Iterator based reduce path

Details

  • Type: Enhancement Enhancement
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Completed
  • Affects Version/s: 1.9.562
  • Fix Version/s: 1.9.655
  • Component/s: None
  • Labels:
  • Patch:
    Code
  • Approval:
    Accepted

Description

In Clojure, when reducing collections which do not support IReduce but are Iterable, reduce uses the iter-reduce function to efficiently reduce these collections. Maps based on PersistentArrayMap and PersistentHashMap and the PersistentHashSet set type would immediately benefit from this. PersistentTreeMap and PersistentTreeSet currently do not support IIterator and would continue to fall back to the current seq-reduce.

  1. CLJS-2112.1.patch
    23/Jun/17 10:38 AM
    3 kB
    Thomas Mulvaney
  2. CLJS-2112.patch
    21/Jun/17 7:22 AM
    3 kB
    Thomas Mulvaney

Activity

Hide
Thomas Mulvaney added a comment -

Some crude benchmarks suggest a 2-4x improvement under V8:

== Maps ==
Patch:
[r (mk-map 4)], (reduce (fn [_ _] nil) r), 1000000 runs, 416 msecs
[r (mk-map 4)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 508 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) r), 1000000 runs, 775 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 762 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) r), 1000 runs, 38 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) nil r), 1000 runs, 31 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) r), 1000 runs, 212 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 220 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) r), 1000 runs, 1784 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 1839 msecs

Master:
[r (mk-map 4)], (reduce (fn [_ _] nil) r), 1000000 runs, 1084 msecs
[r (mk-map 4)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 1133 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) r), 1000000 runs, 1999 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 2335 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) r), 1000 runs, 47 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) nil r), 1000 runs, 48 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) r), 1000 runs, 759 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 670 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) r), 1000 runs, 7860 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 8406 msecs

== Sets ==
Patch:
[r (into #{} (range 4))], (reduce (fn [_ _] nil) r), 1000000 runs, 516 msecs
[r (into #{} (range 4))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 523 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) r), 1000000 runs, 657 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 811 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) r), 1000 runs, 28 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) nil r), 1000 runs, 19 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) r), 1000 runs, 267 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 323 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) r), 1000 runs, 1568 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 1852 msecs

Master:
[r (into #{} (range 4))], (reduce (fn [_ _] nil) r), 1000000 runs, 1362 msecs
[r (into #{} (range 4))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 1361 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) r), 1000000 runs, 2361 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 2295 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) r), 1000 runs, 74 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) nil r), 1000 runs, 150 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) r), 1000 runs, 884 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 859 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) r), 1000 runs, 8202 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 7649 msecs
Show
Thomas Mulvaney added a comment - Some crude benchmarks suggest a 2-4x improvement under V8:
== Maps ==
Patch:
[r (mk-map 4)], (reduce (fn [_ _] nil) r), 1000000 runs, 416 msecs
[r (mk-map 4)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 508 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) r), 1000000 runs, 775 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 762 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) r), 1000 runs, 38 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) nil r), 1000 runs, 31 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) r), 1000 runs, 212 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 220 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) r), 1000 runs, 1784 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 1839 msecs

Master:
[r (mk-map 4)], (reduce (fn [_ _] nil) r), 1000000 runs, 1084 msecs
[r (mk-map 4)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 1133 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) r), 1000000 runs, 1999 msecs
[r (mk-map 8)], (reduce (fn [_ _] nil) nil r), 1000000 runs, 2335 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) r), 1000 runs, 47 msecs
[r (mk-map 100)], (reduce (fn [_ _] nil) nil r), 1000 runs, 48 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) r), 1000 runs, 759 msecs
[r (mk-map 1000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 670 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) r), 1000 runs, 7860 msecs
[r (mk-map 10000)], (reduce (fn [_ _] nil) nil r), 1000 runs, 8406 msecs

== Sets ==
Patch:
[r (into #{} (range 4))], (reduce (fn [_ _] nil) r), 1000000 runs, 516 msecs
[r (into #{} (range 4))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 523 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) r), 1000000 runs, 657 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 811 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) r), 1000 runs, 28 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) nil r), 1000 runs, 19 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) r), 1000 runs, 267 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 323 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) r), 1000 runs, 1568 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 1852 msecs

Master:
[r (into #{} (range 4))], (reduce (fn [_ _] nil) r), 1000000 runs, 1362 msecs
[r (into #{} (range 4))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 1361 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) r), 1000000 runs, 2361 msecs
[r (into #{} (range 8))], (reduce (fn [_ _] nil) nil r), 1000000 runs, 2295 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) r), 1000 runs, 74 msecs
[r (into #{} (range 100))], (reduce (fn [_ _] nil) nil r), 1000 runs, 150 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) r), 1000 runs, 884 msecs
[r (into #{} (range 1000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 859 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) r), 1000 runs, 8202 msecs
[r (into #{} (range 10000))], (reduce (fn [_ _] nil) nil r), 1000 runs, 7649 msecs
Hide
David Nolen added a comment -

For patches like this it's important to provide more context to aid assessment. Does this actually match Clojure's approach? Is there any reason for these collection types to not implement IReduce instead?

Show
David Nolen added a comment - For patches like this it's important to provide more context to aid assessment. Does this actually match Clojure's approach? Is there any reason for these collection types to not implement IReduce instead?
Hide
Thomas Mulvaney added a comment -

The iter-reduce in the patch is basically a 1-1 copy of what is in Clojure's clojure.core.protocols namespace.

In Clojure the reduce path way looks like: Check if coll is an instance of IReduce, if so use it otherwise call clojure.core.protocols/coll-reduce which will fall back on iter-reduce, seq-reduce , etc.

Is there any reason not to implement IReduce for the collections instead? The collections mentioned above don't have IReduce in Clojure - not sure if that is a good reason.

Should implementing IReduce for all collections be chosen it should be fairly simple:

PAM, PHM and PTM's IReduce would be basically a copy of their IKVReduce implementation. PHS and PTS's IReduce implementation would be a call to their backing maps new -reduce method.

Show
Thomas Mulvaney added a comment - The iter-reduce in the patch is basically a 1-1 copy of what is in Clojure's clojure.core.protocols namespace. In Clojure the reduce path way looks like: Check if coll is an instance of IReduce, if so use it otherwise call clojure.core.protocols/coll-reduce which will fall back on iter-reduce, seq-reduce , etc. Is there any reason not to implement IReduce for the collections instead? The collections mentioned above don't have IReduce in Clojure - not sure if that is a good reason. Should implementing IReduce for all collections be chosen it should be fairly simple: PAM, PHM and PTM's IReduce would be basically a copy of their IKVReduce implementation. PHS and PTS's IReduce implementation would be a call to their backing maps new -reduce method.
Hide
A. R added a comment -

I wonder what the performance would be if we just used IKVReduce, like

(-reduce [coll f]
    (reduce-kv
      (fn [xs k v]
        (f xs [k v]))
      coll))

Could be added to all those IKVReduce or to the reduce function more globally.

I think in CLJ the iter reduce was mostly done because it works with the Java collections? (Just a wild guess).

Show
A. R added a comment - I wonder what the performance would be if we just used IKVReduce, like
(-reduce [coll f]
    (reduce-kv
      (fn [xs k v]
        (f xs [k v]))
      coll))
Could be added to all those IKVReduce or to the reduce function more globally. I think in CLJ the iter reduce was mostly done because it works with the Java collections? (Just a wild guess).
Hide
Thomas Mulvaney added a comment -

@aralo That works great for implementing the 3-arity version of reduce. The example you give is for the the 2-arity (no initial input) which reduce-kv does not support.

But yes, that would be an elegant implementation for the 3-arity reduce method.

If IKVReduce was to also support 2-arity, both forms of reduce on maps could be implemented this way.

Show
Thomas Mulvaney added a comment - @aralo That works great for implementing the 3-arity version of reduce. The example you give is for the the 2-arity (no initial input) which reduce-kv does not support. But yes, that would be an elegant implementation for the 3-arity reduce method. If IKVReduce was to also support 2-arity, both forms of reduce on maps could be implemented this way.
Hide
A. R added a comment - - edited

Ah, right, the 2-arity version. I never use that. Way too confusing. I think this implementation using kv-reduce would satisfy the behavior semantics of reduce:

(let [find-max-entry (fn
                       ([] :no-max)
                       ([[max-key max-val :as entry] [k v]]
                        (if (< max-val v)
                          [k v]
                          entry)))
      res (reduce-kv
            (fn [xs k v]
              (if (identical? lookup-sentinel xs)
                [k v]
                (find-max-entry xs [k v])))
            lookup-sentinel
            {:a 1, :b 2})]
  (if (identical? res lookup-sentinel)
    (find-max-entry)
    res))

(try with empty map, one map entry and as given with 2 map entries). How is the performance for this?

Show
A. R added a comment - - edited Ah, right, the 2-arity version. I never use that. Way too confusing. I think this implementation using kv-reduce would satisfy the behavior semantics of reduce:
(let [find-max-entry (fn
                       ([] :no-max)
                       ([[max-key max-val :as entry] [k v]]
                        (if (< max-val v)
                          [k v]
                          entry)))
      res (reduce-kv
            (fn [xs k v]
              (if (identical? lookup-sentinel xs)
                [k v]
                (find-max-entry xs [k v])))
            lookup-sentinel
            {:a 1, :b 2})]
  (if (identical? res lookup-sentinel)
    (find-max-entry)
    res))
(try with empty map, one map entry and as given with 2 map entries). How is the performance for this?
Hide
David Nolen added a comment -

Ok, I'm fine with the basic approach of the patch as is. Minor nit, `.hasNext` method calls need `^boolean` hint for `if` test to be optimized.

Show
David Nolen added a comment - Ok, I'm fine with the basic approach of the patch as is. Minor nit, `.hasNext` method calls need `^boolean` hint for `if` test to be optimized.
Hide
Thomas Mulvaney added a comment -

Updated patch attached with .hasNext calls annotated as boolean.

Show
Thomas Mulvaney added a comment - Updated patch attached with .hasNext calls annotated as boolean.
Hide
David Nolen added a comment -
Show
David Nolen added a comment - fixed https://github.com/clojure/clojurescript/commit/4e48a3448efa24987cfef6fb676439a7b131f017 Added ^boolean hints in the following commit.
Hide
Thomas Mulvaney added a comment -

Great, thanks!

Show
Thomas Mulvaney added a comment - Great, thanks!

People

Vote (0)
Watch (2)

Dates

  • Created:
    Updated:
    Resolved: