Clojure

into does not work with IReduceInit

Details

  • Type: Defect Defect
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Completed
  • Affects Version/s: Release 1.7
  • Fix Version/s: Release 1.7
  • Component/s: None
  • Labels:
  • Patch:
    Code and Test
  • Approval:
    Ok

Description

This should work:

(into []
  (reify clojure.lang.IReduceInit
    (reduce [_ f start]
      (reduce f start (range 10)))))
IllegalArgumentException Don't know how to create ISeq from: user$eval5$reify__6
	clojure.lang.RT.seqFrom (RT.java:506)
	clojure.lang.RT.seq (RT.java:487)
	clojure.core/seq--seq--4091 (core.clj:135)
	clojure.core.protocols/seq-reduce (protocols.clj:30)
	clojure.core.protocols/fn--6422 (protocols.clj:42)
	clojure.core.protocols/fn--6369/f--6255--auto----G--6364--6382 (protocols.clj:13)
	clojure.core/reduce (core.clj:6469)
	clojure.core/into (core.clj:6550)

Cause: CollReduce only supports IReduce, not IReduceInit so when reduce calls into it, it falls back to trying to obtain a seq representation which fails.

Proposed: Extend CollReduce to IReduceInit and in the non-init arity, cast to IReduce. Also, now that CollReduce supports both IReduceInit and Iterable, a coll that implements both makes the path through CollReduce nondeterministic. transduce does an explicit check that prefers IReduceInit - the patch copies that approach to reduce as well.

Another consequence of this change is that since PersistentVector implements IReduce but throws on the non-init path, there are some test breakages. To address this, CLJ-1619 (which implements the non-init reduce) must be applied first.

Patch: clj-1572-4.patch
Depends on: CLJ-1619 being applied first

  1. clj-1572.patch
    24/Oct/14 10:40 AM
    0.8 kB
    Alex Miller
  2. clj-1572-2.patch
    05/Nov/14 1:56 PM
    2 kB
    Alex Miller
  3. clj-1572-3.patch
    12/Nov/14 1:13 PM
    2 kB
    Alex Miller
  4. clj-1572-4.patch
    14/Nov/14 9:18 AM
    2 kB
    Alex Miller
  5. CLJ-1572-alternative-POC.patch
    17/Nov/14 3:46 PM
    5 kB
    Nicola Mometto

Activity

Hide
Alex Miller added a comment -

into calls reduce which calls into CollReduce. CollReduce extends to IReduce, but not to IReduceInit. If CollReduce were extended to IReduceInit for the arity it can support, into work as expected in the given example. Patch clj-1572.patch does this.

Show
Alex Miller added a comment - into calls reduce which calls into CollReduce. CollReduce extends to IReduce, but not to IReduceInit. If CollReduce were extended to IReduceInit for the arity it can support, into work as expected in the given example. Patch clj-1572.patch does this.
Hide
Ghadi Shayban added a comment -

It is also possible that core/reduce needs the same special casing of IReduceInit that transduce has to allow for a deterministic dispatch when transduce is called with (mapcat f), as mapcat calls reduce.

Show
Ghadi Shayban added a comment - It is also possible that core/reduce needs the same special casing of IReduceInit that transduce has to allow for a deterministic dispatch when transduce is called with (mapcat f), as mapcat calls reduce.
Hide
Stuart Halloway added a comment -

Can someone please expand on Ghadi's comment with an example of the problem?

Show
Stuart Halloway added a comment - Can someone please expand on Ghadi's comment with an example of the problem?
Hide
Ghadi Shayban added a comment -

Example of something that is Iterable & ReduceInit:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L122-L128

Let's call that r/range in this example:
(transduce (mapcat r/range) + 0 [5 5 5 5 5])

The when the mapcat transducer encounters r/range, the inner reduce call will dispatch through CollReduce upon Iterable, rather than IReduceInit.

the inner call to reduce within cat:
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7243

Show
Ghadi Shayban added a comment - Example of something that is Iterable & ReduceInit: https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L122-L128 Let's call that r/range in this example: (transduce (mapcat r/range) + 0 [5 5 5 5 5]) The when the mapcat transducer encounters r/range, the inner reduce call will dispatch through CollReduce upon Iterable, rather than IReduceInit. the inner call to reduce within cat: https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7243
Hide
Alex Miller added a comment -

To restate the issue from Ghadi for my own sake:

The CollReduce protocol extends to IReduce, IReduceInit and Iterable. Because these are all interfaces, its possible for a custom coll to implement two or more of them. In that case, Clojure will arbitrarily pick which protocol impl is called - this can result in the Iterable version being called instead of IReduce/IReduceInit (which should be preferred).

transduce avoids this by explicitly checking for IReduceInit and preferring it over CollReduce.

Ghadi is suggesting that reduce should also make this preference (currently it does not).

Show
Alex Miller added a comment - To restate the issue from Ghadi for my own sake: The CollReduce protocol extends to IReduce, IReduceInit and Iterable. Because these are all interfaces, its possible for a custom coll to implement two or more of them. In that case, Clojure will arbitrarily pick which protocol impl is called - this can result in the Iterable version being called instead of IReduce/IReduceInit (which should be preferred). transduce avoids this by explicitly checking for IReduceInit and preferring it over CollReduce. Ghadi is suggesting that reduce should also make this preference (currently it does not).
Hide
Nicola Mometto added a comment -

If CollReduce could be direcly backed by the IReduce interface, this would remove the need for explicit IReduceInit checking at the callsite.

It's already possible to (defprotocol CollReduce :on-interface clojure.lang.IReduce ..), I'm proposing adding the ability to map the "reduce" method to the coll-reduce protocol-fn aswell and go with this solution

Show
Nicola Mometto added a comment - If CollReduce could be direcly backed by the IReduce interface, this would remove the need for explicit IReduceInit checking at the callsite. It's already possible to (defprotocol CollReduce :on-interface clojure.lang.IReduce ..), I'm proposing adding the ability to map the "reduce" method to the coll-reduce protocol-fn aswell and go with this solution
Hide
Alex Miller added a comment -

CollReduce extends to two interfaces (IReduceInit and Iterable) and for some impls this is ambiguous under the CollReduce protocol. The check in reduce and transduce is to force the choice of IReduceInit so it is not ambiguous. I think your suggestion re-introduces that issue? Or maybe I'm just not understanding what you mean.

Show
Alex Miller added a comment - CollReduce extends to two interfaces (IReduceInit and Iterable) and for some impls this is ambiguous under the CollReduce protocol. The check in reduce and transduce is to force the choice of IReduceInit so it is not ambiguous. I think your suggestion re-introduces that issue? Or maybe I'm just not understanding what you mean.
Hide
Nicola Mometto added a comment -

Turns out defprotocol already has that capability via :on metadata field.

The attached patch is a proof of concept of my proposal, if there's interest in this approach I can fix the deftype/record/reify method parser to automatically pick the var name rather than having to specify the method name.

Show
Nicola Mometto added a comment - Turns out defprotocol already has that capability via :on metadata field. The attached patch is a proof of concept of my proposal, if there's interest in this approach I can fix the deftype/record/reify method parser to automatically pick the var name rather than having to specify the method name.
Hide
Nicola Mometto added a comment -

Ah, I see now the issue. Disregard my patch then.

Show
Nicola Mometto added a comment - Ah, I see now the issue. Disregard my patch then.
Hide
Ghadi Shayban added a comment -

Note that unless this patch is applied, a plain reduce over an Eduction goes through the seq/iterator path of CollReduce, and not eduction's native IReduceInit path.

Show
Ghadi Shayban added a comment - Note that unless this patch is applied, a plain reduce over an Eduction goes through the seq/iterator path of CollReduce, and not eduction's native IReduceInit path.
Hide
Ghadi Shayban added a comment -

with this patch + CLJ-1546

(reduce + [1 2 3]) doesn't work anymore, breaking a few tests.

Show
Ghadi Shayban added a comment - with this patch + CLJ-1546 (reduce + [1 2 3]) doesn't work anymore, breaking a few tests.
Hide
Ghadi Shayban added a comment - - edited

Should have left a bit more detail.
https://github.com/clojure/clojure/commit/ad7d9c46992cac0e812ce3dd47584c9bb2fda11f

This might not have anything to do with CLJ-1546, just happened to have them both applied. Seems like vectors are both IReduce+IReduceInit, but throw on the IReduce impl.

Vectors were made IReduce before IReduce was split into IReduceInit.

Show
Ghadi Shayban added a comment - - edited Should have left a bit more detail. https://github.com/clojure/clojure/commit/ad7d9c46992cac0e812ce3dd47584c9bb2fda11f This might not have anything to do with CLJ-1546, just happened to have them both applied. Seems like vectors are both IReduce+IReduceInit, but throw on the IReduce impl. Vectors were made IReduce before IReduce was split into IReduceInit.
Hide
Nicola Mometto added a comment -

I've opened CLJ-1619 with a patch implementing the no-init arity of reduce for PersistentVector

Show
Nicola Mometto added a comment - I've opened CLJ-1619 with a patch implementing the no-init arity of reduce for PersistentVector
Hide
Nicola Mometto added a comment -

An alternative fix would be to just make PersistentVectors IReduceInit rather than IReduce but I don't see the point in doing that since the implementation is trivial.

Show
Nicola Mometto added a comment - An alternative fix would be to just make PersistentVectors IReduceInit rather than IReduce but I don't see the point in doing that since the implementation is trivial.
Hide
Ghadi Shayban added a comment -

Nicola, that is my impression, that Rich intended PersistentVector to be IReduceInit but not IReduce. But he changed it before that interface was split. Would still need some sort of way to handle the existing no-init case, which he mentioned was unfortunate at the conj.

Show
Ghadi Shayban added a comment - Nicola, that is my impression, that Rich intended PersistentVector to be IReduceInit but not IReduce. But he changed it before that interface was split. Would still need some sort of way to handle the existing no-init case, which he mentioned was unfortunate at the conj.
Hide
Nicola Mometto added a comment -

Ghadi, what would the rationale be for PV not supporting the no-init arity? I'm not aware of any technical issues caused by my patch for CLJ-1619 but maybe you know about one?

Show
Nicola Mometto added a comment - Ghadi, what would the rationale be for PV not supporting the no-init arity? I'm not aware of any technical issues caused by my patch for CLJ-1619 but maybe you know about one?
Hide
Ghadi Shayban added a comment -

No I may just be confused.

Rubber-ducking aloud so that I can be corrected:

A call to init reduce with an 'init' supplied is unambiguous.

It's the responsibility of an IReduce to do what is appropriate with 'f', whereas with "improved" reduce and transduce (f) becomes the init. (c.c.reducers/reduce being the improved reduce)

IReduce implementations must preserve compatibility with core/reduce's docstring in the 0 and 1 arity cases.

If coll contains no
items, f must accept no arguments as well, and reduce returns the
result of calling f with no arguments. If coll has only 1 item, it
is returned and f is not called. If val is supplied, returns the
result of applying f to val and the first item in coll, then
applying f to that result and the 2nd item, etc. If coll contains no
items, returns val and f is not called.

A protocol's dispatch is non-deterministic when invoked upon things with multiple paths. One way to resolve the ambiguities in CollReduce is to extend CollReduce directly to any IReduce/IReduceInit impl and not rely upon friends like Iterable.

For example CLJ-1515 needs the same CollReduce extension as CLJ-1603's Iterable/Repeat/Cycle got.

(extend-protocol p/CollReduce
  clojure.lang.LongRange
  (coll-reduce [this f] (.reduce this f)
  ..

  clojure.lang.Cycle
  ...)
;; etc.
Show
Ghadi Shayban added a comment - No I may just be confused. Rubber-ducking aloud so that I can be corrected: A call to init reduce with an 'init' supplied is unambiguous. It's the responsibility of an IReduce to do what is appropriate with 'f', whereas with "improved" reduce and transduce (f) becomes the init. (c.c.reducers/reduce being the improved reduce) IReduce implementations must preserve compatibility with core/reduce's docstring in the 0 and 1 arity cases.
If coll contains no items, f must accept no arguments as well, and reduce returns the result of calling f with no arguments. If coll has only 1 item, it is returned and f is not called. If val is supplied, returns the result of applying f to val and the first item in coll, then applying f to that result and the 2nd item, etc. If coll contains no items, returns val and f is not called.
A protocol's dispatch is non-deterministic when invoked upon things with multiple paths. One way to resolve the ambiguities in CollReduce is to extend CollReduce directly to any IReduce/IReduceInit impl and not rely upon friends like Iterable. For example CLJ-1515 needs the same CollReduce extension as CLJ-1603's Iterable/Repeat/Cycle got.
(extend-protocol p/CollReduce
  clojure.lang.LongRange
  (coll-reduce [this f] (.reduce this f)
  ..

  clojure.lang.Cycle
  ...)
;; etc.
Hide
Nicola Mometto added a comment -

I feel like all those issues introduced by the non-deterministic dispatch of protocol functions in case of multiple available implementations, could (and should?) be solved by a prefer-method-like capability for protocols.
This way we could say have a bunch of hints like (prefer-dispatch CollReduce IReduce Iterable) and be done with it.

Show
Nicola Mometto added a comment - I feel like all those issues introduced by the non-deterministic dispatch of protocol functions in case of multiple available implementations, could (and should?) be solved by a prefer-method-like capability for protocols. This way we could say have a bunch of hints like (prefer-dispatch CollReduce IReduce Iterable) and be done with it.

People

Vote (0)
Watch (3)

Dates

  • Created:
    Updated:
    Resolved: