Clojure

mapcat is too eager

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: Release 1.5
  • Fix Version/s: None
  • Component/s: None
  • Labels:
  • Patch:
    Code and Test

Description

The following expression prints 1234 and returns 1:

(first (mapcat #(do (print %) [%]) '(1 2 3 4 5 6 7)))

The reason is that (apply concat args) is not maximally lazy in its arguments, and indeed will realize the first four before returning the first item. This in turn is essentially unavoidable for a variadic concat.

This could either be fixed just in mapcat, or by adding a new function (to clojure.core?) that is a non-variadic equivalent to concat, and reimplementing mapcat with it:

(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq (when-let [[x & xs] (seq s)] (concat x (join xs)))))

Activity

Hide
Gary Fredericks added a comment -

I realized that concat could actually be made lazier without changing its semantics, if it had a single [& args] clause that was then implemented similarly to join above.

Show
Gary Fredericks added a comment - I realized that concat could actually be made lazier without changing its semantics, if it had a single [& args] clause that was then implemented similarly to join above.
Hide
John Jacobsen added a comment -

I lost several hours understanding this issue last month [1, 2] before seeing this ticket in Jira today... +1.

[1] http://eigenhombre.com/2013/07/13/updating-the-genome-decoder-resulting-consequences/

[2] http://clojurian.blogspot.com/2012/11/beware-of-mapcat.html

Show
John Jacobsen added a comment - I lost several hours understanding this issue last month [1, 2] before seeing this ticket in Jira today... +1. [1] http://eigenhombre.com/2013/07/13/updating-the-genome-decoder-resulting-consequences/ [2] http://clojurian.blogspot.com/2012/11/beware-of-mapcat.html
Alex Miller made changes -
Field Original Value New Value
Summary mapcat is not very lazy mapcat is too eager
Issue Type Defect [ 1 ] Enhancement [ 4 ]
Alex Miller made changes -
Labels lazy
Hide
Gary Fredericks added a comment -

Updated join code to be actually valid.

Show
Gary Fredericks added a comment - Updated join code to be actually valid.
Gary Fredericks made changes -
Description The following expression prints {{1234}} and returns {{1}}:

{code}
(first (mapcat #(do (print %) [%]) '(1 2 3 4 5 6 7)))
{code}

The reason is that {{(apply concat args)}} is not maximally lazy in its arguments, and indeed will realize the first four before returning the first item. This in turn is essentially unavoidable for a variadic {{concat}}.

This could either be fixed just in {{mapcat}}, or by adding a new function (to {{clojure.core}}?) that is a non-variadic equivalent to {{concat}}, and reimplementing {{mapcat}} with it:

{code}
(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq (concat (first s) (concats (rest s)))))
{code}
The following expression prints {{1234}} and returns {{1}}:

{code}
(first (mapcat #(do (print %) [%]) '(1 2 3 4 5 6 7)))
{code}

The reason is that {{(apply concat args)}} is not maximally lazy in its arguments, and indeed will realize the first four before returning the first item. This in turn is essentially unavoidable for a variadic {{concat}}.

This could either be fixed just in {{mapcat}}, or by adding a new function (to {{clojure.core}}?) that is a non-variadic equivalent to {{concat}}, and reimplementing {{mapcat}} with it:

{code}
(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq (when-let [[x & xs] (seq s)] (concat x (join xs)))))
{code}
Hide
Ghadi Shayban added a comment -

The version of join in the description is not maximally lazy either, and will realize two of the underlying collections. Reason: destructuring the seq results in a call to 'nth' for 'x' and 'nthnext' for 'xs'. nthnext is not maximally lazy.

(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq
   (when-let [s (seq s)] 
     (concat (first s) (join (rest s))))))
Show
Ghadi Shayban added a comment - The version of join in the description is not maximally lazy either, and will realize two of the underlying collections. Reason: destructuring the seq results in a call to 'nth' for 'x' and 'nthnext' for 'xs'. nthnext is not maximally lazy.
(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq
   (when-let [s (seq s)] 
     (concat (first s) (join (rest s))))))
Leonid Beschastny made changes -
Comment [ Alternatively, we could re-implement `mapcat` using `lazy-seq` explicitly:

{code}
(defn mapcat
  [f coll]
  (lazy-seq
    (when-let [[x & xs] (seq coll)]
      (concat (f x) (lazy-mapcat f xs)))))
{code} ]
Ghadi Shayban made changes -
Attachment CLJ-1218-lazier-mapcat.patch [ 16691 ]
Ghadi Shayban made changes -
Patch Code and Test [ 10002 ]
Hide
Ghadi Shayban added a comment -

Though the docstring makes no lazyiness promises (except declaring its specific implementation), this seems like a defect, not an enhancement. mapcat realizes 4 underlying collections at minimum:

boot.user=> (defn notifying-seq [cb!] (lazy-seq (cb!) (cons :ignored (notifying-seq cb!))))                                                              

#'boot.user/notifying-seq                                                                                                                                

boot.user=> (let [a (atom 0)] 
  (seq (mapcat (constantly [:ignored :ignored])
               (notifying-seq #(swap! a inc))))
  @a)
4
Show
Ghadi Shayban added a comment - Though the docstring makes no lazyiness promises (except declaring its specific implementation), this seems like a defect, not an enhancement. mapcat realizes 4 underlying collections at minimum:
boot.user=> (defn notifying-seq [cb!] (lazy-seq (cb!) (cons :ignored (notifying-seq cb!))))                                                              

#'boot.user/notifying-seq                                                                                                                                

boot.user=> (let [a (atom 0)] 
  (seq (mapcat (constantly [:ignored :ignored])
               (notifying-seq #(swap! a inc))))
  @a)
4

People

Vote (4)
Watch (5)

Dates

  • Created:
    Updated: