Clojure

mapcat is too eager

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: Release 1.5
  • Fix Version/s: None
  • Component/s: None
  • Labels:
  • Patch:
    Code and Test

Description

The following expression prints 1234 and returns 1:

(first (mapcat #(do (print %) [%]) '(1 2 3 4 5 6 7)))

The reason is that (apply concat args) is not maximally lazy in its arguments, and indeed will realize the first four before returning the first item. This in turn is essentially unavoidable for a variadic concat.

This could either be fixed just in mapcat, or by adding a new function (to clojure.core?) that is a non-variadic equivalent to concat, and reimplementing mapcat with it:

(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq (when-let [[x & xs] (seq s)] (concat x (join xs)))))

Activity

Hide
Ghadi Shayban added a comment -

Though the docstring makes no lazyiness promises (except declaring its specific implementation), this seems like a defect, not an enhancement. mapcat realizes 4 underlying collections at minimum:

boot.user=> (defn notifying-seq [cb!] (lazy-seq (cb!) (cons :ignored (notifying-seq cb!))))                                                              

#'boot.user/notifying-seq                                                                                                                                

boot.user=> (let [a (atom 0)] 
  (seq (mapcat (constantly [:ignored :ignored])
               (notifying-seq #(swap! a inc))))
  @a)
4
Show
Ghadi Shayban added a comment - Though the docstring makes no lazyiness promises (except declaring its specific implementation), this seems like a defect, not an enhancement. mapcat realizes 4 underlying collections at minimum:
boot.user=> (defn notifying-seq [cb!] (lazy-seq (cb!) (cons :ignored (notifying-seq cb!))))                                                              

#'boot.user/notifying-seq                                                                                                                                

boot.user=> (let [a (atom 0)] 
  (seq (mapcat (constantly [:ignored :ignored])
               (notifying-seq #(swap! a inc))))
  @a)
4
Hide
Ghadi Shayban added a comment -

The version of join in the description is not maximally lazy either, and will realize two of the underlying collections. Reason: destructuring the seq results in a call to 'nth' for 'x' and 'nthnext' for 'xs'. nthnext is not maximally lazy.

(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq
   (when-let [s (seq s)] 
     (concat (first s) (join (rest s))))))
Show
Ghadi Shayban added a comment - The version of join in the description is not maximally lazy either, and will realize two of the underlying collections. Reason: destructuring the seq results in a call to 'nth' for 'x' and 'nthnext' for 'xs'. nthnext is not maximally lazy.
(defn join
  "Lazily concatenates a sequence-of-sequences into a flat sequence."
  [s]
  (lazy-seq
   (when-let [s (seq s)] 
     (concat (first s) (join (rest s))))))
Hide
Gary Fredericks added a comment -

Updated join code to be actually valid.

Show
Gary Fredericks added a comment - Updated join code to be actually valid.
Hide
John Jacobsen added a comment -

I lost several hours understanding this issue last month [1, 2] before seeing this ticket in Jira today... +1.

[1] http://eigenhombre.com/2013/07/13/updating-the-genome-decoder-resulting-consequences/

[2] http://clojurian.blogspot.com/2012/11/beware-of-mapcat.html

Show
John Jacobsen added a comment - I lost several hours understanding this issue last month [1, 2] before seeing this ticket in Jira today... +1. [1] http://eigenhombre.com/2013/07/13/updating-the-genome-decoder-resulting-consequences/ [2] http://clojurian.blogspot.com/2012/11/beware-of-mapcat.html
Hide
Gary Fredericks added a comment -

I realized that concat could actually be made lazier without changing its semantics, if it had a single [& args] clause that was then implemented similarly to join above.

Show
Gary Fredericks added a comment - I realized that concat could actually be made lazier without changing its semantics, if it had a single [& args] clause that was then implemented similarly to join above.

People

Vote (4)
Watch (5)

Dates

  • Created:
    Updated: