<< Back to previous view

[CLJ-1669] Move LazyTransformer to an iterator strategy, extend eduction capabilities Created: 04/Mar/15  Updated: 10/Apr/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: transducers

Attachments: Text File clj-1669-2.patch     Text File clj-1669-3.patch     Text File clj-1669-4.patch     Text File clj-1669-5.patch     Text File clj-1669-6.patch     Text File clj-1669.patch    
Patch: Code and Test
Approval: Ok

  • LazyTransformer does a lot of work to be a seq. Instead, switch to creating a transforming iterator.
  • Change sequence to wrap iterator-seq around the transforming iterator.
  • Change the iterator-seq implementation to be chunked. IteratorSeq will no longer be used but is left in case of regressions for now.
  • Change Eduction to provide iteration directly via the transforming iterator.
  • Extend eduction to support multiple xforms.


(use 'criterium.core)
(def s (range 1000))
(def v (vec s))
(def s50 (range 50))
(def v50 (vec s50))

expr master s master v 1669-3 s 1669-3 v 1669-6 s 1669-6 v
non-chunking transform            
(into [] (->> s (interpose 5) (partition-all 2))) 466 us 459 us 525 us 508 us 476 us 501 us
(into [] (->> s (eduction (interpose 5) (partition-all 2)))) * 113 us 112 us 117 us 122 us 108 us 108 us
1 chunking transform            
(into [] (map inc s)) 28 us 31 us 30 us 31 us 30 us 31 us
(into [] (map inc) s) 17 us 19 us 19 us 20 us 19 us 17 us
(into [] (sequence (map inc) s)) 58 us 46 us 142 us 148 us 94 us 67 us
(into [] (eduction (map inc) s)) 21 us 20 us 25 us 21 us 23 us 21 us
(doall (map inc (eduction (map inc) s))) 219 us 208 us 204 us 191 us 117 us 97 us
2 chunking transforms        
(into [] (map inc (map inc s))) 49 us 50 us 50 us 50 us 54 us 55 us
(into [] (comp (map inc) (map inc)) s) 23 us 23 us 28 us 23 us 23 us 23 us
(into [] (sequence (comp (map inc) (map inc)) s)) 73 us 58 us 144 us 135 us 104 us 82 us
(into [] (eduction (map inc) (map inc) s)) * 54 us 51 us 54 us 29 us 55 us 30 us
(doall (map inc (eduction (map inc) (map inc) s))) * 230 us 213 us 213 us 196 us 124 us 104 us
expand transform            
(into [] (mapcat range (map inc s50))) 83 us 81 us 80 us 84 us 71 us 72 us
(into [] (sequence (comp (map inc) (mapcat range)) s50)) 122 us 117 us 256 us 254 us 161 us 156 us
(into [] (eduction (map inc) (mapcat range) s50)) * 78 us 79 us 80 us 82 us 60 us 61 us
materialized eduction            
(sort (eduction (map inc) s)) ERR ERR 120 us 84 us 106 us 89 us
(->> s (filter odd?) (map str) (sort-by last)) 1.13 ms 1.21 ms 1.19 ms 1.20 ms 1.19 ms 1.20 ms
(->> s (eduction (filter odd?) (map str)) (sort-by last)) ERR ERR 1.18 ms 1.17 ms 1.22 ms 1.23 ms
  • used comp to combine xforms as eduction only took one in the before case

Patch: clj-1669-6.patch

Screened by:

Comment by Michael Blume [ 05/Mar/15 3:52 PM ]

Nice, I like the direction on this.

CLJ-1515 currently breaks this patch (LongRange cannot be converted to Iterable), but I imagine that'll get better when it absorbs the changes from CLJ-1603

Comment by Alex Miller [ 06/Mar/15 8:11 AM ]

Yeah. colls should be mapped through RT.iter() to catch more cases.

Comment by Alex Miller [ 06/Mar/15 9:52 AM ]

To do:

  • remove Seqable from Eduction
  • support Iterable in RT.toArray()
  • more eduction pipeline tests that require realization at end
Comment by Alex Miller [ 06/Mar/15 1:00 PM ]

Perf numbers show pretty worse results from sequence, will dig in further.

Comment by Alex Miller [ 13/Mar/15 7:41 AM ]

For the s timings, we've actually introduced more steps into the stack:

OLD reduce with s:

   seq (range) - every transformation is another layer here

NEW reduce with s:

  TransformingIterator (handles N transformations in 1 step)
      seq (range)
Comment by Alex Miller [ 20/Mar/15 10:08 AM ]

Look at perf for:

  • ->> eduction transformation
  • transformation comparison that doesn't support chunking
  • more into vector iteration case
Comment by Alex Miller [ 21/Mar/15 8:45 AM ]

The -5 patch is same -3 except all uses of IteratorSeq have been replaced with a ChunkedCons that is effectively a chunked version of the old IteratorSeq. While no one calls it, I left IteratorSeq in the code base in case of regression.

Generally, the chunked iterator seq reduces the cost in a number of the worst cases and also is a clear benefit in making seqs over a result of eduction or sequence faster to traverse (as they are now chunked).

I think the one potential issue is that seqs over iterators are now chunked when they were not before which could change programs that expect their stateful iterator to be traversed one at a time. This change could be isolated to just to sequence and seq-iterator and mitigated by not changing RT.seqFrom() and seq-iterator to use the new chunking behavior only in sequence and/or with a new chunked-iterator-seq to make it more explicit. The sequence over xf is new so no possible regression there, everything else would just be opt-in.

Comment by Rich Hickey [ 27/Mar/15 9:49 AM ]

push as is but leave unresolved, for perf tweaks

Comment by Alex Miller [ 27/Mar/15 10:15 AM ]

clj-1669-6 is identical to clj-1669-5 but removes two commented out debugging lines that were inadvertently included.

[CLJ-1706] top level conditional splicing ignores all but first element Created: 15/Apr/15  Updated: 24/Apr/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: reader

Approval: Incomplete

user=> (def a (java.io.PushbackReader. (java.io.StringReader. "#?@(:clj [1 2])")))
user=> (read {:read-cond :allow} a)
user=> (read {:read-cond :allow} a)
RuntimeException EOF while reading  clojure.lang.Util.runtimeException (Util.java:221)

I believe using a conditional splicing not inside a list should be a read-time exception like

user=> `~@()
IllegalStateException splice not in list  clojure.lang.LispReader$SyntaxQuoteReader.syntaxQuote (LispReader.java:883)

Comment by Alex Miller [ 15/Apr/15 2:05 PM ]

pulling into 1.7 so we can discuss

Comment by Alex Miller [ 24/Apr/15 11:18 AM ]

Compiler.load() makes calls into LispReader.read() (static call). If the reader reads a top-level splicing conditional read form, that will read the entire form, then return the first spliced element. when LispReader.read() returns, the list carrying the other pending forms is lost.

I see two options:

1) Allow the compiler to call the LispReader with a mutable pendingForms list, basically maintaining that state across the static calls from outside the reader. makes the calling model more complicated and exposes the internal details of the pendingform stuff, but is probably the smaller change.

2) Enhance the LineNumberingPushbackReader in some way to remember the pending forms. This would probably allow us to remove the pending form stuff carried around throughout the LispReader and retain the existing (sensible) api. Much bigger change but probably better direction.

Comment by Nicola Mometto [ 24/Apr/15 11:24 AM ]

What about simply disallowing cond-splicing when top level?
Both proposed options are breaking changes since read currently only requires a java.io.PushbackReader

Comment by Alex Miller [ 24/Apr/15 11:42 AM ]

We want to allow top-level cond-splicing.

Comment by Nicola Mometto [ 24/Apr/15 11:52 AM ]

Would automatically wrapping a top-level cond-splicing in a (do ..) form be out of the question?

I'm personally opposed to supporting this feature as it would change the contract of c.c/read, complicate the implementation of LineNumberingPushbackReader or LispReader and complicate significantly the implementaion of tools.reader's reader types, for no significant benefit.
Is it really that important to be able to write

#~@(:clj [1 2])

rather than

(do #~@(:clj [1 2]))


[CLJ-1716] IExceptionInfo should print with its ex-data Created: 26/Apr/15  Updated: 28/Apr/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Gary Fredericks Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader

Attachments: Text File 0001-Add-tests-for-error-printing.patch     Text File 0002-CLJ-1716-print-ex-data-on-throwables.patch    
Patch: Code and Test
Approval: Incomplete


The new (for 1.7) data-reader printing format for exceptions does not include the ex-data when relevant:

user> *clojure-version*
{:major 1, :minor 7, :incremental 0, :qualifier "beta2"}
user> (pr (ex-info "msg" {:my-data 42}))
#error {
 :cause "msg"
 [{:type clojure.lang.ExceptionInfo
   :message "msg"
   :at [clojure.core$ex_info invoke "core.clj" 4591]}]
 [[clojure.core$ex_info invoke "core.clj" 4591]
  [user$eval9314 invoke "form-init6701752258113826186.clj" 1]
  ;; ...

Approach: If ExceptionInfo is caught, also print :data key with ex-data. Include :data key for each ExceptionInfo in via.


#error {
 :cause "msg"
 :data {:my-data 42}
 [{:type clojure.lang.ExceptionInfo
   :message "msg"
   :at [clojure.core$ex_info invoke "core.clj" 4591]}]
 [[clojure.core$ex_info invoke "core.clj" 4591]
  [user$eval7 invoke "NO_SOURCE_FILE" 4]
  ;; elided

Example with nested ExceptionInfo:

  (throw (ex-info "cause" {:a :b})) 
  (catch Exception e 
    (throw (ex-info "wrapped" {:c :d} e))))

;; yields: ???

Comment by Gary Fredericks [ 26/Apr/15 9:30 PM ]

Added two patch files, the first with tests for the existing code, the second adding the :data key and extra tests for that.

Comment by Alex Miller [ 28/Apr/15 9:55 AM ]

Screening comments:

  • Combine the two patches into a single patch
  • Include :data for all ExceptionInfo in :via (when appropriate) (and leave it where it is)
  • when you git format-patch, throw -W on there to get more context (I think it would help in this case)
  • everything else looks good

Generated at Tue Apr 28 15:07:26 CDT 2015 using JIRA 4.4#649-r158309.