<< Back to previous view

[CLJ-1684] Regression in TransformerIterator Created: 27/Mar/15  Updated: 27/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: regression, transducers

Attachments: Text File xf.patch    
Patch: Code


Error in build:

[java] ERROR in (seq-and-transducer) (TransformerIterator.java:86)
         [java] Uncaught exception, not in assertion.
         [java] expected: nil
         [java]   actual: java.lang.NullPointerException: null
         [java]  at clojure.lang.TransformerIterator.step (TransformerIterator.java:86)
         [java]     clojure.lang.TransformerIterator.hasNext (TransformerIterator.java:97)
         [java]     clojure.lang.RT.chunkIteratorSeq (RT.java:489)
         [java]     clojure.lang.RT.seqFrom (RT.java:518)
         [java]     clojure.lang.RT.seq (RT.java:509)
         [java]     clojure.core/seq (core.clj:135)
         [java]     clojure.core$print_sequential.invoke (core_print.clj:47)
         [java]     clojure.core/fn (core.clj:7362)
         [java]     clojure.lang.MultiFn.invoke (MultiFn.java:233)
         [java]     clojure.core$pr_on.invoke (core.clj:3549)
         [java]     clojure.core$pr.invoke (core.clj:3561)
         [java]     clojure.pprint$pprint_simple_default.invoke (dispatch.clj:144)

This error is due to an invalid transformation being passed to eduction. This isn't caught on construction so it shows up later at pprint time. You can reproduce with the bogus:

(pprint (eduction (constantly nil) nil))

But this is the printing of another error, from case like this:
(eduction (take 1) []) which seems to be throwing an exception only for eduction in the test. Still look at that.

[CLJ-1515] range should return something that implements IReduce Created: 29/Aug/14  Updated: 27/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Timothy Baldridge Assignee: Fogus
Resolution: Unresolved Votes: 2
Labels: None

Attachments: Text File clj-1515-10.patch     Text File clj-1515-11.patch     Text File clj-1515-12.patch     Text File clj-1515-13.patch     Text File clj-1515-2.patch     Text File clj-1515-3.patch     Text File clj-1515-4.patch     Text File clj-1515-5.patch     Text File clj-1515-6.patch     Text File clj-1515-7.patch     Text File clj-1515-8.patch     Text File clj-1515-9.patch     Text File CLJ-1515-deftype2.patch     Text File CLJ-1515-deftype3.patch     Text File CLJ-1515-deftype-nostructural-dup.patch     Text File CLJ-1515-deftype.patch     Text File clj-1515.patch     File patch.diff     File range-patch3.diff     File reified-range4.diff    
Patch: Code and Test
Approval: Screened


range should implement IReduce for fast reduction.

Patch: clj-1515-12.patch - Java range with long optimizations

Approach: The clj-1515-12 patch revives the unused clojure.lang.Range class. This class is similar in implementation to ChunkedCons. It differs by lazily constructing the first chunk (this is done in the existing impl by using a LazySeq wrapper) and by caching the next reference like other seq impls. The latter is done because range is frequently used in both chunked and unchunked traversal.

Additionally, 1515-12 contains an optimized version of range (LongRange) for the extremely common case where start, end, and step are all longs. This version uses primitive longs and primitive math and a customized long[] version of ArrayChunk for greater performance.

The special case of (range) is just handled with (iterate inc' 0) (which is further optimized for reduce in CLJ-1603).


  • CLJ-1515-deftype3.patch took the approach of using deftype to create a seqable, reducible entity that was not actually a seq. Based on work for CLJ-1603, it was established that due to historical requirements, this is not viable and the entity returned for range should implement the ISeq and Collection interfaces directly.
  • clj-1515-11.patch uses the approach of a "split" implementation - a LazySeq that has a fast reduce path. This is a minimal patch that provides no perf difference for sequence usage but a moderate improvement for reduce paths. One important difference vs clj-1515-12 is that the fast reduce path is only obtained on the initial range. Once you walk off the head, the performance will be the same as prior (seq perf).


criterium quick-bench with java 1.8.0-b132

code 1.7.0-master Java clj-1515-12 (w/CLJ-1603) split clj-1515-11
(count (range (* 1024 1024))) 62 ms 0 ms 58 ms
(reduce + (map inc (range (* 1024 1024)))) 49 ms 38 ms 50 ms
(reduce + (map inc (map inc (range (* 1024 1024))))) 66 ms 60 ms 67 ms
(count (keep odd? (range (* 1024 1024)))) 76 ms 49 ms 67 ms
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 46 ms 29 ms 34 ms
(reduce + 0 (range (* 2048 1024))) 64 ms 26 ms 40 ms
(reduce + 0 (rest (range (* 2048 1024)))) 60 ms 17 ms 57 ms
(doall (range 0 31)) 1.20 µs 0.72 µs 1.15 µs
(doall (range 0 32)) 1.22 µs 0.82 µs 1.18 µs
(doall (range 0 4096)) 147 µs 105 µs 145 µs
(into [] (map inc (range 31))) 1.80 µs 1.32 µs 1.88 µs
(into [] (map inc) (range 31)) 1.55 µs 0.72 µs 1.02 µs
(into [] (range 128)) 4.90 µs 2.14 µs 3.27 µs
(doall (range 1/2 1000 1/3)) 1.29 ms 1.24 ms 1.28 µs
(doall (range 0.5 1000 0.33)) 136 µs 151 µs 151 µs
(into [] (range 1/2 1000 1/3)) 1.31 ms 1.18 ms 1.26 ms
(into [] (range 0.5 1000 0.33)) 151 µs 91 µs 130 µs
(count (filter odd? (take (* 1024 1024) (range)))) 194 ms 163 ms 176 ms
(transduce (take (* 1024 1024)) + (range)) 67 ms 34 ms 68 ms

Performance notes:

  • 1515-12 (Java) - significant improvements in virtually all use cases, both seq and reduce. The final two cases with (range) leverage optimizations from CLJ-1603, which was included for those timings.
  • 1515-11 (split) - a much smaller patch that gives no seq benefits and smaller reduction benefits. Only the head of the range will receive the perf benefits (see (reduce + 0 (rest (range (* 2048 1024))))).

(range) and supports auto-promotion towards infinity in this patch, which seems to be implied by the doc string but was not actually implemented or tested correctly afaict.

Comment by Alex Miller [ 29/Aug/14 3:19 PM ]

1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

Comment by Timothy Baldridge [ 29/Aug/14 3:40 PM ]

1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Comment by Alex Miller [ 29/Aug/14 3:49 PM ]

6) might be useful if you're walking through it with nth, which hits count everytime, but doubt that's common
7) yep, reasonable

Comment by Andy Fingerhut [ 18/Sep/14 6:52 AM ]

I have already pointed out to Edipo in personal email the guidelines on what labels to use for Clojure JIRA tickets here: http://dev.clojure.org/display/community/Creating+Tickets

Comment by Timothy Baldridge [ 19/Sep/14 10:02 AM ]

New patch with IReduce directly on Range instead of relying on iterators

Comment by Alex Miller [ 01/Oct/14 2:00 PM ]

The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is.

Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.

Comment by Timothy Baldridge [ 03/Oct/14 10:00 AM ]

Looks like chunked seqs do make lazy seq code about 5x faster in these tests.

Comment by Ghadi Shayban [ 03/Oct/14 10:22 AM ]

I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.

Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

Comment by Timothy Baldridge [ 03/Oct/14 10:38 AM ]

I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

Comment by Alex Miller [ 03/Oct/14 10:46 AM ]

I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.

On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Comment by Ghadi Shayban [ 03/Oct/14 11:00 AM ]

Tim, I'm not suggesting unboxed math, but the singular fast-path of all-Longs that you and Alex describe. I mistakenly lower-cased Long/Float.

Comment by Timothy Baldridge [ 31/Oct/14 11:30 AM ]

Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.

Comment by Nicola Mometto [ 14/Nov/14 12:51 PM ]

As discussed with Tim in #clojure, the current patch should not change ArrayChunk's reduce impl, that's an error.

Comment by Alex Miller [ 09/Dec/14 2:40 AM ]

Still a work in progress...

Comment by Nicola Mometto [ 09/Dec/14 8:44 AM ]

Alex, while this is still a work in progress, I see that the change on ArrayChunk#reduce from previous WIP patches not only has not been reverted but has been extended. I don't think the current approach makes sense as ArrayChunk#reduce is not part of the IReduce/IReduceInit contract but of the IChunk contract and changing the behaviour to be IReduce-like in its handling of reduced introduces the burden of having to use preserve-reduced on the reducing function to no apparent benefit.

Given that the preserve-reduced is done on the clojure side, it seems to me like directly invoking .reduce rather than routing through internal-reduce should be broken but I haven't tested it.

Comment by Alex Miller [ 09/Dec/14 9:49 AM ]

That's the work in progress part - I haven't looked at yet. I have not extended or done any work re ArrayChunk, just carried through what was on the prior patch. I'll be working on it again tomorrow.

Comment by Ghadi Shayban [ 10/Dec/14 11:14 PM ]

I am impressed and have learned a ton through this exercise.

quick review of clj-1515-2
1) withMeta gives the newly formed object the wrong meta.
2) LongRange/create() is the new 0-arity constructor for range, which sets the 'end' to Double/POSITIVE_INFINITY cast as a long. Current core uses Double/POSITIVE_INFINITY directly. Not sure how many programs rely upon iterating that far, or how they would break.
3) Relatedly, depending on the previous point: Because only all-long arguments receive chunking, the very common case of (range) with no args would be unchunked. Doesn't seem like too much of a stretch to add chunking to the other impl.
4) Though the commented invariants say that Range is never empty, the implementation uses a magic value of _count == 0 to mean not cached, which is surprising to me. hashcodes have the magic value of -1
5) s/instanceof Reduced/RT.isReduced
6) is the overflow behavior of "int count()" correct?

Comment by Alex Miller [ 11/Dec/14 12:06 AM ]

1) agreed!
2) Good point. I am definitely changing behavior on this (max of 9223372036854775807). I will look at whether this can be handled without affecting perf. Really, handling an infinite end point is not compatible with several things in LongRange.
3) I actually did implement chunking for the general Range and found it was slower (the original Clojure chunking is faster). LongRange is making up for that difference with improved primitive numerics.
4) Since empty is invalid, 0 and -1 are equally invalid. But I agree -1 conveys the intent better.
5) agreed
6) probably not. ties into 2/3.

Thanks for this, will address.

Comment by Alex Miller [ 11/Dec/14 12:11 AM ]

Added -4 patch that addresses 1,4,5 but not the (range) stuff.

Comment by Alex Miller [ 11/Dec/14 12:51 PM ]

Latest -7 patch addresses all feedback and perf #s updated.

Comment by Ghadi Shayban [ 22/Dec/14 10:59 PM ]

See CLJ-1572, I believe CollReduce needs to be extended directly to clojure.lang.{LongRange,Range} inside protocols.clj

Comment by Ghadi Shayban [ 29/Dec/14 12:29 PM ]

Seems like a missing benchmark is (range 0 31) without the doall. Current patch (-9) allocates the chunk buffer at range's construction time. Maybe this can be delayed?

Comment by Alex Miller [ 02/Jan/15 11:24 AM ]

Ghadi - you are right. I reworked the patch (new -10 version) so that the chunk is only created on demand. Basically, the chunking is only used when traversing via chunked seqs and in normal seq iteration or reduce, no explicit chunks are created. That improved several timings. I also added the bench for (range 0 31) which was greatly improved by lazily creating the chunk, so good point on that.

Comment by Michael Blume [ 03/Jan/15 1:38 PM ]

I'm looking at the implementation of equiv() and wondering if it's worth checking whether the other object is also a reified range and comparing the private parameters rather than iterating through the sequences.

Comment by Ghadi Shayban [ 18/Jan/15 7:26 PM ]

Attaching a simplified implementation as a deftype. I benchmarked a billion variants and dumped bytecode. I'm attaching the best performing combination, and it beats out the Java one in most cases.

Two deftypes, one specialized to primitive longs, the other generic math.
Implement: Seqable/Counted,IReduce,IReduceInit,Iterable
Also, marker interfaces Sequential and Serialized.

The implementations are very straightforward, but Seqable deserves mention.
'seq' delegates its implementation to the existing lazy-seq based range, which has been stripped of the strange corner cases and now only handles strictly ascending/descending ranges. It has been renamed range*, rather than range1 because all the arguments to range are already numbers. core references to range have been accordingly renamed.

The new range constructor is loaded after reduce & protocols load. It checks types, and delegates to either long-range or generic-range, which handle the wacky argument cases that are not strictly ascending or descending. I'm annoyed with the structural duplication of those conditionals, not sure how to solve it.

Inside the LongRange implementation of IReduce/IReduceInit, boxing is carefully controlled, and was verified through bytecode dumping.

I elaborated the benchmarks for comparison, and also included benchmarking without type specialization.

You discover strange things working on this stuff. Turns out having the comparator close over the 'end' field is beneficial: #(< % end).

Alex, I figured out why the Java versions had a nearly exactly 2x regression on (doall (range 0 31)), and I attached a change to 'dorun'. Instead of calling (seq coll) then (next coll), it effectively calls (next (seq coll)). I think the implementation assumes that calling 'seq' is evaluating the thunk inside LazySeq. This should also help other Seqable impls like Cycle/Iterate/Repeat CLJ-1603.

Results (criterium full runs):

code 1.7.0-alpha5 clj-1515-10.patch (Java) deftype specialized (attached) deftype unspecialized
(count (filter odd? (take (* 1024 1024) (range)))) 187.30 ms 194.92 ms 182.53 ms 184.13 ms
(transduce (take (* 1024 1024)) + (range)) 58.27 ms 89.37 ms 84.69 ms 84.72 ms
(count (range (* 1024 1024))) 62.34 ms 27.11 ns not run not run
(reduce + (map inc (range (* 1024 1024)))) 57.63 ms 46.14 ms 46.17 ms 52.94 ms
(reduce + (map inc (map inc (range (* 1024 1024))))) 82.83 ms 68.66 ms 64.36 ms 71.76 ms
(count (keep odd? (range (* 1024 1024)))) 76.09 ms 57.59 ms not run not run
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 52.17 ms 39.71 ms 28.83 ms 42.23 ms
(reduce + 0 (range (* 2048 1024))) 85.93 ms 38.03 ms 26.49 ms 42.43 ms
(doall (range 0 31)) 1.33 µs 2.89 µs 1.03 µs 1.08 µs
(doall (range 0 32)) 1.35 µs 2.97 µs 1.07 µs 1.10 µs
(doall (range 0 128)) 5.27 µs 11.93 µs 4.19 µs 4.29 µs
(doall (range 0 512)) 21.66 µs 47.33 µs 16.95 µs 17.66 µs
(doall (range 0 4096)) 171.30 µs 378.52 µs 135.45 µs 140.27 µs
(into [] (map inc (range 31))) 1.97 µs 1.57 µs 1.87 µs 1.95 µs
(into [] (map inc) (range 31)) 1.66 µs 824.67 ns 891.90 ns 1.11 µs
(into [] (range 128)) 5.11 µs 2.21 µs 2.43 µs 3.36 µs
(doall (range 1/2 1000 1/3)) 1.53 ms 1.67 ms 1.59 ms 1.52 ms
(doall (range 0.5 1000 0.33)) 164.83 µs 382.38 µs 149.38 µs 141.21 µs
(into [] (range 1/2 1000 1/3)) 1.53 ms 1.40 ms 1.43 ms 1.50 ms
(into [] (range 0.5 1000 0.33)) 157.44 µs 108.27 µs 104.18 µs 127.20 µs

Open Questions for screeners of the deftype patch:

1) What to do about the autopromotion at the Long/MAX_VALUE boundary? I've preserved the current behavior of Clojure 1.6.
2) Alex, I did not pull forward the filter chunk tweak you discovered
3) Is the structural duplication of the conditionals in generic-range long-range awful?
4) Are there any other missing interfaces? IMeta comes to mind. Not sure about IHashEq either.

Comment by Ghadi Shayban [ 18/Jan/15 7:30 PM ]

note with the deftype patch, the transduce over infinite (range) case is the only one where 'master' currently performs best. This is because the implementation was changed to (iterate inc' 0). When CLJ-1603 is applied that situation should improve better.

Comment by Ghadi Shayban [ 18/Jan/15 8:27 PM ]

fixup patch CLJ-1515-deftype

Comment by Ghadi Shayban [ 18/Jan/15 11:05 PM ]

new patch CLJ-1515-deftype-nostructural-dup.patch with the silly conditional duplication removed. Updated benchmarks including 'count' impls. These benchmarks run on a different machine with the same hardness. Blue ribbon.

code 1.7.0-alpha5 1.7.0-rangejava 1.7.0-rangespecial rangespecial / alpah5
(count (filter odd? (take (* 1024 1024) (range)))) 297.14 ms 333.93 ms 328.62 ms 1.10
(transduce (take (* 1024 1024)) + (range)) 105.55 ms 145.44 ms 164.70 ms 1.56
(count (range (* 1024 1024))) 108.92 ms 61.09 ns 26.61 ns 0
(reduce + (map inc (range (* 1024 1024)))) 97.67 ms 95.41 ms 84.62 ms 0.86
(reduce + (map inc (map inc (range (* 1024 1024))))) 140.21 ms 135.59 ms 116.38 ms 0.83
(count (keep odd? (range (* 1024 1024)))) 121.18 ms 104.63 ms 111.46 ms 0.92
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 100.40 ms 86.28 ms 67.17 ms 0.67
(reduce + 0 (range (* 2048 1024))) 131.77 ms 80.43 ms 63.24 ms 0.48
(doall (range 0 31)) 2.53 µs 4.36 µs 2.24 µs 0.89
(doall (range 0 32)) 2.37 µs 4.00 µs 1.99 µs 0.84
(doall (range 0 128)) 9.20 µs 14.98 µs 8.01 µs 0.87
(doall (range 0 512)) 37.28 µs 59.13 µs 35.16 µs 0.94
(doall (range 0 4096)) 331.28 µs 471.57 µs 291.76 µs 0.88
(into [] (map inc (range 31))) 2.83 µs 2.79 µs 2.67 µs 0.94
(into [] (map inc) (range 31)) 2.21 µs 1.39 µs 1.26 µs 0.57
(into [] (range 128)) 6.72 µs 3.25 µs 3.09 µs 0.46
(doall (range 1/2 1000 1/3)) 3.41 ms 4.04 ms 3.14 ms 0.92
(doall (range 0.5 1000 0.33)) 281.04 µs 530.92 µs 244.14 µs 0.87
(into [] (range 1/2 1000 1/3)) 3.32 ms 3.71 ms 2.99 ms 0.90
(into [] (range 0.5 1000 0.33)) 215.53 µs 165.93 µs 138.86 µs 0.64
Comment by Alex Miller [ 19/Jan/15 8:32 AM ]

This is looking very good and I think we should move forward with it as the preferred approach. Feel free to update the description appropriately. I'll file a separate ticket with the filter tweak. Some comments on the patch:

1) GenericRange/count - this math is broken if you start to mix infinity in there. I think just (count (seq this)) is safer.
2) GenericRange/iterable - I think if we are IReduceInit and Seqable, we can omit this. I had it in the Java one because I inherited Iterable via ASeq but that's not an issue in this impl.
3) LongRange/reduce - why Long/valueOf? Isn't start a long field?
4) LongRange/iterable - ditto #2
5) print-method - anytime I see print-method, there should probably be print-dup too. For example, this is broken: (binding [*print-dup* true] (println (range 0 10)))
6) Is serialization broken by this patch? Can you justify the test changes?

Comment by Ghadi Shayban [ 19/Jan/15 12:51 PM ]

regarding Serialization tests currently:

(defn roundtrip
  (let [rt (-> v serialize deserialize)
        rt-seq (-> v seq serialize deserialize)]        ;; this
    (and (= v rt)            ;; this fails because the test ^ calls seq first
      (= (seq v) (seq rt))   ;; this passes
      (= (seq v) rt-seq))))  ;; this passes

These new types are merely seqable, so (not= (LongRange. 0 10 1) (seq (LongRange. 0 10 1)))

Not sure how to handle this 100%. Nothing precludes the LongRange itself from roundtripping, but just that calling seq on it returns a separate object.

Comment by Alex Miller [ 23/Jan/15 9:57 AM ]

More comments...

1) Instead of extending IReduce and IReduceInit, just extend IReduce and implement both arities (IReduce extends IReduceInit).
2) I'm slightly troubled by the .invokePrim now. Did you look at a macro that does prim type-hinting maybe?
3) On the serialization thing, is the problem really with serialization or with equality? The test that's failing is (= v rt). Because these are not IPersistentCollections, pcequiv won't be used and it's just calling LongRange.equals(), which is not implemented and falls back to identity, right? Probably need to implement the hashCode and equals stuff. Might need IHashEq too.

user=> (= (range 5) (range 5))
Comment by Ghadi Shayban [ 23/Jan/15 12:13 PM ]

Took care of 1) and 3). Punting on hiding the invokePrim behind a macro. It may be shameful but it works really fast.

Another case found:
(assoc {'(0 1 2 3 4) :foo} (range 5) :bar) needs to properly overwrite keys in the map.

Cause: Might be irrelevant in the face of CLJ-1375. Util/equiv for maps doesn't use IPC/equiv unless the collection is also java.util.Collection or Map. If it is then it checks for IPersistentCollection. I added a check for IPersistentCollection first.


To get hasheq working, I added back the iterators for use by Murmur3/hashOrdered.

Comment by Ghadi Shayban [ 23/Jan/15 3:20 PM ]

Handle hash and equality not through IPersistentCollection. w/ test-cases too.

Comment by Alex Miller [ 20/Feb/15 11:39 AM ]

Ghadi, we need to have range implement IObj too so with-meta works.

Comment by Ghadi Shayban [ 20/Feb/15 12:09 PM ]

Will add pronto but maybe someone can clarify something: Adding a simple clojure.lang.IObj/withMeta to the deftype results in an AbstractMethodError when trying to print a range. This is because the impl of vary-meta used in the default printer assumes that if all IObjs are also IMetas. Seems like a problem with either vary-meta's impl or the interface separation.

I can fix by:
1) adding a _meta field to Ranges and handling IMeta as well as IObj.

;; vary-meta:
(with-meta obj (apply f (meta obj) args)))

;; default printer
(defmethod print-method :default [o, ^Writer w]
  (if (instance? clojure.lang.IObj o)
    (print-method (vary-meta o #(dissoc % :type)) w)
    (print-simple o w)))
Comment by Ghadi Shayban [ 20/Feb/15 12:25 PM ]

Ugh never mind that last bit I fell out of the hammock prematurely. IMeta << IObj.

Alex, CLJ-1603 needs IMeta/meta too.

Comment by Alex Miller [ 24/Feb/15 11:30 AM ]

current direction is pending results of where CLJ-1603 goes

Comment by Fogus [ 27/Mar/15 1:57 PM ]

Screened. Though the amount of duplicated code saddens me, I don't hold it against the implementer. It's a fairly straight-forward counting Impl with a chunk cache.

Comment by Alex Miller [ 27/Mar/15 3:44 PM ]

New patch -13 removes the array allocation and performs better, still need to update timings and consider the non-long case.

[CLJ-1424] Reader conditionals Created: 15/May/14  Updated: 27/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Ghadi Shayban Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: reader

Attachments: File clj-1424-10.diff     Text File clj-1424-11.patch     Text File clj-1424-12.patch     File CLJ-1424-2.diff     File clj-1424-3.diff     File clj-1424-4.diff     File clj-1424-5.diff     File clj-1424-6.diff     File clj-1424-7.diff     File clj-1424-8.diff     File clj-1424-9.diff     File clojure-feature-expressions.diff    
Patch: Code and Test
Approval: Ok


See http://dev.clojure.org/display/design/Reader+Conditionals for design.

The implementation details contain several major parts:

1) LispReader changes to implement new conditional read syntax and features.

In RT:

  • suppress-read - new dynamic var
  • readString() - new arity that takes opts

In LispReader:

  • read(...) entry point - added new variants that take opts. The opts-only form will decode :eof into the correct values for eofIsError and eofValue. Internal forms take both opts and pendingForms (list).
  • All reader invoke dispatch methods now take both opts and pendingForms which get drug around the call stack.
  • read(...) entry point will ensure installation of {:features #{:clj}}, appropriately merging others or none. There is no path that will pass :features inside Clojure but users can call LispReader with it directly.
  • readDelimitedList() was refactored so it can share logic with readCondDelimited().
  • Luke can provide more detail on the guts of reader conditional reading if needed.

New ReaderConditional and TaggedLiteral types.

In clojure.core:

  • tagged-literal?, tagged-literal, reader-conditional?, reader-conditional - new functions to support the preserved data form of these
  • new print-method impls for TaggedLiteral and ReaderConditional
  • read - added new arity that takes opts and explain opts in docstring
  • read-string - added new arity that takes opts (placement matches edn/read-string and is designed for partial application)

2) cljc file changes.
In Compiler:

  • OPTS_COND_ALLOWED - keep around a map {:read-cond :allow} to invoke reader when reading .cljc files.
  • readerOpts() - new private function to determine reader options to use based on file name (.clj vs .cljc)
  • load() - use readerOpts() and invoke LispReader with them
  • compile() - use readerOpts() and invoke LispReader with them

In RT:

  • load(...) - check first for .clj file, then for .cljc file when looking by ns

In clojure.main - update code to allow for .clj and .cljc files
In clojure.repl - update code to allow for .clj and .cljc files

3) Tests + infrastructure updates.

  • Move test/clojure/test_clojure/reader.clj to test/clojure/test_clojure/reader.cljc so that we can test reader conditionals.
  • Bump version of test.generative to get newer version that depends on newer tools.namespace which is able to find .cljc files as source files. The test script at src/script/run_test.clj uses this to find Clojure test files and without the bump, it will not find the new reader.cljc test ns. Note: if you use the ant build, you need to re-run antsetup.sh for this to take effect.
  • src/script/run_test.clj - as part of the version bump, move from deprecated namespace to new namespace
  • Added lots of tests in reader.cljc

Patch: clj-1424-12.patch

Screened by:

Related: CLJS-27, TRDR-14

Comment by Jozef Wagner [ 16/May/14 2:19 AM ]

Has there been a decision that CL syntax is going to be used? Related discussion can be found at design page, google groups discussion and another discussion.

Comment by Alex Miller [ 16/May/14 8:34 AM ]

No, no decisions on anything yet.

Comment by Ghadi Shayban [ 19/May/14 7:25 PM ]

Just to echo a comment from TRDR-14:

This is WIP and just one approach for feature expressions. There seem to be at least two couple diverging approaches emerging from the various discussion (Brandon Bloom's idea of read-time splicing being the other.)

In any case having all Clojure platforms be ready for the change is probably essential. Also backwards compatibility of feature expr code to Clojure 1.6 and below is also not trivial.

Comment by Kevin Downey [ 04/Aug/14 1:39 PM ]

if you have ever tried to do tooling for a language where the "parser" tossed out information or did some partial evaluation, it is a pain. this is basically what the #+cljs style feature expressions and bbloom's read time splicing both do with clojure's reader. I think resolving this at read time instead of having the compiler do it before macro expansion is a huge mistake and makes the reader much less useful for reading code.

Comment by Ghadi Shayban [ 04/Aug/14 2:00 PM ]

Kevin, what kind of tooling use case are you alluding to?

Comment by Kevin Downey [ 04/Aug/14 3:24 PM ]

any use case that involves reading code and not immediately handing it off to the compiler. if I wanted to write a little snippet to read in a function, add an unused argument to every arity then pprint it back, reader resolved feature expressions would not round trip.

if I want to write snippet of code to generate all the methods for a deftype (not a macro, just at the repl write a `for` expression) I can generate a clojure data structure, call pprint on it, then paste it in as code, reader feature expressions don't have a representation as data so I cannot do that, I would have to generate strings directly.

Comment by Alex Miller [ 22/Aug/14 9:10 AM ]

Changing Patch setting so this is not in Screenable yet (as it's still a wip).

Comment by Alex Miller [ 07/Nov/14 4:39 PM ]

Latest patch brings up to par with related patches in CLJS-27 and TRDR-14 and importantly adds support for loading .cljc files as Clojure files.

Comment by Andy Fingerhut [ 07/Nov/14 5:55 PM ]

Maybe undesirable behavior demonstrated below with latest Clojure master plus patch clj-1424-3.diff, due to the #+cljs skipping the comment, but not the (dec a). I thought it could be fixed simply by moving RT.suppressRead() check after (ret == r) check in read(), but that isn't correct.

user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs (dec a))")
(defn foo [a] (inc a))
user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs ; foo\n (dec a))")
(defn foo [a] (inc a) (dec a))
Comment by Alex Miller [ 21/Jan/15 4:28 PM ]

Added new clj-1424-4.diff which makes a couple of modifications:

  • removed support for and/or/not (#+ and #- remain)
  • *features* has been removed
  • if you wish to have a custom feature set while reading, there is a new option map that can be passed to read (this all parallels similar changes previously made to the edn reader)

Example of adding a "custom" feature to the feature set (which will always contain "clj" feature):

  {:features #{:custom}} 
  (java.io.PushbackReader. (java.io.StringReader. "[#+custom :x]")))
Comment by Andy Fingerhut [ 21/Jan/15 5:01 PM ]

Latest patch clj-1424-4.diff also exhibits maybe-undesirable behavior in which #+cljs can suppress an immediately following comment, rather than the form following it. See 07/Nov/14 comment with example above.

Comment by Alex Miller [ 21/Jan/15 6:16 PM ]

Thanks Andy, I'm aware. Haven't looked at it yet.

Comment by Luke VanderHart [ 25/Jan/15 9:26 PM ]

Patch clj-1424-5.diff modifies the code to use "read-conditionals", as outlined by Rich at: https://groups.google.com/d/msg/clojure-dev/LW0ocQ1RcYI/IBPPyfCpM3kJ

Comment by Alex Miller [ 26/Jan/15 12:33 PM ]

Some feedback:

1) Because pendingForms is an internal thing, I would make the read() that takes it non-public.
2) In readDelimitedList, I don't see the point of constructing a new LinkedList then checking if it's empty there. Should just make the add conditional on whether it's null or not.
3) You could treat pendingForms as a Deque (which LinkedList implements) and then use pop() instead of remove(0). The addAll(0, ...) is more painful to replicate though if you're sticking to Deque. I think I'd be tempted to just commit explicitly to LinkedList for pendingForms since we fully control the construction and use of it within the reader.
4) Might be nice to update the commented-out readers to support pendingForms as I did with opts. Or remove the updates for opts. Should either do all the mods or none on the commented-out code.
5) s/read-cond-splicing/read-cond-splice/ ? Seems like where it's used it should be a verb.
6) Should just use :default and make :else and :none throw exceptions. I think Rich mentioned :except or :exception too? or maybe I misheard that.
7) Should have some more tests to tweak the error cases - bad feature, uneven forms, default out of allowed position, bad contents for splice, etc.

Comment by Alex Miller [ 26/Jan/15 2:01 PM ]

From Chouser on the mailing list: "is it intentional that reading (clojure.core/read-cond ...) does not behave the same as (#? ...)? That is, (#? ...) can be read as c.c/read-cond depending on read options, but having been read, if it is printed again it doesn't round-trip back to #?. This is different, for example, from how #(...) is read as (fn* [] (...)), which then retains its meaning."

In shouldReadConditionally(), it looks like the == check vs READ_COND will not work. Instead of:
return (first == READ_COND || first == READ_COND_SPLICING);
return (READ_COND.equals(first) || READ_COND_SPLICING.equals(first));

For example, this test doesn't seem to give the right answer:

user=> (read-str-opts {:preserve-read-cond false} "(clojure.core/read-cond :clj :x :default :y)")
(clojure.core/read-cond :clj :x :default :y)    ;; should be :x
Comment by Michael Blume [ 26/Jan/15 3:27 PM ]

With this patch applied to master, lein check fails on instaparse:

Compiling namespace instaparse.abnf
Exception in thread "main" clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader, compiling:(abnf.clj:186:28)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3605)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3599)
	at clojure.lang.Compiler$DefExpr.eval(Compiler.java:436)
	at clojure.lang.Compiler.eval(Compiler.java:6772)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.RT.loadResourceScript(RT.java:384)
	at clojure.lang.RT.loadResourceScript(RT.java:375)
	at clojure.lang.RT.load(RT.java:459)
	at clojure.lang.RT.load(RT.java:425)
	at clojure.core$load$fn__5424.invoke(core.clj:5850)
	at clojure.core$load.doInvoke(core.clj:5849)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at user$eval52$fn__63.invoke(form-init5310597017138984927.clj:1)
	at user$eval52.invoke(form-init5310597017138984927.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6767)
	at clojure.lang.Compiler.eval(Compiler.java:6757)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.Compiler.loadFile(Compiler.java:7150)
	at clojure.main$load_script.invoke(main.clj:275)
	at clojure.main$init_opt.invoke(main.clj:280)
	at clojure.main$initialize.invoke(main.clj:308)
	at clojure.main$null_opt.invoke(main.clj:343)
	at clojure.main$main.doInvoke(main.clj:421)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader
	at clojure.lang.AFn.throwArity(AFn.java:429)
	at clojure.lang.AFn.invoke(AFn.java:36)
	at instaparse.cfg$eval800$safe_read_string__801.invoke(cfg.clj:163)
	at instaparse.cfg$process_string.invoke(cfg.clj:180)
	at instaparse.cfg$build_rule.invoke(cfg.clj:217)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:214)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:207)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core.protocols$seq_reduce.invoke(protocols.clj:30)
	at clojure.core.protocols$fn__6436.invoke(protocols.clj:59)
	at clojure.core.protocols$fn__6389$G__6384__6402.invoke(protocols.clj:13)
	at clojure.core$reduce.invoke(core.clj:6501)
	at clojure.core$into.invoke(core.clj:6582)
	at instaparse.cfg$ebnf.invoke(cfg.clj:277)
	at clojure.lang.AFn.applyToHelper(AFn.java:154)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3600)
	... 27 more
Comment by Michael Blume [ 26/Jan/15 3:29 PM ]

Aha, of course, Instaparse is calling into the LispReader$StringReader directly.

Is it worth providing versions of these methods with the old arities? Or should instaparse just not be using Clojure internals this way?

Comment by Michael Blume [ 26/Jan/15 3:33 PM ]


Comment by Alex Miller [ 26/Jan/15 3:33 PM ]

Instaparse is reaching pretty deep inside implementation details here, so I'd say this should expect to break. We could back-fill the old arities here but I'd really prefer not to if possible.

Comment by Luke VanderHart [ 27/Jan/15 11:23 AM ]

clj-1424-6.diff addresses all the issues mentioned above. Per a comment from Rich, it also adds tests to ensure that nested splices work properly (they do).

There were two things from your list I didn't do, Alex:

3) I kept pendingForms as a List. Because we aren't confining ourselves to a Deque interface, I don't see the benefit of calling pop() over remove(0) (with identical semantics) as justification for over-specifying the concrete type.

5) I kept "read-cond-splicing" since it parallels the form of "unquote-splicing". Seems that those should be consistent.

Comment by Luke VanderHart [ 30/Jan/15 9:03 PM ]

clj-1424-7.diff contains Rich's "reader-conditionals" proposal.

Comment by Alex Miller [ 06/Feb/15 3:45 PM ]

ReaderConditional / TaggedLiteral
1) when patch applied I see some whitespace errors in here, also line endings seem different, might want to check it
2) a common pattern in other Java classes is private constructor and public static create() method
3) could use Util.hash() to clean up the "null->0" logic in hashCode()

4) adds unused import: java.util.Iterator
5) it looks like returnOn flag could just be collapsed into checking if returnOnChar is non-null?
6) in readCondDelimited, EOF and FINISHED are never used and can be removed presumably

Comment by Luke VanderHart [ 07/Feb/15 1:49 PM ]

I have attached clj-1424-8.diff, which addresses your most recent comments, Alex. I formatted it using `git format patch` instead of `git diff` so it should have the email address added correctly.

Your comments are all addressed, with the exception of returnOn. I don't think that can be collapsed. You really need two values: one to say what character should cause a return, and one to say what value should be returned in that scenario. You could use a convention on the return value, I suppose, (e.g, null means a completed read) but there's already precedent for passing in the value to be returned (namely, eofValue).

Comment by Alex Miller [ 09/Feb/15 9:49 AM ]

Looks good. I think I actually mis-read what returnOn was doing, so np on that. I still see the whitespace issues and the CR/LF in those two files. Were you going to change those?

Comment by Alex Miller [ 09/Feb/15 4:24 PM ]

Added new -9 patch that squashes the last patch but is otherwise identical. The older patches in that diff were the source of still seeing whitespace errors on apply.

Comment by Rich Hickey [ 20/Feb/15 8:09 AM ]

I think in the first iteration we should allow reader conditionals only in .cljc files, and support only standard features :clj, :cljs and :clr.

Comment by Andy Fingerhut [ 27/Feb/15 10:35 AM ]

Comment on -10 patch:

The doc string additions for clojure.core/read are a bit cryptic for those not already steeped in the details. It would be good to at least mention 'reader conditionals' so people have something to Google for. Current additions:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow, or :preserve to keep all branches as data
+    :features - persistent set of feature keywords
+    :eof - on eof, return value unless :eofthrow, then throw

Alternate suggestion:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow to process reader conditionals, or :preserve to keep all branches as data.  Throw if reader conditional found and this key not present.
+    :features - persistent set of feature keywords for processing reader conditionals.
+    :eof - on eof, return value unless :eofthrow, then throw
Comment by Alex Miller [ 27/Feb/15 10:39 AM ]

Thanks Andy, will consider. Indeed, this will have lengthier docs on http://clojure.org/reader so I was trying to just hit the surface but that's a reasonable suggestion.

Comment by Fogus [ 06/Mar/15 11:10 AM ]

I've looked over the -10 patch and have some feedback.

  • clojure.core/read

I agree with Andy's recommendations about enhancing the docstring to mention reader conditionals.

  • clojure.core/read-string

I would also like to see a back-reference to the read function to refer to the available opts.

  • clojure.core/tagged-literal

A symbol is expected, so the docstring should probably mention that.

  • clojure.core/reader-conditional

I realize that the '?' implies that true/false is expected for the splicing? argument, but it might be worth saying so in the docstring.

  • clojure.core/print-method clojure.lang.TaggedLiteral
  • Question: The TaggedLiteral prints without a space, but the concrete realization will most likely print with a space (e.g. #uuid "..."). While this does not cause problems in round-tripping, should we be consistent, or was the space removed to distinguish between the TaggedLiteral and its concretion?
  • Compiler.readOpts

The #endsWith check is against the string "cljc" but elsewhere similar checks are against ".cljc". This is minor of course, but consistency is nice.

  • LispReader.read(rdr, opts)

Question: Do we really need to check for and cast to IPersistentMap? Would Map work just as well?

  • ConditionalReader.hasFeature() #1318

The code's explicitly checking for keyword-ness, but the error message thrown is abiguous. If a kw is expected then say so.

  • ConditionalReader.invoke() #1438

Can this error message be more clear? The term "read-cond form"" could just be "A reader condition's body" no? (or something like that)

  • ConditionalReader.readCondDelimited()
  • Question: If the conditional form given to read contains no features that can resolve (e.g. {{#?(:cljs [1 2 3])}} in a Clojure context) then ConditionalReader.readCondDelimited leaves the Reader instance in an empty state. I understand that the reason for this is that unpreserved conditional forms are effectively wiped from the stream leaving nothing left. This is effectively the same as running (read-string ""), but I wonder if we have enough context to report a more pointed error. Is it worth doing so?
Comment by Alex Miller [ 11/Mar/15 4:26 PM ]

Re fogus comments in new -11 patch:

  • clojure.core/read - updated docstring
  • clojure.core/read-string - clarified opts back-reference in docstring
  • clojure.core/tagged-literal - updated docstring
  • clojure.core/reader-conditional - updated docstring
  • clojure.core/print-method clojure.lang.TaggedLiteral - agreed, updated print form to match
  • Compiler.readerOpts - updated as suggested
  • LispReader.read(rdr, opts) - the IPersistentMap is intentional there and used via that interface, so no change
  • ConditionalReader.hasFeature() - improved error msg
  • ConditionalReader.invoke() - slightly altered error msg but I'm on the fence on this one. "form" is pretty precise there.
  • ConditionalReader.readCondDelimited() - this is in my opinion correct and expected as is
Comment by Fogus [ 20/Mar/15 12:33 PM ]

I'm happy with the latest changes.

Comment by Alex Miller [ 27/Mar/15 2:22 PM ]

-12 patch is same, just applies cleanly to current master

[CLJ-1602] vals and keys return values should implement IReduceInit or Iterable Created: 25/Nov/14  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1602-2.diff     File clj-1602-3.diff     File clj-1602-4.diff     File clj-1602-5.diff     File clj-1602-6.diff     File clj-1602.diff    
Patch: Code and Test
Approval: Ok


Background: clojure.core/keys calls RT.keys(Object) calls APersistentMap.KeySeq.create(ISeq). RT.keys() creates a sequence (of Map.Entry objects) and KeySeq just wraps it, calling .keyValue(). There is an equivalent vals -> RT.vals() -> ValSeq path. Both of these seq impls extend ASeq and provide Iterable implementations via SeqIterator (iterator wrapped over the seq).

Approach: The important thing here is to avoid creating the sequence and instead directly iterate/reduce over the map. Noting that CLJ-1499 provides support for making PHM directly Iterable and that KeySeq/ValSeq already implement Iterable, I chose to focus on making the instances returned from keys and vals support Iterable directly on the underlying map instead of via the seq.

RT.keys()/vals() created the seq and passed it to KeySeq/ValSeq which made it too late to directly cover the original map iterator. There are a few places that rely on passing a seq of Map.Entry to keys/vals (not just a map instance), so I check for IPersistentMap and in that case pass it directly to a new KeySeq factory method that remembers both the Iterable and the ISeq. There is also support in here for using a direct key/value iterator via IMapIterable as introduced in CLJ-1499.


  • Could potentially check for Map or Iterable instead of IPersistentMap in RT.keys()/vals(). Not sure how common it is to pass normal Java maps to keys/vals.
  • The direct Iterable support vanishes once you move off the head of the keys or vals seq. So (rest (keys map)) does not have Iterable support. This is not really possible unless you hold an Iterator and advance it along with the seq, but that seemed to introduce all sorts of possibilities for badness. Since maps are unordered, it seems weird to rely on any ordering or processing only parts of any map, so I suspect doing this would be quite rare.
  • This patch depends on CLJ-1499 for IMapIterable.

Performance: I tested perf using criterium to benchmark as follows:

(use 'criterium.core)
(def m (zipmap (range 1000) (range 1000)))
(bench (reduce + (keys m)))
(bench (reduce + (vals m)))
(bench expr) 1.6.0 1.7.0-alpha5 alpha5 + clj-1499 + patch
(reduce + (keys m)) 69 µs 73 µs 44 µs
(reduce + (vals m)) 75 µs 77 µs 50 µs


  • I added some basic tests for subseq and rsubseq as those both rely on the somewhat special behavior of keys accepting a seqable of Map.Entry objects (not just a map itself). There were no other tests for subseq or rsubseq already present.
  • Some coverage from CLJ-1499 tests d760db
  • Show that metadata works on keys/vals
  • Show that iterator works correctly after you step off the head of the seq, invalidating the iterable
  • Show that the fallthrough branch of iterator works (from PersistentTreeMap)

Patch: clj-1602-6.diff

Screened by:

  • the changes in CollReduce are unlikely to stand but are currently essential

Comment by Alex Miller [ 25/Nov/14 11:53 AM ]

Could leverage CLJ-1499 for the bulk of this, may pull that back from 1.8 into 1.7. Waiting on further work till that's answered.

Comment by Alex Miller [ 03/Dec/14 11:24 PM ]

I also have a patch that extends the CLJ-1499 iterators to support providing both key and val iterators that do not require creating and unpacking a Map.Entry. Unfortunately I only saw times that were ~48 µs on the perf benchmark in the description, so it's not a huge benefit (short-lived object allocation is cheap).

Comment by Alex Miller [ 09/Jan/15 9:15 AM ]

waiting on 1499 mods

Comment by Alex Miller [ 14/Jan/15 5:42 PM ]

Updated patch based on latest CLJ-1499-v10 patch.

Comment by Michael Blume [ 15/Jan/15 1:48 AM ]

I find it concerning that the v9 patch passed generative tests, shouldn't we have seen that?

Comment by Alex Miller [ 15/Jan/15 7:21 AM ]

Actually I did see it in the generative tests for this ticket so I moved those tests into the CLJ-1499 patch.

Comment by Michael Blume [ 15/Jan/15 10:10 AM ]

Ah, cool =)

Comment by Stuart Halloway [ 22/Mar/15 8:16 AM ]

clj-1602-5.diff merely updates -4 to apply on current master

Comment by Stuart Halloway [ 22/Mar/15 8:37 AM ]

requested some tests in the description

Comment by Alex Miller [ 23/Mar/15 1:51 PM ]

Updated to -6 patch that adds the requested tests.

[CLJ-1681] reflection warning throws NPE for literal nil arg Created: 24/Mar/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Michael Blume Assignee: Michael Blume
Resolution: Completed Votes: 0
Labels: regression, typehints

Attachments: Text File clj-1681-v1.patch     Text File clj-1681-v2.patch     Text File clj-1681-v3.patch    
Patch: Code and Test
Approval: Ok


Seen on another project, but reproducible with this:

user=> (set! *warn-on-reflection* true)
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
user=> (pst *e)
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:21:13)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6740)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyze (Compiler.java:6485)
	clojure.lang.Compiler$BodyExpr$Parser.parse (Compiler.java:5861)
	clojure.lang.Compiler$FnMethod.parse (Compiler.java:5296)
	clojure.lang.Compiler$FnExpr.parse (Compiler.java:3925)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6731)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
Caused by:
	clojure.lang.Compiler.getTypeStringForArgs (Compiler.java:2440)
	clojure.lang.Compiler$InstanceMethodExpr.<init> (Compiler.java:1490)
	clojure.lang.Compiler$HostExpr$Parser.parse (Compiler.java:1000)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6733)
	clojure.lang.Compiler.analyze (Compiler.java:6524)

Cause: Regression in Clojure 1.6+ from patch for CLJ-1248. In printing the reflection warning message, if the argument's java class is null, an NPE results.

Approach: Check for null before getting class name.

Patch: clj-1681-v3.patch

Screened by: Alex Miller

Comment by Alex Miller [ 24/Mar/15 3:54 PM ]

Patch welcome.

Comment by Alex Miller [ 24/Mar/15 4:31 PM ]

Need test in the patch and having a simple way to repro in the ticket would be great.

Comment by Nicola Mometto [ 25/Mar/15 3:46 PM ]

here's a repro:

Clojure 1.7.0-master-SNAPSHOT
user=> (set! *warn-on-reflection* true)
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
Comment by Michael Blume [ 25/Mar/15 3:49 PM ]

Thanks! I was having some trouble with that.

Comment by Nicola Mometto [ 25/Mar/15 3:54 PM ]

Michael, me and Alex were discussing this ticket in IRC and Alex was proposing replacing the if expression with an implementation like:

(arg.hasJavaClass() && arg.getJavaClass() != null) ? arg.getJavaClass().getName() : "unknown"
Comment by Michael Blume [ 25/Mar/15 3:58 PM ]

Hm, that makes sense, I was thinking print "nil" instead of "unknown" but I guess the fact that the user is passing nil doesn't actually tell us the expected class of the argument (apart from that it's not a primitive)

[CLJ-1603] cycle, iterate, repeat return vals should IReduceInit Created: 25/Nov/14  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Stuart Halloway
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File clj-1603-10.patch     Text File clj-1603-11.patch     Text File clj-1603-12-2.patch     Text File clj-1603-12.patch     Text File clj-1603-13.patch     Text File clj-1603-14.patch     Text File clj-1603-15.patch     Text File clj-1603-2.patch     Text File clj-1603-3-2.patch     Text File clj-1603-3-3.patch     Text File clj-1603-3-4.patch     Text File clj-1603-3.patch     Text File clj-1603-4.patch     Text File clj-1603-5.patch     Text File clj-1603-6.patch     Text File clj-1603-7.patch     Text File clj-1603-8.patch     Text File clj-1603-9.patch     Text File clj-1603.patch    
Patch: Code and Test
Approval: Ok



clj-1603-15.patch (Java approach)


There were a number of possible approaches for these enhancements:
1) Straight Java impl. The benefit of this approach is improving the performance of both the seq and reduce paths at the expense of writing a bunch of Java. See clj-1603-15.patch for the best of these impls.

2) Clojure deftype. This required moving cycle and iterate and providing a repeat1 implementation until deftype is defined (similar to the approach with reduce1). The deftype version returns a Seqable, IReduce object and has effectively the same former implementation for seq with a new fast implementation for IReduce. This makes reduce paths fast, but leaves seq paths about the same, with the benefit of no new Java code. The downside was that the new seqs did not implement the full range of interfaces from prior impls, which could potentially break code. See clj-1603-12-2.patch for a patch that covers this.

3) Add Iterable or IReduceInit directly to LazySeq. Conceptually, this does not make sense for general lazy seqs. Seqs materialize and cache each value once and doing this along with the ability to iterate/reduce introduces issues with caching (might as well use seqs for that) and synchronization. However, doing this for just these specific lazy seqs is possible - see latest patch.

Approach: Latest patch makes LazySeq implement IReduce and internal macro reducible-lazy-seq lets callers supply a function to implement the reduce path.

A few things to note:

  • Added some example-based tests for iterate, cycle, and repeat where I thought they were needed. Did not add generative tests - not clear to me what these would be that would actually be valuable. All of these functions are pretty simple and the examples cover the special cases.


Some example timing, all in µs:

Expression 1.6.0 1.7.0-alpha5 master + clj-1603-15 (Java) master + clj-1603-12-2 (deftype) master + clj-1603-14 (split)
(doall (take 1000 (repeat 1))) 87 93 63 89 92
(into [] (take 1000) (repeat 1)) n/a 67 25 27 33
(doall (repeat 1000 1)) 87 94 16 94 89
(into [] (repeat 1000 1)) 99 110 13 12 12
(reduce + 0 (repeat 1000 1)) 99 126 20 22 25
(into [] (take 1000) (repeat 1)) n/a 67 28 33 27
(doall (take 1000 (cycle [1 2 3]))) 101 106 85 108 103
(into [] (take 1000) (cycle [1 2 3])) n/a 73 38 45 44
(doall (take 1000 (iterate inc 0))) 93 98 75 123 116
(into [] (take 1000) (iterate inc 0)) n/a 85 38 40 39

Notes on timings above:

  • All reduce timings (with into) comparable across the impls and significantly better than the current behavior over seqs.
  • The Java impl is faster across the board with doall. doall repeatedly calls seq() and next() to walk the sequence. The Java class versions of Repeat, Cycle, and Iterate extend ASeq and seq() just return this. next() constructs and returns a new instance of the class, which is immutable. In the lazy seq versions, LazySeq is mutable and requires synchronization and handling the caching safely. So, simple immutable instance ftw here.
  • The Java finite repeat has an extra benefit from using a primitive long for the counter.
  • One performance difference that's not visible in the timings is that the Java implementations have the benefit of being both seqs and reducibles as they are traversed, so you can always get a fast reduce. The deftype and split impls are only reducible at the initial instance, walking off that initial head reverts to lazy seqs that are not quickly reducible.

Patch: The two patches in leading contention are:

  • clj-1603-15.patch - for the Java version
  • clj-1603-14.patch - for the split impl

Alex opinion: I have swung back and forth on this but my current recommendation is for the Java implementation (clj-1603-15.patch). It's faster for both seqs and reduce, both in the timings above and importantly in maintaining reducibility as they are traversed. There is more Java, but I've made my peace with that - the code maximally leverages existing ASeq infrastructure and the implementation is easy to understand.

Comment by Ghadi Shayban [ 25/Nov/14 11:01 AM ]

Stu, do you intend these to be in Java or Clojure? It could be trickier to implement in Clojure directly, as loading would have to be deferred until core_deftype loads. It's certainly tractable without breaking any backwards compatibility, and I've explored this while experimenting with Range as a deftype https://github.com/ghadishayban/clojure/commit/906cd6ed4d7c624dc4e553373dabfd57550eeff2

A macro to help with Seq&List participation could be certainly useful, as efficiently being both a Seq/List and IReduceInit isn't a party.

May be useful to list requirements for protocol/iface participation.

It seems like 'repeatedly' is another missing link in the IReduceInit story.

Rich mentioned the future integration of reduce-kv at the conj, it would also be useful to know how that could fit in.

---- Other concerns and ops that may belong better on the mailing list ----

In experimenting with more reducible sources, I put out a tiny repo (github.com/ghadishayban/reducers) a couple weeks ago that includes some sources and operations. The sources were CollReduce and not ISeq.

Relatedly, caching the hashcode as a Java `transient` field is not supported when implementing a collection using deftype (patch w/ test in CLJ-1573).

Iterate was one of them https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51
Repeatedly https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51

Reduce/transduce-based Operations that accept transducers:
some, any, yield-first https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L52-L80
(any could use a better name, equiv to (first (filter...)))
some and any have a symmetry like filter/remove.

Novelty maybe for 1.8:
A transducible context for Iterables similar to LazyTransformer:

The unless-reduced macro was very useful in implementing the collections:
It is different than the ensure-reduced and unreduced functions in core.

Comment by Alex Miller [ 25/Nov/14 12:01 PM ]

When we discussed this in the past, it was in the vein of reusing some of the range work (in Java) to implement cycle and iterate (per CLJ-1515).

Comment by Ghadi Shayban [ 25/Nov/14 9:20 PM ]

Never mind about 'repeatedly'. Being both ISeq and IReduceInit for repeatedly doesn't make sense for something that relies on side-effects. Current users of repeatedly can reduce over it many times and only realize the elements once.

Comment by Alex Miller [ 05/Dec/14 11:17 PM ]

attached wip Java impl and posted some example timings

Comment by Ghadi Shayban [ 11/Dec/14 4:35 PM ]

NB iterate in this patch does not cache the realized ISeq, but recalcs it at every call to realize the tail. This is not a change in the promised behavior (docstring says "f must be side-effect free") but an implementation change, as worth noting in the changelog.

Comment by Stuart Halloway [ 02/Jan/15 1:32 PM ]

It looks like all the reduce with no inital value paths are still seq-y, and slower, as shown by e.g.

(dotimes [_ 10]
    (repeat 10000000 1))))

(dotimes [_ 20]
    (repeat 10000000 1))))
Comment by Alex Miller [ 02/Jan/15 2:01 PM ]

On that example in master before / after patch I see:


  • no init = 844 ms
  • with init = 920 ms


  • no init = 124 ms
  • with init = 90 ms

Is that similar to what you see or not?

Comment by Alex Miller [ 02/Jan/15 4:21 PM ]

The clj-1603-3.patch has been updated to use effectively the same algorithm for both versions of reduce. With the -3 patch, I got ~96 ms on both examples in the prior comment. I re-ran the tests in the description and updated those as well (about the same as expected).

Comment by Stuart Halloway [ 16/Jan/15 1:18 PM ]

The tests do not seem to hit the unseeded reduce branches – do we even want these branches? The original ticket was for IReduceInit.

Comment by Michael Blume [ 18/Jan/15 1:48 PM ]

Probably worth noting – Git will happily apply the latest patch for CLJ-1603 on top of the latest patch for CLJ-1515, but the result does not compile because 1515 uses iterate and 1603 moves the definition of iterate lower in clojure.core. Not sure if this is worth fixing now or just noting for when they're actually applied.

Comment by Michael Blume [ 18/Jan/15 1:52 PM ]

Actually, here, this just moves the declare statement further up the file.

Comment by Michael Blume [ 18/Jan/15 2:19 PM ]

OK, no, the two patches are still incompatible even with the declaration order fixed:

[java] ERROR in (test-range) (LongRange.java:95)
     [java] expected: (= (take 3 (range 3 9 0)) (quote (3 3 3)))
     [java]   actual: java.lang.ClassCastException: clojure.core.InfiniteRepeat cannot be cast to clojure.lang.ISeq
     [java]  at clojure.lang.LongRange.create (LongRange.java:95)
Comment by Alex Miller [ 18/Jan/15 2:31 PM ]

The 1515 patch is actually being reworked right now - we will patch things up at application time if needed.

Comment by Alex Miller [ 19/Jan/15 10:12 AM ]

Removing screened marking so can be re-screened. Added new -7 patch that handles print-method, print-dup, and unmapping the deftype constructors so they're not visible. Thanks to Ghadi in CLJ-1515 for the idea on those.

Comment by Ghadi Shayban [ 20/Jan/15 8:08 AM ]

Review of -7 patch:

Seqable/seq implementations that return a separate ISeq like these do should forward a call to seq on the result, like eduction does. [1] (This is not necessary in these particular impls, as the LazySeqs returned are themselves ISeqs. Also because Cycle's deftype is only constructed for non-empty cycles, the fact that there is a guaranteed seq is implicit. Probably a best practice to add an innocuous seq call if users look to these impls as a recipe.)

The performance regression in (doall (repeat 1000 1)) should go away completely with the dorun tweak in CLJ-1515. This is because dorun is effectively calling seq twice (it calls seq, throws away the result, then calls next.)

minor nits
1) repeat1 seems to be identical to repeat-seq and has both arities necessary for the deftypes
2) inside FiniteRepeat s/(> i 0)/(pos? i) also inside the repeat constructor
3) some things are defn- , some are ^:private
4) Cycle/reduce the recur binding can be (recur rr (or (next s) coll)) rather than nil? check

[1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7324

Comment by Alex Miller [ 22/Jan/15 10:34 PM ]

Ghadi - good comments! Fixed 1,2,4. #3 - ^:private is because defn- is not yet defined. New -8 patch.

Comment by Alex Miller [ 23/Jan/15 10:03 AM ]


user=> (= (repeat 5 :a) (repeat 5 :a))
Comment by Alex Miller [ 23/Jan/15 3:04 PM ]

Updated to -9 patch that handles hash and equality for finite repeat case.

Comment by Ghadi Shayban [ 26/Jan/15 2:24 PM ]

metadata in the wrong place on #'repeat1

Comment by Alex Miller [ 26/Jan/15 3:27 PM ]

Thanks, fixed in -10.

Comment by Stuart Halloway [ 20/Feb/15 10:19 AM ]

The collections returned by these APIs promise several things the new deftypes do not deliver. For example, 1.6's repeat currently has the following ancestors that are lost in the propopsed deftype:

  • clojure.lang.ISeq
  • java.io.Serializable
  • clojure.lang.IPending
  • java.util.Collection
  • java.util.List
  • java.lang.Iterable
  • clojure.lang.Obj
  • clojure.lang.IPersistentCollection
  • clojure.lang.IMeta
  • clojure.lang.IObj

Losing metadata and serializability are certainly regressions, other stuff maybe as well. I suspect similar problems in all the other API returns.

We could improve testing by taking advantage of the property-checking fns already in test-clojure/data-structures, e.g. is-same-collection

Comment by Alex Miller [ 20/Feb/15 10:42 AM ]

I think we should consider carefully what is contractually required by the return of repeat. My opinion is that repeat must return something seqable, not literally a seq (or lazy seq). With that perspective, ISeq, IPending (there re lazy seq laziness), Collection, List, Iterable, and IPersistentCollection are non-essential. Obj is a concrete class and we shouldn't commit to that.

  • IObj is a gap, this should work but doesn't: (with-meta (repeat 1 :a) {:a 1})
  • IMeta doesn't need to be added, will never have meta right now and meta handles this correctly
  • Serializable - doesn't make sense for the infinite seqs but should probably fix for finite range
Comment by Alex Miller [ 20/Feb/15 11:03 AM ]

Added -11 patch that adds IObj for all the new deftypes and Serializable for FiniteRepeat. Still need to add more tests.

Comment by Stuart Halloway [ 20/Feb/15 12:27 PM ]

I think all the java. interfaces are mandatory for interop. Leaving out any one of the clojure. interfaces creates observable change in behavior composing with other core fns (admittedly the IPending case would be bizarre, who uses that?), so those seem mandatory too.

Agreed Obj should be irrelevant, and if it isn't the bug is elsewhere.

Comment by Alex Miller [ 27/Feb/15 9:41 AM ]

pending further discussion w/ stu

Comment by Alex Miller [ 13/Mar/15 7:02 AM ]

for a clean run and more explanation of the perf numbers

Comment by Alex Miller [ 17/Mar/15 4:23 PM ]

refreshed numbers in table

Comment by Alex Miller [ 18/Mar/15 9:24 AM ]

Refreshed perf numbers and older patch variants, added some more explanation of the perf numbers for comparison.

Comment by Stuart Halloway [ 22/Mar/15 7:50 AM ]

Screened clj-1603-3-3.patch

Comment by Nicola Mometto [ 23/Mar/15 2:00 PM ]

Shouldn't iterate cache its next slot? the current implementation recalculates it every time:

user=> (def coll (iterate #(do (println %) (inc %)) 0))
user=> (nth coll 3)
user=> (nth coll 3)

I realize the docstring for iterate explicitely warns that f must be side-effect free but this is just for demonstration purposes, imagine that f is a computationally-heavy function and you'll understand why I don't think the new proposed behaviour is desiderable.

Comment by Nicola Mometto [ 23/Mar/15 2:26 PM ]

I attached clj-1603-3-4.patch which is the same as clj-1603-3-3.patch except it caches the next slot of iterate.

Here is the diff between -3-3 and -3-4 for ease of review:

diff --git a/src/jvm/clojure/lang/Iterate.java b/src/jvm/clojure/lang/Iterate.java
index f23ddca..aeef998 100644
--- a/src/jvm/clojure/lang/Iterate.java
+++ b/src/jvm/clojure/lang/Iterate.java
@@ -16,6 +16,7 @@ public class Iterate extends ASeq implements IReduce {
 private final IFn f;      // never null
 private final Object seed;
+private volatile ISeq _next;
 private Iterate(IFn f, Object seed){
     this.f = f;
@@ -37,7 +38,14 @@ public Object first(){
 public ISeq next(){
-    return new Iterate(f, f.invoke(seed));
+    ISeq ret = _next;
+    if (ret == null)
+        synchronized(this) {
+            ret = _next;
+            if (ret == null)
+                _next = ret = new Iterate(f, f.invoke(seed));
+        }
+    return ret;
 public Iterate withMeta(IPersistentMap meta){
Comment by Alex Miller [ 23/Mar/15 3:32 PM ]

It's a good question. That impl kills all of the performance improvement for iterate on the seq use case (prob the current common case). I'll take a look at it though.

Comment by Nicola Mometto [ 23/Mar/15 3:57 PM ]

My opinion is that the performance improvement for iterate that the benchmark shows won't reflect a real performance improvement in "real" usage cases of iterate where `f` actually does some computation

Comment by Nicola Mometto [ 23/Mar/15 4:04 PM ]

This is also true for its reduce impl btw.
I'm sorry if I'm pointing this out late in the life of this ticket but it never occurred to me that, contrary to cycle, repeat or range (CLJ-1515), this change for iterate will possibly make accessing large sequences slower since we lose the caching aspect of lazy-seqs.

Comment by Alex Miller [ 23/Mar/15 4:58 PM ]

Added new -15 patch, which is the same as -3-3 but caches _next for all three types. This has no impact on the reduce performance and was 2-8 us slower in the timings for the seqs above. That seems like a small hit for the reduction in perf and memory on all cases where the seq is traversed multiple times. That seems like a reasonably common thing to do, particularly for finite repeat.

Comment by Alex Miller [ 23/Mar/15 5:02 PM ]

Stu - the only change in -15 vs -3-3 is in the next() methods of each new type. Each has a new "private volatile ISeq _next;"


public ISeq next(){
    if(_next == null) {
        ISeq next = current.next();
        if (next != null)
            _next = new Cycle(all, next);
            _next = new Cycle(all, all);
    return _next;


public ISeq next(){
    if(_next == null) {
        _next = new Iterate(f, f.invoke(seed));
    return _next;


public ISeq next() {
    if(_next == null) {
        if(count > 1)
            _next = new Repeat(count - 1, val);
        else if(count == INFINITE)
            _next = this;
    return _next;
Comment by Michael Blume [ 23/Mar/15 6:47 PM ]

For Cycle, would it be worth holding the head and linking back to it when we loop, so that the underlying sequence only has to be traversed once? Something like this


On my machine this saves about 10 microseconds on

(doall (take 1000 (cycle [1 2 3])))

Comment by Michael Blume [ 23/Mar/15 7:07 PM ]

Also, for the Cycle reduce implementations, it might make sense to assume the underlying sequence is going to be traversed many times, copy it into an array, and loop over it with an index.

Comment by Michael Blume [ 23/Mar/15 7:17 PM ]

like so: https://github.com/MichaelBlume/clojure/commit/array-loop

Comment by Alex Miller [ 23/Mar/15 8:28 PM ]

The problem with holding a pointer back to the head is that the intervening cycle can be arbitrarily large. You are then literally holding the head on an arbitrary size seq. A silly example that fails with the patch but works without it (depending on your jvm size):

(dorun (take 10000000 (cycle (range 10000000))))

I played with the cycle reduce too but it has the same effective potential issue of creating and holding an arbitrarily sized array. Actually this one is worse in that it realizes an arbitrarily large array before knowing how many inputs it will need - it could be a take 1. Also, things that formerly worked could blow up with an oome. What if someone relies on passing an infinite sequence to cycle in a fallthrough case? For example, this works with the current version but would blow up with the array.

(into [] (take 1000) (cycle (iterate inc 0)))

Neither of these seem worth the risk to me.

Comment by Michael Blume [ 23/Mar/15 10:53 PM ]

Yeah, fair enough.

Comment by Alex Miller [ 25/Mar/15 4:14 PM ]

updated version of the -15 patch that gives proper credit to Ghadi for part.

[CLJ-1683] test-reduce doesn't catch off-by-one error in IReduce Created: 25/Mar/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: None

Attachments: Text File clj-1683-v1.patch    
Patch: Code and Test
Approval: Ok


To implement the no-init arity of reduce, you have to use the first element of the sequence as an init value, and then you have to skip the first element when you iterate through the sequence. The tests in test-reduce, which focus on using reduce to sum up a bunch of numbers in a range, don't actually pin down this behavior, because the range used starts with zero, and an extra zero doesn't affect the sum.

Screened by: Alex Miller - this is a good tweak to get better tests for the hard this easy to mess up reduce behavior.

[CLJ-1667] Socket test can fail if hard-coded port is unavailable Created: 26/Feb/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Completed Votes: 0
Labels: io, test

Attachments: Text File socket-test.patch    
Patch: Code
Approval: Ok


I was unable to run the Clojure tests due to this problem. There is a test that hardcodes a port and something else on my machine happened to be using that port.

The patch avoids binding a hard-coded port in the test.

Patch: socket-test.patch

Screened by:

Comment by Andy Fingerhut [ 26/Feb/15 11:31 AM ]

I used to try running the prescreen tests in parallel for two different JDKs on the same machine, and I probably stopped doing that because of this. My use case is a very unusual one, and not a good reason to change this by itself, but my use case certainly made this conflict happen regularly.

Comment by Alex Miller [ 26/Feb/15 11:57 AM ]

No good reason not to fix it! Silly test.

[CLJ-1669] Move LazyTransformer to an iterator strategy, extend eduction capabilities Created: 04/Mar/15  Updated: 27/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: transducers

Attachments: Text File clj-1669-2.patch     Text File clj-1669-3.patch     Text File clj-1669-4.patch     Text File clj-1669-5.patch     Text File clj-1669-6.patch     Text File clj-1669.patch    
Patch: Code and Test
Approval: Vetted

  • LazyTransformer does a lot of work to be a seq. Instead, switch to creating a transforming iterator.
  • Change sequence to wrap iterator-seq around the transforming iterator.
  • Change the iterator-seq implementation to be chunked. IteratorSeq will no longer be used but is left in case of regressions for now.
  • Change Eduction to provide iteration directly via the transforming iterator.
  • Extend eduction to support multiple xforms.


(use 'criterium.core)
(def s (range 1000))
(def v (vec s))
(def s50 (range 50))
(def v50 (vec s50))

expr master s master v 1669-3 s 1669-3 v 1669-6 s 1669-6 v
non-chunking transform            
(into [] (->> s (interpose 5) (partition-all 2))) 466 us 459 us 525 us 508 us 476 us 501 us
(into [] (->> s (eduction (interpose 5) (partition-all 2)))) * 113 us 112 us 117 us 122 us 108 us 108 us
1 chunking transform            
(into [] (map inc s)) 28 us 31 us 30 us 31 us 30 us 31 us
(into [] (map inc) s) 17 us 19 us 19 us 20 us 19 us 17 us
(into [] (sequence (map inc) s)) 58 us 46 us 142 us 148 us 94 us 67 us
(into [] (eduction (map inc) s)) 21 us 20 us 25 us 21 us 23 us 21 us
(doall (map inc (eduction (map inc) s))) 219 us 208 us 204 us 191 us 117 us 97 us
2 chunking transforms        
(into [] (map inc (map inc s))) 49 us 50 us 50 us 50 us 54 us 55 us
(into [] (comp (map inc) (map inc)) s) 23 us 23 us 28 us 23 us 23 us 23 us
(into [] (sequence (comp (map inc) (map inc)) s)) 73 us 58 us 144 us 135 us 104 us 82 us
(into [] (eduction (map inc) (map inc) s)) * 54 us 51 us 54 us 29 us 55 us 30 us
(doall (map inc (eduction (map inc) (map inc) s))) * 230 us 213 us 213 us 196 us 124 us 104 us
expand transform            
(into [] (mapcat range (map inc s50))) 83 us 81 us 80 us 84 us 71 us 72 us
(into [] (sequence (comp (map inc) (mapcat range)) s50)) 122 us 117 us 256 us 254 us 161 us 156 us
(into [] (eduction (map inc) (mapcat range) s50)) * 78 us 79 us 80 us 82 us 60 us 61 us
materialized eduction            
(sort (eduction (map inc) s)) ERR ERR 120 us 84 us 106 us 89 us
(->> s (filter odd?) (map str) (sort-by last)) 1.13 ms 1.21 ms 1.19 ms 1.20 ms 1.19 ms 1.20 ms
(->> s (eduction (filter odd?) (map str)) (sort-by last)) ERR ERR 1.18 ms 1.17 ms 1.22 ms 1.23 ms
  • used comp to combine xforms as eduction only took one in the before case

Patch: clj-1669-6.patch

Screened by:

Comment by Michael Blume [ 05/Mar/15 3:52 PM ]

Nice, I like the direction on this.

CLJ-1515 currently breaks this patch (LongRange cannot be converted to Iterable), but I imagine that'll get better when it absorbs the changes from CLJ-1603

Comment by Alex Miller [ 06/Mar/15 8:11 AM ]

Yeah. colls should be mapped through RT.iter() to catch more cases.

Comment by Alex Miller [ 06/Mar/15 9:52 AM ]

To do:

  • remove Seqable from Eduction
  • support Iterable in RT.toArray()
  • more eduction pipeline tests that require realization at end
Comment by Alex Miller [ 06/Mar/15 1:00 PM ]

Perf numbers show pretty worse results from sequence, will dig in further.

Comment by Alex Miller [ 13/Mar/15 7:41 AM ]

For the s timings, we've actually introduced more steps into the stack:

OLD reduce with s:

   seq (range) - every transformation is another layer here

NEW reduce with s:

  TransformingIterator (handles N transformations in 1 step)
      seq (range)
Comment by Alex Miller [ 20/Mar/15 10:08 AM ]

Look at perf for:

  • ->> eduction transformation
  • transformation comparison that doesn't support chunking
  • more into vector iteration case
Comment by Alex Miller [ 21/Mar/15 8:45 AM ]

The -5 patch is same -3 except all uses of IteratorSeq have been replaced with a ChunkedCons that is effectively a chunked version of the old IteratorSeq. While no one calls it, I left IteratorSeq in the code base in case of regression.

Generally, the chunked iterator seq reduces the cost in a number of the worst cases and also is a clear benefit in making seqs over a result of eduction or sequence faster to traverse (as they are now chunked).

I think the one potential issue is that seqs over iterators are now chunked when they were not before which could change programs that expect their stateful iterator to be traversed one at a time. This change could be isolated to just to sequence and seq-iterator and mitigated by not changing RT.seqFrom() and seq-iterator to use the new chunking behavior only in sequence and/or with a new chunked-iterator-seq to make it more explicit. The sequence over xf is new so no possible regression there, everything else would just be opt-in.

Comment by Rich Hickey [ 27/Mar/15 9:49 AM ]

push as is but leave unresolved, for perf tweaks

Comment by Alex Miller [ 27/Mar/15 10:15 AM ]

clj-1669-6 is identical to clj-1669-5 but removes two commented out debugging lines that were inadvertently included.

[CLJ-1436] Deref throws an unhelpful error message when used on something not dereferencable Created: 03/Jun/14  Updated: 26/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Phillip Lord Assignee: Unassigned
Resolution: Duplicate Votes: 3
Labels: errormsgs, newbie

Attachments: Text File deref.patch     File tests-patch.diff    
Patch: Code and Test


Consider the following code:

(def x 1)
(def y (ref 2))

(+ @x y)

Clojure throws a ClassCastException on cast to Future. This is a very unhelpful error message; why a Future, why not Ref, Atom etc. It would be nice if this failed more gracefully.

Comment by Tobias Kortkamp [ 15/Sep/14 2:08 PM ]

Attached a patch with better error messages for deref. The above example now throws:

IllegalArgumentException class java.lang.Long is not derefable  clojure.core/deref (core.clj:2211)

and e.g.

(deref (delay 1) 500 :foo)


IllegalArgumentException class clojure.lang.Delay is not derefable with a timeout  clojure.core/deref (core.clj:2222)
Comment by Mark Nutter [ 16/Sep/14 6:57 PM ]

Patch file clj-1436-patch-2014-09-16.diff updates the deref function so that it checks whether its arg is a future before sending it to deref-future. It also updates the deref function to provide clearer error messages. If the arg is not a future, and does not implement IDeref, the patched version of deref throws an IllegalArgumentException with a message that the arg cannot be dereferenced because it is not a ref/future/etc.

Comment by Mark Nutter [ 16/Sep/14 7:00 PM ]

Oops, I had this page open from yesterday and didn't see the patch submitted by Tobias. His has everything mine does, so I'll withdraw mine.

Comment by Mark Nutter [ 17/Sep/14 5:13 AM ]

One suggestion: the error message might sound better as "IllegalArgumentException cannot dereference clojure.lang.Delay; not a future or reference type".

Comment by Mark Nutter [ 17/Sep/14 5:44 AM ]

Tobias' patch does not contain the tests I had in mine, so I'm re-submitting just the tests as tests-patch.diff. If you install the tests patch without installing the deref patch, the tests will fail with the error message "Wrong exception type when passing non-IDeref/non-future to deref/@". Applying the deref patch as well will allow the tests to pass.

Comment by Alex Miller [ 28/Dec/14 11:06 AM ]

This patch does not seem to consider performance implications of adding this check and its impact on inlining.

Comment by Nicola Mometto [ 25/Mar/15 5:45 PM ]

Dupe of CLJ-1162

Comment by Andy Fingerhut [ 26/Mar/15 10:13 AM ]

Maybe it shouldn't be the deciding factor when selecting which duplicate ticket to close, but CLJ-1162 has 1 vote, this one has 3. Closing this one as the duplicate is discarding those votes, unless (a) they are copied over to the older ticket, or (b) consider closing the older ticket as a duplicate of this one instead.

Comment by Alex Miller [ 26/Mar/15 10:59 AM ]

I pondered that myself - I usually look at votes, current status, and patch. In this case, I liked the patch on CLJ-1162 better (although they were pretty similar).

[CLJ-1671] Clojure stream socket repl Created: 09/Mar/15  Updated: 26/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: repl

Attachments: Text File clj-1671-2.patch     Text File clj-1671-3.patch     Text File clj-1671-4.patch    
Patch: Code
Approval: Incomplete


Programs often want to provide REPLs to users in contexts when a) network communication is desired, b) capturing stdio is difficult, or c) when more than one REPL session is desired. In addition, tools that want to support REPLs and simultaneous conversations with the host are difficult with a single stdio REPL as currently provided by Clojure.

The goal is to provide a simple streaming socket repl as part of Clojure. The socket repl should support both program-to-program communication (by exchanging data) and human-readable printing (which mirrors current behavior). The socket repl server will be started only when supplying a port to clojure.main or by explicitly starting it by calling the provided function.

Each socket connection will be given its own repl context and unique starting user namespace (user, user1, user2, etc) with separate repl stack and proper bindings. On session termination, the namespace will be removed. *in* and *out* will be bound to incoming and outgoing streams. Tools can communicate with the runtime while also providing a user repl by opening two connections to the same server.

There are two known cases where the repl interprets non-readable objects and prints them for human consumption in a non-readable way: objects with no specific print-method and when handling Throwables. Both cases (Object and Throwable) now have a print-method implementation to return a tagged-literal representation. By default the socket repl will print exceptions with the print-method data form. Optionally, the user can set a custom repl exception printer. A function that provides the current human readable exception printing will be provided.


  1. Socket server to accept connections and serve a repl to a client
    • Runtime configuration via data
  2. Client repl sessions should be independent
    • Separate user namespace
    • Separate bindings
    • Namespace removed on client shutdown
    • Communicate solely via data for program-to-program communication
  3. Stdio has both out and err streams but socket has only single out
  4. Repls should be nestable
    • Repl within a repl binds streams appropriately
    • Means of control (exit)


  1. Printing as data
    • Of object without print-method: as #object
    • Of Throwable: as #error
  2. Start socket server from command line
    • Configure: host, port, whether to prompt, error printing
  3. Start socket server programmatically
    • Configure: host, port, whether to prompt, error printing, whether to bind err to out
    • Control: close returned socket server to stop listening
  4. Start stdio repl from command line
    • Configure: whether to prompt, error printing
  5. On socket client accept
    • Create new user namespace
    • Bind error printer according to server config
  6. In socket client repl
    • Bind new error printer function
  7. On socket client disconnect
    • Remove user namespace
    • Close socket

Impl notes: This adds a new -s option to clojure.main that will start a socket server listening on a given host:port. Each client is given a new userN namespace (starting from user1). It binds *in*, *out*, and *err*. Each client connection consumes a daemon thread named "Socket REPL Client N" (matching the user namespace). On client disconnect, the user namespace is removed and thread will die.

There are two system properties that can be used to control whether the prompt and the default error printer:

  • clojure.repl.socket.prompt (default=false) - whether to print prompts
  • clojure.repl.socket.err-printer (default=clojure.main/err->map) - function to format exceptions

The existing stdio repl behaves the same as before, but it's behavior can be influenced by two new similar system properties:

  • clojure.repl.stdio.prompt (default=true)
  • clojure.repl.socket.err-printer (default=clojure.main/err-print)

You can also start a socket repl server programmatically (shown here with all kwarg options - pick the ones you need):

(def ss 
    :host "localhost"                   ;; default=<loopback>, accepts InetAddress or String
    :port 5555                          ;; default=0 (ephemeral)
    :use-prompt true                    ;; default=false
    :bind-err true                      ;; default=true, to bind \*err* to \*out*
    :err-printer clojure.main/err->map  ;; default=clojure.main/err->map, print Throwable to \*out*

The server socket is returned. Closing it will stop listening on the port (existing client connections will still be alive).

If you want to test with the server port, telnet makes a great client:

;; term 1:
$ java -cp target/classes -Dclojure.repl.socket.prompt=true clojure.main -s

;; term 2:
$ telnet 5555
Connected to localhost.
Escape character is '^]'.
user1=> (+ 1 1)
user1=> (/ 1 0)
#error {:cause "Divide by zero",
 [{:type java.lang.ArithmeticException,
   :message "Divide by zero",
   :at [clojure.lang.Numbers divide "Numbers.java" 158]}],
 [[clojure.lang.Numbers divide "Numbers.java" 158]
  [clojure.lang.Numbers divide "Numbers.java" 3808]
  [user1$eval1 invoke "NO_SOURCE_FILE" 1]
  [clojure.lang.Compiler eval "Compiler.java" 6784]
  [clojure.lang.Compiler eval "Compiler.java" 6747]
  [clojure.core$eval invoke "core.clj" 3078]
  [clojure.main$repl$read_eval_print__8287$fn__8290 invoke "main.clj" 265]
  [clojure.main$repl$read_eval_print__8287 invoke "main.clj" 265]
  [clojure.main$repl$fn__8296 invoke "main.clj" 283]
  [clojure.main$repl doInvoke "main.clj" 283]
  [clojure.lang.RestFn invoke "RestFn.java" 619]
  [clojure.main$socket_repl_server$fn__8342$fn__8344 invoke "main.clj" 450]
  [clojure.lang.AFn run "AFn.java" 22]
  [java.lang.Thread run "Thread.java" 724]]}
user1=> (println "hello")

A dynamic var clojure.main/*err-printer* is provided to customize printing of exceptions. It's bound by :err-printer if invoked programmatically or the system property if started from the command line, but it can be dynamically rebound during the session if desired:

user1=> (set! clojure.main/*err-printer* clojure.main/err-print)
#object[clojure.main$err_print 0x317bee01 "clojure.main$err_print@317bee01"]
user1=> (/ 1 0)
ArithmeticException Divide by zero  clojure.lang.Numbers.divide (Numbers.java:158)

Patch: clj-1671-4.patch

Comment by Timothy Baldridge [ 09/Mar/15 5:50 PM ]

Could we perhaps keep this as a contrib library? This ticket simply states "The goal is to provide a simple streaming socket repl as part of Clojure." What is the rationale for the "part of Clojure" bit?

Comment by Alex Miller [ 09/Mar/15 7:33 PM ]

We want this to be available as a Clojure.main option. It's all additive - why wouldn't you want it in the box?

Comment by Timothy Baldridge [ 09/Mar/15 10:19 PM ]

It never has really been too clear to me why some features are included in core, while others are kept in contrib. I understand that some are simply for historical reasons, but aside from that there doesn't seem to be too much of a philosophy behind it.

However it should be noted that since patches to clojure are much more guarded it's sometimes nice to have certain features in contrib, that way they can evolve with more rapidity than the one release a year that clojure has been going through.

But aside from those issues, I've found that breaking functionality into modules forces the core of a system to become more configurable. Perhaps I would like to use this repl socket feature, but pipe the data over a different communication protocol, or through a different serializer. If this feature were to be coded as a contrib library it would expose extension points that others could use to add additional functionality.

So I guess, all that to say, I'd prefer a tool I can compose rather than a pre-built solution.

Comment by Rich Hickey [ 10/Mar/15 6:25 AM ]

Please move discussions on the merits of the idea to the dev list. Comments should be about the work of resolving the ticket, approach taken by the patch, quality/perf issues etc.

Comment by Colin Jones [ 11/Mar/15 1:33 PM ]

I see that context (a) of the rationale is that network communication is desired, which sounds to me like users of this feature may want to communicate across hosts (whether in VMs or otherwise). Is that the case?

If so, it seems like specifying the address to bind to (e.g. "", "::", "", etc.) may become important as well as the existing port option. This way, someone who wants to communicate across hosts (or conversely, lock down access to local-only) can make that decision.

Comment by Alex Miller [ 11/Mar/15 2:07 PM ]

Colin - agreed. There are many ways to potentially customize what's in there so we need to figure out what's worth doing, both in the function and via the command line.

I think address is clearly worth having via the function and possibly in the command line too.

Comment by Kevin Downey [ 11/Mar/15 5:49 PM ]

I find the exception printing behavior really odd. for a machine you want an exception as data, but you also want some indication of if the data is an error or not, for a human you wanted a pretty printed stacktrace. making the socket repl default to printing errors this way seems to optimize for neither.

Comment by Rich Hickey [ 12/Mar/15 12:29 PM ]

Did you miss the #error tag? That indicates the data is an error. It is likely we will pprint the error data, making it not bad for both purposes

Comment by Alex Miller [ 13/Mar/15 11:29 AM ]

New -4 patch changes:

  • clojure.core/throwable-as-map now public and named clojure.core/Throwable->map
  • catch and ignore SocketException without printing in socket server repl (for client disconnect)
  • functions to print as message and as data are now: clojure.main/err-print and clojure.main/err->map. All defaults and docs updated.
Comment by David Nolen [ 18/Mar/15 12:44 PM ]

Is there any reason to not allow supplying :eval in addition to :use-prompt? In the case of projects like ClojureCLR + Unity eval generally must happen on the main thread. With :eval as something which can be configured, REPL sessions can queue forms to be eval'ed with the needed context (current ns etc.) to the main thread.

Comment by Kevin Downey [ 20/Mar/15 2:12 PM ]

I did see the #error tag, but throwables print with that tag regardless of if they are actually thrown or if they are just the value returned from a function. Admittedly returning exceptions as values is not something generally done, but the jvm does distinguish between a return value and a thrown exception. Having a repl that doesn't distinguish between the two strikes me as an odd design. The repl you get from clojure.main currently prints the message from a thrown uncaught throwable, and on master prints with #error if you have a throwable value, so it distinguishes between an uncaught thrown throwable and a throwable value. That obviously isn't great for tooling because you don't get a good data representation in the uncaught case.

It looks like the most recent patch does pretty print uncaught throwables, which is helpful for humans to distinguish between a returned value and an uncaught throwable.

Comment by Kevin Downey [ 25/Mar/15 1:10 PM ]

alex: saying this is all additive, when it has driven changes to how things are printed, using the global print-method, rings false to me

Comment by Sam Ritchie [ 25/Mar/15 1:15 PM ]

This seems like a pretty big last minute addition for 1.7. What's the rationale for adding it here vs deferring to 1.8, or trying it out as a contrib first?

Comment by Alex Miller [ 25/Mar/15 2:13 PM ]

Kevin: changing the fallthrough printing for things that are unreadable to be readable should be useful regardless of the socket repl. It shouldn't be a change for existing programs (unless they're relying on the toString of objects without print formats).

Sam: Rich wants it in the box as a substrate for tools.

Comment by Alex Miller [ 26/Mar/15 10:03 AM ]

Marking incomplete, pending at least the repl exit question.

[CLJ-878] DispatchReader always calls CtorReader when no dispatch macro found Created: 15/Nov/11  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Greg Chapman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader

Windows 7, Java 1.7


At the REPL, I accidentally typed:

user=> # "\w+"
RuntimeException Unsupported escape character: \w clojure.lang.Util.runtimeExce
ption (Util.java:156)
#<core$PLUS clojure.core$PLUS@6b7dc78>

You can see the confusing result (the REPL is also left in an unclosed string). After looking at LispReader.java, it seems to me DispatchReader ought to at least check for whitespace before calling CtorReader (perhaps better would be to check for a valid symbol character). Another example:

user=> (defrecord X [x])
user=> # "user.X"[5]
#user.X{:x 5}

I'm assuming the fact that the above successfully creates an X is accidental.

Comment by Nicola Mometto [ 25/Mar/15 5:33 PM ]

No longer reproducible

Comment by Alex Miller [ 25/Mar/15 9:33 PM ]

Seems at least partially reproducible to me. I see same behavior on first example. Second example works with a symbol {{# user.X[5]}} but fails with a reasonable error on the given case.

Comment by Nicola Mometto [ 25/Mar/15 9:56 PM ]

Ah, right.
The only way to fix the first error is to make `# "\w+"` a valid regex literal, that is, allowing whitespaces between # and the next dispatch char.
Is this something we want?

Comment by Alex Miller [ 25/Mar/15 10:06 PM ]


[CLJ-1670] Odd error when evaling a quoted object within a macro with a field name containing a '-' Created: 09/Mar/15  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Tim Engler Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: dash, macro, quote

linux gentoo


;;Somewhat self explanatory:

(deftype x [ this-is-a-bad-field ])

(defmacro xxx [] `(quote ~(->x 1)))
;;creates error: CompilerException java.lang.IllegalArgumentException: No matching field found: this-is-a-bad-field for class user.x, compiling:(NO_SOURCE_PATH:1:1)

;;But this works:
(deftype y [ ThisIsAGoodField ])

(defmacro yyy [] `(quote ~(->y 1)))

;;returns: #<y user.y@79fd87c8>

Comment by Alex Miller [ 09/Mar/15 12:41 PM ]

possibly a dupe of CLJ-1399

Comment by Tim Engler [ 09/Mar/15 8:21 PM ]

I applied the patch in one of the attachments of CLJ-1399, and this fixed it. So I agree, a dupe.

Comment by Nicola Mometto [ 25/Mar/15 5:22 PM ]

should be closed as dupe of CLJ-1399

[CLJ-809] fn's created using defn should not lexical shadow the var that holds them Created: 11/Jun/11  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Kevin Downey Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None


currently (defn foo [x] (foo x)) expands to something like (def foo
(fn foo [x] (foo x))) so the fn is bound to foo lexically in the scope
of the fn body.

because of this lexical shadowing self calls to fns defined with defn
do not incur the overhead of var dereferencing, I assume this is the
reason it was added.

the lexical shadowing also breaks memoization if you create the
memoized functions via (alter-var-root #'foo memoize) and several
macros here and there emit code in this style, since reusing defn is
the easiest way to get all the good juicy metadata bits like arglists,

1.3 has changes to eliminate some of the cost of going through the var.

since going through the var is much cheaper now it would be nice to
eliminate the lexical shadowing.


Comment by Kevin Downey [ 11/Jun/11 2:40 PM ]

tests indicate that accessing the fn through the var vs. accessing through the lexical binding have very similar access times


results in

$ java -jar clojure.jar ~/src/lexicaltest.clj
lexical binding run 0
"Elapsed time: 86.739 msecs"
lexical binding run 1
"Elapsed time: 8.062 msecs"
lexical binding run 2
"Elapsed time: 16.182 msecs"
var binding run 0
"Elapsed time: 61.136 msecs"
var binding run 1
"Elapsed time: 40.317 msecs"
var binding run 2
"Elapsed time: 11.641 msecs"

Comment by Jason Wolfe [ 13/Jun/11 5:05 PM ]

Here's another test that makes sure no JIT funny business (e.g. dead code elimination) is going on, and also looks at primitive hinted versions.


var-obj result, time: 14930352 ,"Elapsed time: 965.577 msecs"
lex-obj result, time: 14930352 ,"Elapsed time: 957.741 msecs"
var-prim result, time: 14930352 ,"Elapsed time: 770.411 msecs"
lex-prim result, time: 14930352 ,"Elapsed time: 113.412 msecs"

I'm not sure how to properly hint the var in the var-prim version to get truly comparable results. In any case, this use case should definitely be kept in mind.

Comment by Kevin Downey [ 13/Jun/11 5:35 PM ]

I think the point is we want to make sure the jvm can optimize both just as well, so making the code purposefully unoptimizable is kind of silly, yes?

Comment by Jason Wolfe [ 21/Jun/11 4:41 PM ]

@Kevin In your gist, the return value of foo is not used, and foo has no side effects. A sufficiently smart compiler could optimize the calls out entirely, which would not tell us much about the runtime of foo.

Comment by Aaron Bedra [ 28/Jun/11 6:49 PM ]

Rich: Given the performance testing (in the email thread) what is the next step here?

  • more thorough testing
  • patch?

Are there things that this might break that we need to consider?

Comment by Rich Hickey [ 29/Jul/11 7:39 AM ]

Someone needs to make a patch, and then test perf with and without the patch. These simulated tests aren't necessarily indicative. Naive fib is an ok test once you have a patch. And yes, more things might break - needs careful assessment.

Comment by Nicola Mometto [ 11/Apr/14 5:29 PM ]

If I understand this ticket correctly, it looks like this has been done at some point in time.

Clojure 1.7.0-master-SNAPSHOT
user=> (macroexpand-1 '(defn x []))
(def x (clojure.core/fn ([])))
Comment by Nicola Mometto [ 25/Mar/15 5:40 PM ]

This has already been fixed, should be closed

Comment by Alex Miller [ 25/Mar/15 9:22 PM ]

Already done probably long ago:

user=> (pprint (macroexpand '(defn foo [x] (foo x))))
(def foo (clojure.core/fn ([x] (foo x))))

[CLJ-1043] Unordered literals does not preserve left-to-right evaluation of arguments Created: 16/Aug/12  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Brandon Bloom Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None


Given: (defn f [x] (println x) x)

#{(f 2) (f 1)}



But expected would be:


This issue is related to CLJS-288

Comment by Andy Fingerhut [ 23/Sep/12 6:01 PM ]

I have the same question as David Nolen for CLJS-288: Is this a bug, or just behavior you didn't expect?

It seems that vectors preserve the order of evaluation, so if you really want to control evaluation order you could use something like (set [(f 2) (f 1)]) or (set (map f [2 1])).

Comment by Brandon Bloom [ 23/Sep/12 7:38 PM ]

I'd consider the expected default behavior of any syntax or macro to evaluate each sub-form once each, from left to right. Conditional, repeated, or out-of-order evaluation should be documented as deviations from that norm. If you buy that, then this is either a code or a documentation bug. My vote is for code bug.

Comment by Nicola Mometto [ 25/Mar/15 5:05 PM ]

I don't think it's possible to preserve left-to-right evaluation of args on unordered literals, the original order will be lost during read-time, the compiler has no way to know about it.

Comment by Alex Miller [ 25/Mar/15 5:40 PM ]

Sets are unordered.

Comment by Brandon Bloom [ 25/Mar/15 8:22 PM ]

Sometime over the past two years, I've come around to the perspective that this is expected behavior. However, I still maintain that this should be documented somewhere!

On this page: http://clojure.org/evaluation

This paragraph:

Vectors, Sets and Maps yield vectors and (hash) sets and maps whose contents are the evaluated values of the objects they contain. The same is true of metadata maps. If the vector or map has metadata, the evaluated metadata map will become the metadata of the resulting value.

I'd propose inserting a sentence like:

Contents are evaluated sequentially in vectors, but in an unspecified order for sets and maps.

Comment by Alex Miller [ 25/Mar/15 9:11 PM ]

I added a comment to that page.

Comment by Brandon Bloom [ 25/Mar/15 9:19 PM ]

Awesome, thanks!

Small typo: "Vectors elements" should be "Vector elements".

[CLJ-1596] Using keywords in place of symbols for defrecord fields causes a compiler exception with incorrect line number Created: 20/Nov/14  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Kyle Kingsbury Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: compiler, defrecord

Approval: Triaged


Possibly related to http://dev.clojure.org/jira/browse/CLJ-1261: a defrecord like

(defn foo [x])

(defrecord Bar [:b])

Throws an exception, like you'd expect:

java.lang.ClassCastException: clojure.lang.Keyword cannot be cast to clojure.lang.IObj, compiling:(tesser/quantiles_test.clj:45:15)

However, this exception's line and character indicates the error is in the previous form: the defn, not the defrecord. This can be really tricky to figure out when the expressions are more complicated.

Comment by Alex Miller [ 20/Nov/14 4:17 PM ]

Related: CLJ-1261

Comment by Alex Miller [ 20/Nov/14 4:18 PM ]

Possibly fixed by CLJ-1561, not sure.

Comment by Nicola Mometto [ 25/Mar/15 5:55 PM ]

No longer reproducible, fixed by CLJ-1561

[CLJ-1322] doseq with several bindings causes "ClassFormatError: Invalid Method Code length" Created: 10/Jan/14  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Miikka Koskinen Assignee: Unassigned
Resolution: Unresolved Votes: 5
Labels: None

Clojure 1.5.1, java 1.7.0_25, OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1ubuntu0.12.04.2)

Attachments: Text File doseq-bench.txt     Text File doseq.patch     File script.clj    
Patch: Code
Approval: Incomplete


Important Perf Note the new impl is faster for collections that are custom-reducible but not chunked, and is also faster for large numbers of bindings. The original implementation is hand tuned for chunked collections, and wins for larger chunked coll/smaller binding count scenarios, presumably due to the fn call/return tracking overhead of reduce. Details are in the comments.
Screened By
Patch doseq.patch

user=> (def a1 (range 10))
user=> (doseq [x1 a1 x2 a1 x3 a1 x4 a1 x5 a1 x6 a1 x7 a1 x8 a1] (do))
CompilerException java.lang.ClassFormatError: Invalid method Code length 69883 in class file user$eval1032, compiling:(NO_SOURCE_PATH:2:1)

While this example is silly, it's a problem we've hit a couple of times. It's pretty surprising when you have just a couple of lines of code and suddenly you get the code length error.

Comment by Kevin Downey [ 18/Apr/14 12:20 AM ]

reproduces with jdk 1.8.0 and clojure 1.6

Comment by Nicola Mometto [ 22/Apr/14 5:35 PM ]

A potential fix for this is to make doseq generate intermediate fns like `for` does instead of expanding all the code directly.

Comment by Ghadi Shayban [ 25/Jun/14 8:39 PM ]

Existing doseq handles chunked-traversal internally, deciding the
mechanics of traversal for a seq. In addition to possibly conflating
concerns, this is causing a code explosion blowup when more bindings are
added, approx 240 bytes of bytecode per binding (without modifiers).

This approach redefs doseq later in core.clj, after protocol-based
reduce (and other modern conveniences like destructuring.)

It supports the existing :let, :while, and :when modifiers.

New is a stronger assertion that modifiers cannot come before binding
expressions. (Same semantics as let, i.e. left to right)

valid: [x coll :when (foo x)]
invalid: [:when (foo x) x coll]

This implementation does not suffer from the code explosion problem.
About 25 bytes of bytecode + 1 fn per binding.

Implementing this without destructuring was not a party, luckily reduce
is defined later in core.

Comment by Andy Fingerhut [ 26/Jun/14 12:25 AM ]

For anyone reviewing this patch, note that there are already many tests for correct functionality of doseq in file test/clojure/test_clojure/for.clj. It may not be immediately obvious, but every test for 'for' defined with deftest-both is a test for 'for' and also for 'doseq'.

Regarding the current implementation of doseq: it in't simply that it is too many bytes per binding, it is that the code size doubles with each additional binding. See these results, which measures size of the macroexpanded form rather than byte code size, but those two things should be fairly linearly related to each other here:

(defn formsize [form]
  (count (with-out-str (print (macroexpand form)))))

user=> (formsize '(doseq [x (range 10)] (print x)))
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))

Here are results for the same expressions after Ghadi's patch doseq.patch dated June 25 2014:

user=> (formsize '(doseq [x (range 10)] (print x)))
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))

It would be good to see some performance results with and without this patch, too.

Comment by Stuart Halloway [ 28/Jun/14 2:21 PM ]

In the tests below, the new impl is called "doseq2", vs. the original impl "doseq"

(def hund (into [] (range 100)))
(def ten (into [] (range 10)))
(def arr (int-array 100))
(def s "superduper")

;; big seq, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a (range 100000000)])))
;; 1.2 sec

(dotimes [_ 5]
  (time (doseq2 [a (range 100000000)])))
;; 1.8 sec

;; small unchunked reducible, few bindings: doseq2 wins
(dotimes [_ 5]
  (time (doseq [a s b s c s])))
;; 0.5 sec

(dotimes [_ 5]
  (time (doseq2 [a s b s c s])))
;; 0.2 sec

(dotimes [_ 5]
  (time (doseq [a arr b arr c arr])))
;; 40 msec

(dotimes [_ 5]
  (time (doseq2 [a arr b arr c arr])))
;; 8 msec

;; small chunked reducible, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a hund b hund c hund])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a hund b hund c hund])))
;; 8 msec

;; more bindings: doseq2 wins bigger and bigger
(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten ])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten ])))
;; 0.4 msec

(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten e ten])))
;; 18 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten e ten])))
;; 1 msec
Comment by Ghadi Shayban [ 28/Jun/14 6:23 PM ]

Hmm, I cannot reproduce your results.

I'm not sure whether you are testing with lein, on what platform, what jvm opts.

Can we test using this little harness instead directly against clojure.jar? I've attached a the harness and two runs of results (one w/ default heap, the other 3GB w/ G1GC)

I added a medium and small (range) too.

Anecdotally, I see doseq2 outperform in all cases except the small range. Using criterium shows a wider performance gap favoring doseq2.

I pasted the results side by side for easier viewing.

core/doseq                          doseq2
"Elapsed time: 1610.865146 msecs"   "Elapsed time: 2315.427573 msecs"
"Elapsed time: 2561.079069 msecs"   "Elapsed time: 2232.479584 msecs"
"Elapsed time: 2446.674237 msecs"   "Elapsed time: 2234.556301 msecs"
"Elapsed time: 2443.129809 msecs"   "Elapsed time: 2224.302855 msecs"
"Elapsed time: 2456.406103 msecs"   "Elapsed time: 2210.383112 msecs"

;; med range, few bindings:
core/doseq                          doseq2
"Elapsed time: 28.383197 msecs"     "Elapsed time: 31.676448 msecs"
"Elapsed time: 13.908323 msecs"     "Elapsed time: 11.136818 msecs"
"Elapsed time: 18.956345 msecs"     "Elapsed time: 11.137122 msecs"
"Elapsed time: 12.367901 msecs"     "Elapsed time: 11.049121 msecs"
"Elapsed time: 13.449006 msecs"     "Elapsed time: 11.141385 msecs"

;; small range, few bindings:
core/doseq                          doseq2
"Elapsed time: 0.386334 msecs"      "Elapsed time: 0.372388 msecs"
"Elapsed time: 0.10521 msecs"       "Elapsed time: 0.203328 msecs"
"Elapsed time: 0.083378 msecs"      "Elapsed time: 0.179116 msecs"
"Elapsed time: 0.097281 msecs"      "Elapsed time: 0.150563 msecs"
"Elapsed time: 0.095649 msecs"      "Elapsed time: 0.167609 msecs"

;; small unchunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 2.351466 msecs"      "Elapsed time: 2.749858 msecs"
"Elapsed time: 0.755616 msecs"      "Elapsed time: 0.80578 msecs"
"Elapsed time: 0.664072 msecs"      "Elapsed time: 0.661074 msecs"
"Elapsed time: 0.549186 msecs"      "Elapsed time: 0.712239 msecs"
"Elapsed time: 0.551442 msecs"      "Elapsed time: 0.518207 msecs"

core/doseq                          doseq2
"Elapsed time: 95.237101 msecs"     "Elapsed time: 55.3067 msecs"
"Elapsed time: 41.030972 msecs"     "Elapsed time: 30.817747 msecs"
"Elapsed time: 42.107288 msecs"     "Elapsed time: 19.535747 msecs"
"Elapsed time: 41.088291 msecs"     "Elapsed time: 4.099174 msecs"
"Elapsed time: 41.03616 msecs"      "Elapsed time: 4.084832 msecs"

;; small chunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 31.793603 msecs"     "Elapsed time: 40.082492 msecs"
"Elapsed time: 17.302798 msecs"     "Elapsed time: 28.286991 msecs"
"Elapsed time: 17.212189 msecs"     "Elapsed time: 14.897374 msecs"
"Elapsed time: 17.266534 msecs"     "Elapsed time: 10.248547 msecs"
"Elapsed time: 17.227381 msecs"     "Elapsed time: 10.022326 msecs"

;; more bindings:
core/doseq                          doseq2
"Elapsed time: 4.418727 msecs"      "Elapsed time: 2.685198 msecs"
"Elapsed time: 2.421063 msecs"      "Elapsed time: 2.384134 msecs"
"Elapsed time: 2.210393 msecs"      "Elapsed time: 2.341696 msecs"
"Elapsed time: 2.450744 msecs"      "Elapsed time: 2.339638 msecs"
"Elapsed time: 2.223919 msecs"      "Elapsed time: 2.372942 msecs"

core/doseq                          doseq2
"Elapsed time: 28.869393 msecs"     "Elapsed time: 2.997713 msecs"
"Elapsed time: 22.414038 msecs"     "Elapsed time: 1.807955 msecs"
"Elapsed time: 21.913959 msecs"     "Elapsed time: 1.870567 msecs"
"Elapsed time: 22.357315 msecs"     "Elapsed time: 1.904163 msecs"
"Elapsed time: 21.138915 msecs"     "Elapsed time: 1.694175 msecs"
Comment by Ghadi Shayban [ 28/Jun/14 6:47 PM ]

It's good that the benchmarks contain empty doseq bodies in order to isolate the overhead of traversal. However, that represents 0% of actual real-world code.

At least for the first benchmark (large chunked seq), adding in some tiny amount of work did not change results signifantly. Neither for (map str [a])

(range 10000000) =>  (map str [a])
"Elapsed time: 586.822389 msecs"
"Elapsed time: 563.640203 msecs"
"Elapsed time: 369.922975 msecs"
"Elapsed time: 366.164601 msecs"
"Elapsed time: 373.27327 msecs"
"Elapsed time: 419.704021 msecs"
"Elapsed time: 371.065783 msecs"
"Elapsed time: 358.779231 msecs"
"Elapsed time: 363.874448 msecs"
"Elapsed time: 368.059586 msecs"

nor for intrisics like (inc a)

(range 10000000)
"Elapsed time: 317.091849 msecs"
"Elapsed time: 272.360988 msecs"
"Elapsed time: 215.501737 msecs"
"Elapsed time: 206.639181 msecs"
"Elapsed time: 206.883343 msecs"
"Elapsed time: 241.475974 msecs"
"Elapsed time: 193.154832 msecs"
"Elapsed time: 198.757873 msecs"
"Elapsed time: 197.803042 msecs"
"Elapsed time: 200.603786 msecs"

I still see reduce-based doseq ahead of the original, except for small seqs

Comment by Ghadi Shayban [ 04/Aug/14 2:55 PM ]

A form like the following will not work with this patch:

(go (doseq [c chs] (>! c :foo)))

as the go macro doesn't traverse fn boundaries. The only such code I know is core.async/mapcat*, a private fn supporting a fn that is marked deprecated.

Comment by Ghadi Shayban [ 07/Aug/14 2:09 PM ]

I see #'clojure.core/run! was just added, which has a similar limitation

Comment by Rich Hickey [ 29/Aug/14 8:19 AM ]

Please consider Ghadi's feedback, esp re: closures.

Comment by Ghadi Shayban [ 22/Sep/14 4:36 PM ]

The current expansion of a doseq [1] under a go form is less than ideal due to the amount of control flow. 14 states in the state machine vs. 7 with loop/recur

[1] Comparison of macroexpansion of (go ... doseq) vs (go ... loop/recur)

Comment by Nicola Mometto [ 25/Mar/15 6:07 PM ]

Related: CLJ-77

[CLJ-1646] Small filter performance enhancement Created: 19/Jan/15  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Critical
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: performance

Attachments: Text File clj-1646.patch    
Patch: Code
Approval: Triaged


I found when working on other tickets for 1.7 that filter repeats each call to .nth on the chunk.
Reusing the result is a small perf boost for all filter uses.

Timing with criterium quick-bench:

What stock patch applied comment
(into [] (filter odd? (range 1024))) 54.3 µs 50.2 µs 7.5% reduction

Approach: storing the nth value as to avoid double lookup.

Patch: clj-1646.patch
Screened by:

Comment by Michael Blume [ 25/Mar/15 5:09 PM ]

This seems have been rolled into the patch for CLJ-1515 – should it be closed?

Comment by Alex Miller [ 25/Mar/15 5:13 PM ]

I'm not sure yet which patch will be applied for 1515. I will close this if it's included.

[CLJ-1057] Var's .setDynamic does not set :dynamic in metadata Created: 02/Sep/12  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Trivial
Reporter: Brandon Bloom Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None

((juxt (comp :dynamic meta) #(.isDynamic %)) #'*agent*)

Comment by Timothy Baldridge [ 03/Dec/12 9:10 AM ]

This is actually an enhancement as no where in the clojure code is provision made for syncing var's metadata and dynamic state. .isDynamic is the authoritative source, and the calling of .setDynamic is configured by the compiler. If you'd like to see this change, please, feel free to bring it up on clojure-dev for a discussion.

Comment by Nicola Mometto [ 25/Mar/15 4:52 PM ]

Dupe of CLJ-859

Comment by Alex Miller [ 25/Mar/15 5:02 PM ]

dupe of CLJ-859

[CLJ-270] defn-created fns inherit old metadata from the Var they are assigned to Created: 13/Feb/10  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Anonymous Assignee: Rich Hickey
Resolution: Not Reproducible Votes: 0
Labels: None

  • What (small set of) steps will reproduce the problem?

user> (def #^{:foo "bar"} x 5)
user> (meta #'x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_FILE", :line 1, :foo "bar"}
user> (defn x [] 5)
user> (meta #'x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_FILE", :line 1, :arglists ([])}
user> (meta x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_FILE", :line 1, :foo "bar"}

  • What is the expected output? What do you see instead?

I expect (meta #'x) to evaluate to the value of the final (meta x) in the above.

  • What version are you using?

Current master (commit 61202d2ff6925002400a9843e8fbd080f3bef3a5).

  • Was this discussed on the group? If so, please provide a link to the discussion.


Prompted by discussion at


  • Initial attempt at a diagnosis:

I think this is due to DefExpr's eval method binding the Var to init.eval() first and attaching the supplied metadata to the Var later – see Compiler.java lines 341-352. (Note the Var is always already in place when init.eval() is called, regardless of whether it existed prior to the evaluation of the def / defn.) Thus the init expression supplied by defn sees the old (and wrong) metadata on the Var.

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/270
dont-copy-val-metadata-onto-new-var-value-in-defn.patch - https://www.assembla.com/spaces/clojure/documents/dwK4yssayr37y_eJe5d-aX/download/dwK4yssayr37y_eJe5d-aX
0001-set-meta-on-vars-before-evaluating-their-init-see-27.patch - https://www.assembla.com/spaces/clojure/documents/arrhbiAI4r35lQeJe5cbLr/download/arrhbiAI4r35lQeJe5cbLr

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

stu said: [file:dwK4yssayr37y_eJe5d-aX]

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

stu said: The problem happens with defn, but not with def+fn, so I think the original diagnosis is incorrect.

Another stab at diagnosis: The defn macro copies metadata from a var onto its new value, so if you defn a var that already exists, the old var metadata becomes metadata on the new value. I don't know why this would be the right thing to do, and if you eliminate this behavior (see patch) all the Clojure and Contrib tests still pass.

If this is correct, please assign back to me and I will write tests. If this is wrong, please tell me what's going on here so I know how to write the tests.

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: Child association with ticket #363 was added

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: I think the original diagnosis is correct. Note that the behavior differ when def iscompiled:

user=> (def #^{:foo "bar"} x 5)
user=> (let [] (defn x [] 5))
user=> (meta x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_PATH", :line 83, :arglists ([])}
user=> (meta #'x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_PATH", :line 83, :arglists ([])}

See also http://groups.google.com/group/clojure/browse_thread/thread/6afd81896ca368b2#

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: [file:arrhbiAI4r35lQeJe5cbLr]

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: My patch aligns DefExpr.eval with DefExpr.emit (first set meta then eval the init value).

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: Forget it, my current patch is broken.

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: My current patch is broken because of #352, what is the rational for #352?

Comment by Nicola Mometto [ 25/Mar/15 4:51 PM ]

not reproducible, metadata is no longer attached to the fns, could be closed

Comment by Alex Miller [ 25/Mar/15 5:01 PM ]

no longer relevant or reproducible

[CLJ-451] fn literals lack name/arglists/namespace metadata Created: 05/Oct/10  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Anonymous Assignee: Unassigned
Resolution: Declined Votes: 1
Labels: None


I would expect (meta (fn not-so-anonymous [a b c])) to include {:name not-so-anonymous :arglists ([a b c])} alongside line number information and possibly namespace/file as well, but currently it only includes :line.

Comment by Assembla Importer [ 05/Oct/10 12:29 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/451

Comment by Nicola Mometto [ 25/Mar/15 4:48 PM ]

Since metadata is no longer attached by the compiler directly to fns but to vars, I guess this ticket is no longer relevant

Comment by Alex Miller [ 25/Mar/15 4:51 PM ]

see comments

[CLJ-1226] set! of a deftype field using field-access syntax causes ClassCastException Created: 26/Jun/13  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: compiler, deftype, interop

Attachments: Text File 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r.patch     Text File 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r-v2.patch    
Patch: Code and Test


Clojure 1.6.0-master-SNAPSHOT

user=> (defprotocol p (f [_]))
user=> (deftype t [^:unsynchronized-mutable x] p (f [_] (set! (.x _) 1)))
user=> (f (t. 1))
ClassCastException user.t cannot be cast to compile__stub.user.t user.t (NO_SOURCE_FILE:1

After patch:
Clojure 1.6.0-master-SNAPSHOT
user=> (defprotocol p (f [_]))
user=> (deftype t [^:unsynchronized-mutable x] p (f [_] (set! (.x _) 1)))
user=> (f (t. 1))

Patch: 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r-v2.patch

Comment by Nicola Mometto [ 27/Jun/13 5:30 AM ]

This patch offers a better workaround for CLJ-1075, making it possible to write
(deftype foo [^:unsynchronized-mutable x] MutableX (set-x [this v] (try (set! (.x this) v)) v))

Comment by Nicola Mometto [ 25/Mar/15 4:39 PM ]

Updated patch to apply to current master

[CLJ-1115] multi arity into Created: 25/Nov/12  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Trivial
Reporter: Yongqian Li Assignee: Unassigned
Resolution: Declined Votes: 1
Labels: None

Attachments: File multi-arity-into.diff    
Patch: Code and Test


Any reason why into isn't multi arity?

(into to & froms) => (reduce into to froms)

(into #{} [3 3 4] [2 1] ["a"]) looks better than (reduce into #{} [[3 3 4] [2 1] ["a"]])

Comment by Timothy Baldridge [ 27/Nov/12 11:25 AM ]

Seems to be a valid enhancement. I can't see any issues we'd have with it. Vetted.

Comment by Timothy Baldridge [ 29/Nov/12 2:06 PM ]

Added patch & test. This patch retains the old performance characteristics of into in the case that there is only one collection argument. For example: (into [] [1 2 3]) .

Since the multi-arity version will be slightly slower, I opted to provide it as a second body instead of unifying both into a single body. If someone has a problem with this, I can rewrite the patch. At least this way, into won't get slower.

Comment by Rich Hickey [ 09/Dec/12 7:39 AM ]

This is a good example of an idea for an enhancement I haven't approved, and thus is not yet vetted.

Comment by Yongqian Li [ 05/Feb/14 1:16 AM ]

What's the plan for this bug?

Comment by Alex Miller [ 05/Feb/14 11:27 AM ]

@Yongqian This ticket is one of several hundred open enhancements right now. It is hard to predict when it will be considered. Voting (or encouraging others to vote) on an issue will raise it's priority. The list at http://dev.clojure.org/display/community/Screening+Tickets indicates how we prioritize triage.

Comment by Andy Fingerhut [ 06/Aug/14 2:12 PM ]

Patch multi-arity-into.diff dated Nov 29 2012 no longer applied cleanly to latest Clojure master due to commits made earlier today. Those changes add another arity to the function clojure.core/into, so the multi-arity generalization of into seems unlikely to be compatible with those changes.

Comment by Nicola Mometto [ 25/Mar/15 3:33 PM ]

This is probably out of the question now that into has an additional arity for transducers

Comment by Alex Miller [ 25/Mar/15 3:41 PM ]

That is true. Also, you can use the cat transducer to do something like this:

(into #{} cat [[3 3 4] [2 1] ["a"]])

[CLJ-273] def with a function value returns meta {:macro false}, but def itself doesn't have meta Created: 23/Feb/10  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Anonymous Assignee: Rich Hickey
Resolution: Not Reproducible Votes: 0
Labels: None


On the master (1.2) branch, if you create a def with an initial function value, {:macro false} is added to the metadata of the return value for def. However, if you look again at the metadata on the var itself, the {:macro false} is not present! This breaks the use of contrib's defalias when aliasing macros, because the new alias is marked as {:macro false}.

The code below demonstrates the issue, which was introduced in http://github.com/richhickey/clojure/commit/430dd4fa711d0008137d7a82d4b4cd27b6e2d6d1, "metadata for fns."

;; all running on 1.2, DIFF noted in comments

(defmacro foo [])
-> #'user/foo

(meta (def bar (.getRoot #'foo)))
-> {:macro false, :ns #<Namespace user>, :name bar, :file "NO_SOURCE_PATH", :line 83}
;; DIFF: where did that :macro false come from??

(def bar (.getRoot #'foo))
-> #'user/bar

(meta #'bar)
-> {:ns #<Namespace user>, :name bar, :file "NO_SOURCE_PATH", :line 84}
;; LIKE 1.1, but really weird: now the :macro false is gone again!

Comment by Assembla Importer [ 24/Aug/10 9:32 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/273

Comment by Kevin Downey [ 17/Apr/14 10:27 PM ]

this belongs in the isssues for one of the new contrib projects (if defalias ever got moved)

Comment by Alex Miller [ 17/Apr/14 10:35 PM ]

actually, it sounds like defalias is just a place where the issue is observed but the issue is in core. The old contrib/def.clj moved to core.incubator, but defalias did not go with it.

Comment by Nicola Mometto [ 18/Apr/14 9:12 AM ]

https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Var.java#L270 this is probably relevant

Comment by Nicola Mometto [ 25/Mar/15 11:03 AM ]

as of 1.7.0-master-SNAPSHOT this is not reproducible (even replacing .getRoot with .getRawRoot)

Comment by Alex Miller [ 25/Mar/15 12:49 PM ]

see comment

[CLJ-1680] quot and rem handle doubles badly Created: 24/Mar/15  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Francis Avila Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: math

Attachments: Text File clj-1680_no_div0.patch    
Patch: Code and Test
Approval: Triaged


quot and rem in the doubles case (where any one of the arguments is a floating point) gives strange results for non-finite arguments:

(quot Double/POSITIVE_INFINITY 2) ; Java: Infinity
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(quot 0 Double/NaN) ; Java: NaN
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(rem Double/POSITIVE_INFINITY 2) ; Java: NaN
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(rem 0 Double/NaN) ; Java: NaN
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(rem 1 Double/POSITIVE_INFINITY) ; The strangest one. Java: 1.0
=> NaN

quot and rem also do divide-by-zero checks for doubles, which is inconsistent with how doubles act for division:

(/ 1.0 0)
=> NaN
(quot 1.0 0) ; Java: NaN
ArithmeticException Divide by zero  clojure.lang.Numbers.quotient (Numbers.java:176)
(rem 1.0 0); Java: NaN
ArithmeticException Divide by zero  clojure.lang.Numbers.remainder (Numbers.java:191)

Attached patch does not address this because I'm not sure if this is intended behavior. There were no tests asserting any of the behavior mentioned above.

Fundamentally the reason for this behavior is that the implementation for quot and rem sometimes (when result if division larger than a long) take a double, coerce it to BigDecimal, then BigInteger, then back to a double. The coersion means it can't handle nonfinite intermediate values. All of this is completely unnecessary, and I think is just leftover detritus from when these methods used to return a boxed integer type (long or BigInteger). That changed at this commit to return primitive doubles but the method body was not refactored extensively enough.

The method bodies should instead be simply:

static public double quotient(double n, double d){
    if(d == 0)
        throw new ArithmeticException("Divide by zero");
    double q = n / d;
    return (q >= 0) ? Math.floor(q) : Math.ceil(q);

static public double remainder(double n, double d){
    if(d == 0)
        throw new ArithmeticException("Divide by zero");
    return n % d;

Which is what the attached patch does. (And I'm not even sure the d==0 check is appropriate.)

Even if exploding on non-finite results is a desirable property of quot and rem, there is no need for the BigDecimal+BigInteger coersion. I can prepare a patch that preserves existing behavior but is more efficient.

More discussion at Clojure dev.

Comment by Francis Avila [ 24/Mar/15 12:55 PM ]

More testing revealed that n % d does not preserve the relation (= n (+ (* d (quot n d)) (rem n d))) as well as (n - d * (quot n d)), which doesn't make sense to me since that is the very relation the spec says % preserves. % is apparently not simply Math.IEEEremainder() with a different quotient rounding.

Test case: (rem 40.0 0.1) == 0.0; 40.0 % 0.1 == 0.0999... (Smaller numerators will still not land at 0 precisely, but land closer than % does.)

Updated patch which rolls back some parts of the simplification to remainder and adds this test case.

[CLJ-1682] clojure.set/intersection occasionally allows non-set arguments. Created: 24/Mar/15  Updated: 24/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Valerie Houseman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs

Approval: Triaged


clojure.set/intersection, by intent and documentation, is meant to be operations between two sets. However, it sometimes allows (and returns correct opreations upon) non-set arguments. This confuses the intention that non-set arguments are not to be used.

Here's an example with Set vs. KeySeq:
If there happens to be an intersection, you'll get a result. This may lead someone coding this to think that's okay, or to not notice they've used an incompatible data type. As soon as the intersection is empty, however, an appropriate type error ensues, albeit by accident because the first argument to clojure.core/disj should be a set.

user=> (require '[clojure.set :refer [intersection]])
user=> (intersection #{:key_1 :key_2} (keys {:key_1 "na"}))   ;This works, but shouldn't
user=> (intersection #{:key_1 :key_2} (keys {:key_3 "na"}))   ;This fails, because intersection assumes the second argument was a Set
ClassCastException clojure.lang.APersistentMap$KeySeq cannot be cast to clojure.lang.IPersistentSet  clojure.core/disj (core.clj:1449)

(disj (keys {:key_1 "na"}) #{:key_1 :key_2})   ;The assumption that intersection made
ClassCastException clojure.lang.APersistentMap$KeySeq cannot be cast to clojure.lang.IPersistentSet  clojure.core/disj (core.clj:1449)

Enforcing type security on a library that's clearly meant for a particular type seems like the responsible thing to do. It prevents buggy code from being unknowingly accepted as correct, until the right data comes along to step on the bear trap.

Comment by Valerie Houseman [ 24/Mar/15 6:11 PM ]

Please reroute to the CLJ project, as I lack access to do so - my bad.

Comment by Andy Fingerhut [ 24/Mar/15 7:19 PM ]

CLJ-810 was similar, except it was for function clojure.set/difference. That one was declined with the comment "set/difference's behavior is not documented if you don't pass in a set." I do not know what core team will judge ought to be done with this ticket, but wanted to provide some history.

Dynalint [1] and I think perhaps Dire [2] can be used to add dynamic argument checking to core functions.

[1] https://github.com/frenchy64/dynalint
[2] https://github.com/MichaelDrogalis/dire

Comment by Alex Miller [ 24/Mar/15 9:00 PM ]

Now that `set` is faster for sets, I think we could actually add checking for sets in some places where we might not have before. So, it's worth looking at with fresh eyes.

[CLJ-1237] reduce gives a SO for pathological seqs Created: 27/Jul/13  Updated: 24/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Gary Fredericks Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: None


Attachments: Text File CLJ-1237c.patch    
Patch: Code and Test


reduce gives a StackOverflowError on long sequences that contain many transitions between chunked and unchunked:

(->> (repeat 50000 (cons :x [:y]))
     (apply concat)
     (reduce (constantly nil)))
;; throws StackOverflowError

Such a sequence is well behaved under most other sequence operations, and its underlying structure can even be masked such that reduce succeeds:

(->> (repeat 50000 (cons :x [:y]))
     (apply concat)
     (take 10000000)
     (reduce (constantly nil)))
;; => nil

I don't think Clojure developers normally worry about mixing chunked and unchunked seqs, so the existence of such a sequence is not at all unreasonable (and indeed this happened to me at work and was very difficult to debug).

It seems obvious what causes this given the implementation of reduce – it bounces back and forth between the chunked impl and the unchunked impl, consuming more and more stack as it goes. Without proper tail call optimization, it's not obvious to me what a good fix would be.

Presumed bad solutions

Degrade to naive impl after first chunk

In the IChunkedSeq implementation, instead of calling internal-reduce when the
sequence stops being chunked, it could have an (inlined?) unoptimized implementation,
ensuring that no further stack space is taken up. This retains the behavior that a
generic seq with a chunked tail will still run in an optimized fashion, but a seq with
two chunked portions would only be optimized the first time.

Use clojure.core/trampoline

This would presumably work, but requires wrapping the normal return values from all
implementations of internal-reduce.

Proposed Solution

(attached as CLJ-1237c.patch)

Similar to using trampoline, but create a special type (Unreduced) that signals
an implementation change. The two implementation-change points in internal-reduce
(in the IChunkedSeq impl and the Object impl) are converted to return an instance
of Unreduced instead of a direct call to internal-reduce.

Then seq-reduce is converted to check for instances of Unreduced before returning,
and recurs if it finds one.


  • Only requires one additional check in most cases
  • Reduces stack usage for existing heterogeneous reductions that weren't extreme enough to crash
  • Should be compatible with 3rd-party implementations of internal-reduce, which can still use the old style (direct recursive calls to internal-reduce) or the optimized style if desired.


  • internal-reduce is slightly more complicated
  • There's an extra check at the end of seq-reduce – is that a performance worry?

Comment by Gary Fredericks [ 25/Aug/13 4:13 PM ]

Added patch.

Comment by Gary Fredericks [ 24/Mar/15 11:33 AM ]

My coworker just ran into this, independently.

[CLJ-1679] Add fast path in seq comparison for structurally sharing seqs Created: 21/Mar/15  Updated: 23/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: collections, seq

Attachments: Text File 0001-CLJ-1679-do-pointer-checks-in-seq-equality.patch     Text File CLJ-1679-v2.patch     Text File CLJ-1679-v3.patch    
Patch: Code and Test


Currently comparing two non identical seqs requires iterating through both seqs comparing value by value, ignoring the possibility of seq `a` and `b` having the same (pointer-equal) rest.

The proposed patch adds a pointer equality check on the seq tails that can make the equality short-circuit if the test returns true, which is helpful when comparing large (or possibly infinite) seqs that share a common subseq.

After this patch, comparisons like

(let [x (range)] (= x (cons 0 (rest x))))
which currently don't terminate, return true in constant time.

Patch: CLJ-1679-v3.patch

Comment by Michael Blume [ 23/Mar/15 12:52 PM ]

When this test fails (it fails on my master, but I've got a bunch of other development patches, I'm still figuring out where the conflict is), it fails by hanging forever. Maybe it'd be better to check equality in a future and time out the future?

Comment by Michael Blume [ 23/Mar/15 1:01 PM ]

like so =)

Comment by Nicola Mometto [ 23/Mar/15 1:11 PM ]

Makes sense, thanks for the updated patch

Comment by Michael Blume [ 23/Mar/15 1:24 PM ]

Hm, previous patch had a problem where the reporting logic still tries to force the sequence and OOMs, this patch prevents that.

Comment by Michael Blume [ 23/Mar/15 1:41 PM ]

ok, looks like CLJ-1515, CLJ-1603, and this patch, all combine to fail together, though any two of them work fine.

Comment by Michael Blume [ 23/Mar/15 1:43 PM ]

(And really there's nothing wrong with the source of this patch, it still works nicely to short-circuit = where there's structural sharing, it's just that the other two patches break structural sharing for ranges, so the test fails)

Comment by Nicola Mometto [ 23/Mar/15 2:02 PM ]

I see, I guess we'll have to change the test if the patches for those tickets get applied.

Generated at Sat Mar 28 04:27:54 CDT 2015 using JIRA 4.4#649-r158309.