<< Back to previous view

[CLJ-1684] Transducer eduction test is wrong Created: 27/Mar/15  Updated: 31/Mar/15  Resolved: 31/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: regression, transducers

Attachments: Text File clj-1684.patch    
Patch: Code and Test
Approval: Ok

 Description   

Error in build:

[java] ERROR in (seq-and-transducer) (TransformerIterator.java:86)
         [java] Uncaught exception, not in assertion.
         [java] expected: nil
         [java]   actual: java.lang.NullPointerException: null
         [java]  at clojure.lang.TransformerIterator.step (TransformerIterator.java:86)
         [java]     clojure.lang.TransformerIterator.hasNext (TransformerIterator.java:97)
         [java]     clojure.lang.RT.chunkIteratorSeq (RT.java:489)
         [java]     clojure.lang.RT.seqFrom (RT.java:518)
         [java]     clojure.lang.RT.seq (RT.java:509)
         [java]     clojure.core/seq (core.clj:135)
         [java]     clojure.core$print_sequential.invoke (core_print.clj:47)
         [java]     clojure.core/fn (core.clj:7362)
         [java]     clojure.lang.MultiFn.invoke (MultiFn.java:233)
         [java]     clojure.core$pr_on.invoke (core.clj:3549)
         [java]     clojure.core$pr.invoke (core.clj:3561)
         [java]     clojure.pprint$pprint_simple_default.invoke (dispatch.clj:144)

There is an error in the generative transducer eduction test that was added in CLJ-1669. This code in transducers.clj:

(apply eduction (into actions [coll]))

is not invoking eduction with actions and a collection. Rather it is putting the actions inside the collection and effectively using no transformation at all as in {{(eduction [1 2])}}, which always passes. The error seen is due to a bad function being passed - if actions happens to be nil (which I think is happening while shrinking another failure), something like (eduction (constantly nil) []) is being called, which we would not expect to work in the first place.

After fixing the bad eduction handling, I was only able to trigger failures with a high number of iterations and a very large number of transformations. The errors reported under these conditions are difficult to understand, I believe because they are hitting StackOverflow errors and the stack traces are being removed by the JVM.

I did some more investigation into whether we are actually generating useful generative tests and found that due to the large stack of transformations, virtually all tests were just producing exceptions, rather than more interesting behavior. I capped the number of transformations to 5 and saw much more useful and interesting tests being generated. I've also doubled the number of transducer tests being run in the patch and ran it locally with a much higher number with no failures.

Patch: clj-1684.patch






[CLJ-1602] vals and keys return values should implement IReduceInit or Iterable Created: 25/Nov/14  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1602-2.diff     File clj-1602-3.diff     File clj-1602-4.diff     File clj-1602-5.diff     File clj-1602-6.diff     File clj-1602.diff    
Patch: Code and Test
Approval: Ok

 Description   

Background: clojure.core/keys calls RT.keys(Object) calls APersistentMap.KeySeq.create(ISeq). RT.keys() creates a sequence (of Map.Entry objects) and KeySeq just wraps it, calling .keyValue(). There is an equivalent vals -> RT.vals() -> ValSeq path. Both of these seq impls extend ASeq and provide Iterable implementations via SeqIterator (iterator wrapped over the seq).

Approach: The important thing here is to avoid creating the sequence and instead directly iterate/reduce over the map. Noting that CLJ-1499 provides support for making PHM directly Iterable and that KeySeq/ValSeq already implement Iterable, I chose to focus on making the instances returned from keys and vals support Iterable directly on the underlying map instead of via the seq.

RT.keys()/vals() created the seq and passed it to KeySeq/ValSeq which made it too late to directly cover the original map iterator. There are a few places that rely on passing a seq of Map.Entry to keys/vals (not just a map instance), so I check for IPersistentMap and in that case pass it directly to a new KeySeq factory method that remembers both the Iterable and the ISeq. There is also support in here for using a direct key/value iterator via IMapIterable as introduced in CLJ-1499.

Questions/notes:

  • Could potentially check for Map or Iterable instead of IPersistentMap in RT.keys()/vals(). Not sure how common it is to pass normal Java maps to keys/vals.
  • The direct Iterable support vanishes once you move off the head of the keys or vals seq. So (rest (keys map)) does not have Iterable support. This is not really possible unless you hold an Iterator and advance it along with the seq, but that seemed to introduce all sorts of possibilities for badness. Since maps are unordered, it seems weird to rely on any ordering or processing only parts of any map, so I suspect doing this would be quite rare.
  • This patch depends on CLJ-1499 for IMapIterable.

Performance: I tested perf using criterium to benchmark as follows:

(use 'criterium.core)
(def m (zipmap (range 1000) (range 1000)))
(bench (reduce + (keys m)))
(bench (reduce + (vals m)))
(bench expr) 1.6.0 1.7.0-alpha5 alpha5 + clj-1499 + patch
(reduce + (keys m)) 69 µs 73 µs 44 µs
(reduce + (vals m)) 75 µs 77 µs 50 µs

Tests:

  • I added some basic tests for subseq and rsubseq as those both rely on the somewhat special behavior of keys accepting a seqable of Map.Entry objects (not just a map itself). There were no other tests for subseq or rsubseq already present.
  • Some coverage from CLJ-1499 tests d760db
  • Show that metadata works on keys/vals
  • Show that iterator works correctly after you step off the head of the seq, invalidating the iterable
  • Show that the fallthrough branch of iterator works (from PersistentTreeMap)

Patch: clj-1602-6.diff

Screened by:

  • the changes in CollReduce are unlikely to stand but are currently essential


 Comments   
Comment by Alex Miller [ 25/Nov/14 11:53 AM ]

Could leverage CLJ-1499 for the bulk of this, may pull that back from 1.8 into 1.7. Waiting on further work till that's answered.

Comment by Alex Miller [ 03/Dec/14 11:24 PM ]

I also have a patch that extends the CLJ-1499 iterators to support providing both key and val iterators that do not require creating and unpacking a Map.Entry. Unfortunately I only saw times that were ~48 µs on the perf benchmark in the description, so it's not a huge benefit (short-lived object allocation is cheap).

Comment by Alex Miller [ 09/Jan/15 9:15 AM ]

waiting on 1499 mods

Comment by Alex Miller [ 14/Jan/15 5:42 PM ]

Updated patch based on latest CLJ-1499-v10 patch.

Comment by Michael Blume [ 15/Jan/15 1:48 AM ]

I find it concerning that the v9 patch passed generative tests, shouldn't we have seen that?

Comment by Alex Miller [ 15/Jan/15 7:21 AM ]

Actually I did see it in the generative tests for this ticket so I moved those tests into the CLJ-1499 patch.

Comment by Michael Blume [ 15/Jan/15 10:10 AM ]

Ah, cool =)

Comment by Stuart Halloway [ 22/Mar/15 8:16 AM ]

clj-1602-5.diff merely updates -4 to apply on current master

Comment by Stuart Halloway [ 22/Mar/15 8:37 AM ]

requested some tests in the description

Comment by Alex Miller [ 23/Mar/15 1:51 PM ]

Updated to -6 patch that adds the requested tests.





[CLJ-1681] reflection warning throws NPE for literal nil arg Created: 24/Mar/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Michael Blume Assignee: Michael Blume
Resolution: Completed Votes: 0
Labels: regression, typehints

Attachments: Text File clj-1681-v1.patch     Text File clj-1681-v2.patch     Text File clj-1681-v3.patch    
Patch: Code and Test
Approval: Ok

 Description   

Seen on another project, but reproducible with this:

user=> (set! *warn-on-reflection* true)
true
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
user=> (pst *e)
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:21:13)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6740)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyze (Compiler.java:6485)
	clojure.lang.Compiler$BodyExpr$Parser.parse (Compiler.java:5861)
	clojure.lang.Compiler$FnMethod.parse (Compiler.java:5296)
	clojure.lang.Compiler$FnExpr.parse (Compiler.java:3925)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6731)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
Caused by:
NullPointerException
	clojure.lang.Compiler.getTypeStringForArgs (Compiler.java:2440)
	clojure.lang.Compiler$InstanceMethodExpr.<init> (Compiler.java:1490)
	clojure.lang.Compiler$HostExpr$Parser.parse (Compiler.java:1000)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6733)
	clojure.lang.Compiler.analyze (Compiler.java:6524)

Cause: Regression in Clojure 1.6+ from patch for CLJ-1248. In printing the reflection warning message, if the argument's java class is null, an NPE results.

Approach: Check for null before getting class name.

Patch: clj-1681-v3.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 24/Mar/15 3:54 PM ]

Patch welcome.

Comment by Alex Miller [ 24/Mar/15 4:31 PM ]

Need test in the patch and having a simple way to repro in the ticket would be great.

Comment by Nicola Mometto [ 25/Mar/15 3:46 PM ]

here's a repro:

Clojure 1.7.0-master-SNAPSHOT
user=> (set! *warn-on-reflection* true)
true
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
Comment by Michael Blume [ 25/Mar/15 3:49 PM ]

Thanks! I was having some trouble with that.

Comment by Nicola Mometto [ 25/Mar/15 3:54 PM ]

Michael, me and Alex were discussing this ticket in IRC and Alex was proposing replacing the if expression with an implementation like:

(arg.hasJavaClass() && arg.getJavaClass() != null) ? arg.getJavaClass().getName() : "unknown"
Comment by Michael Blume [ 25/Mar/15 3:58 PM ]

Hm, that makes sense, I was thinking print "nil" instead of "unknown" but I guess the fact that the user is passing nil doesn't actually tell us the expected class of the argument (apart from that it's not a primitive)





[CLJ-1603] cycle, iterate, repeat return vals should IReduceInit Created: 25/Nov/14  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Stuart Halloway
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File clj-1603-10.patch     Text File clj-1603-11.patch     Text File clj-1603-12-2.patch     Text File clj-1603-12.patch     Text File clj-1603-13.patch     Text File clj-1603-14.patch     Text File clj-1603-15.patch     Text File clj-1603-2.patch     Text File clj-1603-3-2.patch     Text File clj-1603-3-3.patch     Text File clj-1603-3-4.patch     Text File clj-1603-3.patch     Text File clj-1603-4.patch     Text File clj-1603-5.patch     Text File clj-1603-6.patch     Text File clj-1603-7.patch     Text File clj-1603-8.patch     Text File clj-1603-9.patch     Text File clj-1603.patch    
Patch: Code and Test
Approval: Ok

 Description   

Screened

clj-1603-15.patch (Java approach)

Alternatives:

There were a number of possible approaches for these enhancements:
1) Straight Java impl. The benefit of this approach is improving the performance of both the seq and reduce paths at the expense of writing a bunch of Java. See clj-1603-15.patch for the best of these impls.

2) Clojure deftype. This required moving cycle and iterate and providing a repeat1 implementation until deftype is defined (similar to the approach with reduce1). The deftype version returns a Seqable, IReduce object and has effectively the same former implementation for seq with a new fast implementation for IReduce. This makes reduce paths fast, but leaves seq paths about the same, with the benefit of no new Java code. The downside was that the new seqs did not implement the full range of interfaces from prior impls, which could potentially break code. See clj-1603-12-2.patch for a patch that covers this.

3) Add Iterable or IReduceInit directly to LazySeq. Conceptually, this does not make sense for general lazy seqs. Seqs materialize and cache each value once and doing this along with the ability to iterate/reduce introduces issues with caching (might as well use seqs for that) and synchronization. However, doing this for just these specific lazy seqs is possible - see latest patch.

Approach: Latest patch makes LazySeq implement IReduce and internal macro reducible-lazy-seq lets callers supply a function to implement the reduce path.

A few things to note:

  • Added some example-based tests for iterate, cycle, and repeat where I thought they were needed. Did not add generative tests - not clear to me what these would be that would actually be valuable. All of these functions are pretty simple and the examples cover the special cases.

Performance:

Some example timing, all in µs:

Expression 1.6.0 1.7.0-alpha5 master + clj-1603-15 (Java) master + clj-1603-12-2 (deftype) master + clj-1603-14 (split)
(doall (take 1000 (repeat 1))) 87 93 63 89 92
(into [] (take 1000) (repeat 1)) n/a 67 25 27 33
(doall (repeat 1000 1)) 87 94 16 94 89
(into [] (repeat 1000 1)) 99 110 13 12 12
(reduce + 0 (repeat 1000 1)) 99 126 20 22 25
(into [] (take 1000) (repeat 1)) n/a 67 28 33 27
(doall (take 1000 (cycle [1 2 3]))) 101 106 85 108 103
(into [] (take 1000) (cycle [1 2 3])) n/a 73 38 45 44
(doall (take 1000 (iterate inc 0))) 93 98 75 123 116
(into [] (take 1000) (iterate inc 0)) n/a 85 38 40 39

Notes on timings above:

  • All reduce timings (with into) comparable across the impls and significantly better than the current behavior over seqs.
  • The Java impl is faster across the board with doall. doall repeatedly calls seq() and next() to walk the sequence. The Java class versions of Repeat, Cycle, and Iterate extend ASeq and seq() just return this. next() constructs and returns a new instance of the class, which is immutable. In the lazy seq versions, LazySeq is mutable and requires synchronization and handling the caching safely. So, simple immutable instance ftw here.
  • The Java finite repeat has an extra benefit from using a primitive long for the counter.
  • One performance difference that's not visible in the timings is that the Java implementations have the benefit of being both seqs and reducibles as they are traversed, so you can always get a fast reduce. The deftype and split impls are only reducible at the initial instance, walking off that initial head reverts to lazy seqs that are not quickly reducible.

Patch: The two patches in leading contention are:

  • clj-1603-15.patch - for the Java version
  • clj-1603-14.patch - for the split impl

Alex opinion: I have swung back and forth on this but my current recommendation is for the Java implementation (clj-1603-15.patch). It's faster for both seqs and reduce, both in the timings above and importantly in maintaining reducibility as they are traversed. There is more Java, but I've made my peace with that - the code maximally leverages existing ASeq infrastructure and the implementation is easy to understand.



 Comments   
Comment by Ghadi Shayban [ 25/Nov/14 11:01 AM ]

Stu, do you intend these to be in Java or Clojure? It could be trickier to implement in Clojure directly, as loading would have to be deferred until core_deftype loads. It's certainly tractable without breaking any backwards compatibility, and I've explored this while experimenting with Range as a deftype https://github.com/ghadishayban/clojure/commit/906cd6ed4d7c624dc4e553373dabfd57550eeff2

A macro to help with Seq&List participation could be certainly useful, as efficiently being both a Seq/List and IReduceInit isn't a party.

May be useful to list requirements for protocol/iface participation.

It seems like 'repeatedly' is another missing link in the IReduceInit story.

Rich mentioned the future integration of reduce-kv at the conj, it would also be useful to know how that could fit in.

---- Other concerns and ops that may belong better on the mailing list ----

In experimenting with more reducible sources, I put out a tiny repo (github.com/ghadishayban/reducers) a couple weeks ago that includes some sources and operations. The sources were CollReduce and not ISeq.

Relatedly, caching the hashcode as a Java `transient` field is not supported when implementing a collection using deftype (patch w/ test in CLJ-1573).

Sources:
Iterate was one of them https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51
Repeatedly https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51

Reduce/transduce-based Operations that accept transducers:
some, any, yield-first https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L52-L80
(any could use a better name, equiv to (first (filter...)))
some and any have a symmetry like filter/remove.

Novelty maybe for 1.8:
A transducible context for Iterables similar to LazyTransformer:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L157-L161

The unless-reduced macro was very useful in implementing the collections:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L7-L15
It is different than the ensure-reduced and unreduced functions in core.

Comment by Alex Miller [ 25/Nov/14 12:01 PM ]

When we discussed this in the past, it was in the vein of reusing some of the range work (in Java) to implement cycle and iterate (per CLJ-1515).

Comment by Ghadi Shayban [ 25/Nov/14 9:20 PM ]

Never mind about 'repeatedly'. Being both ISeq and IReduceInit for repeatedly doesn't make sense for something that relies on side-effects. Current users of repeatedly can reduce over it many times and only realize the elements once.

Comment by Alex Miller [ 05/Dec/14 11:17 PM ]

attached wip Java impl and posted some example timings

Comment by Ghadi Shayban [ 11/Dec/14 4:35 PM ]

NB iterate in this patch does not cache the realized ISeq, but recalcs it at every call to realize the tail. This is not a change in the promised behavior (docstring says "f must be side-effect free") but an implementation change, as worth noting in the changelog.

Comment by Stuart Halloway [ 02/Jan/15 1:32 PM ]

It looks like all the reduce with no inital value paths are still seq-y, and slower, as shown by e.g.

(dotimes [_ 10]
  (time
   (reduce
    +
    (repeat 10000000 1))))

(dotimes [_ 20]
  (time
   (reduce
    +
    0
    (repeat 10000000 1))))
Comment by Alex Miller [ 02/Jan/15 2:01 PM ]

On that example in master before / after patch I see:

before:

  • no init = 844 ms
  • with init = 920 ms

after:

  • no init = 124 ms
  • with init = 90 ms

Is that similar to what you see or not?

Comment by Alex Miller [ 02/Jan/15 4:21 PM ]

The clj-1603-3.patch has been updated to use effectively the same algorithm for both versions of reduce. With the -3 patch, I got ~96 ms on both examples in the prior comment. I re-ran the tests in the description and updated those as well (about the same as expected).

Comment by Stuart Halloway [ 16/Jan/15 1:18 PM ]

The tests do not seem to hit the unseeded reduce branches – do we even want these branches? The original ticket was for IReduceInit.

Comment by Michael Blume [ 18/Jan/15 1:48 PM ]

Probably worth noting – Git will happily apply the latest patch for CLJ-1603 on top of the latest patch for CLJ-1515, but the result does not compile because 1515 uses iterate and 1603 moves the definition of iterate lower in clojure.core. Not sure if this is worth fixing now or just noting for when they're actually applied.

Comment by Michael Blume [ 18/Jan/15 1:52 PM ]

Actually, here, this just moves the declare statement further up the file.

Comment by Michael Blume [ 18/Jan/15 2:19 PM ]

OK, no, the two patches are still incompatible even with the declaration order fixed:

[java] ERROR in (test-range) (LongRange.java:95)
     [java] expected: (= (take 3 (range 3 9 0)) (quote (3 3 3)))
     [java]   actual: java.lang.ClassCastException: clojure.core.InfiniteRepeat cannot be cast to clojure.lang.ISeq
     [java]  at clojure.lang.LongRange.create (LongRange.java:95)
Comment by Alex Miller [ 18/Jan/15 2:31 PM ]

The 1515 patch is actually being reworked right now - we will patch things up at application time if needed.

Comment by Alex Miller [ 19/Jan/15 10:12 AM ]

Removing screened marking so can be re-screened. Added new -7 patch that handles print-method, print-dup, and unmapping the deftype constructors so they're not visible. Thanks to Ghadi in CLJ-1515 for the idea on those.

Comment by Ghadi Shayban [ 20/Jan/15 8:08 AM ]

Review of -7 patch:

Seqable/seq implementations that return a separate ISeq like these do should forward a call to seq on the result, like eduction does. [1] (This is not necessary in these particular impls, as the LazySeqs returned are themselves ISeqs. Also because Cycle's deftype is only constructed for non-empty cycles, the fact that there is a guaranteed seq is implicit. Probably a best practice to add an innocuous seq call if users look to these impls as a recipe.)

The performance regression in (doall (repeat 1000 1)) should go away completely with the dorun tweak in CLJ-1515. This is because dorun is effectively calling seq twice (it calls seq, throws away the result, then calls next.)

minor nits
1) repeat1 seems to be identical to repeat-seq and has both arities necessary for the deftypes
2) inside FiniteRepeat s/(> i 0)/(pos? i) also inside the repeat constructor
3) some things are defn- , some are ^:private
4) Cycle/reduce the recur binding can be (recur rr (or (next s) coll)) rather than nil? check

[1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7324

Comment by Alex Miller [ 22/Jan/15 10:34 PM ]

Ghadi - good comments! Fixed 1,2,4. #3 - ^:private is because defn- is not yet defined. New -8 patch.

Comment by Alex Miller [ 23/Jan/15 10:03 AM ]

Bah.

user=> (= (repeat 5 :a) (repeat 5 :a))
false
Comment by Alex Miller [ 23/Jan/15 3:04 PM ]

Updated to -9 patch that handles hash and equality for finite repeat case.

Comment by Ghadi Shayban [ 26/Jan/15 2:24 PM ]

metadata in the wrong place on #'repeat1

Comment by Alex Miller [ 26/Jan/15 3:27 PM ]

Thanks, fixed in -10.

Comment by Stuart Halloway [ 20/Feb/15 10:19 AM ]

The collections returned by these APIs promise several things the new deftypes do not deliver. For example, 1.6's repeat currently has the following ancestors that are lost in the propopsed deftype:

  • clojure.lang.ISeq
  • java.io.Serializable
  • clojure.lang.IPending
  • java.util.Collection
  • java.util.List
  • java.lang.Iterable
  • clojure.lang.Obj
  • clojure.lang.IPersistentCollection
  • clojure.lang.IMeta
  • clojure.lang.IObj

Losing metadata and serializability are certainly regressions, other stuff maybe as well. I suspect similar problems in all the other API returns.

We could improve testing by taking advantage of the property-checking fns already in test-clojure/data-structures, e.g. is-same-collection

Comment by Alex Miller [ 20/Feb/15 10:42 AM ]

I think we should consider carefully what is contractually required by the return of repeat. My opinion is that repeat must return something seqable, not literally a seq (or lazy seq). With that perspective, ISeq, IPending (there re lazy seq laziness), Collection, List, Iterable, and IPersistentCollection are non-essential. Obj is a concrete class and we shouldn't commit to that.

  • IObj is a gap, this should work but doesn't: (with-meta (repeat 1 :a) {:a 1})
  • IMeta doesn't need to be added, will never have meta right now and meta handles this correctly
  • Serializable - doesn't make sense for the infinite seqs but should probably fix for finite range
Comment by Alex Miller [ 20/Feb/15 11:03 AM ]

Added -11 patch that adds IObj for all the new deftypes and Serializable for FiniteRepeat. Still need to add more tests.

Comment by Stuart Halloway [ 20/Feb/15 12:27 PM ]

I think all the java. interfaces are mandatory for interop. Leaving out any one of the clojure. interfaces creates observable change in behavior composing with other core fns (admittedly the IPending case would be bizarre, who uses that?), so those seem mandatory too.

Agreed Obj should be irrelevant, and if it isn't the bug is elsewhere.

Comment by Alex Miller [ 27/Feb/15 9:41 AM ]

pending further discussion w/ stu

Comment by Alex Miller [ 13/Mar/15 7:02 AM ]

for a clean run and more explanation of the perf numbers

Comment by Alex Miller [ 17/Mar/15 4:23 PM ]

refreshed numbers in table

Comment by Alex Miller [ 18/Mar/15 9:24 AM ]

Refreshed perf numbers and older patch variants, added some more explanation of the perf numbers for comparison.

Comment by Stuart Halloway [ 22/Mar/15 7:50 AM ]

Screened clj-1603-3-3.patch

Comment by Nicola Mometto [ 23/Mar/15 2:00 PM ]

Shouldn't iterate cache its next slot? the current implementation recalculates it every time:

user=> (def coll (iterate #(do (println %) (inc %)) 0))
#'user/coll
user=> (nth coll 3)
0
1
2
3
user=> (nth coll 3)
0
1
2
3

I realize the docstring for iterate explicitely warns that f must be side-effect free but this is just for demonstration purposes, imagine that f is a computationally-heavy function and you'll understand why I don't think the new proposed behaviour is desiderable.

Comment by Nicola Mometto [ 23/Mar/15 2:26 PM ]

I attached clj-1603-3-4.patch which is the same as clj-1603-3-3.patch except it caches the next slot of iterate.

Here is the diff between -3-3 and -3-4 for ease of review:

diff --git a/src/jvm/clojure/lang/Iterate.java b/src/jvm/clojure/lang/Iterate.java
index f23ddca..aeef998 100644
--- a/src/jvm/clojure/lang/Iterate.java
+++ b/src/jvm/clojure/lang/Iterate.java
@@ -16,6 +16,7 @@ public class Iterate extends ASeq implements IReduce {
 
 private final IFn f;      // never null
 private final Object seed;
+private volatile ISeq _next;
 
 private Iterate(IFn f, Object seed){
     this.f = f;
@@ -37,7 +38,14 @@ public Object first(){
 }
 
 public ISeq next(){
-    return new Iterate(f, f.invoke(seed));
+    ISeq ret = _next;
+    if (ret == null)
+        synchronized(this) {
+            ret = _next;
+            if (ret == null)
+                _next = ret = new Iterate(f, f.invoke(seed));
+        }
+    return ret;
 }
 
 public Iterate withMeta(IPersistentMap meta){
Comment by Alex Miller [ 23/Mar/15 3:32 PM ]

It's a good question. That impl kills all of the performance improvement for iterate on the seq use case (prob the current common case). I'll take a look at it though.

Comment by Nicola Mometto [ 23/Mar/15 3:57 PM ]

My opinion is that the performance improvement for iterate that the benchmark shows won't reflect a real performance improvement in "real" usage cases of iterate where `f` actually does some computation

Comment by Nicola Mometto [ 23/Mar/15 4:04 PM ]

This is also true for its reduce impl btw.
I'm sorry if I'm pointing this out late in the life of this ticket but it never occurred to me that, contrary to cycle, repeat or range (CLJ-1515), this change for iterate will possibly make accessing large sequences slower since we lose the caching aspect of lazy-seqs.

Comment by Alex Miller [ 23/Mar/15 4:58 PM ]

Added new -15 patch, which is the same as -3-3 but caches _next for all three types. This has no impact on the reduce performance and was 2-8 us slower in the timings for the seqs above. That seems like a small hit for the reduction in perf and memory on all cases where the seq is traversed multiple times. That seems like a reasonably common thing to do, particularly for finite repeat.

Comment by Alex Miller [ 23/Mar/15 5:02 PM ]

Stu - the only change in -15 vs -3-3 is in the next() methods of each new type. Each has a new "private volatile ISeq _next;"

Cycle:

public ISeq next(){
    if(_next == null) {
        ISeq next = current.next();
        if (next != null)
            _next = new Cycle(all, next);
        else
            _next = new Cycle(all, all);
    }
    return _next;
}

Iterate:

public ISeq next(){
    if(_next == null) {
        _next = new Iterate(f, f.invoke(seed));
    }
    return _next;
}

Repeat:

public ISeq next() {
    if(_next == null) {
        if(count > 1)
            _next = new Repeat(count - 1, val);
        else if(count == INFINITE)
            _next = this;
    }
    return _next;
}
Comment by Michael Blume [ 23/Mar/15 6:47 PM ]

For Cycle, would it be worth holding the head and linking back to it when we loop, so that the underlying sequence only has to be traversed once? Something like this

https://github.com/MichaelBlume/clojure/commit/cache-cycle-head

On my machine this saves about 10 microseconds on

(doall (take 1000 (cycle [1 2 3])))

Comment by Michael Blume [ 23/Mar/15 7:07 PM ]

Also, for the Cycle reduce implementations, it might make sense to assume the underlying sequence is going to be traversed many times, copy it into an array, and loop over it with an index.

Comment by Michael Blume [ 23/Mar/15 7:17 PM ]

like so: https://github.com/MichaelBlume/clojure/commit/array-loop

Comment by Alex Miller [ 23/Mar/15 8:28 PM ]

The problem with holding a pointer back to the head is that the intervening cycle can be arbitrarily large. You are then literally holding the head on an arbitrary size seq. A silly example that fails with the patch but works without it (depending on your jvm size):

(dorun (take 10000000 (cycle (range 10000000))))

I played with the cycle reduce too but it has the same effective potential issue of creating and holding an arbitrarily sized array. Actually this one is worse in that it realizes an arbitrarily large array before knowing how many inputs it will need - it could be a take 1. Also, things that formerly worked could blow up with an oome. What if someone relies on passing an infinite sequence to cycle in a fallthrough case? For example, this works with the current version but would blow up with the array.

(into [] (take 1000) (cycle (iterate inc 0)))

Neither of these seem worth the risk to me.

Comment by Michael Blume [ 23/Mar/15 10:53 PM ]

Yeah, fair enough.

Comment by Alex Miller [ 25/Mar/15 4:14 PM ]

updated version of the -15 patch that gives proper credit to Ghadi for part.





[CLJ-1683] test-reduce doesn't catch off-by-one error in IReduce Created: 25/Mar/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: None

Attachments: Text File clj-1683-v1.patch    
Patch: Code and Test
Approval: Ok

 Description   

To implement the no-init arity of reduce, you have to use the first element of the sequence as an init value, and then you have to skip the first element when you iterate through the sequence. The tests in test-reduce, which focus on using reduce to sum up a bunch of numbers in a range, don't actually pin down this behavior, because the range used starts with zero, and an extra zero doesn't affect the sum.

Screened by: Alex Miller - this is a good tweak to get better tests for the hard this easy to mess up reduce behavior.






[CLJ-1667] Socket test can fail if hard-coded port is unavailable Created: 26/Feb/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Completed Votes: 0
Labels: io, test

Attachments: Text File socket-test.patch    
Patch: Code
Approval: Ok

 Description   

I was unable to run the Clojure tests due to this problem. There is a test that hardcodes a port and something else on my machine happened to be using that port.

The patch avoids binding a hard-coded port in the test.

Patch: socket-test.patch

Screened by:



 Comments   
Comment by Andy Fingerhut [ 26/Feb/15 11:31 AM ]

I used to try running the prescreen tests in parallel for two different JDKs on the same machine, and I probably stopped doing that because of this. My use case is a very unusual one, and not a good reason to change this by itself, but my use case certainly made this conflict happen regularly.

Comment by Alex Miller [ 26/Feb/15 11:57 AM ]

No good reason not to fix it! Silly test.





Generated at Wed Apr 01 04:31:13 CDT 2015 using JIRA 4.4#649-r158309.