<< Back to previous view

[CLJ-1713] Range is not serializable Created: 21/Apr/15  Updated: 24/Apr/15  Resolved: 24/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: regression

Attachments: Text File clj-1713.patch    
Patch: Code and Test
Approval: Ok

 Description   

Range is not serializable once it starts getting chunked.

(import [java.io ByteArrayOutputStream ObjectOutputStream])

(defn- serialize
  "Serializes a single object, returning a byte array."
  [v]
  (with-open [bout (ByteArrayOutputStream.)
              oos (ObjectOutputStream. bout)]
    (.writeObject oos v)
    (.flush oos)
    (.toByteArray bout)))

(serialize (range 10))  ;; works fine
;;=> #object["[B" 0x71c14b1d "[B@71c14b1d"]

(def r (range 50))
(nth r 35)  ;; 35
(serialize r)
;;=> NotSerializableException clojure.lang.LongRange$LongChunk  java.io.ObjectOutputStream.writeObject0 (ObjectOutputStream.java:1181)

Cause: LongChunk is not serializable.

Approach: Make it serializable.

Patch: clj-1713.patch



 Comments   
Comment by Fogus [ 24/Apr/15 9:03 AM ]

This patch couldn't be more trivial. Screened and ready to apply.





[CLJ-1711] Hash map and StructMap cannot be conjoined using "into" Created: 21/Apr/15  Updated: 24/Apr/15  Resolved: 24/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Immo Heikkinen Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: regression
Environment:

Clojure 1.7.0-beta1


Attachments: Text File clj-1711-fix-structmap-iterator.patch    
Patch: Code and Test
Approval: Ok

 Description   
user=> (defstruct foo :bar)
#'user/foo
user=> (into {} (struct-map foo :bar 1))

IllegalArgumentException Don't know how to create ISeq from: clojure.lang.Keyword  clojure.lang.RT.seqFrom (RT.java:528)

In Clojure 1.6 this returns {:bar 1} as expected.



 Comments   
Comment by Alex Miller [ 21/Apr/15 8:43 AM ]

Thanks for the report - looks like an issue with the iter/reduce path for structmaps which changed in 1.7. Another example yielding same error:

(reduce (fn [acc i] (inc acc)) 0 struct-map)
Comment by Immo Heikkinen [ 22/Apr/15 12:42 AM ]

This turned out to be pretty simple to fix so I gave it a go.

Comment by Alex Miller [ 22/Apr/15 10:51 AM ]

Thanks! I'll take a look at it tomorrow!

Comment by Alex Miller [ 23/Apr/15 11:16 AM ]

Looks great, thanks.





[CLJ-1709] Incorrect range contents and count with step != 1 Created: 18/Apr/15  Updated: 24/Apr/15  Resolved: 24/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Nelson Morris Assignee: Devin Walters
Resolution: Completed Votes: 0
Labels: regression
Environment:

clojure 1.7.0-beta1


Attachments: Text File CLJ-1709-1710.patch     Text File clj-1709-2.patch    
Patch: Code and Test
Approval: Ok

 Description   

From https://groups.google.com/d/msg/clojure/dweCm6Bd-vc/atritH--xUEJ.

(range 0 11 2)
;;=> (0 2 4 6 8)
;;Expected: (0 2 4 6 8 10)
(count (range 0 11 2))
;;=> 5
;;Expected: 6

Cause: absCount method in LongRange is not computing count correctly.

Approach: Fix computation in LongRange.absCount() and add a generative test that compares the long and non-long versions of range produce the same ranges.

Screened by: Alex Miller, also verified that generative test failed without the fix.

Patch: clj-1709-2.patch



 Comments   
Comment by Devin Walters [ 18/Apr/15 7:10 PM ]

See attached patch and http://dev.clojure.org/jira/browse/CLJ-1710.

Comment by Alex Miller [ 19/Apr/15 9:07 AM ]

This seems to be related to `clojure.lang.LongRange`. Assuming that, here is a test.check property for being equal to clojure.lang.Range:

(def longrange equals-range               
  (prop/for-all [start gen/int                                                                                                                                            
                 end gen/int                                                                                                                                              
                 step (gen/such-that #(> % 0)                                                                                                                             
                                     gen/nat)]                                                                                                                            
                (= (clojure.lang.Range/create start end step)                                                                                                             
                   (clojure.lang.LongRange/create start end step))))                                                                                                      
                                                                                                                                                                          
(tc/quick-check 100 longrange-equals-range)                                                                                                                               
{:result false, :seed 1429392529363, :failing-size 15, :num-tests 16, :fail [0 15 7], :shrunk {:total-nodes-visited 22, :depth 5, :result false, :smallest [0 4 3]}}
Comment by Alex Miller [ 19/Apr/15 9:08 AM ]

Can we add the test.check property to the patch please? Clojure uses test.check already so this dependency is already taken care of.

Comment by Devin Walters [ 19/Apr/15 4:35 PM ]

@Alex, sure. It looks like transducers is not taking advantage of clojure.test.check's clojure.test integration. Specifically, the use of defspec. Is there a good reason why this is so?

This is what it'd look like:

(defspec longrange-equals-range 100
  (prop/for-all [start gen/int
                 end gen/int
                 step (gen/such-that #(> % 0)
                                     gen/nat)]
                (= (clojure.lang.Range/create start end step)
                   (clojure.lang.LongRange/create start end step))))

When building:

[java] Testing clojure.test-clojure.sequences
[java] {:result true, :num-tests 100, :seed 1429478867534, :test-var longrange-equals-range}

Is this alright? Please advise.

Comment by Devin Walters [ 19/Apr/15 4:44 PM ]

@Alex, I went ahead and did it the way I mentioned above.

Added an updated patch to include generative test to verify LongRange is the same as Range.

Comment by Michael Blume [ 20/Apr/15 12:05 AM ]

@Devin, this may be off-topic, but I'm pretty sure I'm responsible for the lack of defspec in the transducers tests you're talking about, and the reason is simply that I couldn't find a way with defspec to get the clarity of the reporting to the level I wanted.

Comment by Michael Blume [ 20/Apr/15 12:06 AM ]

See CLJ-1621

Comment by Michael Blume [ 20/Apr/15 12:11 AM ]

(I've been meaning for ages to come up with a proposal for test.check itself that would handle that sort of reporting for the user, but never came up with anything I liked enough)

Comment by Devin Walters [ 22/Apr/15 7:47 PM ]

Hey Michael, thanks for the background. I discussed this briefly with Alex at Clojure/West. By the sound of it, using defspec here should be fine. He mentioned he'd take a peek at it.

Comment by Steve Miner [ 23/Apr/15 12:36 PM ]

Regarding the test code, you could use gen/s-pos-int instead of (gen/such-that #(> % 0) gen/nat).

Comment by Alex Miller [ 24/Apr/15 8:20 AM ]

Tweaked one line in the test per Steve's suggestion.

Comment by Nelson Morris [ 24/Apr/15 8:40 AM ]

Since this is still open to suggestions, if I were to write the spec over again I'd probably use (such-that #(not= 0 %) gen/int) for the step. A negative step will produce non-empty lists when end < start, and those might be worth checking. The case to avoid is one that makes an infinite range like (repeat 0 1 0), due to the equality check in the property. A more complicated generator could probably avoid that without a such-that filter, but not sure it would be worth the effort in this case.

Comment by Alex Miller [ 24/Apr/15 9:02 AM ]

Well Rich has now ok'ed it so would prefer no more changes to this patch.





[CLJ-1703] Pretty print #error data Created: 14/Apr/15  Updated: 24/Apr/15  Resolved: 24/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: print

Attachments: Text File clj-1703-2.patch     Text File clj-1703.patch    
Patch: Code
Approval: Ok

 Description   

Some of the work we were doing re socket repls got pushed out but it would still be nice to expose the pretty printed #error formatting as the current version is very hard to read.

1.7.0-beta1:

user=> *99

CompilerException java.lang.RuntimeException: Unable to resolve symbol: *99 in this context, compiling:(/private/var/folders/7r/_1fj0f517rgcxwx79mn79mfc0000gn/T/form-init8696775124159270468.clj:1:1263)
user=> (prn *e)
#error{:cause "Unable to resolve symbol: *99 in this context", :via [{:type clojure.lang.Compiler$CompilerException, :message "java.lang.RuntimeException: Unable to resolve symbol: *99 in this context, compiling:(/private/var/folders/7r/_1fj0f517rgcxwx79mn79mfc0000gn/T/form-init8696775124159270468.clj:1:1263)", :at [clojure.lang.Compiler analyze "Compiler.java" 6543]} {:type java.lang.RuntimeException, :message "Unable to resolve symbol: *99 in this context", :at [clojure.lang.Util runtimeException "Util.java" 221]}], :trace [[clojure.lang.Util runtimeException "Util.java" 221] [clojure.lang.Compiler resolveIn "Compiler.java" 7029] [clojure.lang.Compiler resolve "Compiler.java" 6973] [clojure.lang.Compiler analyzeSymbol "Compiler.java" 6934] [clojure.lang.Compiler analyze "Compiler.java" 6506] [clojure.lang.Compiler analyze "Compiler.java" 6485] [clojure.lang.Compiler eval "Compiler.java" 6796] [clojure.lang.Compiler eval "Compiler.java" 6755] [clojure.core$eval invoke "core.clj" 3082] [clojure.main$repl$read_eval_print__7057$fn__7060 invoke "main.clj" 240] [clojure.main$repl$read_eval_print__7057 invoke "main.clj" 240] [clojure.main$repl$fn__7066 invoke "main.clj" 258] [clojure.main$repl doInvoke "main.clj" 258] [clojure.lang.RestFn invoke "RestFn.java" 1523] [clojure.tools.nrepl.middleware.interruptible_eval$evaluate$fn__599 invoke "interruptible_eval.clj" 53] [clojure.lang.AFn applyToHelper "AFn.java" 152] [clojure.lang.AFn applyTo "AFn.java" 144] [clojure.core$apply invoke "core.clj" 628] [clojure.core$with_bindings_STAR_ doInvoke "core.clj" 1866] [clojure.lang.RestFn invoke "RestFn.java" 425] [clojure.tools.nrepl.middleware.interruptible_eval$evaluate invoke "interruptible_eval.clj" 51] [clojure.tools.nrepl.middleware.interruptible_eval$interruptible_eval$fn__641$fn__644 invoke "interruptible_eval.clj" 183] [clojure.tools.nrepl.middleware.interruptible_eval$run_next$fn__634 invoke "interruptible_eval.clj" 152] [clojure.lang.AFn run "AFn.java" 22] [java.util.concurrent.ThreadPoolExecutor runWorker "ThreadPoolExecutor.java" 1142] [java.util.concurrent.ThreadPoolExecutor$Worker run "ThreadPoolExecutor.java" 617] [java.lang.Thread run "Thread.java" 744]]}

Approach: Do some minimal formatting of the #error data, should be pretty close to (pprint (Throwable->map *e)).

w/patch:

user=> (prn *e)
#error {
 :cause "Unable to resolve symbol: *99 in this context"
 :via
 [{:type clojure.lang.Compiler$CompilerException
   :message "java.lang.RuntimeException: Unable to resolve symbol: *99 in this context, compiling:(NO_SOURCE_PATH:0:0)"
   :at [clojure.lang.Compiler analyze "Compiler.java" 6543]}
  {:type java.lang.RuntimeException
   :message "Unable to resolve symbol: *99 in this context"
   :at [clojure.lang.Util runtimeException "Util.java" 221]}]
 :trace
 [[clojure.lang.Util runtimeException "Util.java" 221]
  [clojure.lang.Compiler resolveIn "Compiler.java" 7029]
  [clojure.lang.Compiler resolve "Compiler.java" 6973]
  [clojure.lang.Compiler analyzeSymbol "Compiler.java" 6934]
  [clojure.lang.Compiler analyze "Compiler.java" 6506]
  [clojure.lang.Compiler analyze "Compiler.java" 6485]
  [clojure.lang.Compiler eval "Compiler.java" 6796]
  [clojure.lang.Compiler eval "Compiler.java" 6755]
  [clojure.core$eval invoke "core.clj" 3079]
  [clojure.main$repl$read_eval_print__7093$fn__7096 invoke "main.clj" 240]
  [clojure.main$repl$read_eval_print__7093 invoke "main.clj" 240]
  [clojure.main$repl$fn__7102 invoke "main.clj" 258]
  [clojure.main$repl doInvoke "main.clj" 258]
  [clojure.lang.RestFn invoke "RestFn.java" 421]
  [clojure.main$repl_opt invoke "main.clj" 324]
  [clojure.main$main doInvoke "main.clj" 422]
  [clojure.lang.RestFn invoke "RestFn.java" 397]
  [clojure.lang.Var invoke "Var.java" 375]
  [clojure.lang.AFn applyToHelper "AFn.java" 152]
  [clojure.lang.Var applyTo "Var.java" 700]
  [clojure.main main "main.java" 37]]}

Patch: clj-1703-2.patch



 Comments   
Comment by Rich Hickey [ 17/Apr/15 9:48 AM ]

Can we please put the kv pairs of via each on their own line?





[CLJ-1700] REPL evaluation of conditional reader forms fails Created: 12/Apr/15  Updated: 24/Apr/15  Resolved: 24/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Luke VanderHart Assignee: Unassigned
Resolution: Completed Votes: 3
Labels: reader
Environment:

1.7-beta1


Attachments: Text File clj-1700.patch    
Patch: Code
Approval: Ok

 Description   

When using reader conditionals, evaluating a reader conditional in a vanilla command-line REPL (not nRepl or anything like that) results in a "Conditional read not allowed" error message.

Loading the whole file with load-file works as expected.

This breaks the very normal workflow of eval-ing forms from a *.cljc file in a Clojure repl using (e.g.) inferior lisp.

Approach: clojure.main/repl (also used by swank I think) enables reader conditionals at the REPL.

Patch: clj-1700.patch



 Comments   
Comment by Colin Jones [ 21/Apr/15 12:53 AM ]

Looks/works great for me - I quite literally wrote the exact same patch before talking w/ Luke today about this.





[CLJ-1699] data_readers hard coded to .clj extension, should be extended to .cljc Created: 10/Apr/15  Updated: 24/Apr/15  Resolved: 24/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: David Nolen Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader

Attachments: Text File clj-1699.patch    
Patch: Code
Approval: Ok

 Description   

Currently using data_readers in ClojureScript is difficult because the extensions are not available to both compile time and runtime as they are in Clojure. This is fairly straightforward to remedy now given the presence of conditional reading - simply supply data_readers.cljc.

Approach: Find and read both data_readers.clj and data_readers.cljc. For cljc, allow reader conditionals.

Alternative: Another option would be to just allow reader conditionals on the existing data_readers.clj file. That's a simpler patch but possibly confusing given that conditionals are only available in .cljc files right now.

Patch: clj-1699.patch - tested with a variety of manual tests



 Comments   
Comment by David Nolen [ 10/Apr/15 4:23 PM ]

This could be solved trivially by concatenated data_readers.cljc resources to the return value of clojure.core/data-reader-urls.





[CLJ-1698] conditional reading bugs Created: 10/Apr/15  Updated: 24/Apr/15  Resolved: 24/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: David Nolen Assignee: Unassigned
Resolution: Completed Votes: 3
Labels: reader

Attachments: Text File 0001-CLJ-1698-fix-conditional-reading-bugs.patch    
Patch: Code and Test
Approval: Ok

 Description   

Bugs in conditional reading:

(ns bug)

#?(:cljs {'a 1 'b 2})

#?(:cljs (let [{{b :b} :a {d :d} :c} {}]))

Requiring / loading this file at the REPL results in the following exception:

CompilerException java.lang.IllegalArgumentException: Duplicate key: null, compiling:
	clojure.lang.Compiler.load (Compiler.java:7244)
	clojure.lang.RT.loadResourceScript (RT.java:371)
	clojure.lang.RT.loadResourceScript (RT.java:362)
	clojure.lang.RT.load (RT.java:446)
	clojure.lang.RT.load (RT.java:412)
	clojure.core/load/fn--5427 (core.clj:5862)
	clojure.core/load (core.clj:5861)
	clojure.core/load-one (core.clj:5667)
	clojure.core/load-lib/fn--5376 (core.clj:5707)
	clojure.core/load-lib (core.clj:5706)
	clojure.core/apply (core.clj:630)
	clojure.core/load-libs (core.clj:5745)
Caused by:
IllegalArgumentException Duplicate key: null
	clojure.lang.PersistentArrayMap.createWithCheck (PersistentArrayMap.java:70)
	clojure.lang.RT.map (RT.java:1537)
	clojure.lang.LispReader$MapReader.invoke (LispReader.java:1152)
	clojure.lang.LispReader.read (LispReader.java:255)
	clojure.lang.LispReader.access$800 (LispReader.java:40)
	clojure.lang.LispReader$ConditionalReader.readCondDelimited (LispReader.java:1376)
	clojure.lang.LispReader$ConditionalReader.invoke (LispReader.java:1448)
	clojure.lang.LispReader$DispatchReader.invoke (LispReader.java:684)
	clojure.lang.LispReader.read (LispReader.java:255)
	clojure.lang.LispReader.readDelimitedList (LispReader.java:1191)
	clojure.lang.LispReader$ListReader.invoke (LispReader.java:1040)
	clojure.lang.LispReader.read (LispReader.java:255)
(ns bug)

(def m #?(:cljs ^{:a :b} {}
          :clj  ^{:a :b} {}))
CompilerException java.lang.IllegalArgumentException: Metadata must be Symbol,Keyword,String or Map, compiling:(bug.cljc:3:25)
	clojure.lang.Compiler.load (Compiler.java:7244)
	clojure.lang.RT.loadResourceScript (RT.java:371)
	clojure.lang.RT.loadResourceScript (RT.java:362)
	clojure.lang.RT.load (RT.java:446)
	clojure.lang.RT.load (RT.java:412)
	clojure.core/load/fn--5427 (core.clj:5862)
	clojure.core/load (core.clj:5861)
	clojure.core/load-one (core.clj:5667)
	clojure.core/load-lib/fn--5376 (core.clj:5707)
	clojure.core/load-lib (core.clj:5706)
	clojure.core/apply (core.clj:630)
	clojure.core/load-libs (core.clj:5745)
Caused by:
IllegalArgumentException Metadata must be Symbol,Keyword,String or Map
	clojure.lang.LispReader$MetaReader.invoke (LispReader.java:790)
	clojure.lang.LispReader.read (LispReader.java:255)
	clojure.lang.LispReader.access$800 (LispReader.java:40)
	clojure.lang.LispReader$ConditionalReader.readCondDelimited (LispReader.java:1376)
	clojure.lang.LispReader$ConditionalReader.invoke (LispReader.java:1448)
	clojure.lang.LispReader$DispatchReader.invoke (LispReader.java:684)
	clojure.lang.LispReader.read (LispReader.java:255)
	clojure.lang.LispReader.readDelimitedList (LispReader.java:1191)
	clojure.lang.LispReader$ListReader.invoke (LispReader.java:1040)
	clojure.lang.LispReader.read (LispReader.java:255)
	clojure.lang.LispReader.read (LispReader.java:195)
	clojure.lang.Compiler.load (Compiler.java:7232)
(ns bug)

#?(:cljs {:a #_:b :c})
CompilerException java.lang.RuntimeException: Map literal must contain an even number of forms

Cause: Not properly handling suppress-read flag.

Patch: 0001-CLJ-1698-fix-conditional-reading-bugs.patch



 Comments   
Comment by Alex Miller [ 11/Apr/15 5:49 AM ]

For Clojure REPL repro:

(def opts {:features #{:clj} :read-cond :allow})
(read-string opts "#?(:cljs {'a 1 'b 2})")
(read-string opts "#?(:cljs (let [{{b :b} :a {d :d} :c} {}]))")
(read-string opts "(def m #?(:cljs ^{:a :b} {} :clj  ^{:a :b} {}))")
(read-string opts "(def m #?(:cljs ^{:a :b} {} :clj ^{:a :b} {}))")
(read-string opts "#?(:cljs {:a #_:b :c}")




[CLJ-1695] Variadic vector-of overload has poor performance Created: 03/Apr/15  Updated: 10/Apr/15  Resolved: 10/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Leon Grapenthin Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: performance

Attachments: Text File clj-1695-2.patch     Text File clj-1695.patch    
Patch: Code
Approval: Ok

 Description   
(time (do (apply vector-of :double (repeat 100000 0))
                nil))

Times after a few runs ~335 ms.

(time (do (into (vector-of :double) (repeat 100000 0))
                nil))

Times after a few runs ~5 ms.

Cause: The variadic case for vector-of is missing a type hint and uses reflection - this will be seen in any call to vector-of with more than 4 elements.

Approach: Switch interop call to instead use conj - there is no reason to be using interop here over standard conj on a vector. The after time for the first case above is 6-9 ms depending on GC (the fast reduce path in repeat reduces gc variability I think).

Patch: clj-1695-2.patch



 Comments   
Comment by Leon Grapenthin [ 03/Apr/15 4:57 PM ]

I thought seqs were passed pretty much "as is" with apply. E.g.:

(time (do (apply (fn [& args]
(into (vector-of :double) args))
(repeat 100000 0))
nil))

performs as fast as (into (vector-of :double) (repeat 100000 0)

Comment by Alex Miller [ 06/Apr/15 9:51 AM ]

I'm going to see if we can include this in 1.7, no guarantees. For now a workaround is any use of vector-of that creates, then fills the vector separately as you are doing with into.

The into case actually will get an additional benefit in 1.7 for sources that can reduce themselves faster (repeat happens to be one of those).

Comment by Michael Blume [ 06/Apr/15 11:02 AM ]

If we're adding a type-hint anyway, wouldn't it be more effective to type-hint the return value of vector-of?

Comment by Alex Miller [ 06/Apr/15 11:10 AM ]

I went back-and-forth on that but it didn't seem like that any other code could benefit from the return type hint?

Re-thinking, maybe the hint should actually be more generic and hint to clojure.lang.IPersistentCollection instead.

Comment by Michael Blume [ 06/Apr/15 11:15 AM ]

Actually I'm wrong, type-hinting the return of vector-of (as either Vec or IPersistentCollection) doesn't remove the reflection.

Comment by Alex Miller [ 06/Apr/15 11:37 AM ]

Attached new patch with more generic typehint, placed more specifically where it's needed.

Comment by Alex Miller [ 10/Apr/15 9:52 AM ]

Switched interop call to just





[CLJ-1694] cycle is too eager Created: 03/Apr/15  Updated: 10/Apr/15  Resolved: 10/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: regression

Attachments: Text File clj-1694.patch    
Patch: Code and Test
Approval: Ok

 Description   

Same as CLJ-1692, cycle realizes one element ahead in next(), which is different than before CLJ-1603 was applied:

(take 2 (cycle (map #(/ 42 %) '(2 1 0))))

Approach: Delay realization until first is called.

Patch: clj-1694.patch



 Comments   
Comment by Alex Miller [ 03/Apr/15 2:56 PM ]

Actually, the behavior here is no different vs before.

(take 2 (cycle (map #(/ 42 %) (range 31 -1 -1))))
;; exception

(take 2 (cycle (map #(/ 42 %) (range 32 -1 -1))))
;; works

Just a side effect of chunking and afaict not any different than before.

Comment by Nicola Mometto [ 03/Apr/15 3:21 PM ]

The regression does exist, just a bad example.
Here's a valid one:

Clojure 1.7.0-master-SNAPSHOT
user=> (take 2 (cycle (map #(/ 42 %) '(2 1 0))))
ArithmeticException Divide by zero  clojure.lang.Numbers.divide (Numbers.java:158)
Clojure 1.6.0
user=> (take 2 (cycle (map #(/ 42 %) '(2 1 0))))
(21 42)




[CLJ-1692] Iterate is too eager Created: 01/Apr/15  Updated: 10/Apr/15  Resolved: 10/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Ambrose Bonnaire-Sergeant Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: regression

Attachments: Text File clj-1692-2.patch     Text File clj-1692-3.patch     Text File clj1692.patch    
Patch: Code and Test
Approval: Ok

 Description   

1.7's iterate calls its function one extra time than in 1.6

;; clojure 1.6
user=> (take 2 (iterate zero? 0))
(0 true)

;; clojure 1.7-alpha6
user=> (take 2 (iterate zero? 0))
ClassCastException java.lang.Boolean cannot be cast to java.lang.Number  clojure.lang.Numbers.isZero (Numbers.java:92)

This is because Iterate.java calls its function in its "next" method rather than in its "first" method.

Approach: Iterate now holds a prevSeed that will allow computation of a lazily computed seed (the first of the iterate sequence). The very first node will be initialized with the initial seed. All other uses of seed need to ensure first is called to force realization before using seed.

Patch: clj-1692-3.patch

  • note there were two copies of test-iterate so one is removed in the test, allowing the more expansive one to be called.


 Comments   
Comment by Alex Miller [ 02/Apr/15 9:31 AM ]

What about something like clj-1692-2.patch instead? Seems simpler to me. Also, why did the first patch remove the iterate tests?

Comment by Nicola Mometto [ 02/Apr/15 9:55 AM ]

Alex, unless I'm reading it wrong, your patch addresses the symptom of this bug rather than its cause (iterate is now eager than it was before), I personally prefer Ambrose's approach.

WRT removing the iterate tests, it looks like test-iterate is duplicated at line 781 and at line 970 of sequences.clj, Ambrose's patch just removes the duplicate test.

Comment by Ambrose Bonnaire-Sergeant [ 02/Apr/15 10:57 AM ]

To be clear, removing the redundant deftest actually reveals several tests that were previously hidden.

Comment by Alex Miller [ 02/Apr/15 12:01 PM ]

Ok, fair points. I renamed some fields in the first patch to (in my opinion) better reflect their role. There were also several bugs related to using seed without guaranteeing it's computation, for example both of these threw exceptions:

(first (next (next (iterate inc 0))))
(into [] (take 3) (next (iterate inc 0)))

I added those tests as well. I re-ran the perf test from CLJ-1603 and there is definitely some degradation on (doall (take 1000 (iterate inc 0))) from 75 us to 85 us, but that's still a significant win over the prior version.

Comment by Fogus [ 03/Apr/15 1:36 PM ]

I think patch-3 is sound, and I suspect that only Ambrose would have caught it. Thanks!

Comment by Ambrose Bonnaire-Sergeant [ 03/Apr/15 1:48 PM ]

This issue was not caught by Typed Clojure, rather the peculiarities of its implementation https://github.com/clojure/core.typed/commit/1027f792accdc3e5da3475745da0106f0ad5e1cc#diff-eec0f851200aca22af17269fcab94719L1759

I was using iterate to dig down a structure, and was being very careful to only `take` exactly enough.





[CLJ-1691] Occasional failure of seq-and-transducer test Created: 01/Apr/15  Updated: 10/Apr/15  Resolved: 10/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Andy Fingerhut Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File build-failure2.txt     Text File clj-1691.patch    
Patch: Code
Approval: Ok

 Description   

I started a 'mvn clean ; mvn test' build loop on a machine yesterday. I saw 2 failures of the test seq-and-transducer in transducers.clj out of 520 runs (and no other failures). The first failure message appears below. The second is in the attachment build-fail2.txt. I haven't yet analyzed them carefully to see whether the generated test is useful.

[java] Testing clojure.test-clojure.transducers
     [java] 
     [java] FAIL in (seq-and-transducer) (transducers.clj:143)
     [java] {:coll [0 0],
     [java]  :actions
     [java]  (->>
     [java]   coll
     [java]   (partition-all 1)
     [java]   (interpose 1)
     [java]   dedupe
     [java]   (remove empty?)
     [java]   (take-while empty?)),
     [java]  :s
     [java]  #error{:cause "Don't know how to create ISeq from: java.lang.Long", :via [{:type java.lang.IllegalArgumentException, :message "Don't know how to create ISeq from: java.lang.Long", :at [clojure.lang.RT seqFrom "RT.java" 528]}], :trace [[clojure.lang.RT seqFrom "RT.java" 528] [clojure.lang.RT seq "RT.java" 509] [clojure.core$seq__4106 invoke "core.clj" 135] [clojure.core$empty_QMARK_ invoke "core.clj" 5946] [clojure.core$complement$fn__4358 invoke "core.clj" 1374] [clojure.core$filter$fn__4558 invoke "core.clj" 2683] [clojure.lang.LazySeq sval "LazySeq.java" 40] [clojure.lang.LazySeq seq "LazySeq.java" 49] [clojure.lang.RT seq "RT.java" 507] [clojure.core$seq__4106 invoke "core.clj" 135] [clojure.core$take_while$fn__4582 invoke "core.clj" 2771] [clojure.lang.LazySeq sval "LazySeq.java" 40] [clojure.lang.LazySeq seq "LazySeq.java" 49] [clojure.lang.RT seq "RT.java" 507] [clojure.core$seq__4106 invoke "core.clj" 135] [clojure.core$dorun invoke "core.clj" 3010] [clojure.core$doall invoke "core.clj" 3026] [clojure.test_clojure.transducers$apply_as_seq invoke "transducers.clj" 82] [clojure.test_clojure.transducers$build_results$fn__26512 invoke "transducers.clj" 115] [clojure.test_clojure.transducers$build_results invoke "transducers.clj" 115] [clojure.test_clojure.transducers$fn__26524 invoke "transducers.clj" 130] [clojure.test.check.rose_tree$fmap invoke "rose_tree.clj" 46] [clojure.core$partial$fn__4505 invoke "core.clj" 2491] [clojure.core$map$fn__4531 invoke "core.clj" 2622] [clojure.lang.LazySeq sval "LazySeq.java" 40] [clojure.lang.LazySeq seq "LazySeq.java" 49] [clojure.lang.RT seq "RT.java" 507] [clojure.core$seq__4106 invoke "core.clj" 135] [clojure.test.check.rose_tree$permutations$iter__26065__26069$fn__26070$iter__26089__26093$fn__26094 invoke "rose_tree.clj" 73] [clojure.lang.LazySeq sval "LazySeq.java" 40] [clojure.lang.LazySeq seq "LazySeq.java" 49] [clojure.lang.RT seq "RT.java" 507] [clojure.core$seq__4106 invoke "core.clj" 135] [clojure.core$map$fn__4531 invoke "core.clj" 2614] [clojure.lang.LazySeq sval "LazySeq.java" 40] [clojure.lang.LazySeq seq "LazySeq.java" 49] [clojure.lang.RT seq "RT.java" 507] [clojure.core$seq__4106 invoke "core.clj" 135] [clojure.core$map$fn__4531 invoke "core.clj" 2614] [clojure.lang.LazySeq sval "LazySeq.java" 40] [clojure.lang.LazySeq seq "LazySeq.java" 49] [clojure.lang.Cons next "Cons.java" 39] [clojure.lang.RT next "RT.java" 674] [clojure.core$next__4090 invoke "core.clj" 64] [clojure.core$nthnext invoke "core.clj" 3039] [clojure.test.check$shrink_loop invoke "check.clj" 94] [clojure.test.check$failure invoke "check.clj" 121] [clojure.test.check$quick_check doInvoke "check.clj" 65] [clojure.lang.RestFn invoke "RestFn.java" 425] [clojure.test_clojure.transducers$fn__26531 invoke "transducers.clj" 139] [clojure.test$test_var$fn__7628 invoke "test.clj" 704] [clojure.test$test_var invoke "test.clj" 704] [clojure.test$test_vars$fn__7650$fn__7655 invoke "test.clj" 722] [clojure.test$default_fixture invoke "test.clj" 674] [clojure.test$test_vars$fn__7650 invoke "test.clj" 722] [clojure.test$default_fixture invoke "test.clj" 674] [clojure.test$test_vars invoke "test.clj" 718] [clojure.test$test_all_vars invoke "test.clj" 728] [clojure.test$test_ns invoke "test.clj" 747] [clojure.core$map$fn__4531 invoke "core.clj" 2622] [clojure.lang.LazySeq sval "LazySeq.java" 40] [clojure.lang.LazySeq seq "LazySeq.java" 49] [clojure.lang.Cons next "Cons.java" 39] [clojure.lang.RT next "RT.java" 674] [clojure.core$next__4090 invoke "core.clj" 64] [clojure.core$reduce1 invoke "core.clj" 907] [clojure.core$reduce1 invoke "core.clj" 898] [clojure.core$merge_with doInvoke "core.clj" 2937] [clojure.lang.RestFn applyTo "RestFn.java" 139] [clojure.core$apply invoke "core.clj" 630] [clojure.test$run_tests doInvoke "test.clj" 762] [clojure.lang.RestFn applyTo "RestFn.java" 137] [clojure.core$apply invoke "core.clj" 628] [user$eval28346 invoke "run_test.clj" 7] [clojure.lang.Compiler eval "Compiler.java" 6792] [clojure.lang.Compiler load "Compiler.java" 7237] [clojure.lang.Compiler loadFile "Compiler.java" 7175] [clojure.main$load_script invoke "main.clj" 275] [clojure.main$script_opt invoke "main.clj" 337] [clojure.main$main doInvoke "main.clj" 421] [clojure.lang.RestFn invoke "RestFn.java" 408] [clojure.lang.Var invoke "Var.java" 379] [clojure.lang.AFn applyToHelper "AFn.java" 154] [clojure.lang.Var applyTo "Var.java" 700] [clojure.main main "main.java" 37]]},
     [java]  :xs (),
     [java]  :xi [],
     [java]  :xe [],
     [java]  :xt []}
     [java] 
     [java] expected: (:result res)
     [java]   actual: false


 Comments   
Comment by Ghadi Shayban [ 01/Apr/15 9:34 AM ]

Both failures seem like a test script problem: an 'empty?' check after an interpose of a scalar long.

Mildly surprised that these didn't show up earlier.

Comment by Nicola Mometto [ 01/Apr/15 9:48 AM ]

Looks like this is caused by the chunkedness of keep:

;; lists are not chunked
user=> (->> '([1] 1) (keep empty?) first)
()
;; vector seqs are chunked
user=> (->> '[[1] 1] (keep empty?) first)
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long  clojure.lang.RT.seqFrom (RT.java:528)

In the non-chunked example (mimicking how transducers work), the call to (empty? 1) never gets executed since only the first value is required, on the chunked example otoh (empty? 1) gets invoked causing the exception.





[CLJ-1685] :eof option in clojure.core/read not handled properly Created: 29/Mar/15  Updated: 10/Apr/15  Resolved: 10/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Adrian Medina Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: reader

Attachments: Text File 0001-CLJ-1685-correctly-handle-eof-option-in-read-read-st.patch     Text File 0001-CLJ-1685-correctly-handle-eof-option-in-read-read-st-v2.patch     Text File clj-1685-v3.patch    
Patch: Code and Test
Approval: Ok

 Description   

Example form which exhibits the behavior:

(read {:read-cond :allow :eof (Object.)} input)

When EOF is reached in the stream, instead of returning the :eof value specified the boolean value true is always returned instead. If you omit :eof from the option map given to clojure.core/read, false is consistently returned and no EOF error is thrown.
Patch: 0001-CLJ-1685-correctly-handle-eof-option-in-read-read-st-v2.patch

Note: Currently

(read {} stream)
behaves like
(read {:eof nil} stream)
rather than
(read stream)
, the proposed patch makes it believe like
(read {:eof :eofthrow} input)
, the proposed patch changes this so that the default behaviour is always to throw on eof unless a :eof option is explicitly included in the read opts.

Patch: clj-1685-v3.patch

Screened by: Alex Miller



 Comments   
Comment by Nicola Mometto [ 29/Mar/15 2:38 PM ]

Attached patch fixes the issue for both read and read-string

Comment by Andy Fingerhut [ 29/Mar/15 2:47 PM ]

Never try to race Nicola to a patch when he is on the task Thanks, Nicola.

Comment by Nicola Mometto [ 30/Mar/15 8:20 AM ]

Alex, currently calls to read/read-string with an empty options map behave as if {:eof nil} was passed, thus

user=> (read-string "")
RuntimeException EOF while reading  clojure.lang.Util.runtimeException (Util.java:221)
user=> (read-string {} "")
nil

i.e, :eof defaults to nil.
Is this intended? if not, the attached patch 0001-CLJ-1685-correctly-handle-eof-option-in-read-read-st-v2.patch
fixes this issue and changes the behaviour of read/read-string to default to :eofthrow rather than to nil

Comment by Alex Miller [ 06/Apr/15 11:30 AM ]

The -v3 patch is identical to -v2, but adds one clarifying docstring addition.





[CLJ-1684] Transducer eduction test is wrong Created: 27/Mar/15  Updated: 31/Mar/15  Resolved: 31/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: regression, transducers

Attachments: Text File clj-1684.patch    
Patch: Code and Test
Approval: Ok

 Description   

Error in build:

[java] ERROR in (seq-and-transducer) (TransformerIterator.java:86)
         [java] Uncaught exception, not in assertion.
         [java] expected: nil
         [java]   actual: java.lang.NullPointerException: null
         [java]  at clojure.lang.TransformerIterator.step (TransformerIterator.java:86)
         [java]     clojure.lang.TransformerIterator.hasNext (TransformerIterator.java:97)
         [java]     clojure.lang.RT.chunkIteratorSeq (RT.java:489)
         [java]     clojure.lang.RT.seqFrom (RT.java:518)
         [java]     clojure.lang.RT.seq (RT.java:509)
         [java]     clojure.core/seq (core.clj:135)
         [java]     clojure.core$print_sequential.invoke (core_print.clj:47)
         [java]     clojure.core/fn (core.clj:7362)
         [java]     clojure.lang.MultiFn.invoke (MultiFn.java:233)
         [java]     clojure.core$pr_on.invoke (core.clj:3549)
         [java]     clojure.core$pr.invoke (core.clj:3561)
         [java]     clojure.pprint$pprint_simple_default.invoke (dispatch.clj:144)

There is an error in the generative transducer eduction test that was added in CLJ-1669. This code in transducers.clj:

(apply eduction (into actions [coll]))

is not invoking eduction with actions and a collection. Rather it is putting the actions inside the collection and effectively using no transformation at all as in {{(eduction [1 2])}}, which always passes. The error seen is due to a bad function being passed - if actions happens to be nil (which I think is happening while shrinking another failure), something like (eduction (constantly nil) []) is being called, which we would not expect to work in the first place.

After fixing the bad eduction handling, I was only able to trigger failures with a high number of iterations and a very large number of transformations. The errors reported under these conditions are difficult to understand, I believe because they are hitting StackOverflow errors and the stack traces are being removed by the JVM.

I did some more investigation into whether we are actually generating useful generative tests and found that due to the large stack of transformations, virtually all tests were just producing exceptions, rather than more interesting behavior. I capped the number of transformations to 5 and saw much more useful and interesting tests being generated. I've also doubled the number of transducer tests being run in the patch and ran it locally with a much higher number with no failures.

Patch: clj-1684.patch






[CLJ-1683] test-reduce doesn't catch off-by-one error in IReduce Created: 25/Mar/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: None

Attachments: Text File clj-1683-v1.patch    
Patch: Code and Test
Approval: Ok

 Description   

To implement the no-init arity of reduce, you have to use the first element of the sequence as an init value, and then you have to skip the first element when you iterate through the sequence. The tests in test-reduce, which focus on using reduce to sum up a bunch of numbers in a range, don't actually pin down this behavior, because the range used starts with zero, and an extra zero doesn't affect the sum.

Screened by: Alex Miller - this is a good tweak to get better tests for the hard this easy to mess up reduce behavior.






[CLJ-1681] reflection warning throws NPE for literal nil arg Created: 24/Mar/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Michael Blume Assignee: Michael Blume
Resolution: Completed Votes: 0
Labels: regression, typehints

Attachments: Text File clj-1681-v1.patch     Text File clj-1681-v2.patch     Text File clj-1681-v3.patch    
Patch: Code and Test
Approval: Ok

 Description   

Seen on another project, but reproducible with this:

user=> (set! *warn-on-reflection* true)
true
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
user=> (pst *e)
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:21:13)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6740)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyze (Compiler.java:6485)
	clojure.lang.Compiler$BodyExpr$Parser.parse (Compiler.java:5861)
	clojure.lang.Compiler$FnMethod.parse (Compiler.java:5296)
	clojure.lang.Compiler$FnExpr.parse (Compiler.java:3925)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6731)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
Caused by:
NullPointerException
	clojure.lang.Compiler.getTypeStringForArgs (Compiler.java:2440)
	clojure.lang.Compiler$InstanceMethodExpr.<init> (Compiler.java:1490)
	clojure.lang.Compiler$HostExpr$Parser.parse (Compiler.java:1000)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6733)
	clojure.lang.Compiler.analyze (Compiler.java:6524)

Cause: Regression in Clojure 1.6+ from patch for CLJ-1248. In printing the reflection warning message, if the argument's java class is null, an NPE results.

Approach: Check for null before getting class name.

Patch: clj-1681-v3.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 24/Mar/15 3:54 PM ]

Patch welcome.

Comment by Alex Miller [ 24/Mar/15 4:31 PM ]

Need test in the patch and having a simple way to repro in the ticket would be great.

Comment by Nicola Mometto [ 25/Mar/15 3:46 PM ]

here's a repro:

Clojure 1.7.0-master-SNAPSHOT
user=> (set! *warn-on-reflection* true)
true
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
Comment by Michael Blume [ 25/Mar/15 3:49 PM ]

Thanks! I was having some trouble with that.

Comment by Nicola Mometto [ 25/Mar/15 3:54 PM ]

Michael, me and Alex were discussing this ticket in IRC and Alex was proposing replacing the if expression with an implementation like:

(arg.hasJavaClass() && arg.getJavaClass() != null) ? arg.getJavaClass().getName() : "unknown"
Comment by Michael Blume [ 25/Mar/15 3:58 PM ]

Hm, that makes sense, I was thinking print "nil" instead of "unknown" but I guess the fact that the user is passing nil doesn't actually tell us the expected class of the argument (apart from that it's not a primitive)





[CLJ-1677] Add setLineNumber to LineNumberingPushbackReader Created: 17/Mar/15  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader

Attachments: Text File clj-1677.patch    
Patch: Code and Test
Approval: Ok

 Description   

Add setLineNumber() to set line number on underlying LineNumberingReader.






[CLJ-1667] Socket test can fail if hard-coded port is unavailable Created: 26/Feb/15  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Completed Votes: 0
Labels: io, test

Attachments: Text File socket-test.patch    
Patch: Code
Approval: Ok

 Description   

I was unable to run the Clojure tests due to this problem. There is a test that hardcodes a port and something else on my machine happened to be using that port.

The patch avoids binding a hard-coded port in the test.

Patch: socket-test.patch

Screened by:



 Comments   
Comment by Andy Fingerhut [ 26/Feb/15 11:31 AM ]

I used to try running the prescreen tests in parallel for two different JDKs on the same machine, and I probably stopped doing that because of this. My use case is a very unusual one, and not a good reason to change this by itself, but my use case certainly made this conflict happen regularly.

Comment by Alex Miller [ 26/Feb/15 11:57 AM ]

No good reason not to fix it! Silly test.





[CLJ-1663] DynamicClassLoader delegates to parent classloader before checking in its URL list Created: 18/Feb/15  Updated: 20/Feb/15  Resolved: 20/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: classloader, regression

Attachments: Text File 0001-CLJ-1663-delegate-loadClass-to-super-classloader-bef.patch    
Patch: Code
Approval: Ok

 Description   

See Cursive #748. Cursive calls into Leiningen in-process, and before doing that it creates a new DynamicClassLoader which uses an IntelliJ PluginClassLoader as its parent. This is throwing a CNFE, although the URL containing the class is present in the DynamicClassLoader URL list.

Cause: The patch for CLJ-979 added an implementation of loadClass that delegates to "getParent().loadClass()", which in this case delegates to PluginClassLoader.loadClass(). This was incorrect as the current implementation should have been to delegate to the loadClass method of the superclass, which will take care of delegating to the loadClass method of the parent class loader if necessary.

Approach: The proposed patch replaces the call to "getParent().loadClass()" with a call "super.loadClass()" fixing this issue.

Screened by: Alex Miller



 Comments   
Comment by Colin Fleming [ 18/Feb/15 6:25 AM ]

Unfortunately getClassLoadingLock(name) is only available from Java 1.7+ and I am targeting Java 1.6. However reverting that part of the patch to synchronize on "this" as previously does indeed fix the original problem.

Comment by Nicola Mometto [ 18/Feb/15 7:22 AM ]

I'll revert the getClassLoadingLock change then, it was actually out of scope for this ticket.





[CLJ-1642] Add mention of new :warn-on-boxed option to doc string of unchecked-math Var Created: 15/Jan/15  Updated: 20/Feb/15  Resolved: 20/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Andy Fingerhut Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: docstring

Attachments: Text File clj-1642-v1.patch    
Patch: Code
Approval: Ok

 Description   

A small doc string enhancement about the new compiler behavior in Clojure 1.7 when *unchecked-math* is bound to :warn-on-boxed

https://github.com/clojure/clojure/blob/master/changes.md#13-warn-on-boxed-math

Patch: clj-1642-v1.patch
Screened by: Alex Miller



 Comments   
Comment by Andy Fingerhut [ 15/Jan/15 11:51 AM ]

Patch clj-1642-v1.patch dated Jan 15 2014 is one way to document the new :warn-on-boxed behavior.





[CLJ-1638] Regression - PersistentVector.create(List) was removed in 1.7.0-alpha5 Created: 12/Jan/15  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: collections, regression
Environment:

1.7.0-alpha5


Attachments: Text File clj-1638-2.patch     Text File clj-1638.patch    
Patch: Code
Approval: Ok

 Description   

For CLJ-1546, PersistentVector.create(List) was replaced with PersistentVector.create(ArrayList). At least one library (flatland) was calling this method directly and was broken by the change.

Approach: Change create(ArrayList) to more general prior method create(List).

Patch: clj-1638-2.patch

Screened by:



 Comments   
Comment by Rich Hickey [ 20/Feb/15 7:42 AM ]

Is there a good reason to have both PersistentVector.create(List)and PersistentVector.create(ArrayList)?

Comment by Fogus [ 27/Feb/15 9:08 AM ]

This couldn't possibly be more straight-forward.





[CLJ-1637] vec fails on MapEntry Created: 11/Jan/15  Updated: 20/Feb/15  Resolved: 20/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: regression
Environment:

1.7.0-alpha5


Attachments: Text File clj-1637-jdevuyst.patch     Text File clj-1637.patch     Text File clj-1637-with-test.patch    
Patch: Code
Approval: Ok

 Description   

After CLJ-1546:

(vec (first {1 2}))

Cause: (if (vector? coll) (with-meta coll nil) ...) checks that something is IPersistentVector, then sends it to something that takes IObj, so anything that is one but not both throws an error. In Clojure itself, this is the set of classes extending from AMapEntry.

Alternatives:

1. Make AMapEntry implement IObj - this fixes everything in Clojure and keeps the vec code as is but still leaves open this gap for any external implementation of IPersistentVector.
2. Check for this case explicitly in vec. (if (vector? coll) (if (instance? clojure.lang.IObj coll) (with-meta coll nil) (clojure.lang.LazilyPersistentVector/create coll)) ...). Perf testing shows no significant difference in performance with the change.
3. Pull the special check for vector? in vec.
4. Check for this case explicitly in vec and return the same instance if it's not an IObj. See clj-1637-jdevuyst.patch.

Approach: patch takes approach #2

Patch: clj-1637-with-test.patch

Screened by: Stu (also added test)



 Comments   
Comment by Nicola Mometto [ 11/Jan/15 8:19 AM ]

The correct fix is to probably make MapEntry an IObj

Comment by Nicola Mometto [ 11/Jan/15 8:21 AM ]

Actually, making AMapEntry an IObj rather than MapEntry would fix the issue for sorted-map kv-pairs too.

user=> (vec (first (sorted-map 1 1)))
ClassCastException clojure.lang.PersistentTreeMap$BlackVal cannot be cast to clojure.lang.IObj  clojure.core/with-meta--4121 (core.clj:216)
Comment by Alex Miller [ 11/Jan/15 8:35 AM ]

There are potentially a couple ways to fix this. I'll look at it next week.

Comment by Jonas De Vuyst [ 14/Jan/15 6:14 AM ]

Slightly modified patch. In the case where coll is a vector but not an IObj, simply return coll.

Comment by Jonas De Vuyst [ 14/Jan/15 7:17 AM ]

If desired I could also update the patch to fix an analogous—albeit somewhat theoretic—bug in `set`.

It does make me wonder if perhaps a `without-meta` function should be added to `clojure.core`.

I think making AMapEntry an IObj might make sense even after applying the above patches. In `AMapEntry`, perhaps `withMeta(m)` could be implemented as `asVector().withMeta(m)`. This, however, would require changing `asVector()` to return some `IObj ∩ IPersistentVector` type (e.g. `PersistentVector`). This would be straightforward to do, but requires deciding if this change in signature may be propagated to the static methods of `LazilyPersistentVector`.

Comment by Alex Miller [ 14/Jan/15 9:28 AM ]

This is a point of some debate, but the intention with the change in the implementation is to retain current behavior, which always gives you a new vector instance. It's not clear to me that there is any point in attaching meta to map entries (which also does not solve the problem for external IPersistentVector, non-IObj instances outside Clojure).

In any case, I'm going to update the description a bit to add this as an alternative.





[CLJ-1636] SeqIterator can return incorrect results Created: 10/Jan/15  Updated: 18/Jan/15  Resolved: 16/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Blocker
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: regression
Environment:

clojure-1.7.0-alpha5


Attachments: Text File 0001-CLJ-1636-don-t-use-this-as-a-sentinel-in-SeqIterator.patch     Text File 0001-fix-for-CLJ-1636.patch    
Patch: Code
Approval: Ok

 Description   

As of 1.7.0-alpha5, we are seeing SeqIterator return iterated results that do reflect the values of the underlying seq, in particular acting as if the seq contains a nil value when it does not. This problem is intermittent but has at times caused clojure master to fail in compilation (which is why this is marked as a blocker).

Two recent changes during 1.7 have created and exposed this problem:

1) This commit https://github.com/clojure/clojure/commit/43cc1854508d655e58e377f84836ba128971f90c changed the SeqIterator implementation to be lazier and to use "this" as a sentinel object in SeqIterator. (1.7.0-alpha2)
2) CLJ-1546 changed the implementation of vec such that PersistentHashMap and PersistentHashSet are now converted using iterator() rather than seq(). PHS/PHM use SeqIterator for their Iterator implementation. (1.7.0-alpha5)

Because of #2, we are now stressing #1 much more than before. In particular, things like defining defrecords rely heavily on vec (and set) of PHS and PHM.

Example stack trace: https://gist.github.com/puredanger/f56e3253f0668a515ec5 (seen compiling Clojure itself)

Cause: Setting seq==this; in the constructor of SeqIterator is allowing unsafe publication of the partially constructed "this" object, which can cause subtle problems in the hasNext() implementation. In particular, it seems that after inlining, on the first call, the seq==this condition when comparing the cached partially constructed instance in seq and the fully constructed version in this will return false, even though these have the same object identity. This causes the wrong path to be executed in hasNext().

Approach: Do not use this as a sentinel value.

Patch: 0001-CLJ-1636-don-t-use-this-as-a-sentinel-in-SeqIterator.patch

Screened by: Alex Miller



 Comments   
Comment by Colin Jones [ 11/Jan/15 12:40 AM ]

I was able to reproduce this (intermittently) earlier, but I've seen periods of many successful runs in a row (both with that patch reverted and with it in place), so it's been hard for me to trust what I'm seeing locally when it passes. I didn't see any evidence of AOT compilation happening (e.g. no classfiles under `target/`), so I'd have expected the new function `already-compiled?` in CLJ-1544 never to actually run.

It looks like the Cause section of the stacktrace is implicating an error in trying to `(resolve nil)`, where `nil` is an entry in an interfaces collection that should actually be empty. That's based on these two lines (along with the lines higher up in the cause):

...snip...
at clojure.core$set.invoke(core.clj:3944)
at clojure.core$emit_defrecord.invoke(core_deftype.clj:154)
...snip...

(https://github.com/clojure/clojure/blob/3e7cb1a5c840612ad41cf6e0be92480f798bc05d/src/clj/clojure/core_deftype.clj#L154)

The defrecord here in question looks like

(defrecord Foo [x y])

So the `opts+specs` var-arg argument to `defrecord` should be `nil` since there are no entries, which should mean the `interfaces` piece of the `parse-opts+specs` call should return an empty vector.

But that stacktrace confuses me, because it suggests that the `interfaces` vector, instead of being empty, contains a `nil` element. How can this be? Or what misstep have I made in tracing through this?

Comment by Alex Miller [ 11/Jan/15 12:54 AM ]

If the error is intermittent, then my pegging of CLJ-1544 may be wrong. For me, it was repeatable as of clojure commit e5a104e894ed82f244d69513918d570cee5df67d (when CLJ-1544 was applied) and I have not reproduced it prior.

Comment by Nicola Mometto [ 11/Jan/15 4:50 AM ]

Alex, just to be sure – you were able to reproduce this bug with clojure at e5a104e894ed82f244d69513918d570cee5df67d (CLJ-1544) ? I'd like to have confirmation so 9f277c80258b3d2951128ce26a07c30ad0b47af0 (CLJ-979) can be excluded as the culprit

Comment by Alex Miller [ 11/Jan/15 8:09 AM ]

Correct.

Comment by Nicola Mometto [ 11/Jan/15 8:15 AM ]

Well this is weird then.
The only way I can think of that would produce that exception is if this returned nil: https://github.com/clojure/clojure/blob/3e7cb1a5c840612ad41cf6e0be92480f798bc05d/src/clj/clojure/core_deftype.clj#L57

This means a scenario like this:

user=> (defrecord x [] +)
NullPointerException   clojure.lang.Compiler.maybeResolveIn (Compiler.java:7015)
user=> (.printStackTrace *e)
java.lang.NullPointerException, compiling:(NO_SOURCE_FILE:3:1)
	at clojure.lang.Compiler.macroexpand1(Compiler.java:6620)
	at clojure.lang.Compiler.macroexpand(Compiler.java:6678)
	at clojure.lang.Compiler.eval(Compiler.java:6752)
	at clojure.lang.Compiler.eval(Compiler.java:6731)
	at clojure.core$eval.invoke(core.clj:3076)
	at clojure.main$repl$read_eval_print__7044$fn__7047.invoke(main.clj:239)
	at clojure.main$repl$read_eval_print__7044.invoke(main.clj:239)
	at clojure.main$repl$fn__7053.invoke(main.clj:257)
	at clojure.main$repl.doInvoke(main.clj:257)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.main$repl_opt.invoke(main.clj:323)
	at clojure.main$main.doInvoke(main.clj:421)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at clojure.lang.Var.invoke(Var.java:375)
	at clojure.lang.AFn.applyToHelper(AFn.java:152)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: java.lang.NullPointerException
	at clojure.lang.Compiler.maybeResolveIn(Compiler.java:7015)
	at clojure.core$ns_resolve.invoke(core.clj:4200)
	at clojure.core$ns_resolve.invoke(core.clj:4197)
	at clojure.core$resolve.invoke(core.clj:4206)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:485)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$reduce1.invoke(core.clj:899)
	at clojure.core$set.invoke(core.clj:3944)
	at clojure.core$emit_defrecord.invoke(core_deftype.clj:154)
	at clojure.core$defrecord.doInvoke(core_deftype.clj:374)
	at clojure.lang.RestFn.invoke(RestFn.java:497)
	at clojure.lang.Var.invoke(Var.java:401)
	at clojure.lang.AFn.applyToHelper(AFn.java:171)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.lang.Compiler.macroexpand1(Compiler.java:6607)
	... 16 more
nil

where a var is used as a protocol but no interface is present.

Comment by Nicola Mometto [ 11/Jan/15 8:18 AM ]

Alex, since I cannot reproduce, can you try getting the exception with a patched version of clojure that replaces https://github.com/clojure/clojure/blob/3e7cb1a5c840612ad41cf6e0be92480f798bc05d/src/clj/clojure/core_deftype.clj#L57
with something like

(or (:on (deref (resolve %))) 
    (println % @(resolve %)))

so we can get an idea of what's going on?

Comment by Nicola Mometto [ 11/Jan/15 11:38 AM ]

I was just able to reproduce this issue using clojure at commit 4afd4a7c14c48b5baf3c03196053066483cb4223

This means that CLJ-1544 is not responsable for this bug.

I can also confirm that this bug is intermittent, which makes figuring out what's going on really hard.

Comment by Nicola Mometto [ 11/Jan/15 12:10 PM ]

I still have absolutely no idea how this can happen but adding a bunch of printlns it turned out that for some reason in this binding of the deftype macro:

[interfaces methods opts] (parse-opts+specs opts+specs)

when opts+specs is nil, interfaces is sometimes [nil] as opposed to [].

This makes me think that there's some concurrency bug in the recent changes around the handling of vec, but this is just a guess.

Comment by Nicola Mometto [ 11/Jan/15 12:13 PM ]

I've restricted it down a bit and it looks like this part of opts+spec can bind interfaces to [nil] when impls is {}

interfaces (→ (map #(if (var? (resolve %))
                      (:on (deref (resolve %)))
                      %)
                   (keys impls))
             set
             (disj 'Object 'java.lang.Object)
             vec)
Comment by Nicola Mometto [ 11/Jan/15 12:25 PM ]

Here's an example output from my debugging tests, with the following patch applied:

diff --git a/src/clj/clojure/core_deftype.clj b/src/clj/clojure/core_deftype.clj
index 97e14cc..8f521eb 100644
--- a/src/clj/clojure/core_deftype.clj
+++ b/src/clj/clojure/core_deftype.clj
@@ -60,6 +60,8 @@ (defn- parse-opts+specs [opts+specs]
                        set
                        (disj 'Object 'java.lang.Object)
                        vec)
+        _ (when (nil? opts+specs)
+            (println impls interfaces))
         methods (map (fn [[name params & body]]
                        (cons name (maybe-destructured params body)))
                      (apply concat (vals impls)))]
{} [nil]
Exception in thread "main" java.lang.NullPointerException, compiling:(schema/utils.clj:68:1)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6716)
	at clojure.lang.Compiler.analyze(Compiler.java:6500)
	at clojure.lang.Compiler.analyze(Compiler.java:6461)
	at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5837)
	at clojure.lang.Compiler$LetExpr$Parser.parse(Compiler.java:6155)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6709)
	at clojure.lang.Compiler.analyze(Compiler.java:6500)
	at clojure.lang.Compiler.analyze(Compiler.java:6461)
	at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5837)
	at clojure.lang.Compiler$FnMethod.parse(Compiler.java:5272)
	at clojure.lang.Compiler$FnExpr.parse(Compiler.java:3901)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6707)
	at clojure.lang.Compiler.analyze(Compiler.java:6500)
	at clojure.lang.Compiler.eval(Compiler.java:6765)
	at clojure.lang.Compiler.load(Compiler.java:7195)
	at clojure.lang.RT.loadResourceScript(RT.java:370)
	at clojure.lang.RT.loadResourceScript(RT.java:361)
	at clojure.lang.RT.load(RT.java:440)
	at clojure.lang.RT.load(RT.java:411)
	at clojure.core$load$fn__5424.invoke(core.clj:5848)
	at clojure.core$load.doInvoke(core.clj:5847)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at clojure.core$load_one.invoke(core.clj:5653)
	at clojure.core$load_lib$fn__5373.invoke(core.clj:5693)
	at clojure.core$load_lib.doInvoke(core.clj:5692)
	at clojure.lang.RestFn.applyTo(RestFn.java:142)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$load_libs.doInvoke(core.clj:5731)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$require.doInvoke(core.clj:5814)
	at clojure.lang.RestFn.invoke(RestFn.java:457)
	at plumbing.core$eval13998$loading__5316__auto____13999.invoke(core.clj:1)
	at plumbing.core$eval13998.invoke(core.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6768)
	at clojure.lang.Compiler.eval(Compiler.java:6757)
	at clojure.lang.Compiler.load(Compiler.java:7195)
	at clojure.lang.RT.loadResourceScript(RT.java:370)
	at clojure.lang.RT.loadResourceScript(RT.java:361)
	at clojure.lang.RT.load(RT.java:440)
	at clojure.lang.RT.load(RT.java:411)
	at clojure.core$load$fn__5424.invoke(core.clj:5848)
	at clojure.core$load.doInvoke(core.clj:5847)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at clojure.core$load_one.invoke(core.clj:5653)
	at clojure.core$load_lib$fn__5373.invoke(core.clj:5693)
	at clojure.core$load_lib.doInvoke(core.clj:5692)
	at clojure.lang.RestFn.applyTo(RestFn.java:142)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$load_libs.doInvoke(core.clj:5731)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$require.doInvoke(core.clj:5814)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at user.fus_threading$eval13994.invoke(fus_threading.clj:6)
	at clojure.lang.Compiler.eval(Compiler.java:6768)
	at clojure.lang.Compiler.load(Compiler.java:7195)
	at clojure.lang.RT.loadResourceScript(RT.java:370)
	at clojure.lang.RT.loadResourceScript(RT.java:361)
	at clojure.lang.RT.load(RT.java:440)
	at clojure.lang.RT.load(RT.java:411)
	at clojure.core$load$fn__5424.invoke(core.clj:5848)
	at clojure.core$load.doInvoke(core.clj:5847)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at clojure.core$load_one.invoke(core.clj:5653)
	at clojure.core$load_lib$fn__5373.invoke(core.clj:5693)
	at clojure.core$load_lib.doInvoke(core.clj:5692)
	at clojure.lang.RestFn.applyTo(RestFn.java:142)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$load_libs.doInvoke(core.clj:5731)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$require.doInvoke(core.clj:5814)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at midje.repl$load_facts$fn__6148.invoke(repl.clj:206)
	at midje.repl$load_facts.doInvoke(repl.clj:192)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at user$eval6211.invoke(form-init7109545842773565024.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6768)
	at clojure.lang.Compiler.eval(Compiler.java:6758)
	at clojure.lang.Compiler.load(Compiler.java:7195)
	at clojure.lang.Compiler.loadFile(Compiler.java:7151)
	at clojure.main$load_script.invoke(main.clj:274)
	at clojure.main$init_opt.invoke(main.clj:279)
	at clojure.main$initialize.invoke(main.clj:307)
	at clojure.main$null_opt.invoke(main.clj:342)
	at clojure.main$main.doInvoke(main.clj:420)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: java.lang.NullPointerException
	at clojure.lang.Compiler.resolveIn(Compiler.java:6971)
	at clojure.lang.Compiler.resolve(Compiler.java:6949)
	at clojure.lang.Compiler$NewInstanceExpr.build(Compiler.java:7565)
	at clojure.lang.Compiler$NewInstanceExpr$DeftypeParser.parse(Compiler.java:7490)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6709)
	... 91 more
Comment by Nicola Mometto [ 11/Jan/15 12:32 PM ]

Further debugging is convincing me further that some of the recent changes around `vec` are causing this bug.

With the following patch:

diff --git a/src/clj/clojure/core_deftype.clj b/src/clj/clojure/core_deftype.clj
index 97e14cc..9478b04 100644
--- a/src/clj/clojure/core_deftype.clj
+++ b/src/clj/clojure/core_deftype.clj
@@ -53,13 +53,16 @@ (defn- parse-impls [specs]
 (defn- parse-opts+specs [opts+specs]
   (let [[opts specs] (parse-opts opts+specs)
         impls (parse-impls specs)
-        interfaces (-> (map #(if (var? (resolve %)) 
-                               (:on (deref (resolve %)))
-                               %)
-                            (keys impls))
-                       set
-                       (disj 'Object 'java.lang.Object)
-                       vec)
+        ks (keys impls)
+        interfaces' (map #(if (var? (resolve %))
+                            (:on (deref (resolve %)))
+                            %)
+                         ks)
+        interfaces'' (set interfaces')
+        interfaces''' (disj interfaces'' 'Object 'java.lang.Object)
+        interfaces (vec interfaces''')
+        _ (when (nil? opts+specs)
+            (println impls ks interfaces' interfaces'' interfaces''' interfaces))
         methods (map (fn [[name params & body]]
                        (cons name (maybe-destructured params body)))
                      (apply concat (vals impls)))]

I get this println when the NPE occurs:

{} nil () #{} #{} [nil]

Meaning that for some reson, `vec` of `#{}` returns `[nil]` rather than `[]`

Comment by Nicola Mometto [ 11/Jan/15 12:42 PM ]

I confirmed that the bug is in the vec function.
With the following patch, when the NPE occurs, the debug println is triggered:

diff --git a/src/jvm/clojure/lang/PersistentVector.java b/src/jvm/clojure/lang/PersistentVector.java
index 9804a0b..a460b6f 100644
--- a/src/jvm/clojure/lang/PersistentVector.java
+++ b/src/jvm/clojure/lang/PersistentVector.java
@@ -96,19 +96,22 @@ static public PersistentVector create(ArrayList list){
 static public PersistentVector create(Iterable items){
     // optimize common case
     if(items instanceof ArrayList)
         return create((ArrayList)items);

     Iterator iter = items.iterator();
     TransientVector ret = EMPTY.asTransient();
     while(iter.hasNext())
         ret = ret.conj(iter.next());
-    return ret.persistent();
+    PersistentVector r = ret.persistent();
+    if (RT.seq(r) != null && RT.seq(items) == null)
+        System.out.println("bug");
+    return r;
 }

 static public PersistentVector create(Object... items){
        TransientVector ret = EMPTY.asTransient();
        for(Object item : items)
                ret = ret.conj(item);
        return ret.persistent();
 }
Comment by Alex Miller [ 11/Jan/15 1:47 PM ]

And items is a PersistentSet? I've actually been looking at some weirdness on set iterators in the context of CLJ-1499 in consistency between seq and iterators.

Comment by Nicola Mometto [ 11/Jan/15 1:52 PM ]

I believe I have identified the bug, but I cannot make any sense out of it.

The bug apperas to be in SeqIterator.hasNext(), when the NPE occurs, after applying the following patch:

diff --git a/src/jvm/clojure/lang/SeqIterator.java b/src/jvm/clojure/lang/SeqIterator.java
index e6ad481..031fbc8 100644
--- a/src/jvm/clojure/lang/SeqIterator.java
+++ b/src/jvm/clojure/lang/SeqIterator.java
@@ -35,14 +35,18 @@ public SeqIterator(ISeq o){
 public boolean hasNext(){
        if(seq == this){
                seq = START;
                next = RT.seq(next);
                }
        else if(seq == next)
                next = RT.next(seq);
+    else if (RT.seq(next) == null)
+        System.out.println("this shouldn't happen: " + (this == seq));
+    if (RT.seq(next) == null && next != null)
+        System.out.println("bug: " + next);
        return next != null;
 }

 public Object next() throws NoSuchElementException {
        if(!hasNext())
                throw new NoSuchElementException();
        seq = next;

I get the following output:

this shouldn't happen: true
bug: #{}

I have absolutely no idea how it is possible that the last branch gets executed since it is true that seq == this thus the first branch should have been executed.

Comment by Nicola Mometto [ 11/Jan/15 1:58 PM ]

I don't know why, but with the attached patch the bug seems to go away.
This is probably just by accident though as I have no idea what changes between the code pre patch and the code post patch.

Comment by Alex Miller [ 11/Jan/15 3:21 PM ]

Without looking at the patch Id say that non deterministic bug plus impossible state smells like a concurrency / race condition problem.

Comment by Michael Blume [ 11/Jan/15 4:18 PM ]

This isn't the bug, per se, the thing I'm describing should not break anything, but why is the PersistentVector(Iterable) constructor being called on a PersistentHashSet? It looks like we could very easily do

--- i/src/jvm/clojure/lang/LazilyPersistentVector.java
+++ w/src/jvm/clojure/lang/LazilyPersistentVector.java
@@ -26,7 +26,7 @@ static public IPersistentVector createOwning(Object... items){
 static public IPersistentVector create(Object obj){
    if(obj instanceof IReduceInit)
        return PersistentVector.create((IReduceInit) obj);
-   else if(obj instanceof ISeq)
+   else if(obj instanceof Seqable)
        return PersistentVector.create(RT.seq(obj));
    else if(obj instanceof Iterable)
        return PersistentVector.create((Iterable)obj);

and treat the set directly as a seq. Is there some way that would be slower?

Comment by Michael Blume [ 11/Jan/15 4:45 PM ]

It may be that I've just tried it an insufficient number of times, but simply adding 'synchronized' to SeqIterator.hasNext appears to solve the problem. Again, this doesn't really tell us what the problem is.

ETA: Nope, fails sometimes even with synchronized.

Comment by Alex Miller [ 12/Jan/15 8:57 AM ]

The use of ISeq and not Seqable in LazilyPersistentVector is quite intentional. The idea here is that if something is already a seq (effectively a linked list), the best we're going to do is walk that chain. However, many things are Seqable that may have more efficient Iterable implementations which can (statefully) walk a data structure without creating all the intermediate objects required by seq. In particular, via CLJ-1499, map and set will soon be gaining direct Iterable implementations that walk the persistent tree without instantiating a seq object for every element. However, at the moment set and map will create a SeqIterator wrapped around the seq.

CLJ-1546 changed this path - it was walking the seq but is now walking it via SeqIterator. My working theory is that that switch has uncovered a latent race condition in SeqIterator that was never noticed before as the path wasn't exercised.

Note that because CLJ-1499 removes the reliance on SeqIterator, it would have avoided the bug in a different way! However, I have been seeing a number of weird things while doing dev on CLJ-1499 specifically around iterating over sets - the OO around iterator() and seq() in the APersistentSet/PersistentHashSet/PersistentTreeSet has some weird interactions.

I'm going to look a little closer at the suggested patch.

Comment by Nicola Mometto [ 12/Jan/15 9:01 AM ]

Alex, before you waste your time on my patch I want to clarify that I don't think that patch fixes the issue in any way, it's just a random change I made that happens to make the symptoms disappear on my system, I just attached it for debugging purposed.

Maybe there's a reason why the patch solves the bug or maybe it's just masking it, I still can't figure out why this apparent race condition happens.

Comment by Alex Miller [ 12/Jan/15 9:05 AM ]

Gotcha, thx.

Comment by Michael Blume [ 12/Jan/15 11:54 AM ]

Aha, makes sense, thanks =)

Comment by Andy Fingerhut [ 12/Jan/15 12:32 PM ]

I haven't dug into this, and don't have a solution, but SeqIterator's fields are not final, so there is no guarantee that the values assigned to a new instance's fields in its constructors will be visible to other threads, yes? And I believe that if those writes to the fields do eventually become visible, they need not become visible in the order that the assignments occur in the source code, but can become visible in any order.

Comment by Nicola Mometto [ 12/Jan/15 5:13 PM ]

By the way, I've been able to reproduce this bug using jdk 1.8 so it's not just with 1.7

Comment by Michael Blume [ 12/Jan/15 9:10 PM ]

SeqIterator seems to use 'this' as a sentinel value. If I replace it with an explicit 'new Object' sentinel, the problem appears to go away (~40 compilations without an NPE).

Making seq and next volatile doesn't help.

Interestingly, when I synchronize the entire SeqIterator class (both hasNext and next synchronized on this), the problem doesn't go away, so if this is a race condition, it's kind of a weird one.

I can insert a call to seq before the call to set here https://github.com/clojure/clojure/blob/clojure-1.7.0-alpha5/src/clj/clojure/core_deftype.clj#L60 and the problem doesn't go away. I can then print the result of seq before it's passed to set, and, of course, it's a nil.

So somehow, we're basically evaluating (-> nil set (disj 'Object 'java.lang.Object) vec) and getting [nil] instead of []

But when I actually evaluate that expression in a REPL (from 20 threads at once, 1M times in a row) it evaluates to empty vector every time.

So I'm confused.

Debug patch here if anyone wants to check my work: https://gist.githubusercontent.com/MichaelBlume/735c8f601210cfa1ecaf/raw/814f21e5e4abb2ca9d1d5330d0b4cc2b3a4424e6/gistfile1.txt

Comment by Alex Miller [ 12/Jan/15 9:53 PM ]

I'm with you. I'm starting to suspect that this is involved:

https://github.com/clojure/clojure/commit/43cc1854508d655e58e377f84836ba128971f90c

Comment by Andy Fingerhut [ 12/Jan/15 11:57 PM ]

Out of curiosity, I tried adding a field to the SeqIterator class that remembers the thread that constructed each instance, and then checks in the call to hasNext if the calling thread is the same as the creator thread, printing a message if they are ever different. I never saw that while building Clojure, nor running the command on the Midje project. That seems to rule out the possibility of the SeqIterator getting passed from one thread to another.

If that is always true, then I'm with Nicola: I would love an explanation of what is going on here to cause the debug print's he mentions in a comment above to print what they do when a failure occurs. It looks really, really wrong, as in wrong single-threaded program behavior.

Comment by Nicola Mometto [ 13/Jan/15 5:09 AM ]

Michael's and Andy's findings agree with what I observed. There's no multithreading involved yet somehow there's what appears to be a data race condition in SeqIterator.

Also, as Michael observed in his own testings, changing the sentinel from this to an Object instance (START) makes the issue go away, this is exactly what the patch I attached does.

I am also unable to reproduce this issue by repeatedly invoking vec on an empty set.

Comment by Nicola Mometto [ 13/Jan/15 5:15 AM ]

This just happened:

public boolean hasNext(){
    Object a = seq;
    Object b = this;
	if(seq == this){
		seq = START;
		next = RT.seq(next);
		}
	else {
        if (RT.seq(next) == null) {
            System.out.println (a);
            System.out.println (b);
        }
        if(seq == next)
            next = RT.next(seq);
    }
	return next != null;
}
clojure.lang.SeqIterator@3ddc6873
clojure.lang.SeqIterator@3ddc6873
Exception in thread "main" java.util.NoSuchElementException, compiling:(utils.clj:68:1)
Comment by Nicola Mometto [ 13/Jan/15 5:45 AM ]

I'm taking a shot in the dark but I tried reproducing this bug using -Xint (using the jvm interpreter) and I can't seem to be able to reproduce it after many runs.
As absurd as this sounds, I'm starting to think that some hotspot optimization is responsible for this nonsensical behaviour – this would explain the nondeterministic behaviour in the absence of multithreading.

Comment by Alex Miller [ 13/Jan/15 8:50 AM ]

I did some experiments myself yesterday and found many of the same things listed here - single-threaded usage with seemingly impossible results.

I am currently thinking that setting seq = this; in the constructor is unsafe publication of this. "this" is not valid until after the constructor completes, yet we have saved away a reference to it in a field.

Thus seq==this may possibly return false in hasNext() because it is comparing a partially constructed object with the fully constructed object (same object identity!). It may be that this doesn't happen until after hot spot/inlining. By turning on the inline diagnostics, I do see these SeqIterator methods as a hot spot that gets inlined at some point soon before I see the error manifest.

This is my best explanation of the results we're seeing and seems sufficient justification to change the implementation of SeqIterator to me.

Comment by Nicola Mometto [ 13/Jan/15 8:59 AM ]

From the Java Language Spec, section 15.8.3:
"The keyword this may be used only in the body of an instance method or default method, or in the body of a constructor of a class, or in an instance initializer of a class, or in the initializer of an instance variable of a class"

So I don't believe that the current usage of this is incorrect for this reason, however I agree that it's likely caused by some interaction of doing identity check with this and hot spot/inlining issue so I'm more confident that the patch I posted is a right fix for this bug.

Comment by Andy Fingerhut [ 13/Jan/15 10:30 AM ]

The only other scenario of multiple calls to next / hasNext that I could think of with a single thread is if some method call inside of hasNext causes next / hasNext to be called, nested, on the same SeqIterator instance. I did some instrumentation to check for nested calls, unlikely as that would seem from the source code, and saw a failure with no nested call to hasNext.

Comment by Alex Miller [ 13/Jan/15 4:28 PM ]

Midje stack trace (removed from original ticket, left here for safe-keeping). To reproduce:

// set your JAVA_HOME and PATH to JDK 1.7
git clone git@github.com:marick/Midje.git
cd Midje
git co e98cf87
lein with-profile 1.7 midje

Error:

Exception in thread "main" java.lang.NullPointerException, compiling:(fim_collection_diffs.clj:7:1)
	at clojure.lang.Compiler.macroexpand1(Compiler.java:6619)
	at clojure.lang.Compiler.macroexpand(Compiler.java:6677)
	at clojure.lang.Compiler.eval(Compiler.java:6751)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.RT.loadResourceScript(RT.java:370)
	at clojure.lang.RT.loadResourceScript(RT.java:361)
	at clojure.lang.RT.load(RT.java:440)
	at clojure.lang.RT.load(RT.java:411)
	at clojure.core$load$fn__5436.invoke(core.clj:5863)
	at clojure.core$load.doInvoke(core.clj:5862)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at clojure.core$load_one.invoke(core.clj:5653)
	at clojure.core$load_lib$fn__5383.invoke(core.clj:5708)
	at clojure.core$load_lib.doInvoke(core.clj:5707)
	at clojure.lang.RestFn.applyTo(RestFn.java:142)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$load_libs.doInvoke(core.clj:5746)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$require.doInvoke(core.clj:5829)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at midje.repl$load_facts$fn__6148.invoke(repl.clj:206)
	at midje.repl$load_facts.doInvoke(repl.clj:192)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at user$eval6211.invoke(form-init3965655274254111851.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6767)
	at clojure.lang.Compiler.eval(Compiler.java:6757)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.Compiler.loadFile(Compiler.java:7150)
	at clojure.main$load_script.invoke(main.clj:274)
	at clojure.main$init_opt.invoke(main.clj:279)
	at clojure.main$initialize.invoke(main.clj:307)
	at clojure.main$null_opt.invoke(main.clj:342)
	at clojure.main$main.doInvoke(main.clj:420)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: java.lang.NullPointerException
	at clojure.lang.Compiler.maybeResolveIn(Compiler.java:7014)
	at clojure.core$ns_resolve.invoke(core.clj:4200)
	at clojure.core$ns_resolve.invoke(core.clj:4197)
	at clojure.core$resolve.invoke(core.clj:4206)
	at clojure.core$map$fn__4529.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:485)
	at clojure.core$seq__4109.invoke(core.clj:135)
	at clojure.core$reduce1.invoke(core.clj:899)
	at clojure.core$set.invoke(core.clj:3944)
	at clojure.core$emit_defrecord.invoke(core_deftype.clj:154)
	at clojure.core$defrecord.doInvoke(core_deftype.clj:374)
	at clojure.lang.RestFn.invoke(RestFn.java:470)
	at clojure.lang.Var.invoke(Var.java:394)
	at clojure.lang.AFn.applyToHelper(AFn.java:165)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.lang.Compiler.macroexpand1(Compiler.java:6606)
Comment by Ghadi Shayban [ 13/Jan/15 4:32 PM ]

Not sure I fully understand the sad path that causes this bug, but START can safely be marked static final in the patch.

Comment by Andy Fingerhut [ 13/Jan/15 4:54 PM ]

Does this seem like it may be a bug in JIT compilation to anyone? I ask because as far as we can tell, this bug occurs completely within a single thread, and as far as I've read, the Java memory model should guarantee that operations in a single thread appear to occur in the order they happen in the source code.

Independent question: It seems that the assumption is that a SeqIterator is only ever accessed from 1 thread. Is it considered an internal implementation detail, and thus on documentation is needed about this assumption?

Comment by Alex Miller [ 13/Jan/15 5:22 PM ]

Re JIT - I wouldn't rule that out, but if it is, that doesn't help us make Clojure work again for everyone with existing JVMs. JCiP section 3.2.1 says "If the this reference escapes during construction, the object is considered not properly constructed." which sounds like what we're doing.

Re threading - I think that the use of iterators inside Clojure itself has (until recently) been unusual. Virtually everything is written to leverage the seq model and iterators were largely provided for Java compliance. With the creation of LazyTransformer and extension of reduce to iterators, this orientation has changed somewhat. However, I'd say that iterators are dirty stateful things and they should be consumed in localized contexts by no more than one thread at a time. If they are used in a thread-confined way and safely published between threads, SeqIterator seems ok.

Comment by Nicola Mometto [ 13/Jan/15 5:32 PM ]

0001-CLJ-1636-don-t-use-this-as-a-sentinel-in-SeqIterator.patch is the same as 0001-fix-for-CLJ-1636.patch except it makes START final as suggested by Ghadi

Comment by Ghadi Shayban [ 15/Jan/15 12:27 PM ]

Alex has screened this – which probably implies that the patch fixes the issue. Just to confirm does the patch clear up the issue for everyone else?

Comment by Michael Blume [ 15/Jan/15 12:48 PM ]

Does for me, yes =)

Comment by Andy Fingerhut [ 15/Jan/15 1:17 PM ]

If you want to collect test results, seems like it would be good for people to respond with the OS version and JVM version they tested on, and whether it was the Midje test in the description of this ticket that they tried, or some other test.

Comment by Andy Fingerhut [ 16/Jan/15 4:59 PM ]

Out of curiosity, does anyone have a smaller test case that causes this incorrect return value from the SeqIterator, without running the 'lein with-profile 1.7 midje' command on Midje? e.g. running a page or less worth of Clojure code?

Comment by Alex Miller [ 16/Jan/15 6:01 PM ]

While we've applied the patch, I would still love to understand wtf is happening here and would love to see that too. For me, I can quite reliably reproduce it building Clojure itself. To support the theory of it happening during an inlining transition, it's unlikely to reproduce outside the context of other code however. I see it get embedded inside a big wad of calling code when I watching inlining.

Comment by Nicola Mometto [ 18/Jan/15 11:36 AM ]

I've spent some time reading through both the jvm and the java specs and I can't find a reasonable explaination for what was happening, I can only think this is a bug in some hotspot inlining optimization.





[CLJ-1635] Make distinct/dedupe/interpose transducer tests play nicely with new reporting Created: 09/Jan/15  Updated: 20/Feb/15  Resolved: 20/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: test, transducers

Attachments: Text File clj-1635.patch    
Patch: Code and Test
Approval: Ok

 Description   

Fix interaction between CLJ-1601 (which introduced new transducers) and CLJ-1621 (which improved transducer tests) to improve test reporting for these new transducer arities as well.

Note from Alex M: I goofed these while rebasing CLJ-1601 after CLJ-1621 went in.

Patch: clj-1635.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 09/Jan/15 2:31 PM ]

My fault in the integration process! We'll try to get it fixed in 1.7. Thanks...





[CLJ-1633] PersistentList/creator doesn't handle ArraySeqs correctly Created: 07/Jan/15  Updated: 10/Jan/15  Resolved: 10/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: collections

Attachments: Text File 0001-CLJ-1633-fix-PersistentList-creator-handling-of-Arra.patch     Text File CLJ-1633-v2.patch     Text File CLJ-1633-v3.patch     Text File generative-seq-tests.patch     Text File generative-seq-tests-v2.patch    
Patch: Code and Test
Approval: Ok

 Description   

This should return '(2 3) but returns '(1 2 3) instead:

user=> ((fn [& args] (apply (fn [a & b] (println a b) (apply list b)) args)) 1 2 3)
1 (2 3)
(1 2 3)

Note that using vector rather than list returns the correct values:

user=> ((fn [& args] (apply (fn [a & b] (println a b) (apply vector b)) args)) 1 2 3)
1 (2 3)
[2 3]

The bug was reported in this stackoverflow question: https://stackoverflow.com/questions/27819418/strange-behaviour-of-clojure-trampoline and the bug identified in this comment: https://stackoverflow.com/questions/27819418/strange-behaviour-of-clojure-trampoline#comments-27821793

A simpler example of this bug:

user=> (apply list (next (clojure.lang.ArraySeq/create (object-array [1 2 3]))))
(1 2 3)

Patch: CLJ-1633-v3.patch



 Comments   
Comment by Michael Blume [ 07/Jan/15 12:28 PM ]

Very nice catch.

This makes me wonder if there's more we can do with generative testing to catch this class of bugs, maybe along the lines of zach tellman's collection-check

Comment by Alex Miller [ 07/Jan/15 2:27 PM ]

There's definitely more we can do. collection-check is great and I've started to integrate some of those ideas into Clojure's tests (see for example the new transducer tests that generate random chains of sequence operations and compare seq and transducer executions). If you have ideas about specific test areas, would be happy to see a jira/patch.

Comment by Michael Blume [ 07/Jan/15 4:59 PM ]

Updated test to get expected list inside the (= ...) form

Comment by Michael Blume [ 07/Jan/15 5:03 PM ]

Oops, rerolling again to apply cleanly to master

Comment by Nicola Mometto [ 07/Jan/15 5:04 PM ]

Thanks for the fix!

Comment by Michael Blume [ 07/Jan/15 5:40 PM ]

No problem =)

Comment by Michael Blume [ 07/Jan/15 5:42 PM ]

Ok, this is kind of crude, and mostly a proof of concept, but this does catch the bug.

Output:

[java] FAIL in (seq-gentest) (sequences.clj:105)
     [java] {:acts1 (-> [] next (->> (cons :foo)) (->> (cons :foo)) next),
     [java]  :acts2
     [java]  (->
     [java]   []
     [java]   next
     [java]   (->> (cons :foo))
     [java]   (->> (cons :foo))
     [java]   into-array-seq
     [java]   next
     [java]   (->> (apply list))),
     [java]  :result1 (:foo),
     [java]  :result2 (:foo :foo),
     [java]  :pass false}
     [java]
     [java] expected: (:result res)
     [java]   actual: false
Comment by Alex Miller [ 09/Jan/15 9:12 AM ]

Rich said to move this forward.

Comment by Nicola Mometto [ 09/Jan/15 5:51 PM ]

This ticket has been closed but no patch has been committed

Comment by Alex Miller [ 09/Jan/15 6:22 PM ]

Stu, doesn't look like this patch was applied but it was closed?





[CLJ-1621] Improve reporting in transducers generative test. Created: 21/Dec/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: transducers

Attachments: Text File transducer-reporting-v1.patch    
Patch: Code and Test
Approval: Ok

 Description   

If the transducers generative test breaks, you get output like this:

[java] {:test-var seq-and-transducer, :result #<ExceptionInfo clojure.lang.ExceptionInfo: Applied actions to coll as seq, sequence transducer, and into transducer and got different results. {:coll [-16 10 -8 8 -5], :actions map clojure.core$dec@782a4056,take 5,partition-by clojure.core$even_QMARK_@2200d281,partition-all 9,map clojure.core$inc@643b798d,drop 9,remove clojure.core$empty_QMARK_@4600f352,remove clojure.core$odd_QMARK_@32dd05af, :s #<ClassCastException java.lang.ClassCastException: clojure.lang.LazySeq cannot be cast to java.lang.Number>, :xs (), :xi [], :xt []}>, :seed 1419199634890, :failing-size 21, :num-tests 22, :fail [[-16 10 -8 8 -5] [{:desc map clojure.core$dec@782a4056, :xf #<core$map$fn__3669 clojure.core$map$fn__3669@28449652>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@506b8505>} {:desc take 5, :xf #<core$take$fn__3712 clojure.core$take$fn__3712@38934406>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@27ce0b6d>} {:desc partition-by clojure.core$even_QMARK_@2200d281, :xf #<core$partition_by$fn__5568 clojure.core$partition_by$fn__5568@5287c159>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@70961c7b>} {:desc partition-all 9, :xf #<core$partition_all$fn__5590 clojure.core$partition_all$fn__5590@3f869b0>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@6f99ed9f>} {:desc map clojure.core$inc@643b798d, :xf #<core$map$fn__3669 clojure.core$map$fn__3669@2f2c41d3>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@2fdbef8d>} {:desc drop 9, :xf #<core$drop$fn__3728 clojure.core$drop$fn__3728@4f7b4b50>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@214b9b5>} {:desc remove clojure.core$empty_QMARK_@4600f352, :xf #<core$filter$fn__3696 clojure.core$filter$fn__3696@6846d654>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@7df231c7>} {:desc remove clojure.core$odd_QMARK_@32dd05af, :xf #<core$filter$fn__3696 clojure.core$filter$fn__3696@5a8ce6dd>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@34ee9000>}]], :shrunk {:total-nodes-visited 32, :depth 12, :result #<ExceptionInfo clojure.lang.ExceptionInfo: Applied actions to coll as seq, sequence transducer, and into transducer and got different results. {:coll [0], :actions map clojure.core$inc@643b798d, :s (1), :xs (0), :xi [0], :xt [0]}>, :smallest [[0] [{:desc map clojure.core$inc@643b798d, :xf #<core$map$fn__3669 clojure.core$map$fn__3669@2f2c41d3>, :seq #<core$partial$fn__3652 clojure.core$partial$fn__3652@2fdbef8d>}]]}}
     [java]
     [java] ERROR in (seq-and-transducer) (core.clj:4566)
     [java] Uncaught exception, not in assertion.
     [java] expected: nil
     [java]   actual: clojure.lang.ExceptionInfo: Applied actions to coll as seq, sequence transducer, and into transducer and got different results.
     [java]  at clojure.core$ex_info.invoke (core.clj:4566)
     [java]     clojure.test_clojure.transducers$seq_and_transducer_same_result.invoke (transducers.clj:103)
     [java]     clojure.lang.AFn.applyToHelper (AFn.java:156)
...etc etc

This has a few problems:

  • when clojure functions are given as arguments, they're full object printouts, with classnames and memory addresses, this is kind of hard to read
  • the combination of the first problem found with the shrunk version means there's a lot of content to read
  • lack of pretty printing makes that content very hard to read
  • the traceback isn't actually that helpful – we know what failed already.

Approach: The attached patch encodes more descriptive info in the actions and does a better job of reporting the difference in an understandable manner:

[java] FAIL in (seq-and-transducer) (transducers.clj:135)
     [java] {:coll [0],
     [java]  :actions (->> coll (map inc)),
     [java]  :s (1),
     [java]  :xs (0),
     [java]  :xi [0],
     [java]  :xt [0]}

Patch: transducer-reporting-v1.patch

Screened by: Alex Miller






[CLJ-1619] PersistentVector implements IReduce but the no init arity throws Created: 17/Dec/14  Updated: 10/Jan/15  Resolved: 10/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File 0001-CLJ-1619-Implement-no-init-arity-of-reduce-for-Persi.patch     Text File 0001-Implement-no-init-arity-of-reduce-for-PersistentVect.patch    
Patch: Code and Test
Approval: Ok

 Description   

The reduce arity of IReduce in PersistentVector is implemented as: "throw new UnsupportedOperationException()".

After the CLJ-1572 patch is applied the following code will throw:

(reduce + [1 2])

Approach taken: Implement reduce(f) in PersistentVector.

Alternative: An alternate would be to change PersistentVector from IReduce to IReduceInit and remove the reduce without init function. In this case, reducing a vector would fall back to seqs.

Patch: 0001-CLJ-1619-Implement-no-init-arity-of-reduce-for-Persi.patch

Screened by: Stu



 Comments   
Comment by Alex Miller [ 18/Dec/14 10:59 AM ]

Is that return null there right? In the case of no elements, you should invoke f with no args right?

Comment by Nicola Mometto [ 18/Dec/14 11:04 AM ]

you're right, I didn't know that detail about the behaviour of reduce. Updated the patch to invoke (f) rather than returning nil when the coll is empty

Comment by Alex Miller [ 09/Jan/15 8:14 AM ]

Needs tests

Comment by Nicola Mometto [ 09/Jan/15 8:33 AM ]

Updated patch adding testcases for PersistentVector.reduce in the already existing reduce test

Comment by Stuart Halloway [ 09/Jan/15 11:50 AM ]

I don't understand the problem here. coll-reduce appears to cut off ever hitting this path (the tests call underlying interfaces directly).

  • Need a public API example showing the failure
  • Need tests covering main branches (i.e. the empty case)
Comment by Stuart Halloway [ 09/Jan/15 11:52 AM ]

nevermind, screening 1572





[CLJ-1618] Widen set to take Iterable/IReduceInit Created: 17/Dec/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File clj-1618.patch    
Patch: Code
Approval: Ok

 Description   

Similar to CLJ-1546 (same thing on vec), set should work on IReducibleInit or Iterable. Currently eduction will work via Iterable but through SeqIterator. set on an IReduceInit will throw an error.

user=> (set (eduction (map inc) (range 100)))  ;; works, but slower path
user=> (set (reify clojure.lang.IReduceInit  
       (reduce [_ f start]
         (reduce f start (range 10)))))
IllegalArgumentException Don't know how to create ISeq from: user$eval1198$reify__1199  clojure.lang.RT.seqFrom (RT.java:506)

Approach: Check for and use IReduceInit path if available, otherwise fallback to seq. Additionally, the patch adds a modification to return a set without it's meta (same approach as CLJ-1546) if a set is passed, which is fast constant time with no change in effective behavior.

Performance: (using Criterium quick-bench)

Timings done with either (count (set coll)) or (count (into #{} coll)):

coll 1.6.0 into 1.6.0 set 1.7.0-alpha4 set 1.7.0-alpha4+patch set
(set (range 100)) 15.4 µs 17.0 µs 11.4 µs 0.0 µs
(vec (range 1000000)) 360.7 ms 702.5 ms 391.1 ms 358.6 ms
(doall (range 1000000)) 363.6 ms 736.9 ms 387.5 ms 371.0 ms
(doall (range 5)) 404.9 ns 612.3 ns 481.9 ns 445.9 ns
(eduction (map identity) (vec (range 100))) n/a n/a 11.3 µs 8.7 µs

See also: CLJ-1546, CLJ-1384

Patch: clj-1618.patch

Screened by:






[CLJ-1606] Transducing an eduction finishes twice Created: 27/Nov/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Herwig Hochleitner Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: transducers
Environment:

1.7.0-alpha4


Attachments: Text File CLJ-1606-2.patch     Text File CLJ-1606-3.patch     Text File CLJ-1606-4.patch     Text File CLJ-1606-5.patch     Text File CLJ-1606.patch    
Patch: Code and Test
Approval: Ok

 Description   
> (transduce (map identity)
             (fn
               ([s] (println "Finishing") s)
               ([s i] s))
             nil
             (eduction (map identity) []))
Finishing
Finishing
nil

Cause: transduce passes (xf f) into .reduce of Eduction, which calls transduce, causing completing xf to be called more than once.

Proposed: Eduction reduce should use (completing f) instead of f to isolate completion of inner xf from outer xf.

Patch: CLJ-1606-5.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 27/Nov/14 11:01 PM ]

identity is not a valid xf - changed to (map identity)

Comment by Ghadi Shayban [ 27/Nov/14 11:34 PM ]

identity is a valid though nonsensical transducer. fix & test added.

Comment by Ghadi Shayban [ 28/Nov/14 12:06 AM ]

Simple reproduction similar to into:

(transduce (map dec)
           (completing conj! persistent!)
           (transient [])
           (eduction (map inc) (range 6)))

;; ClassCastException clojure.lang.PersistentVector cannot be cast to clojure.lang.ITransientCollection

into doesn't use completing, and conj! has an arity that hides the problem.

Comment by Alex Miller [ 28/Nov/14 8:54 AM ]

I removed trailing whitespace in the patch so it applies cleanly.

Comment by Ghadi Shayban [ 14/Dec/14 11:16 PM ]

This patch is a little more subtle than I thought. Completion of the eduction's rfn needs to be handled separately from the "outer" transduce's xform. Patch coming.

Comment by Ghadi Shayban [ 14/Dec/14 11:32 PM ]

New patch with tests that completes the inner xform without completing the passed in rfn

Comment by Ghadi Shayban [ 15/Dec/14 1:19 AM ]

both -3 and -2 are equivalent. -3 is probably better stylistically.

Comment by Alex Miller [ 15/Dec/14 8:37 AM ]

Added CLJ-1606-4.patch - identical to -3, just fixed whitespace error.

Comment by Andy Fingerhut [ 08/Jan/15 6:09 PM ]

There are two identically named attachments here (containing -2). It looks like it isn't the one under consideration, but it might be nice to remove or rename to avoid the name conflict.

Comment by Ghadi Shayban [ 08/Jan/15 6:24 PM ]

Andy, not sure how to do that, but in any case I just added -5 clarifying language in the comment

Comment by Alex Miller [ 08/Jan/15 6:41 PM ]

Ghadi, that was super confusing. Did you just add a new -5 patch? The -4 patch has already been screened, and you have not removed the duplicate -2 patch so I don't get what the -5 is. Can we just delete the -5 and older -2 patches?

Comment by Andy Fingerhut [ 08/Jan/15 6:44 PM ]

Sorry for adding to the confusion. Ghadi, instructions for deleting patches are in the "Removing patches" section on this wiki page: http://dev.clojure.org/display/community/Developing+Patches

Comment by Ghadi Shayban [ 08/Jan/15 6:50 PM ]

Sorry. Fine by me, though permissions prevent me from deleting one of the patches.

As I read through the screened patch I just tried to clarify the wording. This:

;; NB (completing f) isolates completion of inner xfns from outer xfns
became:
;; NB (completing f) isolates completion of inner rf from outer rf

Feel free to nix that -5 patch if that's worthless

Comment by Alex Miller [ 08/Jan/15 7:12 PM ]

Gotcha. I will take care of the further changes later tonight.

In the future, please don't modify screened patches without letting me know.





[CLJ-1604] AOT'ed code that defs a var with clojure.core symbol name causes IllegalStateException Created: 25/Nov/14  Updated: 21/Feb/15  Resolved: 20/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 3
Labels: aot, compiler

Attachments: Text File 0001-fix-AOT-bug-preventing-overriding-of-clojure.core-fu.patch     Text File 0001-fix-AOT-bug-preventing-overriding-of-clojure.core-fu-v2.patch     File 1604-context.diff    
Patch: Code
Approval: Ok

 Description   

AOT'ed code that defs a var that is also a symbol in clojure.core results in an exception at runtime. This problem can be avoided with (:refer-clojure :exclude ...) but this requires a library author to update and release a new version. AOT'ed applications must then wait for all transitive dependencies to update before they can update to a new Clojure version. For some users, this problem prevents them from trying or adopting new releases.

For example, the contrib library data.int-map defines an update function. clojure.core will also have a new update function as of 1.7.0. If this library is AOT'ed, then users of the clojure.data.int-map/update function will see the exception below. This situation can commonly occur when an application uses lein uberjar to compile all of the project+libs. In this case, applications or libraries that use data.int-map (either directly or indirectly) are affected.

java.lang.IllegalStateException: Attempting to call unbound fn: #'clojure.data.int-map/update
 at clojure.lang.Var$Unbound.throwArity (Var.java:43)
    clojure.lang.AFn.invoke (AFn.java:40)
    compiler_update_not_referenced_bug.core$foo.invoke (core.clj:5)

Reproduce with this sample project: https://github.com/yeller/compiler_update_not_referenced_bug

Cause: When AOT compiling a namespace, the def forms are hoisted into the ns__init class (in the example here, clojure.data.int_map__init). The static initializer in this class creates each var in the ns via a call to RT.var(ns, name). For data.int-map the static initializer will properly create the var for clojure.data.int-map/update. But when the ns is loaded (via the clojure.data.int_map.load() method), (refer-clojure) will be called, which will remap clojure.data.int-map/update to point to clojure.core/update.

This problem does not affect non-AOT loading (which doesn't use the ns__init class) and does not affect collisions from any other namespace. Only collisions from clojure.core create this possibility.

Proposed: The proposed patch explicitly refers the Var during ns__init.load() (after Clojure symbols are referred) rather than implicitly during ns__init static {}.

This change in behavior only happens during AOT in the specific case where a core symbol is being shadowed. In that case, clojure.core has already been loaded and v (the looked up var) will have ns=clojure.core. The currentNS will be (for example) data.int-map. If that's the case, and the sym has no ns, then the new logic will be emitted.

In the case of clojure.core itself, NO new bytecode is emitted. From tests on several projects, only shadowed vars during AOT get this additional bytecode.

Patch: 0001-fix-AOT-bug-preventing-overriding-of-clojure.core-fu-v2.patch

Screened by: Alex Miller



 Comments   
Comment by Andy Fingerhut [ 25/Nov/14 11:28 PM ]

When I try latest Clojure master plus patch CLJ-1604-only-core.patch with the small test project created by Tom Crayford to demonstrate this issue: https://github.com/yeller/compiler_update_not_referenced_bug

In that project, I get the same exception thrown when attempting 'lein do clean, uberjar, test' using this patch, as without it. It is because int-map/update in namespace compiler-update-not-referenced-bug.core is an unbound var.

Comment by Nicola Mometto [ 26/Nov/14 4:25 AM ]

Andy, you're right. For some reason I attached the wrong patch to the ticket, this is the correct one

Comment by Nicola Mometto [ 26/Nov/14 5:21 AM ]

I wasn't able to write a test for this, so here's a repl session using the clojure jar demonstrating this issue:

[˷/test]> ls
classes  clojure.jar  test.clj
[˷/test]> cat test.clj
(in-ns 'test)
(clojure.core/refer 'clojure.core)
(def foo "bar")
(def update "foo")
[˷/test]> java -cp classes:clojure.jar:. clojure.main
Clojure 1.7.0-master-SNAPSHOT
user=> (binding [*compile-files* true] (load "test"))
WARNING: update already refers to: #'clojure.core/update in namespace: test, being replaced by: #'test/update
nil
user=> test/foo
"bar"
user=> test/update
"foo"
user=>
[˷/test]> java -cp classes:clojure.jar:. clojure.main
Clojure 1.7.0-master-SNAPSHOT
user=> (load "test")
nil
user=> test/foo
"bar"
user=> test/update
CompilerException java.lang.RuntimeException: No such var: test/update, compiling: (NO_SOURCE_PATH:0:0)
user=>
Comment by Andy Fingerhut [ 26/Nov/14 10:39 AM ]

Thanks. I have not tried to assess the details of the change, other than to say that patch 0001-fix-AOT-bug-preventing-overriding-of-clojure.core-fu.patch dated 26 Nov 2014, when applied to latest Clojure master as of today, enables both 'lein do clean, test' and 'lein do clean, uberjar, test' to work as expected with Tom Crayford's test project, linked above, whereas 'lein do clean, uberjar, test' fails without this patch, due to a var being unbound that should have a value.

Comment by Andy Fingerhut [ 27/Nov/14 10:53 AM ]

Copying a comment here from CLJ-1591, since it is more appropriate here. It is responding to Tom Crayford's posting of his example project to demonstrate the issue: https://github.com/yeller/compiler_update_not_referenced_bug

Tom, looked at your project. Thanks for that. It appears not to have anything like (def inc inc) in it. It throws exception during test step of 'lein do clean, uberjar, test' consistently for me, too, but compiles with only warnings and passes tests with 'lein do clean, test'. I have more test results showing in which Clojure versions these results change. To summarize, the changes to Clojure that appear to make the biggest difference in the results are below (these should be added to the new ticket you create – you are welcome to do so):

Clojure 1.6.0, 1.7.0-alpha1, and later changes up through the commit with description "CLJ-1378: Allows FnExpr to override its reported class with a type hint": No errors or warnings for either lein command above.

Next commit with description "Add clojure.core/update, like update-in but takes a single key" that adds clojure.core/update: 'lein do clean, test' is fine, but 'lein do clean, uberjar' throws exception during compilation, probably due to CLJ-1241.

Next commit with description "fix CLJ-1241": 'lein do clean, test' and 'lein do clean, uberjar' give warnings about clojure.core/update, but no errors or exceptions. 'lein do clean, uberjar, test' throws exception during test step that is same as the one I see with Clojure 1.7.0-alpha4. Debug prints of values of clojure.core/update and int-map/update (in data.int-map and in Tom's namespace compiler-update-not-referenced-bug.core) show things look fine when printed inside data.int-map, and in Tom's namespace when not doing the uberjar, but when doing the uberjar, test, int-map/update is unbound in Tom's namespace.

In case it makes a difference, my testing was done with Mac OS X 10.9.5, Leiningen 2.5.0 on Java 1.7.0_45 Java HotSpot(TM) 64-Bit Server VM

Comment by Nicola Mometto [ 02/Dec/14 9:04 AM ]

The updated patch only emits the interning bytecode when necessary, avoiding the emission when a clojure.core var with the same name exists but is not mapped to the current namespace

Comment by Alex Miller [ 09/Jan/15 9:27 AM ]

Attached 1604-context.diff for purely informational purposes - same diff just more context in it for easier reading.

Comment by Tom Crayford [ 10/Jan/15 4:52 PM ]

Thought I'd add a minor note in here to say I tried testing this patch out on my app (which is where I discovered this AOT bug), and the bug doesn't turn up with this patch applied to clojure (tested by applying 0001-fix-AOT-bug-preventing-overriding-of-clojure.core-fu-v2.patch to 1.7-alpha5)

Comment by Adam Krieg [ 21/Feb/15 12:28 PM ]

I ran into this issue with Korma 0.4.0. I'm still running into it, but there is a twist.

My project depends on an artifact that was built with Clojure 1.7.0-alpha1. If I remove this dependency, everything is fine. However, with this dependency, I run into this issue, even if I declare a dependency on 1.7.0-master-SNAPSHOT in my project and exclude any dependency on clojure-1.7.0-alpha1.

I'm not sure if this is a Maven issue or a Clojure issue. Running Maven with debug on seems to show that it's using the correct version of Clojure.

I have created a dummy project that reproduces this issue if you are interested.

https://github.com/deaddowney/UpdateProblem

Check it out, run "mvn install", and you will get
java.lang.RuntimeException: No such var: korma.core/update
.





[CLJ-1603] cycle, iterate, repeat return vals should IReduceInit Created: 25/Nov/14  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Stuart Halloway
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File clj-1603-10.patch     Text File clj-1603-11.patch     Text File clj-1603-12-2.patch     Text File clj-1603-12.patch     Text File clj-1603-13.patch     Text File clj-1603-14.patch     Text File clj-1603-15.patch     Text File clj-1603-2.patch     Text File clj-1603-3-2.patch     Text File clj-1603-3-3.patch     Text File clj-1603-3-4.patch     Text File clj-1603-3.patch     Text File clj-1603-4.patch     Text File clj-1603-5.patch     Text File clj-1603-6.patch     Text File clj-1603-7.patch     Text File clj-1603-8.patch     Text File clj-1603-9.patch     Text File clj-1603.patch    
Patch: Code and Test
Approval: Ok

 Description   

Screened

clj-1603-15.patch (Java approach)

Alternatives:

There were a number of possible approaches for these enhancements:
1) Straight Java impl. The benefit of this approach is improving the performance of both the seq and reduce paths at the expense of writing a bunch of Java. See clj-1603-15.patch for the best of these impls.

2) Clojure deftype. This required moving cycle and iterate and providing a repeat1 implementation until deftype is defined (similar to the approach with reduce1). The deftype version returns a Seqable, IReduce object and has effectively the same former implementation for seq with a new fast implementation for IReduce. This makes reduce paths fast, but leaves seq paths about the same, with the benefit of no new Java code. The downside was that the new seqs did not implement the full range of interfaces from prior impls, which could potentially break code. See clj-1603-12-2.patch for a patch that covers this.

3) Add Iterable or IReduceInit directly to LazySeq. Conceptually, this does not make sense for general lazy seqs. Seqs materialize and cache each value once and doing this along with the ability to iterate/reduce introduces issues with caching (might as well use seqs for that) and synchronization. However, doing this for just these specific lazy seqs is possible - see latest patch.

Approach: Latest patch makes LazySeq implement IReduce and internal macro reducible-lazy-seq lets callers supply a function to implement the reduce path.

A few things to note:

  • Added some example-based tests for iterate, cycle, and repeat where I thought they were needed. Did not add generative tests - not clear to me what these would be that would actually be valuable. All of these functions are pretty simple and the examples cover the special cases.

Performance:

Some example timing, all in µs:

Expression 1.6.0 1.7.0-alpha5 master + clj-1603-15 (Java) master + clj-1603-12-2 (deftype) master + clj-1603-14 (split)
(doall (take 1000 (repeat 1))) 87 93 63 89 92
(into [] (take 1000) (repeat 1)) n/a 67 25 27 33
(doall (repeat 1000 1)) 87 94 16 94 89
(into [] (repeat 1000 1)) 99 110 13 12 12
(reduce + 0 (repeat 1000 1)) 99 126 20 22 25
(into [] (take 1000) (repeat 1)) n/a 67 28 33 27
(doall (take 1000 (cycle [1 2 3]))) 101 106 85 108 103
(into [] (take 1000) (cycle [1 2 3])) n/a 73 38 45 44
(doall (take 1000 (iterate inc 0))) 93 98 75 123 116
(into [] (take 1000) (iterate inc 0)) n/a 85 38 40 39

Notes on timings above:

  • All reduce timings (with into) comparable across the impls and significantly better than the current behavior over seqs.
  • The Java impl is faster across the board with doall. doall repeatedly calls seq() and next() to walk the sequence. The Java class versions of Repeat, Cycle, and Iterate extend ASeq and seq() just return this. next() constructs and returns a new instance of the class, which is immutable. In the lazy seq versions, LazySeq is mutable and requires synchronization and handling the caching safely. So, simple immutable instance ftw here.
  • The Java finite repeat has an extra benefit from using a primitive long for the counter.
  • One performance difference that's not visible in the timings is that the Java implementations have the benefit of being both seqs and reducibles as they are traversed, so you can always get a fast reduce. The deftype and split impls are only reducible at the initial instance, walking off that initial head reverts to lazy seqs that are not quickly reducible.

Patch: The two patches in leading contention are:

  • clj-1603-15.patch - for the Java version
  • clj-1603-14.patch - for the split impl

Alex opinion: I have swung back and forth on this but my current recommendation is for the Java implementation (clj-1603-15.patch). It's faster for both seqs and reduce, both in the timings above and importantly in maintaining reducibility as they are traversed. There is more Java, but I've made my peace with that - the code maximally leverages existing ASeq infrastructure and the implementation is easy to understand.



 Comments   
Comment by Ghadi Shayban [ 25/Nov/14 11:01 AM ]

Stu, do you intend these to be in Java or Clojure? It could be trickier to implement in Clojure directly, as loading would have to be deferred until core_deftype loads. It's certainly tractable without breaking any backwards compatibility, and I've explored this while experimenting with Range as a deftype https://github.com/ghadishayban/clojure/commit/906cd6ed4d7c624dc4e553373dabfd57550eeff2

A macro to help with Seq&List participation could be certainly useful, as efficiently being both a Seq/List and IReduceInit isn't a party.

May be useful to list requirements for protocol/iface participation.

It seems like 'repeatedly' is another missing link in the IReduceInit story.

Rich mentioned the future integration of reduce-kv at the conj, it would also be useful to know how that could fit in.

---- Other concerns and ops that may belong better on the mailing list ----

In experimenting with more reducible sources, I put out a tiny repo (github.com/ghadishayban/reducers) a couple weeks ago that includes some sources and operations. The sources were CollReduce and not ISeq.

Relatedly, caching the hashcode as a Java `transient` field is not supported when implementing a collection using deftype (patch w/ test in CLJ-1573).

Sources:
Iterate was one of them https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51
Repeatedly https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51

Reduce/transduce-based Operations that accept transducers:
some, any, yield-first https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L52-L80
(any could use a better name, equiv to (first (filter...)))
some and any have a symmetry like filter/remove.

Novelty maybe for 1.8:
A transducible context for Iterables similar to LazyTransformer:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L157-L161

The unless-reduced macro was very useful in implementing the collections:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L7-L15
It is different than the ensure-reduced and unreduced functions in core.

Comment by Alex Miller [ 25/Nov/14 12:01 PM ]

When we discussed this in the past, it was in the vein of reusing some of the range work (in Java) to implement cycle and iterate (per CLJ-1515).

Comment by Ghadi Shayban [ 25/Nov/14 9:20 PM ]

Never mind about 'repeatedly'. Being both ISeq and IReduceInit for repeatedly doesn't make sense for something that relies on side-effects. Current users of repeatedly can reduce over it many times and only realize the elements once.

Comment by Alex Miller [ 05/Dec/14 11:17 PM ]

attached wip Java impl and posted some example timings

Comment by Ghadi Shayban [ 11/Dec/14 4:35 PM ]

NB iterate in this patch does not cache the realized ISeq, but recalcs it at every call to realize the tail. This is not a change in the promised behavior (docstring says "f must be side-effect free") but an implementation change, as worth noting in the changelog.

Comment by Stuart Halloway [ 02/Jan/15 1:32 PM ]

It looks like all the reduce with no inital value paths are still seq-y, and slower, as shown by e.g.

(dotimes [_ 10]
  (time
   (reduce
    +
    (repeat 10000000 1))))

(dotimes [_ 20]
  (time
   (reduce
    +
    0
    (repeat 10000000 1))))
Comment by Alex Miller [ 02/Jan/15 2:01 PM ]

On that example in master before / after patch I see:

before:

  • no init = 844 ms
  • with init = 920 ms

after:

  • no init = 124 ms
  • with init = 90 ms

Is that similar to what you see or not?

Comment by Alex Miller [ 02/Jan/15 4:21 PM ]

The clj-1603-3.patch has been updated to use effectively the same algorithm for both versions of reduce. With the -3 patch, I got ~96 ms on both examples in the prior comment. I re-ran the tests in the description and updated those as well (about the same as expected).

Comment by Stuart Halloway [ 16/Jan/15 1:18 PM ]

The tests do not seem to hit the unseeded reduce branches – do we even want these branches? The original ticket was for IReduceInit.

Comment by Michael Blume [ 18/Jan/15 1:48 PM ]

Probably worth noting – Git will happily apply the latest patch for CLJ-1603 on top of the latest patch for CLJ-1515, but the result does not compile because 1515 uses iterate and 1603 moves the definition of iterate lower in clojure.core. Not sure if this is worth fixing now or just noting for when they're actually applied.

Comment by Michael Blume [ 18/Jan/15 1:52 PM ]

Actually, here, this just moves the declare statement further up the file.

Comment by Michael Blume [ 18/Jan/15 2:19 PM ]

OK, no, the two patches are still incompatible even with the declaration order fixed:

[java] ERROR in (test-range) (LongRange.java:95)
     [java] expected: (= (take 3 (range 3 9 0)) (quote (3 3 3)))
     [java]   actual: java.lang.ClassCastException: clojure.core.InfiniteRepeat cannot be cast to clojure.lang.ISeq
     [java]  at clojure.lang.LongRange.create (LongRange.java:95)
Comment by Alex Miller [ 18/Jan/15 2:31 PM ]

The 1515 patch is actually being reworked right now - we will patch things up at application time if needed.

Comment by Alex Miller [ 19/Jan/15 10:12 AM ]

Removing screened marking so can be re-screened. Added new -7 patch that handles print-method, print-dup, and unmapping the deftype constructors so they're not visible. Thanks to Ghadi in CLJ-1515 for the idea on those.

Comment by Ghadi Shayban [ 20/Jan/15 8:08 AM ]

Review of -7 patch:

Seqable/seq implementations that return a separate ISeq like these do should forward a call to seq on the result, like eduction does. [1] (This is not necessary in these particular impls, as the LazySeqs returned are themselves ISeqs. Also because Cycle's deftype is only constructed for non-empty cycles, the fact that there is a guaranteed seq is implicit. Probably a best practice to add an innocuous seq call if users look to these impls as a recipe.)

The performance regression in (doall (repeat 1000 1)) should go away completely with the dorun tweak in CLJ-1515. This is because dorun is effectively calling seq twice (it calls seq, throws away the result, then calls next.)

minor nits
1) repeat1 seems to be identical to repeat-seq and has both arities necessary for the deftypes
2) inside FiniteRepeat s/(> i 0)/(pos? i) also inside the repeat constructor
3) some things are defn- , some are ^:private
4) Cycle/reduce the recur binding can be (recur rr (or (next s) coll)) rather than nil? check

[1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7324

Comment by Alex Miller [ 22/Jan/15 10:34 PM ]

Ghadi - good comments! Fixed 1,2,4. #3 - ^:private is because defn- is not yet defined. New -8 patch.

Comment by Alex Miller [ 23/Jan/15 10:03 AM ]

Bah.

user=> (= (repeat 5 :a) (repeat 5 :a))
false
Comment by Alex Miller [ 23/Jan/15 3:04 PM ]

Updated to -9 patch that handles hash and equality for finite repeat case.

Comment by Ghadi Shayban [ 26/Jan/15 2:24 PM ]

metadata in the wrong place on #'repeat1

Comment by Alex Miller [ 26/Jan/15 3:27 PM ]

Thanks, fixed in -10.

Comment by Stuart Halloway [ 20/Feb/15 10:19 AM ]

The collections returned by these APIs promise several things the new deftypes do not deliver. For example, 1.6's repeat currently has the following ancestors that are lost in the propopsed deftype:

  • clojure.lang.ISeq
  • java.io.Serializable
  • clojure.lang.IPending
  • java.util.Collection
  • java.util.List
  • java.lang.Iterable
  • clojure.lang.Obj
  • clojure.lang.IPersistentCollection
  • clojure.lang.IMeta
  • clojure.lang.IObj

Losing metadata and serializability are certainly regressions, other stuff maybe as well. I suspect similar problems in all the other API returns.

We could improve testing by taking advantage of the property-checking fns already in test-clojure/data-structures, e.g. is-same-collection

Comment by Alex Miller [ 20/Feb/15 10:42 AM ]

I think we should consider carefully what is contractually required by the return of repeat. My opinion is that repeat must return something seqable, not literally a seq (or lazy seq). With that perspective, ISeq, IPending (there re lazy seq laziness), Collection, List, Iterable, and IPersistentCollection are non-essential. Obj is a concrete class and we shouldn't commit to that.

  • IObj is a gap, this should work but doesn't: (with-meta (repeat 1 :a) {:a 1})
  • IMeta doesn't need to be added, will never have meta right now and meta handles this correctly
  • Serializable - doesn't make sense for the infinite seqs but should probably fix for finite range
Comment by Alex Miller [ 20/Feb/15 11:03 AM ]

Added -11 patch that adds IObj for all the new deftypes and Serializable for FiniteRepeat. Still need to add more tests.

Comment by Stuart Halloway [ 20/Feb/15 12:27 PM ]

I think all the java. interfaces are mandatory for interop. Leaving out any one of the clojure. interfaces creates observable change in behavior composing with other core fns (admittedly the IPending case would be bizarre, who uses that?), so those seem mandatory too.

Agreed Obj should be irrelevant, and if it isn't the bug is elsewhere.

Comment by Alex Miller [ 27/Feb/15 9:41 AM ]

pending further discussion w/ stu

Comment by Alex Miller [ 13/Mar/15 7:02 AM ]

for a clean run and more explanation of the perf numbers

Comment by Alex Miller [ 17/Mar/15 4:23 PM ]

refreshed numbers in table

Comment by Alex Miller [ 18/Mar/15 9:24 AM ]

Refreshed perf numbers and older patch variants, added some more explanation of the perf numbers for comparison.

Comment by Stuart Halloway [ 22/Mar/15 7:50 AM ]

Screened clj-1603-3-3.patch

Comment by Nicola Mometto [ 23/Mar/15 2:00 PM ]

Shouldn't iterate cache its next slot? the current implementation recalculates it every time:

user=> (def coll (iterate #(do (println %) (inc %)) 0))
#'user/coll
user=> (nth coll 3)
0
1
2
3
user=> (nth coll 3)
0
1
2
3

I realize the docstring for iterate explicitely warns that f must be side-effect free but this is just for demonstration purposes, imagine that f is a computationally-heavy function and you'll understand why I don't think the new proposed behaviour is desiderable.

Comment by Nicola Mometto [ 23/Mar/15 2:26 PM ]

I attached clj-1603-3-4.patch which is the same as clj-1603-3-3.patch except it caches the next slot of iterate.

Here is the diff between -3-3 and -3-4 for ease of review:

diff --git a/src/jvm/clojure/lang/Iterate.java b/src/jvm/clojure/lang/Iterate.java
index f23ddca..aeef998 100644
--- a/src/jvm/clojure/lang/Iterate.java
+++ b/src/jvm/clojure/lang/Iterate.java
@@ -16,6 +16,7 @@ public class Iterate extends ASeq implements IReduce {
 
 private final IFn f;      // never null
 private final Object seed;
+private volatile ISeq _next;
 
 private Iterate(IFn f, Object seed){
     this.f = f;
@@ -37,7 +38,14 @@ public Object first(){
 }
 
 public ISeq next(){
-    return new Iterate(f, f.invoke(seed));
+    ISeq ret = _next;
+    if (ret == null)
+        synchronized(this) {
+            ret = _next;
+            if (ret == null)
+                _next = ret = new Iterate(f, f.invoke(seed));
+        }
+    return ret;
 }
 
 public Iterate withMeta(IPersistentMap meta){
Comment by Alex Miller [ 23/Mar/15 3:32 PM ]

It's a good question. That impl kills all of the performance improvement for iterate on the seq use case (prob the current common case). I'll take a look at it though.

Comment by Nicola Mometto [ 23/Mar/15 3:57 PM ]

My opinion is that the performance improvement for iterate that the benchmark shows won't reflect a real performance improvement in "real" usage cases of iterate where `f` actually does some computation

Comment by Nicola Mometto [ 23/Mar/15 4:04 PM ]

This is also true for its reduce impl btw.
I'm sorry if I'm pointing this out late in the life of this ticket but it never occurred to me that, contrary to cycle, repeat or range (CLJ-1515), this change for iterate will possibly make accessing large sequences slower since we lose the caching aspect of lazy-seqs.

Comment by Alex Miller [ 23/Mar/15 4:58 PM ]

Added new -15 patch, which is the same as -3-3 but caches _next for all three types. This has no impact on the reduce performance and was 2-8 us slower in the timings for the seqs above. That seems like a small hit for the reduction in perf and memory on all cases where the seq is traversed multiple times. That seems like a reasonably common thing to do, particularly for finite repeat.

Comment by Alex Miller [ 23/Mar/15 5:02 PM ]

Stu - the only change in -15 vs -3-3 is in the next() methods of each new type. Each has a new "private volatile ISeq _next;"

Cycle:

public ISeq next(){
    if(_next == null) {
        ISeq next = current.next();
        if (next != null)
            _next = new Cycle(all, next);
        else
            _next = new Cycle(all, all);
    }
    return _next;
}

Iterate:

public ISeq next(){
    if(_next == null) {
        _next = new Iterate(f, f.invoke(seed));
    }
    return _next;
}

Repeat:

public ISeq next() {
    if(_next == null) {
        if(count > 1)
            _next = new Repeat(count - 1, val);
        else if(count == INFINITE)
            _next = this;
    }
    return _next;
}
Comment by Michael Blume [ 23/Mar/15 6:47 PM ]

For Cycle, would it be worth holding the head and linking back to it when we loop, so that the underlying sequence only has to be traversed once? Something like this

https://github.com/MichaelBlume/clojure/commit/cache-cycle-head

On my machine this saves about 10 microseconds on

(doall (take 1000 (cycle [1 2 3])))

Comment by Michael Blume [ 23/Mar/15 7:07 PM ]

Also, for the Cycle reduce implementations, it might make sense to assume the underlying sequence is going to be traversed many times, copy it into an array, and loop over it with an index.

Comment by Michael Blume [ 23/Mar/15 7:17 PM ]

like so: https://github.com/MichaelBlume/clojure/commit/array-loop

Comment by Alex Miller [ 23/Mar/15 8:28 PM ]

The problem with holding a pointer back to the head is that the intervening cycle can be arbitrarily large. You are then literally holding the head on an arbitrary size seq. A silly example that fails with the patch but works without it (depending on your jvm size):

(dorun (take 10000000 (cycle (range 10000000))))

I played with the cycle reduce too but it has the same effective potential issue of creating and holding an arbitrarily sized array. Actually this one is worse in that it realizes an arbitrarily large array before knowing how many inputs it will need - it could be a take 1. Also, things that formerly worked could blow up with an oome. What if someone relies on passing an infinite sequence to cycle in a fallthrough case? For example, this works with the current version but would blow up with the array.

(into [] (take 1000) (cycle (iterate inc 0)))

Neither of these seem worth the risk to me.

Comment by Michael Blume [ 23/Mar/15 10:53 PM ]

Yeah, fair enough.

Comment by Alex Miller [ 25/Mar/15 4:14 PM ]

updated version of the -15 patch that gives proper credit to Ghadi for part.





[CLJ-1602] vals and keys return values should implement IReduceInit or Iterable Created: 25/Nov/14  Updated: 27/Mar/15  Resolved: 27/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1602-2.diff     File clj-1602-3.diff     File clj-1602-4.diff     File clj-1602-5.diff     File clj-1602-6.diff     File clj-1602.diff    
Patch: Code and Test
Approval: Ok

 Description   

Background: clojure.core/keys calls RT.keys(Object) calls APersistentMap.KeySeq.create(ISeq). RT.keys() creates a sequence (of Map.Entry objects) and KeySeq just wraps it, calling .keyValue(). There is an equivalent vals -> RT.vals() -> ValSeq path. Both of these seq impls extend ASeq and provide Iterable implementations via SeqIterator (iterator wrapped over the seq).

Approach: The important thing here is to avoid creating the sequence and instead directly iterate/reduce over the map. Noting that CLJ-1499 provides support for making PHM directly Iterable and that KeySeq/ValSeq already implement Iterable, I chose to focus on making the instances returned from keys and vals support Iterable directly on the underlying map instead of via the seq.

RT.keys()/vals() created the seq and passed it to KeySeq/ValSeq which made it too late to directly cover the original map iterator. There are a few places that rely on passing a seq of Map.Entry to keys/vals (not just a map instance), so I check for IPersistentMap and in that case pass it directly to a new KeySeq factory method that remembers both the Iterable and the ISeq. There is also support in here for using a direct key/value iterator via IMapIterable as introduced in CLJ-1499.

Questions/notes:

  • Could potentially check for Map or Iterable instead of IPersistentMap in RT.keys()/vals(). Not sure how common it is to pass normal Java maps to keys/vals.
  • The direct Iterable support vanishes once you move off the head of the keys or vals seq. So (rest (keys map)) does not have Iterable support. This is not really possible unless you hold an Iterator and advance it along with the seq, but that seemed to introduce all sorts of possibilities for badness. Since maps are unordered, it seems weird to rely on any ordering or processing only parts of any map, so I suspect doing this would be quite rare.
  • This patch depends on CLJ-1499 for IMapIterable.

Performance: I tested perf using criterium to benchmark as follows:

(use 'criterium.core)
(def m (zipmap (range 1000) (range 1000)))
(bench (reduce + (keys m)))
(bench (reduce + (vals m)))
(bench expr) 1.6.0 1.7.0-alpha5 alpha5 + clj-1499 + patch
(reduce + (keys m)) 69 µs 73 µs 44 µs
(reduce + (vals m)) 75 µs 77 µs 50 µs

Tests:

  • I added some basic tests for subseq and rsubseq as those both rely on the somewhat special behavior of keys accepting a seqable of Map.Entry objects (not just a map itself). There were no other tests for subseq or rsubseq already present.
  • Some coverage from CLJ-1499 tests d760db
  • Show that metadata works on keys/vals
  • Show that iterator works correctly after you step off the head of the seq, invalidating the iterable
  • Show that the fallthrough branch of iterator works (from PersistentTreeMap)

Patch: clj-1602-6.diff

Screened by:

  • the changes in CollReduce are unlikely to stand but are currently essential


 Comments   
Comment by Alex Miller [ 25/Nov/14 11:53 AM ]

Could leverage CLJ-1499 for the bulk of this, may pull that back from 1.8 into 1.7. Waiting on further work till that's answered.

Comment by Alex Miller [ 03/Dec/14 11:24 PM ]

I also have a patch that extends the CLJ-1499 iterators to support providing both key and val iterators that do not require creating and unpacking a Map.Entry. Unfortunately I only saw times that were ~48 µs on the perf benchmark in the description, so it's not a huge benefit (short-lived object allocation is cheap).

Comment by Alex Miller [ 09/Jan/15 9:15 AM ]

waiting on 1499 mods

Comment by Alex Miller [ 14/Jan/15 5:42 PM ]

Updated patch based on latest CLJ-1499-v10 patch.

Comment by Michael Blume [ 15/Jan/15 1:48 AM ]

I find it concerning that the v9 patch passed generative tests, shouldn't we have seen that?

Comment by Alex Miller [ 15/Jan/15 7:21 AM ]

Actually I did see it in the generative tests for this ticket so I moved those tests into the CLJ-1499 patch.

Comment by Michael Blume [ 15/Jan/15 10:10 AM ]

Ah, cool =)

Comment by Stuart Halloway [ 22/Mar/15 8:16 AM ]

clj-1602-5.diff merely updates -4 to apply on current master

Comment by Stuart Halloway [ 22/Mar/15 8:37 AM ]

requested some tests in the description

Comment by Alex Miller [ 23/Mar/15 1:51 PM ]

Updated to -6 patch that adds the requested tests.





[CLJ-1601] transducer arities for map-indexed, distinct, and interpose Created: 25/Nov/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: transducers

Attachments: Text File clj-1601-2.patch     Text File clj-1601-3.patch     Text File clj-1601-4.patch     Text File clj-1601.patch     Text File clj-1601-transient-distinct.patch    
Patch: Code and Test
Approval: Ok

 Description   
  • with generative tests
  • with examples demonstrating performance

Performance: Details in comments, summary:

(def v (vec (concat (range 1000) (range 1000))))
(into [] (distinct v))            ;; 821.3 µs
(into [] (distinct) v)            ;; 388.2 µs
(into [] (interpose nil v))       ;; 316.0 µs
(into [] (interpose nil) v)       ;; 35.5 µs
(into [] (map-indexed vector v))  ;; 76.8 µs
(into [] (map-indexed vector) v)  ;; 49.4 µs

Patch: clj-1601-4.patch

Screening note: We could use transients to improve performance of the distinct impl, except checking containment in a transient set is broken per CLJ-700 (which is not currently in 1.7). I have a new patch and direction on CLJ-700 that could provide a way to solve that if we want to move it back and push this further. Or we could just wait and refactor when CLJ-700 does go in.

Screened by:



 Comments   
Comment by Alex Miller [ 25/Nov/14 11:54 AM ]

working on this

Comment by Alex Miller [ 25/Nov/14 4:22 PM ]

Initial patch with impls. Tests and perf still to do.

Comment by Alex Miller [ 27/Nov/14 7:09 AM ]

Perf tests, summarized in description:

user=> (use 'criterium.core)
nil
user=> (def v (vec (concat (range 1000) (range 1000))))
#'user/v
user=> (quick-bench (into [] (distinct v)))
WARNING: Final GC required 10.433088780213309 % of runtime
Evaluation count : 744 in 6 samples of 124 calls.
             Execution time mean : 821.339608 µs
    Execution time std-deviation : 11.351053 µs
   Execution time lower quantile : 811.901435 µs ( 2.5%)
   Execution time upper quantile : 837.972000 µs (97.5%)
                   Overhead used : 1.794010 ns
nil
user=> (quick-bench (into [] (distinct) v))
WARNING: Final GC required 10.78492057474076 % of runtime
Evaluation count : 14028 in 6 samples of 2338 calls.
             Execution time mean : 43.630656 µs
    Execution time std-deviation : 170.185825 ns
   Execution time lower quantile : 43.433193 µs ( 2.5%)
   Execution time upper quantile : 43.853959 µs (97.5%)
                   Overhead used : 1.794010 ns
				   
user=> (quick-bench (into [] (interpose nil v)))
WARNING: Final GC required 10.79555726490133 % of runtime
Evaluation count : 1914 in 6 samples of 319 calls.
             Execution time mean : 316.024853 µs
    Execution time std-deviation : 9.077484 µs
   Execution time lower quantile : 310.139273 µs ( 2.5%)
   Execution time upper quantile : 330.917486 µs (97.5%)
                   Overhead used : 1.794010 ns

Found 1 outliers in 6 samples (16.6667 %)
	low-severe	 1 (16.6667 %)
 Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
nil
user=> (quick-bench (into [] (interpose nil) v))
WARNING: Final GC required 10.70401297525592 % of runtime
Evaluation count : 17022 in 6 samples of 2837 calls.
             Execution time mean : 35.592672 µs
    Execution time std-deviation : 560.066138 ns
   Execution time lower quantile : 35.252348 µs ( 2.5%)
   Execution time upper quantile : 36.553414 µs (97.5%)
                   Overhead used : 1.794010 ns

Found 1 outliers in 6 samples (16.6667 %)
	low-severe	 1 (16.6667 %)
 Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
nil

user=> (quick-bench (into [] (map-indexed vector v)))
WARNING: Final GC required 12.45755646853723 % of runtime
Evaluation count : 7338 in 6 samples of 1223 calls.
             Execution time mean : 76.807691 µs
    Execution time std-deviation : 381.019170 ns
   Execution time lower quantile : 76.433202 µs ( 2.5%)
   Execution time upper quantile : 77.170733 µs (97.5%)
                   Overhead used : 1.794010 ns
nil
user=> (quick-bench (into [] (map-indexed vector) v))
WARNING: Final GC required 11.38700971837483 % of runtime
Evaluation count : 12474 in 6 samples of 2079 calls.
             Execution time mean : 49.458043 µs
    Execution time std-deviation : 620.716737 ns
   Execution time lower quantile : 48.995801 µs ( 2.5%)
   Execution time upper quantile : 50.229507 µs (97.5%)
                   Overhead used : 1.794010 ns
Comment by Alex Miller [ 17/Dec/14 1:50 PM ]

Updated based on comment from Christophe Grand that java.util.HashSet used in distinct impl had different hash/equality semantics than the set used with sequences.

Comment by Nikita Prokopov [ 21/Dec/14 6:13 AM ]

This can be further improved by using transient set instead of persistent one in distinct:

distinct with persistent set, w/o transducers:  904.932406 µs
distinct with transient set,  w/o transducers:  755.338598 µs
distinct with persistent set, with transducers: 452.170600 µs
distinct with transient set,  with transducers: 293.258473 µs

Only caveat is that transient sets do not support contains? for now (see CLJ-700). This can be solved by using (.contains ^clojure.lang.ITransientSet set key)

I’m not sure what’s the best way to attach patch to this, for now attaching a patch that can be applied on top of Alex changes (clj-1601-transient-distinct.patch).

Comment by Alex Miller [ 22/Dec/14 8:32 AM ]

Hey Nikita, I'd rather fix CLJ-700 and use the normal functions rather than what you've done in the patch, which is why I hadn't done this before. I'm waiting to check with Rich whether we'll do that in 1.7 or wait till next release.

Comment by Michael Blume [ 09/Jan/15 10:23 AM ]

1601-3 no longer applies cleanly to master, I've got a reroll that does, is it ok to attach it even though the ticket is marked 'screened'?

Comment by Alex Miller [ 09/Jan/15 10:27 AM ]

Updated tests to apply cleanly to current master in -4 patch.

Comment by Alex Miller [ 09/Jan/15 10:33 AM ]

Ha, didn't see your comment Michael! I was working on the same thing.





[CLJ-1600] calling hashCode on clojure.lang.LazyTransformer causes a StackOverflowError Created: 24/Nov/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Sam Ritchie Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: transducers
Environment:

OS X 10.10.1, Macbook Pro,, Java 1.8.0_11, Clojure 1.7.0-alpha4


Attachments: Text File CLJ-1600-2.patch     Text File CLJ-1600.patch    
Patch: Code and Test
Approval: Ok

 Description   

Calling .hashCode on a an instance of clojure.lang.LazyTransformer causes a StackOverflowError:

user> (.hashCode (sequence (map identity) ["s"]))
StackOverflowError   clojure.lang.Util.hash (Util.java:161)

The trace is

Util.java:  161  clojure.lang.Util/hash
          LazyTransformer.java:  216  clojure.lang.LazyTransformer/hashCode
                     Util.java:  161  clojure.lang.Util/hash
          LazyTransformer.java:  216  clojure.lang.LazyTransformer/hashCode
                     Util.java:  161  clojure.lang.Util/hash
          LazyTransformer.java:  216  clojure.lang.LazyTransformer/hashCode
                     Util.java:  161  clojure.lang.Util/hash
          LazyTransformer.java:  216  clojure.lang.LazyTransformer/hashCode

Relevant lines:

https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/LazyTransformer.java#L212
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Util.java#L161

Cause: Looks like "seq" returns "this":

https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/LazyTransformer.java#L55

This does NOT occur on an empty sequence, as clojure.core/sequence short-circuits.

Proposal: compute and cache hash and hasheq using same algorithm used in other seqs

Patch: CLJ-1600-2.patch

Screened by: Alex Miller



 Comments   
Comment by Ghadi Shayban [ 25/Nov/14 1:18 AM ]

Patch with hashcode calculation and caching similar to ASeq. Might be worthwhile hoisting that into its own hashSeq method.

Comment by Alex Miller [ 25/Nov/14 10:13 AM ]

What's here looks good. Can we hook into existing tests that verify equals/hashcode and equiv/hasheq equivalence?

Comment by Ghadi Shayban [ 25/Nov/14 1:24 PM ]

Test case added. Verified case was failing with SO prior to patch.





[CLJ-1590] Some IReduce/IReduceInit implementors don't respect reduced Created: 14/Nov/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File 0001-ensure-IReduce-IReduceInit-implementors-respect-redu.patch     Text File clj-1537-gvec-ArraySeq.patch    
Patch: Code and Test
Approval: Ok

 Description   

Several reduce implementations don't properly respect reduced:

  • clojure.core.ArrayChunk's implementation of IChunk/reduce
  • VecSeq's impl of InternalReduce/reduce
  • APersistentVector's reduce with init doesn't unwrap reduced on last value
  • seqs of primitive arrays don't unwrap reduced on last value
  • PersistentList doesn't unwrap reduced on last value

Some examples:

user=> (transduce (take 1) conj (seq (long-array [1 2 3 4])))
#<Reduced@38f774f8: [1]>
user=> (.reduce (list 1 2 3 4 5) (fn [_ a] (if (= a 5) (reduced "foo"))) 1)
#<Reduced@753d01cc: "foo">

Patch: 0001-ensure-IReduce-IReduceInit-implementors-respect-redu.patch
Screened by: Alex Miller
See also: http://dev.clojure.org/jira/browse/CLJ-1537



 Comments   
Comment by Alex Miller [ 17/Nov/14 12:35 PM ]

The patch should only be considering the result of calling the reducing function, not checking the init value (this matches what we do elsewhere).

Also, needs at least some simple example tests.

Comment by Nicola Mometto [ 17/Nov/14 1:36 PM ]

While reworking my patch to address your comment, I discovered that PersistentList and APersistentList's IReduceInit/reduce implementation aren't handling correctly reduced when the reducing function returns one on the last iteration.

The attached patch fixes those too and contains testcases demonstrating the issues.

Comment by Nicola Mometto [ 17/Nov/14 1:39 PM ]

I haven't fixed the IReduce/IReduceInit implementations for range as that's in scope for http://dev.clojure.org/jira/browse/CLJ-1515

Comment by Nicola Mometto [ 17/Nov/14 1:59 PM ]

As Ghadi Shayban noticed, while reduce doesn't use IReduceInit's reduce impl for PersistentList, transduce does so this might be cause of serious bugs even from clojure code, not only when using `.reduce` calls

Comment by Alex Miller [ 17/Nov/14 2:47 PM ]

reduce will use IReduceInit's reduce impl for PersistentList, after CLJ-1572.





[CLJ-1589] Cleanup internal-reduce implementation Created: 14/Nov/14  Updated: 20/Feb/15  Resolved: 20/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File 0001-cleanup-internal-reduce-impl.patch    
Patch: Code
Approval: Ok

 Description   

Currently internal-reduce provides an implementation for ArraySeq and the ArraySeq_* prim classes.
Since those classes implement IReduce the current patch makes instances of those classes fallback on coll-reduce's IReduce impl (that simply invokes .reduce)

This change is desiderable because it removes unnecessary duplicated code, reducing the implementation surface and making it easier to follow reduce's code path. In addition to ArraySeq there will be (based on other tickets) more seq impls that also IReduce, so it would be good to re-route back through coll-reduce when we get combinations of potentially reducible sub-seqs.

Patch: 0001-cleanup-internal-reduce-impl.patch

  • This patch depends on the patch for CLJ-1590 since the current IReduce impl for those ArraySeq classes doesn't properly handle Reduced

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 14/Nov/14 10:28 PM ]

I'm not sure whether this should be in 1.7 or not, but I'm adding it there so we can have a discussion on it regardless.





[CLJ-1580] Transient collections should guarantee thread visibility Created: 05/Nov/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: transient

Attachments: Text File clj-1580-2.patch     Text File clj-1580.patch    
Patch: Code
Approval: Ok

 Description   

With changes from CLJ-1498, transients are still thread isolated but may move between threads during their lifetime which introduces new concurrency concerns, namely visibility of changes across threads.

Approach: Make all transient collection fields either final or volatile to ensure visibility across threads.

Patch: clj-1580-2.patch

Screened by:



 Comments   
Comment by Stuart Halloway [ 02/Jan/15 11:39 AM ]

Should ATtransientSet.impl be volatile also?

Comment by Alex Miller [ 02/Jan/15 11:57 AM ]

Added -2 patch that makes ATransientSet.impl volatile.





[CLJ-1578] 1.7.0-alpha3 breakage due to symbol conflicts Created: 31/Oct/14  Updated: 14/Nov/14  Resolved: 14/Nov/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Mike Anderson Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File 0001-CLJ-1578-don-t-throw-when-a-core-Var-replaces-anothe.patch    
Patch: Code
Approval: Ok

 Description   

I've been trying to build core.matrix with 1.7.0-alpha3 and I get a failures due to symbol conflicts with clojure.core (specifically the new update function).

java.lang.IllegalStateException: update already refers to: #'clojure.core.matrix.utils/update in namespace: clojure.core.matrix.impl.ndarray-magic
	at clojure.lang.Namespace.warnOrFailOnReplace(Namespace.java:88)
	at clojure.lang.Namespace.reference(Namespace.java:110)
	at clojure.lang.Namespace.refer(Namespace.java:168)
	at clojure.core$refer.doInvoke(core.clj:4071)
	at clojure.lang.RestFn.invoke(RestFn.java:439)
	at clojure.core.matrix.impl.ndarray_magic$eval9762$loading__5295__auto____9763.invoke(ndarray_magic.clj:1)
	at clojure.core.matrix.impl.ndarray_magic$eval9762.invoke(ndarray_magic.clj:1)

Simpler case to reproduce:

(ns foo)
(def inc dec) ;; gets a warning 
(ns bar (:require [foo :refer :all])) ;; gets another warning
(ns bar (:require [foo :refer :all])) ;; causes the exception (effectively a ns reload)

Cause: In the case of a load, foo/inc is replacing clojure.core/inc and that causes the expected warning. In the case of a reload, clojure.core/inc is replacing foo/inc - this case is not currently handled and falls into the error case.

Approach: In the case of clojure.core/inc replacing foo/inc (should only happen during a reload), ignore and issue neither warning or error.

Patch: 0001-CLJ-1578-don-t-throw-when-a-core-Var-replaces-anothe.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 01/Nov/14 7:12 AM ]

The warnings I would expect / the failures I would not. Can you boil down the reproduction of the exception somehow?

Comment by Nicola Mometto [ 01/Nov/14 7:32 AM ]

I have seen similar failures when re-compiling a namespace that shadows a core Var:

  • ns foo is created
  • ns foo maps 'update to #'clojure.core/update
  • ns foo interns 'update, the compiler emits a warning
  • ns foo now maps 'update to #'foo/update
  • ns foo is reloaded
  • ns foo tries to map 'update to #'clojure.core/update but it's already mapped to #'foo/update

The logic in clojure.lang.Namespace/warnOnReplace makes it so that shadowing a clojure.core Var produces a warning while shadowing a Var from another namespace produces an error, this is what happening after reloading the namespace.

I haven't looked into the core.matrix code but I highly suspect that's what's going on there.

Comment by Alex Miller [ 01/Nov/14 11:27 AM ]

Definitely interested in a patch for this for the special case of clojure.core.

Comment by Nicola Mometto [ 01/Nov/14 11:41 AM ]

The attached patch fixes this issue by making warnOrFailOnReplace silently ignore when a clojure.core Var shadows another Var, which should only happen on namespace reload.

Comment by Mike Anderson [ 03/Nov/14 12:29 AM ]

The simplest way I can find to reproduce the general issue at the REPL is as follows:

(ns foo)
(def inc dec) ;; gets a warning
(ns bar (:require [foo :refer :all])) ;; gets another warning
(ns bar (:require [foo :refer :all])) ;; causes the exception (effectively a ns reload)

Preventing the exception is the biggest priority, it would be really nice to be also suppress the warnings. There are often good reasons to re-use names in clojure.core so it shouldn't cause a non-suppressible warning.

Note that the Clojure library coding standards say "Use good names, and don't be afraid to collide with names in other namespaces" so it is very inconsistent to trigger warnings / exceptions when people do exactly this.





[CLJ-1572] into does not work with IReduceInit Created: 24/Oct/14  Updated: 10/Jan/15  Resolved: 10/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: transducers

Attachments: Text File clj-1572-2.patch     Text File clj-1572-3.patch     Text File clj-1572-4.patch     Text File CLJ-1572-alternative-POC.patch     Text File clj-1572.patch    
Patch: Code and Test
Approval: Ok

 Description   

This should work:

(into []
  (reify clojure.lang.IReduceInit
    (reduce [_ f start]
      (reduce f start (range 10)))))
IllegalArgumentException Don't know how to create ISeq from: user$eval5$reify__6
	clojure.lang.RT.seqFrom (RT.java:506)
	clojure.lang.RT.seq (RT.java:487)
	clojure.core/seq--seq--4091 (core.clj:135)
	clojure.core.protocols/seq-reduce (protocols.clj:30)
	clojure.core.protocols/fn--6422 (protocols.clj:42)
	clojure.core.protocols/fn--6369/f--6255--auto----G--6364--6382 (protocols.clj:13)
	clojure.core/reduce (core.clj:6469)
	clojure.core/into (core.clj:6550)

Cause: CollReduce only supports IReduce, not IReduceInit so when reduce calls into it, it falls back to trying to obtain a seq representation which fails.

Proposed: Extend CollReduce to IReduceInit and in the non-init arity, cast to IReduce. Also, now that CollReduce supports both IReduceInit and Iterable, a coll that implements both makes the path through CollReduce nondeterministic. transduce does an explicit check that prefers IReduceInit - the patch copies that approach to reduce as well.

Another consequence of this change is that since PersistentVector implements IReduce but throws on the non-init path, there are some test breakages. To address this, CLJ-1619 (which implements the non-init reduce) must be applied first.

Patch: clj-1572-4.patch
Depends on: CLJ-1619 being applied first



 Comments   
Comment by Alex Miller [ 24/Oct/14 10:40 AM ]

into calls reduce which calls into CollReduce. CollReduce extends to IReduce, but not to IReduceInit. If CollReduce were extended to IReduceInit for the arity it can support, into work as expected in the given example. Patch clj-1572.patch does this.

Comment by Ghadi Shayban [ 08/Nov/14 4:34 PM ]

It is also possible that core/reduce needs the same special casing of IReduceInit that transduce has to allow for a deterministic dispatch when transduce is called with (mapcat f), as mapcat calls reduce.

Comment by Stuart Halloway [ 10/Nov/14 11:02 AM ]

Can someone please expand on Ghadi's comment with an example of the problem?

Comment by Ghadi Shayban [ 10/Nov/14 11:14 AM ]

Example of something that is Iterable & ReduceInit:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L122-L128

Let's call that r/range in this example:
(transduce (mapcat r/range) + 0 [5 5 5 5 5])

The when the mapcat transducer encounters r/range, the inner reduce call will dispatch through CollReduce upon Iterable, rather than IReduceInit.

the inner call to reduce within cat:
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7243

Comment by Alex Miller [ 12/Nov/14 12:55 PM ]

To restate the issue from Ghadi for my own sake:

The CollReduce protocol extends to IReduce, IReduceInit and Iterable. Because these are all interfaces, its possible for a custom coll to implement two or more of them. In that case, Clojure will arbitrarily pick which protocol impl is called - this can result in the Iterable version being called instead of IReduce/IReduceInit (which should be preferred).

transduce avoids this by explicitly checking for IReduceInit and preferring it over CollReduce.

Ghadi is suggesting that reduce should also make this preference (currently it does not).

Comment by Nicola Mometto [ 17/Nov/14 3:06 PM ]

If CollReduce could be direcly backed by the IReduce interface, this would remove the need for explicit IReduceInit checking at the callsite.

It's already possible to (defprotocol CollReduce :on-interface clojure.lang.IReduce ..), I'm proposing adding the ability to map the "reduce" method to the coll-reduce protocol-fn aswell and go with this solution

Comment by Alex Miller [ 17/Nov/14 3:21 PM ]

CollReduce extends to two interfaces (IReduceInit and Iterable) and for some impls this is ambiguous under the CollReduce protocol. The check in reduce and transduce is to force the choice of IReduceInit so it is not ambiguous. I think your suggestion re-introduces that issue? Or maybe I'm just not understanding what you mean.

Comment by Nicola Mometto [ 17/Nov/14 3:46 PM ]

Turns out defprotocol already has that capability via :on metadata field.

The attached patch is a proof of concept of my proposal, if there's interest in this approach I can fix the deftype/record/reify method parser to automatically pick the var name rather than having to specify the method name.

Comment by Nicola Mometto [ 17/Nov/14 3:52 PM ]

Ah, I see now the issue. Disregard my patch then.

Comment by Ghadi Shayban [ 14/Dec/14 11:58 AM ]

Note that unless this patch is applied, a plain reduce over an Eduction goes through the seq/iterator path of CollReduce, and not eduction's native IReduceInit path.

Comment by Ghadi Shayban [ 17/Dec/14 5:03 PM ]

with this patch + CLJ-1546

(reduce + [1 2 3]) doesn't work anymore, breaking a few tests.

Comment by Ghadi Shayban [ 17/Dec/14 5:16 PM ]

Should have left a bit more detail.
https://github.com/clojure/clojure/commit/ad7d9c46992cac0e812ce3dd47584c9bb2fda11f

This might not have anything to do with CLJ-1546, just happened to have them both applied. Seems like vectors are both IReduce+IReduceInit, but throw on the IReduce impl.

Vectors were made IReduce before IReduce was split into IReduceInit.

Comment by Nicola Mometto [ 17/Dec/14 5:19 PM ]

I've opened CLJ-1619 with a patch implementing the no-init arity of reduce for PersistentVector

Comment by Nicola Mometto [ 17/Dec/14 5:20 PM ]

An alternative fix would be to just make PersistentVectors IReduceInit rather than IReduce but I don't see the point in doing that since the implementation is trivial.

Comment by Ghadi Shayban [ 22/Dec/14 3:04 PM ]

Nicola, that is my impression, that Rich intended PersistentVector to be IReduceInit but not IReduce. But he changed it before that interface was split. Would still need some sort of way to handle the existing no-init case, which he mentioned was unfortunate at the conj.

Comment by Nicola Mometto [ 22/Dec/14 3:11 PM ]

Ghadi, what would the rationale be for PV not supporting the no-init arity? I'm not aware of any technical issues caused by my patch for CLJ-1619 but maybe you know about one?

Comment by Ghadi Shayban [ 22/Dec/14 10:58 PM ]

No I may just be confused.

Rubber-ducking aloud so that I can be corrected:

A call to init reduce with an 'init' supplied is unambiguous.

It's the responsibility of an IReduce to do what is appropriate with 'f', whereas with "improved" reduce and transduce (f) becomes the init. (c.c.reducers/reduce being the improved reduce)

IReduce implementations must preserve compatibility with core/reduce's docstring in the 0 and 1 arity cases.

If coll contains no
items, f must accept no arguments as well, and reduce returns the
result of calling f with no arguments. If coll has only 1 item, it
is returned and f is not called. If val is supplied, returns the
result of applying f to val and the first item in coll, then
applying f to that result and the 2nd item, etc. If coll contains no
items, returns val and f is not called.

A protocol's dispatch is non-deterministic when invoked upon things with multiple paths. One way to resolve the ambiguities in CollReduce is to extend CollReduce directly to any IReduce/IReduceInit impl and not rely upon friends like Iterable.

For example CLJ-1515 needs the same CollReduce extension as CLJ-1603's Iterable/Repeat/Cycle got.

(extend-protocol p/CollReduce
  clojure.lang.LongRange
  (coll-reduce [this f] (.reduce this f)
  ..

  clojure.lang.Cycle
  ...)
;; etc.
Comment by Nicola Mometto [ 23/Dec/14 4:36 AM ]

I feel like all those issues introduced by the non-deterministic dispatch of protocol functions in case of multiple available implementations, could (and should?) be solved by a prefer-method-like capability for protocols.
This way we could say have a bunch of hints like (prefer-dispatch CollReduce IReduce Iterable) and be done with it.





[CLJ-1571] Transducer of partition-by over take gives wrong answer Created: 20/Oct/14  Updated: 21/Oct/14  Resolved: 21/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Rich Hickey
Resolution: Completed Votes: 1
Labels: transducers

Attachments: Text File 0001-CLJ-1571-fix-regression-introduced-by-43cc1854508d65.patch     Text File CLJ-1571.patch    
Approval: Ok

 Description   
(partition-by pos? (take 2 [-1 1]))
=> ((-1) (1))
(sequence (comp (take 2) (partition-by pos?)) [-1 1])
=> ([-1])


 Comments   
Comment by Nicola Mometto [ 21/Oct/14 7:49 AM ]

Given that it works fine when using transduce instead of sequence, the bug might be in LazyTransformer rather than in partition-by.

(into [] (comp (take 2) (partition-by pos?)) [-1 1])
=> [[-1] [1]]
Comment by Ghadi Shayban [ 21/Oct/14 9:21 AM ]

Patch fixes the test case, but needs eyes, I certainly may have broken something. This highlights the importance of CLJ-1554, something similar to the existing defequiv tests for reducers, but between #'into and #'sequence, also covering edge cases in reduced unwrapping.

Comment by Alex Miller [ 21/Oct/14 9:41 AM ]

Thanks Ghadi. This bug was found by the tests I wrote for CLJ-1554, so yes.

Comment by Nicola Mometto [ 21/Oct/14 9:53 AM ]

Applying this patch causes a regression in the lazyiness of sequence.
The lines that Ghadi removed for this patch were added by Rich in this commit https://github.com/clojure/clojure/commit/43cc1854508d655e58e377f84836ba128971f90c to address http://dev.clojure.org/jira/browse/CLJ-1497

Example of the regression:
current master:

user=>  (sequence (take 2) (map #(do (println (str "~" %)) %) (iterate inc 1)))
~1
~2
(1 2)

with this patch:

user=>  (sequence (take 2) (map #(do (println (str "~" %)) %) (iterate inc 1)))
~1
~2
~3
(1 2)
Comment by Nicola Mometto [ 21/Oct/14 10:03 AM ]

Patch 0001-CLJ-1571-fix-regression-introduced-by-43cc1854508d65.patch addresses this issue while preserving the current lazyness factor of `sequence`

Comment by Alex Miller [ 21/Oct/14 11:09 AM ]

Rich has a (different) patch for this on the way.

Comment by Alex Miller [ 21/Oct/14 1:16 PM ]

Fixed directly by Rich in commit https://github.com/clojure/clojure/commit/38d7572e4254afdd7f02b78095ccdb27065754d2





[CLJ-1568] Incorrect error locations reported in the stacktrace Created: 19/Oct/14  Updated: 20/Feb/15  Resolved: 20/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Bozhidar Batsov Assignee: Unassigned
Resolution: Completed Votes: 19
Labels: errormsgs, ft

Attachments: Text File 0001-CLJ-1568-fix-incorrect-error-locations.patch     Text File 0001-CLJ-1568-fix-incorrect-error-locations-v2.patch     Text File clj-1568.patch    
Patch: Code and Test
Approval: Ok

 Description   

The following code produces an incorrect stacktrace:

(ns clojure-demo.core)

(defn foo
  "I don't do a whole lot."
  [x]
  (println x "Hello, World!"))

(/ 1 0)
Exception in thread "main" java.lang.ArithmeticException: Divide by zero, compiling:(clojure_demo/core.clj:6:31)

The problem is actually on the 8th line. As a matter of fact - there's nothing at location 6:31.
This is a pretty serious problem as many tools parse stacktraces for error locations.
Here's a related discussion in cider's issue tracker.

Patch: clj-1568.patch
Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 19/Oct/14 1:39 PM ]

Maybe a dupe of CLJ-1561 ?

Comment by Andy Fingerhut [ 19/Oct/14 4:16 PM ]

I tried out the example given in the description, with the latest Clojure master as of today plus the patch for CLJ-1561 called 0002-Mark-line-number-after-emitting-children.patch, dated Oct 10 2014.

The line:column number 6:31 is the same for that patched version as it is in the ticket description, which is for Clojure 1.6.0.

The issue of misleading line:column numbers is common between the two tickets, but at least the proposed improvement in CLJ-1561's patch is not effective for improving this issue.

Comment by Bozhidar Batsov [ 20/Oct/14 1:36 AM ]

I know that the issue list for 1.7 is pretty much finalised, but I think that this issue and and CLJ-1561 should be fixed as soon as possible.
Correct error reporting is extremely important IMO.

Comment by Nicola Mometto [ 20/Oct/14 8:28 AM ]

Attached a patch that fixes the issue by consuming all the whitespaces before retrieving line/column info for the next form.

Comment by Alex Miller [ 20/Oct/14 8:39 AM ]

Are there possible downsides to more eagerly consuming whitespace as done in the patch?

Comment by Nicola Mometto [ 20/Oct/14 8:44 AM ]

I can't think of any

Comment by Paul Stadig [ 22/Oct/14 2:59 PM ]

The defect on master does not have effect when using compile:

user=> (require 'clojure-demo.core)

CompilerException java.lang.ArithmeticException: Divide by zero, compiling:(clojure_demo/core.clj:6:31) 
user=> (load "/clojure_demo/core")

CompilerException java.lang.ArithmeticException: Divide by zero, compiling:(clojure_demo/core.clj:6:31) 
user=> (compile "clojure_demo/core")

CompilerException java.lang.ArithmeticException: Divide by zero, compiling:(core.clj:8:1) 

With the patch applied all the line numbers are the same in all cases:

user=> (require 'clojure-demo.core)

CompilerException java.lang.ArithmeticException: Divide by zero, compiling:(clojure_demo/core.clj:8:1) 
user=> (load "/clojure_demo/core")

CompilerException java.lang.ArithmeticException: Divide by zero, compiling:(clojure_demo/core.clj:8:1) 
user=> (compile "clojure_demo/core")

CompilerException java.lang.ArithmeticException: Divide by zero, compiling:(core.clj:8:1) 

Agreed that this seems to be orthogonal to CLJ-1561.

Comment by Bozhidar Batsov [ 10/Jan/15 3:08 AM ]

Seems we need to add tests before this can be merged - https://groups.google.com/forum/#!topic/clojure-dev/7pFhG8LMvGo

Comment by Daniel Solano Gómez [ 11/Jan/15 12:20 AM ]

Well, I've been looking into adding tests to this patch, and I've made some interesting discoveries. The additions to Compiler.load() seem to work just fine. However, I'm not seeing much coming out of the changes to Compiler.compile. You can see in Paul's comment above that the compile call actually reports the correct location. I created a test that throws a NullPointerException during compilation, and in the case of compile, it is never wrapped in a CompilerException. I'll attach my test patch that contains the example code I have been working with.

Comment by Daniel Solano Gómez [ 11/Jan/15 12:26 AM ]

Patch with tests for code paths in the fix. The tests for the Compiler.compile changes are not showing what I had expected.

Comment by Alex Miller [ 12/Jan/15 8:07 AM ]

Whenever people feel this is ready for screening, just switch the Approval from Incomplete to Vetted.

Comment by Nicola Mometto [ 12/Jan/15 8:39 AM ]

Updated patch removing the changes to Compiler.compile as they seem to be useless, by Daniel's tests

Comment by Daniel Solano Gómez [ 14/Jan/15 10:26 AM ]

I have cleaned up my test code a bit and put together a combined patch that includes both the fix and the tests.

Comment by Bozhidar Batsov [ 14/Jan/15 11:12 AM ]

Great! Looking forward to seeing this merged.

Comment by Bozhidar Batsov [ 20/Feb/15 7:50 AM ]

Seems we can finally merge this!





[CLJ-1561] Incorrect line numbers are emitted Created: 10/Oct/14  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Paul Stadig Assignee: Unassigned
Resolution: Completed Votes: 24
Labels: errormsgs

Attachments: Text File clj-1561.patch    
Patch: Code and Test
Approval: Ok

 Description   

The Clojure JVM compiler marks the line number for a form before emitting the children for that form. Marking the line number before emitting children leads to incorrect line numbers when a runtime error occurs. For example, when

 (foo bar
      baz)

is emitted the compiler will visit the line number for the expression, then emit the children expressions ('bar' and 'baz') which will mark their own line numbers, then come back and emit the invoke bytecode for 'foo', but since the last line number to be marked was that of 'baz', if 'foo' throws an exception the line number of 'baz' will be reported instead of the line number for the expression as a whole.

This same issue was being manifested with special forms and inlined functions, and was especially bad in the case of the threading macro '->', because it is usually spread across several lines, and the line number reported could end up being very different than the line actually causing an exception.

A demonstration of the incorrect line numbers (and how the fix affects line numbers) can be seen here https://github.com/pjstadig/clojure-line-numbers

Patch: clj-1561.patch

Screened by: Alex Miller



 Comments   
Comment by Jozef Wagner [ 10/Oct/14 1:57 PM ]

additions in your patch mixes tabs and spaces. Could you please update the patch so that your added lines indent only with tab characters? Not everyone has tab set at 4 spaces...

Comment by Paul Stadig [ 10/Oct/14 2:42 PM ]

There's already a mixture of just tabs, just spaces, and tabs & spaces in Compiler.java. I'm not sure what the "standard" is, but I've changed the patch to match the surrounding lines.

Comment by Paul Stadig [ 10/Oct/14 2:42 PM ]

Patch with whitespace changes.

Comment by Alex Miller [ 20/Oct/14 8:38 AM ]

These changes will affect the line number tables for a variety of Clojure constructs when compiled. It would be very helpful to me to have a set of examples that covered each case touched in the patch so that I could compile them and look at the bytecode vs the source. This would greatly accelerate the screening process.

Comment by Paul Stadig [ 20/Oct/14 2:29 PM ]

Alex,
I have created a repo on github that has a sample file demonstrating the line number changes.

https://github.com/pjstadig/clojure-line-numbers

Hope that helps!

BTW, I'd be glad to do a skype call or hangout, if you have questions.

Comment by Alex Miller [ 20/Oct/14 2:34 PM ]

This is very helpful, thanks!!

Comment by Alex Miller [ 22/Oct/14 11:35 AM ]

In the hunk at 3191 in KeywordInvokeExpr, a call to visitLineNumber was added, but the prior call 4 lines earlier was not removed. Should it be?

Comment by Paul Stadig [ 22/Oct/14 12:05 PM ]

I left that in thinking that if something goes wrong with the getstatic instruction (null pointer exception? class cast exception?) it should report the line number of the KeywordInvokeExpr. It may be that there isn't a realistic possibility that anything could actually happen with that getstatic instruction, but that was the thought process.

My general rule of thumb was if an emit method emits any instructions before it calls the emit method on another expr, then it should mark its line number before and after the recursive emit call (assuming that the recursive emit call would mark its own line number). In cases where an emit method immediately calls another emit method, then I don't bother to mark a line number until afterwards.

Comment by Bozhidar Batsov [ 10/Jan/15 3:09 AM ]

Seems we need to add tests before this can be merged - https://groups.google.com/forum/#!topic/clojure-dev/7pFhG8LMvGo

Comment by Daniel Solano Gómez [ 11/Jan/15 8:42 PM ]

Here's an updated version of Paul's patch that applies cleanly to master.

I based my test code from his GitHub repository, but I have made a few changes:

  1. I have added test cases for when emitUnboxed or the reflecting branch of emit is called.
  2. The original keyword invocation test doesn't actually go through KeywordInvokeExpr. It is resolved by the change to InvokeExpr.emitArgsAndCall, so I have removed it.

Additionally, I have been unable to create test cases that verify that the changes for intrinsic predicates, KeywordInvokExprs, and for the changes inside of InvokeExpr.emit. These could be that they are redundant, but I am not sure.

Comment by Paul Stadig [ 12/Jan/15 5:51 AM ]

Thanks Daniel!

I applied the patches for my commit (as updated by Daniel) and Daniel's commit. The namespace docstring had shifted all the line numbers down by 4, and the tests were failing. I amended Daniel's commit with updated line numbers and combined both commits into a single patch file (clj-1561.patch).

I also reverted my commit and re-ran Daniel's tests to verify that they failed as expected.

I believe the combined patch file should include everything we need other than the difficult test cases that Daniel mentions in his comment.

Cheers!

Comment by Alex Miller [ 12/Jan/15 8:07 AM ]

Whenever people feel this is ready for screening, just switch the Approval from Incomplete to Vetted.

Comment by Paul Stadig [ 12/Jan/15 2:20 PM ]

I have added a test for KeywordInvokeExpr, and a new test to test the logic in InvokeExpr.emit.

I removed the changes I made to StaticMethodExpr.emitIntrinsicPredicate, since I was unable to find a way to create a test case. The issue is that we mark the lines of child expressions, then come back to the parent expression and try to invoke without marking its line, but in the case of intrinsic predicates there is no invocation. When we come back to the parent we directly emit some bytecodes. I don think the change I made in that case actually changed anything, but I removed it since there is no need for it.

Comment by Alex Miller [ 14/Jan/15 10:07 PM ]

Could we squash the patch into a single commit?

Comment by Alex Miller [ 15/Jan/15 9:21 AM ]

Also, in addition to squashing the patch, can we rename examples_clj_1561.clj to line_number_examples.clj or something w/o the ticket in it? I'm ready to mark as screened when those are done. (And if that can get done today, it's likely to get pushed forward tomorrow.)

Comment by Daniel Solano Gómez [ 15/Jan/15 9:24 AM ]

In case Paul doesn't get to it sooner, I can take care of it this afternoon.

Comment by Paul Stadig [ 15/Jan/15 10:40 AM ]

Alex,
I'd like to maintain attribution. I could squash my test related commits into Daniel's commit, so we'd have two commits, my commit for the change and Daniel's for the tests. Would that suffice? If so, I can take care of it now.

Comment by Alex Miller [ 15/Jan/15 11:13 AM ]

Sure, that's fine.

Comment by Alex Miller [ 15/Jan/15 11:14 AM ]

note that one of the commits has a whitespace error (I think Daniel's) that got removed in a later commit. In the end patch, please make sure it doesn't warn about any whitespace errors when applied.

Comment by Alex Miller [ 15/Jan/15 12:46 PM ]

Paul, this patch still seems to have examples_clj_1561.clj in it - would prefer name that is meaningful vs jira issue.

Comment by Paul Stadig [ 15/Jan/15 1:44 PM ]

Yeah, sorry, I'm still working on it. Should be up shortly.

Comment by Paul Stadig [ 15/Jan/15 2:05 PM ]

I squashed the commits down to two: my commit for the change, and Daniel's for the tests. I renamed the "examples_clj_1561" file to "line_number_examples". I removed some trailing whitespace from Daniel's commit. There are still whitespace errors (on my machine) from my commit because tabs are used for indenting. I used tabs to indent because that is how the lines surrounding were formatted, and Jozef had already complained about the formatting. It also appears that these whitespace errors are dependent on your git config. I have:

[core]
whitespace = trailing-space,space-before-tab,tab-in-indent

But when I comment that out of my gitconfig I get no whitespace errors. I guess the bottom line is if we want to agree on a standard config for that I'd be glad to accomodate, but I don't think we want to reformat Compiler.java, so the best choice seems to be to just match the surrounding context, which is what I've done. If you want me to do anything further with that, then let me know.

Ball is back in your court, Alex.





[CLJ-1557] Nested reduced is broken Created: 09/Oct/14  Updated: 29/Jan/15  Resolved: 10/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Christophe Grand Assignee: Rich Hickey
Resolution: Completed Votes: 0
Labels: transducers

Attachments: File re-reduced.diff    
Patch: Code
Approval: Vetted

 Description   

Re-reduced from composed transformation functions:

  • re-wraps the Reduced when it should not (take)
  • forget to unwrap the Reduced (partition-by, partition-all)
; nested reduced
=> (transduce (comp (take 1)) conj [:a])
[:a]
=> (transduce (comp (take 1) (take 1)) conj [:a])
#<Reduced@65979031: [:a]>
=> (transduce (comp (take 1) (take 1) (take 1)) conj [:a])
#<Reduced@fcbc8d1: #<Reduced@60bea99a: [:a]>>
=> (transduce (comp (take 1) (take 1) (take 1) (take 1)) conj [:a])
#<Reduced@6e9915bb: #<Reduced@5c712302: #<Reduced@472b9f70: [:a]>>>
 
; problems not appearing in all contexts
; not ok with transduce
=> (transduce (comp (partition-by keyword?) (take 1)) conj [] [:a])
#<Reduced@5156c42e: [[:a]]>
; but ok with sequence
=> (sequence (comp (partition-by keyword?) (take 1)) [:a])
([:a])
; well, not always
=> (sequence (comp (partition-by keyword?) (take 1)  (partition-by keyword?) (take 1)) [:a])
ClassCastException clojure.lang.Reduced cannot be cast to clojure.lang.LazyTransformer  clojure.lang.LazyTransformer$Stepper$1.invoke (LazyTransformer.java:104)

See also: https://groups.google.com/d/msg/clojure-dev/cWzMS_qqgcM/7IAhzMKzVigJ



 Comments   
Comment by Ghadi Shayban [ 09/Oct/14 11:11 PM ]

Same with partition-all

(transduce (comp (take 1) (partition-all 3) (take 1)) conj [] (range 15))
 #<Reduced@84f8976: [[0]]>
Comment by Christophe Grand [ 10/Oct/14 5:50 AM ]

patch for take, partition-by and partition-all

Comment by Nicola Mometto [ 29/Jan/15 12:09 PM ]

ticket was marked resolved but not closed





[CLJ-1554] Need to expand tests to cover transducers Created: 07/Oct/14  Updated: 14/Nov/14  Resolved: 14/Nov/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: transducers

Attachments: Text File clj-1554-2.patch     Text File clj-1554-3.patch     Text File clj-1554-4.patch     Text File clj-1554-5.patch     Text File clj-1554.patch    
Patch: Code and Test
Approval: Ok

 Description   

Attached patch contains both some generative and example tests for transducers. The generative tests build a series of sequence functions (take 5, filter odd?, etc) and apply them to a random vector of numbers as seq transformations, sequence of transducer, into of transducer, and transduce of transducer. The results are compared.

Note: these tests depend on the patch in CLJ-1349 to run as tests.

Patch: clj-1554-5.patch



 Comments   
Comment by Fogus [ 24/Oct/14 1:44 PM ]

I downloaded and applied this patch and its dependent patch (1349) and ran the tests. The coverage is a good start and the approach of verifying results against results gathered from other approaches is important. One note of style is that the use of `doall` is inconsistent in the `apply-as-*` functions. i would recommend that at least one other person screen this patch as my grasp of test.check is tenuous.

Comment by Alex Miller [ 24/Oct/14 2:52 PM ]

Updated patch slightly to clean up the doall stuff.

Comment by Guangyu Zhang [ 01/Nov/14 2:55 PM ]

What is clojure.test.check? You require it but never use it. This namespace doesn't exist, so I can't do individual test by (require 'clojure.test-clojure.transducers).

The error message:
CompilerException java.io.FileNotFoundException: Could not locate clojure/test/check__init.class or clojure/test/check.clj on classpath., compiling:(clojure/test_clojure/transducers.clj:1:1)

The way I used to do individual test is described in http://dev.clojure.org/display/community/Developing+Patches.

But there is no error when I run 'mvn package'.

Comment by Alex Miller [ 01/Nov/14 3:13 PM ]

As noted in the description, this patch depends on CLJ-1349 to be applied first.

Comment by Alex Miller [ 01/Nov/14 3:23 PM ]

After you apply CLJ-1349 you will also need to rerun antsetup.sh as it adds new dependencies.

Comment by Guangyu Zhang [ 02/Nov/14 12:43 AM ]

I did what you say, but the error still exists.
I can pass this test via 'ant test-example', but I can not do individual test.

To reproduce this problem:
Apply CLJ-1349 and CLJ-1554
$ ./antsetup.sh
$ ant
$ java -cp test:clojure-1.7.0-master-SNAPSHOT.jar clojure.main
user=> (require 'clojure.test-clojure.transducers)
CompilerException java.io.FileNotFoundException: Could not locate clojure/test/check__init.class or clojure/test/check.clj on classpath., compiling:(clojure/test_clojure/transducers.clj:1:1)

This should work:
$ java -cp /Users/guangyu/.m2/repository/org/clojure/test.check/0.5.9/test.check-0.5.9.jar:/Users/guangyu/.m2/repository/org/clojure/test.generative/0.5.1/test.generative-0.5.1.jar:test:clojure.jar clojure.main
user=> (require 'clojure.test-clojure.transducers)
nil

Maybe the document (http://dev.clojure.org/display/community/Developing+Patches) needs to be updated.

Comment by Guangyu Zhang [ 02/Nov/14 12:46 AM ]

There is no need to require clojure.test.check . I remove it and nothing happens.

Comment by Alex Miller [ 03/Nov/14 10:46 AM ]

That page is out of date with respect to running tests with either test.generative or test.check (which doesn't actually exist yet until CLJ-1349).

More complete recipe:

1. Apply CLJ-1349 and CLJ-1554 patches
2. ./antsetup.sh
3. ant
4. java -cp `cat maven-classpath`:target/classes:src:test clojure.main
5. (require 'clojure.test-clojure.transducers)
6. (clojure.test/run-tests 'clojure.test-clojure.transducers)

Works for me.

Confusingly, the patch in this test uses test.check, which is a generative test but run in the build (post CLJ-1349) as an example-based test. Stu and I are still talking about the best way to address that. One issue is that test.generative tests are time-based for intensity while test.check is iteration-based.

I will update the patch to remove the require of test.check.

Comment by Alex Miller [ 03/Nov/14 11:14 AM ]

I updated that testing page to cover test.generative as well.

Comment by Stuart Halloway [ 10/Nov/14 12:15 PM ]

Alex, would like to discuss two possible changes

  • make fbind create a symbolic rep of the work to do, so that failure messages are easier to read
  • whitelist the exceptions we expect, and check with a predicate in seq-and-transducer-same-result
Comment by Alex Miller [ 12/Nov/14 12:08 PM ]

Added new patch that whitelists only IllegalArgumentException and ClassCastException as the possible allowed exceptions in the transducer tests (they may vary between the transducer and non-transducer form).

The fbind does build a semantic description already in the :desc key which is used on error. Here's an example error - see the :actions key. That will be a list of the transformations applied (although shrinking often minimizes that list):

[java] Testing clojure.test-clojure.transducers
     [java] {:test-var seq-and-transducer, :result #<ExceptionInfo clojure.lang.ExceptionInfo: Applied actions to coll as seq, sequence transducer, and into transducer and got different results. {:coll [3 5 5 5 -2], :actions take 6, :s (3 5 5 5 -2), :xs (3 5 5), :xi [3 5 5], :xt [3 5 5]}>, :seed 1415806766835, :failing-size 6, :num-tests 7, :fail [[3 5 5 5 -2] [{:desc take 6, :xf #<core$take$fn__4550 clojure.core$take$fn__4550@4d186c57>, :seq #<core$partial$fn__4490 clojure.core$partial$fn__4490@44709ca4>}]], :shrunk {:total-nodes-visited 46, :depth 10, :result #<ExceptionInfo clojure.lang.ExceptionInfo: Applied actions to coll as seq, sequence transducer, and into transducer and got different results. {:coll [0 0], :actions take 2, :s (0 0), :xs (0), :xi [0], :xt [0]}>, :smallest [[0 0] [{:desc take 2, :xf #<core$take$fn__4550 clojure.core$take$fn__4550@5b938615>, :seq #<core$partial$fn__4490 clojure.core$partial$fn__4490@556733e4>}]]}}




[CLJ-1549] split IReduce Created: 06/Oct/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File clj-1549-2.patch     Text File clj-1549.patch    
Patch: Code
Approval: Ok

 Description   
  • IReduceInit should take arity-2 version from existing IReduce
  • IReduce should extend IReduceInit and add arity-1
  • new stuff should implement IReduceInit only (audit everything added for 1.7)
  • old stuff should not change or break

Patch: clj-1549-2.patch



 Comments   
Comment by Alex Miller [ 06/Oct/14 4:56 PM ]

Patch does as requested. Did not change the CollReduce extension which currently needs both arities:

(extend-protocol CollReduce
  ...

  clojure.lang.IReduce
  (coll-reduce
   ([coll f] (.reduce coll f))
   ([coll f val] (.reduce coll f val)))
Comment by Rich Hickey [ 07/Oct/14 8:29 AM ]

Can we please use the name IReduceInit instead of ILeftReduce?





[CLJ-1546] Widen vec to take Iterable/IReduce Created: 02/Oct/14  Updated: 11/Jan/15  Resolved: 10/Jan/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File clj-1546-2.patch     Text File clj-1546-3.patch     Text File clj-1546-4.patch     Text File clj-1546-5.patch     Text File clj-1546-6.patch     Text File clj-1546.patch     PNG File numbers.png    
Patch: Code and Test
Approval: Ok

 Description   

These examples should work but do not:

Something Iterable but not IReduce:

user> (def i (eduction (map inc) (range 100)))
#'user/i
user> (instance? java.util.Collection i)
false
user> (instance? Iterable i)
true
user> (vec i)
RuntimeException Unable to convert: class clojure.core.Iteration to Object[]

Something IReduceInit but not Iterable:

user=> (vec
  (reify clojure.lang.IReduceInit
    (reduce [_ f start]
      (reduce f start (range 10)))))
RuntimeException Unable to convert: class user$reify__15 to Object[]

Proposal: Add PersistentVector.create(Iterable) and PersistentVector.create(IReduceInit) to efficiently create PVs from those. See also the blog post http://insideclojure.org/2015/01/07/vec-perf/.

For performance, vec has several cases:
1) (vec) if vector?: return new vector w/o meta - this matches prior behavior but has a constant cost of a few ns, rather than linear cost. If not a vector, spill to LazilyPersistentVector.create(Object).

2) (LPV) instanceof IReduceInit: Anything reducible can reduce itself fastest. Right now this has a big benefit for PersistentList. on 1.7.0-alpha4 with list of size 1024, into=28 seconds, vec=18 seconds. After patch, vec=7 seconds. As more things become IReduce, they'll take this path as well. This is also the branch that handles the new Eduction and IReduceInit cases.

3) (LPV) instanceof ISeq: If the coll is a sequence already, best to just walk it rather than build an iterator or array from it. This calls into PersistentVector.create(ISeq). That implementation now contains an optimization to build into an array and construct the PersistentVector directly from the array for sequences <= 32 elements (which is common). Once that threshold is reached, it switches to building with transients. The benchmark shows that the patch makes vec substantially faster for all seqs and even faster than into in some cases.

4) (LPV) instanceof Iterable: For all non-Clojure collections (ArrayList) and current non-IReduce Clojure collections (PHM, PHS), this is fastest path. Iterators are preferred to seqs as they do not cache or hold onto the values as they go by. The PV.create() for Iterable uses transients. Due to slightly more overhead, small maps and sets are slightly slower but this is largely fixed by CLJ-1499 which adds direct iterators. ArrayLists with <= 32 are special-cased - we can toArray() and construct the PV with a seeded node in this case. This type and size is particularly common in real code. Even so, very small ArrayLists are a bit slower than they were due to increased number of conditional checks I think.

5) (LPV) otherwise RT.toArray(): catches Map, String, Object[], primitive array, etc. The important ones here are the arrays - they are slightly slower on small arrays due to overhead of checking more cases above, but big arrays are significantly faster than they were.

In addition, there was one hard-coded path in the Compiler into PersistentVector.create() and I re-routed that through LazilyPersistentVector instead as that code is now the place to choose the fastest path logic.

Patch: clj-1546-6.patch, see numbers.png for perf comparison

Screened by: Stu



 Comments   
Comment by Timothy Baldridge [ 02/Oct/14 9:44 AM ]

Is there a reason the final case for (vec something) can't just be a call to (into [] coll)? It seems a bit odd to do (to-array) on anything thats not a java collection or Iterable, when we have IReduce.

Comment by Rich Hickey [ 02/Oct/14 10:02 AM ]

re: Tim - yes, this needs to support IReduce (and thereby educe) as well

Comment by Alex Miller [ 14/Oct/14 9:56 AM ]

Added new patch that handles Iterable and IReduceInit in vec. It also makes calling with a vector much faster due to the first check. into is still faster for chunked seqs (due to special InternalReduce handling of chunking).

It would be possible to move more of the variant checking into LazilyPersistentVector or PersistentVector so it could be used in more contexts. I'm not sure how much to do with that.

It would also be possible to instead lean on reduce more from the Java side if there was a Java version of reduce (as defined in mikera's branch for http://dev.clojure.org/jira/browse/CLJ-1192 at https://github.com/mikera/clojure/compare/clj-1192-vec-performance. Something like that is the only way I can see of leveraging that same InternalReduce logic that makes into faster than vec.

Comment by Alex Miller [ 13/Nov/14 4:14 PM ]

Prior comments from Stu removed from description: "Open Question: Which branch should come first, Collection or IReduceInit? Collection reaches the fast path for small collections through LazilyPersistentVector, but IReduceInit should be faster for larger things. Related: Shouldn't the item count in LazilyPersistentVector be a bounded count?"

I have attached a new patch that simplifies the impl to do it in LazilyPersistentVector instead of in vec, which was easier due to "and" not being able yet when vec is implemented to do the length check.

I have also done a considerable amount of analysis on the matrix of incoming collections and best path to follow and also collected some data on what collections are commonly passed into vec. The current patch reflects those findings. Some highlights:

  • vec is called with PersistentVector in all projects I tested. The instanceof check takes that case from typically 100s of nanos to ~5 ns. So I do think it is worth doing.
  • vec is overwhelmingly called with small collections - in most cases the incoming collection is <10 elements. In cases where the collection is not a sequence, the path of creating the Vector with an owning array is the fastest option, beating even IReduce and transient building (as that path has some checks involved).
  • PersistentList is the only IReduce likely to be encountered by vec right now and adding that branch is a significant performance boost from prior impl and vs into. If maps and sets were IReduce, they would gain this as well.
  • chunked seqs will be significantly faster with into than vec as into goes through CollReduce and can leverage many optimizations on reducing through chunks that are not available to vec.
  • seqs in general though are now faster with vec than they were due to leveraging transients.
  • eduction results support IReduce and are also faster with vec than into.
  • range is currently slower with vec, but when range is IReduce, it will probably be faster with vec

In summary, some new conventional wisdom (after this patch) on (into []) vs vec:

  • vec is faster if passed a vector, an IReduce, or an array
  • into is faster when working with seqs, but even vec is better than it used to be and may even be faster for things like range in the future
Comment by Michael Blume [ 13/Nov/14 7:24 PM ]

Latest patch won't build for me when applied to master

compile-clojure:
     [java] Exception in thread "main" java.lang.ExceptionInInitializerError
     [java] 	at clojure.lang.Compile.<clinit>(Compile.java:29)
     [java] Caused by: java.lang.NoSuchMethodError: clojure.lang.LazilyPersistentVector.create(Ljava/util/Collection;)Lclojure/lang/IPersistentVector;, compiling:(clojure/core.clj:14:23)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7206)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:370)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:361)
     [java] 	at clojure.lang.RT.load(RT.java:440)
     [java] 	at clojure.lang.RT.load(RT.java:411)
     [java] 	at clojure.lang.RT.doInit(RT.java:448)
     [java] 	at clojure.lang.RT.<clinit>(RT.java:329)
     [java] 	... 1 more
     [java] Caused by: java.lang.NoSuchMethodError: clojure.lang.LazilyPersistentVector.create(Ljava/util/Collection;)Lclojure/lang/IPersistentVector;
     [java] 	at clojure.lang.LispReader$VectorReader.invoke(LispReader.java:1073)
     [java] 	at clojure.lang.LispReader.readDelimitedList(LispReader.java:1138)
     [java] 	at clojure.lang.LispReader$ListReader.invoke(LispReader.java:972)
     [java] 	at clojure.lang.LispReader.read(LispReader.java:183)
     [java] 	at clojure.lang.LispReader$WrappingReader.invoke(LispReader.java:535)
     [java] 	at clojure.lang.LispReader.readDelimitedList(LispReader.java:1138)
     [java] 	at clojure.lang.LispReader$MapReader.invoke(LispReader.java:1081)
     [java] 	at clojure.lang.LispReader.read(LispReader.java:183)
     [java] 	at clojure.lang.LispReader$MetaReader.invoke(LispReader.java:716)
     [java] 	at clojure.lang.LispReader.readDelimitedList(LispReader.java:1138)
     [java] 	at clojure.lang.LispReader$ListReader.invoke(LispReader.java:972)
     [java] 	at clojure.lang.LispReader.read(LispReader.java:183)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7190)
     [java] 	... 7 more
Comment by Alex Miller [ 13/Nov/14 7:28 PM ]

Did you clean first? I replaced that static method call there with a wider version but if you are cleaning fresh it should be fine.

Comment by Michael Blume [ 13/Nov/14 7:31 PM ]

Apologies, maven just wasn't doing a good job of tracking changes, running mvn clean fixes the build.

Comment by Alex Miller [ 25/Nov/14 9:58 AM ]

Added benchmark.png showing times (in ns), tested with criterium, for into and vec on different types and sizes on 1.7.0-alpha4 and then vec again after the patch.

Comment by Jonas De Vuyst [ 11/Jan/15 6:47 AM ]

This patch breaks (vec (first {1 2}))
; ClassCastException clojure.lang.MapEntry cannot be cast to clojure.lang.IObj

Comment by Alex Miller [ 11/Jan/15 8:12 AM ]

Thanks for the report! I will get that fixed in the next alpha.





[CLJ-1537] Audit IReduce usages for proper Reduced handling Created: 26/Sep/14  Updated: 14/Nov/14  Resolved: 14/Nov/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Timothy Baldridge Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File audit-ireduce.diff     Text File clj-1537-gvec-ArraySeq.patch     File clj-1537-v2.diff     File clj-1537-v3.diff    
Patch: Code

 Description   

Rich asked that we make sure that all usages of IReduce properly handle Reduced semantics.

Approach: I did a "Find Usages" in InteliJ and updated usages of IReduce as needed.

Example: Before the patch:

user=> (transduce (take 1) conj (seq (subvec [1 2 3 4 5] 1)))
#<Reduced@13df2a8c: #<Reduced@1ebea008: #<Reduced@72d6b3ba: #<Reduced@1787f2a0: [2]>>>>

user=> (transduce (take 1) conj '(1 2 3 4))
#<Reduced@51bd8b5c: #<Reduced@7b50df34: #<Reduced@1b410b60: #<Reduced@2462cb01: [1]>>>>

Patch: clj-1537-v3.diff
Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 30/Sep/14 5:59 PM ]

Should be same as audit-ireduce.diff but w/o whitespace diffs.

Comment by Alex Miller [ 30/Sep/14 8:25 PM ]

Should these changes be deref'ing ret?

Also, can you add an example to the description (not sure if it needs to be a test) of where these are an issue?

Comment by Timothy Baldridge [ 03/Oct/14 10:18 AM ]

Following the pattern here: https://github.com/clojure/clojure/blob/master/src/clj/clojure/core/protocols.clj#L85 they should deref the reduced value.

Comment by Timothy Baldridge [ 03/Oct/14 10:29 AM ]

Failure examples from master, are added to the description.

Comment by Timothy Baldridge [ 03/Oct/14 2:23 PM ]

clj-1537-v2.diff didn't properly deref the reduced box. clj-1537-v3.diff does this now.

Comment by Nicola Mometto [ 14/Nov/14 1:00 PM ]

I've reopened this issue as there are still instances of IReduce implementations that don't handle Reduced:

user=> (transduce (take 1) conj (seq (long-array [1 2 3 4])))
#<Reduced@38f774f8: [1]>
Comment by Nicola Mometto [ 14/Nov/14 1:00 PM ]

The attached patch (clj-1537-gvec-ArraySeq.patch) fixes the remaining IReduce impls that don't correctly handle Reduced

Comment by Nicola Mometto [ 14/Nov/14 6:08 PM ]

I'm closing this ticket again and opening a different ticket for the new patch, as asked privately by Alex Miller





[CLJ-1535] Make boxed math warning suppressible Created: 26/Sep/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: math

Attachments: Text File clj-1535-2.patch     Text File clj-1535-3.patch     Text File clj-1535.patch     Text File silence-boxed-patch-10-01-2014.patch    
Patch: Code and Test
Approval: Ok

 Description   

Clojure 1.7.0-alpha2 included a new warning that will notify on use of boxed math when unchecked-math is set to true (CLJ-1325). Based on feedback, would like to make these warnings optional.

Approach: Revert (set! *unchecked-math* true) to prior behavior. Only emit warnings when (set! *unchecked-math* :warn-on-boxed).

Patch: clj-1535-3.patch

Screened by: Stuart Halloway



 Comments   
Comment by Michael Blume [ 01/Oct/14 7:45 PM ]

So I decided to take a shot at writing a patch for this. This is my first Clojure core patch, so I've probably messed up some formatting, but the implementation was pretty simple and the tests pass.

I introduced a variable, clojure.core/silence-boxed which defaults false and, when true, silences boxed math warnings. If the reverse is preferred (warn-boxed or similar) I can do that too.

Comment by Alex Miller [ 01/Oct/14 8:34 PM ]

Hi Michael, we have other plans for how this should be implemented, so will likely not use your patch. In the future, it's always good to check if the ticket is already assigned to someone before working on it.

Comment by Alex Miller [ 07/Oct/14 8:12 AM ]

Added clj-1535-3.patch, which is exactly the same diff as clj-1535-2.patch, but just squashes into a single commit.





[CLJ-1529] Significantly improve compile time by reducing calls to Class.forName Created: 21/Sep/14  Updated: 14/Nov/14  Resolved: 14/Nov/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Zach Tellman Assignee: Unassigned
Resolution: Completed Votes: 28
Labels: compiler, performance

Attachments: File class-for-name.diff     File clj-1529-no-cache-2.diff     File clj-1529-no-cache.diff     PNG File clj-1529.png     File clj-1529-with-cache.diff     Text File maybe-class-cache-2.patch     Text File maybe-class-cache.patch    
Patch: Code
Approval: Ok

 Description   

Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

Proposed: Avoid calling Class.forName() on non-namespaced symbols that do not contain "." or start with "[", don't map to a Class in the ns, and are names in the current local env. Also, adjust the ordering of checks in tagToClass to check for hints before checking for class.

[Note that the latest variant of the patch moves the check from the original patch farther down in the logic to avoid changing the semantics. This still yields virtually all of the performance gains. See comments for details.]

Patch: clj-1529-no-cache-2.diff

Screened by: Stu Halloway. Note that for this change the patch ended up being so small it is easier follow the code than the prose description.

[1] https://github.com/ztellman/aleph



 Comments   
Comment by Ghadi Shayban [ 21/Sep/14 4:30 PM ]

One of our larger projects (not macro-laden) just went from 36 seconds to 23 seconds to start with this patch.

Comment by Ramsey Nasser [ 03/Oct/14 12:34 PM ]

I ported this patch to Clojure-CLR for the Unity integration project and we have seen significant speedups as well. I too agree that this is the behavior I expect as a user.

Comment by Alex Miller [ 06/Oct/14 12:19 PM ]

I ran this on a variety of open-source projects. I didn't find that it produced any unexpected behavior or test errors. Most projects were about 10% faster to run equivalent of "lein test" with a few as high as 35% faster.

Comment by Alex Miller [ 07/Oct/14 12:52 PM ]

We're interested in comparing this and the class caching in fastload branch to get something in for 1.7. Next step is to extract a patch of the stuff in fastload so we can compare them better.

Comment by Alex Miller [ 07/Oct/14 4:06 PM ]

Add maybe class cache patch from fastload branch

Comment by Alex Miller [ 08/Oct/14 8:57 AM ]

Times below to run "time lein test" on a variety of projects with columns:

  • master = current 1.7.0 master
  • maybe-cache = maybe-class-cache.patch extracted from Rich's fastload branch
  • class-for-name = class-for-name.diff from Zach
  • % maybe-cache = % improvement for maybe-cache over master
  • % class-for-name = % improvement for class-for-name patch over master (sorted desc)

project,master,maybe-cache,class-for-name,% maybe-cache,% class-for-name
aleph,25.605,16.572,14.460,35.278,43.527
riemann,40.550,27.656,24.734,31.798,39.004
lamina,37.247,30.072,29.045,19.263,22.021
trapperkeeper,11.979,11.158,10.3,6.854,14.016
plumbing,73.777,68.388,66.922,7.304,9.292
cheshire,5.583,5.089,5.086,8.848,8.902
tools.analyzer,5.411,5.289,5.023,2.255,7.171
core.async,19.161,18.090,17.942,5.589,6.362
tools.reader,4.686,4.435,4.401,5.356,6.082
clara-rules,43.964,42.140,41.542,4.149,5.509
core.typed,158.885,154.954,151.445,2.474,4.683
instaparse,9.286,8.922,8.859,3.920,4.598
schema,45.3,43.914,43.498,3.060,3.978
mandoline,76.295,74.831,74.425,1.919,2.451

The summary is that both patches improve times on all projects. In most cases, the improvement from either is <10% but the first few projects have greater improvements. The class-for-name patch has a bigger improvement in all projects than the maybe-cache patch (but maybe-cache has no change in semantics).

Comment by Nicola Mometto [ 08/Oct/14 9:03 AM ]

Are the two patches mutually exclusive?

Comment by Alex Miller [ 08/Oct/14 9:35 AM ]

They are non-over-lapping. I have not considered whether they could both be applied or whether that makes any sense.

Comment by Alex Miller [ 08/Oct/14 9:53 AM ]

The two patches both essentially cut off the same hot code path, just at different points (class-for-name is earlier), so applying them both effectively should give you about the performance of class-for-name.

Comment by Alex Miller [ 08/Oct/14 2:14 PM ]

Added a picture of the data for easier consumption.

Comment by Deepak Giridharagopal [ 10/Oct/14 4:35 PM ]

One of our bigger projects saw a reduction of startup time of 16% with class-for-name, 14% with maybe-cache, and a whopping 23% with both patches applied. This was actually starting up the program, as opposed to running "lein test", FWIW.

Maybe it's worth re-running the benchmarks with a "both-patches" variant?

Comment by Alex Miller [ 10/Oct/14 5:28 PM ]

Hey Deepak, I did actually run some of them with both patches and saw times similar to class-for-name.

Were your times consistent across restarts? The times in the data above are the best of 3 trials for every data point (although they were relatively consistent).

Comment by Deepak Giridharagopal [ 10/Oct/14 6:08 PM ]

Hi Alex, the tests I ran did 20-iteration loops, and I took the mean (though it was pretty consistent between restarts). I can redo stuff and upload the raw data for you if that will help.

Comment by Deepak Giridharagopal [ 10/Oct/14 6:43 PM ]

So repeating the experiment several times does in fact behave as you suspected...apologies for my previous LOLDATA.

Comment by Alex Miller [ 24/Oct/14 3:01 PM ]

maybe-class-cache-2.patch removes some debugging stuff

Comment by Alex Miller [ 27/Oct/14 4:41 PM ]

I've done more testing and made mods to both patches and moved them closer together.

On the maybe-class-cache patch (new version = clj-1529-with-cache.diff):
1) I found that adding a final else branch that returned null was an improvement - this avoids caching things that will never hit in the future (Cons, PersistentList, Symbols with namespaces, etc). That's both a perf improvement (avoids hashing those things) and a memory improvement.
2) The tagToClass improvement from Zach's patch is orthogonal and also valid here so I added it.
3) I added Zach's check, but moved the placement lower so that it doesn't alter semantics. It helps avoid caching locals that aren't classes.

On the class-for-name patch (new version = clj-1529-no-cache.diff):
1) Same change as #3 above - moved check lower to avoid semantics change.

With these changes, both patches have tagToClass and local checks, neither patch changes semantics, and the only difference is whether to keep or remove the cache.

aleph timings (for "lein test"):

  • 1.7 master = 25.415 s
  • 1.7 + clj-1529-with-cache.diff = 14.329 s
  • 1.7 + clj-1529-no-cache.diff = 14.808 s

lamina timings (for "lein test"):

  • 1.7 master = 37.340 s
  • 1.7 + clj-1529-with-cache.diff = 28.680 s
  • 1.7 + clj-1529-no-cache.diff = 28.759 s

The cache helps slightly in both cases, but it does not seem worth adding the new dynamic var and unbounded memory use inherent in the cache.

Comment by Alex Miller [ 05/Nov/14 11:40 AM ]

Talked to Rich, focusing on no-cache patch. Added new version that fixes tabbing and restores Zach's name to the patch, which seems appropriate.





[CLJ-1524] SeqIterator constructor change broke binary compatibility in 1.7.0-alpha2 Created: 09/Sep/14  Updated: 09/Sep/14  Resolved: 09/Sep/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1524.diff    
Patch: Code
Approval: Ok

 Description   

Running code AOT-compiled against Clojure 1.6.0 (or older) with 1.7.0-alpha2 runtime will encounter this error with SeqIterator:

CompilerException java.lang.NoSuchMethodError: clojure.lang.SeqIterator.<init>(Lclojure/lang/ISeq;)V, compiling:(form-init5913779045640355531.clj:1:11)

Cause: This is due to a type change in the constructor of SeqIterator from ISeq to Object (commit: https://github.com/clojure/clojure/commit/43cc1854508d655e58e377f84836ba128971f90c ).

Proposed: Add the ISeq constructor back so that calls into that constructor retain backwards binary compatibility.

Patch: clj-1524.diff

Screened by:

More: From Datomic mailing list - https://groups.google.com/forum/#!topic/datomic/KZqhY6hUHz0



 Comments   
Comment by Alex Miller [ 09/Sep/14 11:06 AM ]

Patch not applied, but similar change applied directly here:

https://github.com/clojure/clojure/commit/ba41f25b6f3f32729c55f7f7ceb179be597acf94





[CLJ-1518] Patch for removing transient thread owner check broke rrb-vector Created: 03/Sep/14  Updated: 04/Sep/14  Resolved: 04/Sep/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1518.diff    
Patch: Code
Approval: Ok

 Description   

The patch for CLJ-1498 changed the field types inside the persistent data structures, which inadvertently broke core.rrb-vector, which relies on reusing some of those internals. It is not necessary to change the type to satisfy the patch, so we would like to rollback that aspect of the change to minimize breakage.



 Comments   
Comment by Alex Miller [ 03/Sep/14 2:05 PM ]

In the patch I rolled back the changes in the Persistent*.java from CLJ-1498 and re-applied. The only "real" changes after the rollback are in ensureEditable(). Tests were left of course.





[CLJ-1515] range should return something that implements IReduce Created: 29/Aug/14  Updated: 10/Apr/15  Resolved: 10/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Timothy Baldridge Assignee: Unassigned
Resolution: Completed Votes: 2
Labels: None

Attachments: Text File clj-1515-10.patch     Text File clj-1515-11.patch     Text File clj-1515-12.patch     Text File clj-1515-13.patch     Text File clj-1515-14.patch     Text File clj-1515-2.patch     Text File clj-1515-3.patch     Text File clj-1515-4.patch     Text File clj-1515-5.patch     Text File clj-1515-6.patch     Text File clj-1515-7.patch     Text File clj-1515-8.patch     Text File clj-1515-9.patch     Text File CLJ-1515-deftype2.patch     Text File CLJ-1515-deftype3.patch     Text File CLJ-1515-deftype-nostructural-dup.patch     Text File CLJ-1515-deftype.patch     Text File clj-1515.patch     File patch.diff     File range-patch3.diff     File reified-range4.diff    
Patch: Code and Test
Approval: Ok

 Description   

range should implement IReduce for fast reduction.

Patch: clj-1515-14.patch - Java range with long optimizations

Approach: The clj-1515-14 patch revives the unused clojure.lang.Range class. This class is similar in implementation to ChunkedCons. It differs by lazily constructing the first chunk (this is done in the existing impl by using a LazySeq wrapper) and by caching the next reference like other seq impls. The latter is done because range is frequently used in both chunked and unchunked traversal.

Additionally, 1515-14 contains an optimized version of range (LongRange) for the extremely common case where start, end, and step are all longs. This version uses primitive longs and primitive math and a customized version of ArrayChunk for greater performance.

The special case of (range) is just handled with (iterate inc' 0) (which was further optimized for reduce in CLJ-1603).

Alternatives:

  • CLJ-1515-deftype3.patch took the approach of using deftype to create a seqable, reducible entity that was not actually a seq. Based on work for CLJ-1603, it was established that due to historical requirements, this is not viable and the entity returned for range should implement the ISeq and Collection interfaces directly.
  • clj-1515-11.patch uses the approach of a "split" implementation - a LazySeq that has a fast reduce path. This is a minimal patch that provides no perf difference for sequence usage but a moderate improvement for reduce paths. One important difference vs clj-1515-12 is that the fast reduce path is only obtained on the initial range. Once you walk off the head, the performance will be the same as prior (seq perf).

Performance

criterium quick-bench with java 1.8.0-b132

code 1.7.0-master Java clj-1515-14 split clj-1515-11
(count (range (* 1024 1024))) 63 ms 0 ms 58 ms
(reduce + (map inc (range (* 1024 1024)))) 50 ms 33 ms 50 ms
(reduce + (map inc (map inc (range (* 1024 1024))))) 68 ms 52 ms 67 ms
(count (keep odd? (range (* 1024 1024)))) 73 ms 52 ms 67 ms
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 46 ms 26 ms 34 ms
(reduce + 0 (range (* 2048 1024))) 67 ms 19 ms 40 ms
(reduce + 0 (rest (range (* 2048 1024)))) 67 ms 19 ms 57 ms
(doall (range 0 31)) 1.04 µs 0.75 µs 1.15 µs
(doall (range 0 32)) 1.08 µs 0.76 µs 1.18 µs
(doall (range 0 4096)) 135 µs 96 µs 145 µs
(into [] (map inc (range 31))) 1.81 µs 1.30 µs 1.88 µs
(into [] (map inc) (range 31)) 1.71 µs 0.75 µs 1.02 µs
(into [] (range 128)) 4.82 µs 2.20 µs 3.27 µs
(doall (range 1/2 1000 1/3)) 1.24 ms 1.24 ms 1.28 µs
(doall (range 0.5 1000 0.33)) 126 µs 147 µs 151 µs
(into [] (range 1/2 1000 1/3)) 1.26 ms 1.20 ms 1.26 ms
(into [] (range 0.5 1000 0.33)) 148 µs 91 µs 130 µs
(count (filter odd? (take (* 1024 1024) (range)))) 185 ms 167 ms 176 ms
(transduce (take (* 1024 1024)) + (range)) 67 ms 35 ms 68 ms

Performance notes:

  • 1515-14 (Java) - significant improvements in virtually all use cases, both seq and reduce. The final two cases with (range) leverage optimizations from CLJ-1603.
  • 1515-11 (split) - a much smaller patch that gives no seq benefits and smaller reduction benefits. Only the head of the range will receive the perf benefits (see (reduce + 0 (rest (range (* 2048 1024))))).

Questions
(range) and supports auto-promotion towards infinity in this patch, which seems to be implied by the doc string but was not actually implemented or tested correctly afaict.



 Comments   
Comment by Alex Miller [ 29/Aug/14 3:19 PM ]

1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

Comment by Timothy Baldridge [ 29/Aug/14 3:40 PM ]

1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Comment by Alex Miller [ 29/Aug/14 3:49 PM ]

6) might be useful if you're walking through it with nth, which hits count everytime, but doubt that's common
7) yep, reasonable

Comment by Andy Fingerhut [ 18/Sep/14 6:52 AM ]

I have already pointed out to Edipo in personal email the guidelines on what labels to use for Clojure JIRA tickets here: http://dev.clojure.org/display/community/Creating+Tickets

Comment by Timothy Baldridge [ 19/Sep/14 10:02 AM ]

New patch with IReduce directly on Range instead of relying on iterators

Comment by Alex Miller [ 01/Oct/14 2:00 PM ]

The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is.

Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.

Comment by Timothy Baldridge [ 03/Oct/14 10:00 AM ]

Looks like chunked seqs do make lazy seq code about 5x faster in these tests.

Comment by Ghadi Shayban [ 03/Oct/14 10:22 AM ]

I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.

Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

Comment by Timothy Baldridge [ 03/Oct/14 10:38 AM ]

I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

Comment by Alex Miller [ 03/Oct/14 10:46 AM ]

I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.

On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Comment by Ghadi Shayban [ 03/Oct/14 11:00 AM ]

Tim, I'm not suggesting unboxed math, but the singular fast-path of all-Longs that you and Alex describe. I mistakenly lower-cased Long/Float.

Comment by Timothy Baldridge [ 31/Oct/14 11:30 AM ]

Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.

Comment by Nicola Mometto [ 14/Nov/14 12:51 PM ]

As discussed with Tim in #clojure, the current patch should not change ArrayChunk's reduce impl, that's an error.

Comment by Alex Miller [ 09/Dec/14 2:40 AM ]

Still a work in progress...

Comment by Nicola Mometto [ 09/Dec/14 8:44 AM ]

Alex, while this is still a work in progress, I see that the change on ArrayChunk#reduce from previous WIP patches not only has not been reverted but has been extended. I don't think the current approach makes sense as ArrayChunk#reduce is not part of the IReduce/IReduceInit contract but of the IChunk contract and changing the behaviour to be IReduce-like in its handling of reduced introduces the burden of having to use preserve-reduced on the reducing function to no apparent benefit.

Given that the preserve-reduced is done on the clojure side, it seems to me like directly invoking .reduce rather than routing through internal-reduce should be broken but I haven't tested it.

Comment by Alex Miller [ 09/Dec/14 9:49 AM ]

That's the work in progress part - I haven't looked at yet. I have not extended or done any work re ArrayChunk, just carried through what was on the prior patch. I'll be working on it again tomorrow.

Comment by Ghadi Shayban [ 10/Dec/14 11:14 PM ]

I am impressed and have learned a ton through this exercise.

quick review of clj-1515-2
1) withMeta gives the newly formed object the wrong meta.
2) LongRange/create() is the new 0-arity constructor for range, which sets the 'end' to Double/POSITIVE_INFINITY cast as a long. Current core uses Double/POSITIVE_INFINITY directly. Not sure how many programs rely upon iterating that far, or how they would break.
3) Relatedly, depending on the previous point: Because only all-long arguments receive chunking, the very common case of (range) with no args would be unchunked. Doesn't seem like too much of a stretch to add chunking to the other impl.
4) Though the commented invariants say that Range is never empty, the implementation uses a magic value of _count == 0 to mean not cached, which is surprising to me. hashcodes have the magic value of -1
5) s/instanceof Reduced/RT.isReduced
6) is the overflow behavior of "int count()" correct?

Comment by Alex Miller [ 11/Dec/14 12:06 AM ]

1) agreed!
2) Good point. I am definitely changing behavior on this (max of 9223372036854775807). I will look at whether this can be handled without affecting perf. Really, handling an infinite end point is not compatible with several things in LongRange.
3) I actually did implement chunking for the general Range and found it was slower (the original Clojure chunking is faster). LongRange is making up for that difference with improved primitive numerics.
4) Since empty is invalid, 0 and -1 are equally invalid. But I agree -1 conveys the intent better.
5) agreed
6) probably not. ties into 2/3.

Thanks for this, will address.

Comment by Alex Miller [ 11/Dec/14 12:11 AM ]

Added -4 patch that addresses 1,4,5 but not the (range) stuff.

Comment by Alex Miller [ 11/Dec/14 12:51 PM ]

Latest -7 patch addresses all feedback and perf #s updated.

Comment by Ghadi Shayban [ 22/Dec/14 10:59 PM ]

See CLJ-1572, I believe CollReduce needs to be extended directly to clojure.lang.{LongRange,Range} inside protocols.clj

Comment by Ghadi Shayban [ 29/Dec/14 12:29 PM ]

Seems like a missing benchmark is (range 0 31) without the doall. Current patch (-9) allocates the chunk buffer at range's construction time. Maybe this can be delayed?

Comment by Alex Miller [ 02/Jan/15 11:24 AM ]

Ghadi - you are right. I reworked the patch (new -10 version) so that the chunk is only created on demand. Basically, the chunking is only used when traversing via chunked seqs and in normal seq iteration or reduce, no explicit chunks are created. That improved several timings. I also added the bench for (range 0 31) which was greatly improved by lazily creating the chunk, so good point on that.

Comment by Michael Blume [ 03/Jan/15 1:38 PM ]

I'm looking at the implementation of equiv() and wondering if it's worth checking whether the other object is also a reified range and comparing the private parameters rather than iterating through the sequences.

Comment by Ghadi Shayban [ 18/Jan/15 7:26 PM ]

Attaching a simplified implementation as a deftype. I benchmarked a billion variants and dumped bytecode. I'm attaching the best performing combination, and it beats out the Java one in most cases.

Approach:
Two deftypes, one specialized to primitive longs, the other generic math.
Implement: Seqable/Counted,IReduce,IReduceInit,Iterable
Also, marker interfaces Sequential and Serialized.

The implementations are very straightforward, but Seqable deserves mention.
'seq' delegates its implementation to the existing lazy-seq based range, which has been stripped of the strange corner cases and now only handles strictly ascending/descending ranges. It has been renamed range*, rather than range1 because all the arguments to range are already numbers. core references to range have been accordingly renamed.

The new range constructor is loaded after reduce & protocols load. It checks types, and delegates to either long-range or generic-range, which handle the wacky argument cases that are not strictly ascending or descending. I'm annoyed with the structural duplication of those conditionals, not sure how to solve it.

Inside the LongRange implementation of IReduce/IReduceInit, boxing is carefully controlled, and was verified through bytecode dumping.

I elaborated the benchmarks for comparison, and also included benchmarking without type specialization.

You discover strange things working on this stuff. Turns out having the comparator close over the 'end' field is beneficial: #(< % end).

Alex, I figured out why the Java versions had a nearly exactly 2x regression on (doall (range 0 31)), and I attached a change to 'dorun'. Instead of calling (seq coll) then (next coll), it effectively calls (next (seq coll)). I think the implementation assumes that calling 'seq' is evaluating the thunk inside LazySeq. This should also help other Seqable impls like Cycle/Iterate/Repeat CLJ-1603.

Results (criterium full runs):

code 1.7.0-alpha5 clj-1515-10.patch (Java) deftype specialized (attached) deftype unspecialized
(count (filter odd? (take (* 1024 1024) (range)))) 187.30 ms 194.92 ms 182.53 ms 184.13 ms
(transduce (take (* 1024 1024)) + (range)) 58.27 ms 89.37 ms 84.69 ms 84.72 ms
(count (range (* 1024 1024))) 62.34 ms 27.11 ns not run not run
(reduce + (map inc (range (* 1024 1024)))) 57.63 ms 46.14 ms 46.17 ms 52.94 ms
(reduce + (map inc (map inc (range (* 1024 1024))))) 82.83 ms 68.66 ms 64.36 ms 71.76 ms
(count (keep odd? (range (* 1024 1024)))) 76.09 ms 57.59 ms not run not run
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 52.17 ms 39.71 ms 28.83 ms 42.23 ms
(reduce + 0 (range (* 2048 1024))) 85.93 ms 38.03 ms 26.49 ms 42.43 ms
(doall (range 0 31)) 1.33 µs 2.89 µs 1.03 µs 1.08 µs
(doall (range 0 32)) 1.35 µs 2.97 µs 1.07 µs 1.10 µs
(doall (range 0 128)) 5.27 µs 11.93 µs 4.19 µs 4.29 µs
(doall (range 0 512)) 21.66 µs 47.33 µs 16.95 µs 17.66 µs
(doall (range 0 4096)) 171.30 µs 378.52 µs 135.45 µs 140.27 µs
(into [] (map inc (range 31))) 1.97 µs 1.57 µs 1.87 µs 1.95 µs
(into [] (map inc) (range 31)) 1.66 µs 824.67 ns 891.90 ns 1.11 µs
(into [] (range 128)) 5.11 µs 2.21 µs 2.43 µs 3.36 µs
(doall (range 1/2 1000 1/3)) 1.53 ms 1.67 ms 1.59 ms 1.52 ms
(doall (range 0.5 1000 0.33)) 164.83 µs 382.38 µs 149.38 µs 141.21 µs
(into [] (range 1/2 1000 1/3)) 1.53 ms 1.40 ms 1.43 ms 1.50 ms
(into [] (range 0.5 1000 0.33)) 157.44 µs 108.27 µs 104.18 µs 127.20 µs

Open Questions for screeners of the deftype patch:

1) What to do about the autopromotion at the Long/MAX_VALUE boundary? I've preserved the current behavior of Clojure 1.6.
2) Alex, I did not pull forward the filter chunk tweak you discovered
3) Is the structural duplication of the conditionals in generic-range long-range awful?
4) Are there any other missing interfaces? IMeta comes to mind. Not sure about IHashEq either.

Comment by Ghadi Shayban [ 18/Jan/15 7:30 PM ]

note with the deftype patch, the transduce over infinite (range) case is the only one where 'master' currently performs best. This is because the implementation was changed to (iterate inc' 0). When CLJ-1603 is applied that situation should improve better.

Comment by Ghadi Shayban [ 18/Jan/15 8:27 PM ]

fixup patch CLJ-1515-deftype

Comment by Ghadi Shayban [ 18/Jan/15 11:05 PM ]

new patch CLJ-1515-deftype-nostructural-dup.patch with the silly conditional duplication removed. Updated benchmarks including 'count' impls. These benchmarks run on a different machine with the same hardness. Blue ribbon.

code 1.7.0-alpha5 1.7.0-rangejava 1.7.0-rangespecial rangespecial / alpah5
(count (filter odd? (take (* 1024 1024) (range)))) 297.14 ms 333.93 ms 328.62 ms 1.10
(transduce (take (* 1024 1024)) + (range)) 105.55 ms 145.44 ms 164.70 ms 1.56
(count (range (* 1024 1024))) 108.92 ms 61.09 ns 26.61 ns 0
(reduce + (map inc (range (* 1024 1024)))) 97.67 ms 95.41 ms 84.62 ms 0.86
(reduce + (map inc (map inc (range (* 1024 1024))))) 140.21 ms 135.59 ms 116.38 ms 0.83
(count (keep odd? (range (* 1024 1024)))) 121.18 ms 104.63 ms 111.46 ms 0.92
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 100.40 ms 86.28 ms 67.17 ms 0.67
(reduce + 0 (range (* 2048 1024))) 131.77 ms 80.43 ms 63.24 ms 0.48
(doall (range 0 31)) 2.53 µs 4.36 µs 2.24 µs 0.89
(doall (range 0 32)) 2.37 µs 4.00 µs 1.99 µs 0.84
(doall (range 0 128)) 9.20 µs 14.98 µs 8.01 µs 0.87
(doall (range 0 512)) 37.28 µs 59.13 µs 35.16 µs 0.94
(doall (range 0 4096)) 331.28 µs 471.57 µs 291.76 µs 0.88
(into [] (map inc (range 31))) 2.83 µs 2.79 µs 2.67 µs 0.94
(into [] (map inc) (range 31)) 2.21 µs 1.39 µs 1.26 µs 0.57
(into [] (range 128)) 6.72 µs 3.25 µs 3.09 µs 0.46
(doall (range 1/2 1000 1/3)) 3.41 ms 4.04 ms 3.14 ms 0.92
(doall (range 0.5 1000 0.33)) 281.04 µs 530.92 µs 244.14 µs 0.87
(into [] (range 1/2 1000 1/3)) 3.32 ms 3.71 ms 2.99 ms 0.90
(into [] (range 0.5 1000 0.33)) 215.53 µs 165.93 µs 138.86 µs 0.64
Comment by Alex Miller [ 19/Jan/15 8:32 AM ]

This is looking very good and I think we should move forward with it as the preferred approach. Feel free to update the description appropriately. I'll file a separate ticket with the filter tweak. Some comments on the patch:

1) GenericRange/count - this math is broken if you start to mix infinity in there. I think just (count (seq this)) is safer.
2) GenericRange/iterable - I think if we are IReduceInit and Seqable, we can omit this. I had it in the Java one because I inherited Iterable via ASeq but that's not an issue in this impl.
3) LongRange/reduce - why Long/valueOf? Isn't start a long field?
4) LongRange/iterable - ditto #2
5) print-method - anytime I see print-method, there should probably be print-dup too. For example, this is broken: (binding [*print-dup* true] (println (range 0 10)))
6) Is serialization broken by this patch? Can you justify the test changes?

Comment by Ghadi Shayban [ 19/Jan/15 12:51 PM ]

regarding Serialization tests currently:

(defn roundtrip
  [v]
  (let [rt (-> v serialize deserialize)
        rt-seq (-> v seq serialize deserialize)]        ;; this
    (and (= v rt)            ;; this fails because the test ^ calls seq first
      (= (seq v) (seq rt))   ;; this passes
      (= (seq v) rt-seq))))  ;; this passes

These new types are merely seqable, so (not= (LongRange. 0 10 1) (seq (LongRange. 0 10 1)))

Not sure how to handle this 100%. Nothing precludes the LongRange itself from roundtripping, but just that calling seq on it returns a separate object.

Comment by Alex Miller [ 23/Jan/15 9:57 AM ]

More comments...

1) Instead of extending IReduce and IReduceInit, just extend IReduce and implement both arities (IReduce extends IReduceInit).
2) I'm slightly troubled by the .invokePrim now. Did you look at a macro that does prim type-hinting maybe?
3) On the serialization thing, is the problem really with serialization or with equality? The test that's failing is (= v rt). Because these are not IPersistentCollections, pcequiv won't be used and it's just calling LongRange.equals(), which is not implemented and falls back to identity, right? Probably need to implement the hashCode and equals stuff. Might need IHashEq too.

user=> (= (range 5) (range 5))
false
Comment by Ghadi Shayban [ 23/Jan/15 12:13 PM ]

Took care of 1) and 3). Punting on hiding the invokePrim behind a macro. It may be shameful but it works really fast.

Another case found:
(assoc {'(0 1 2 3 4) :foo} (range 5) :bar) needs to properly overwrite keys in the map.

Cause: Might be irrelevant in the face of CLJ-1375. Util/equiv for maps doesn't use IPC/equiv unless the collection is also java.util.Collection or Map. If it is then it checks for IPersistentCollection. I added a check for IPersistentCollection first.

https://github.com/ghadishayban/clojure/commits/for-screening

To get hasheq working, I added back the iterators for use by Murmur3/hashOrdered.

Comment by Ghadi Shayban [ 23/Jan/15 3:20 PM ]

Handle hash and equality not through IPersistentCollection. w/ test-cases too.

Comment by Alex Miller [ 20/Feb/15 11:39 AM ]

Ghadi, we need to have range implement IObj too so with-meta works.

Comment by Ghadi Shayban [ 20/Feb/15 12:09 PM ]

Will add pronto but maybe someone can clarify something: Adding a simple clojure.lang.IObj/withMeta to the deftype results in an AbstractMethodError when trying to print a range. This is because the impl of vary-meta used in the default printer assumes that if all IObjs are also IMetas. Seems like a problem with either vary-meta's impl or the interface separation.

I can fix by:
1) adding a _meta field to Ranges and handling IMeta as well as IObj.

;; vary-meta:
(with-meta obj (apply f (meta obj) args)))

;; default printer
(defmethod print-method :default [o, ^Writer w]
  (if (instance? clojure.lang.IObj o)
    (print-method (vary-meta o #(dissoc % :type)) w)
    (print-simple o w)))
Comment by Ghadi Shayban [ 20/Feb/15 12:25 PM ]

Ugh never mind that last bit I fell out of the hammock prematurely. IMeta << IObj.

Alex, CLJ-1603 needs IMeta/meta too.

Comment by Alex Miller [ 24/Feb/15 11:30 AM ]

current direction is pending results of where CLJ-1603 goes

Comment by Fogus [ 27/Mar/15 1:57 PM ]

Screened. Though the amount of duplicated code saddens me, I don't hold it against the implementer. It's a fairly straight-forward counting Impl with a chunk cache.

Comment by Alex Miller [ 27/Mar/15 3:44 PM ]

New patch -13 removes the array allocation and performs better, still need to update timings and consider the non-long case.

Comment by Alex Miller [ 03/Apr/15 10:29 AM ]

The -14 patch just switches to leverage Repeat now that 1603 has been applied.

Comment by Fogus [ 10/Apr/15 12:14 PM ]

The latest patch (-14) handles the extraneous allocations from -12. All tests run and timings verified.





[CLJ-1512] Create volatile box for managing state Created: 25/Aug/14  Updated: 07/Oct/14  Resolved: 03/Sep/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Rich Hickey Assignee: Rich Hickey
Resolution: Completed Votes: 0
Labels: transducers

Attachments: File volatile2.diff     File volatile3.diff    
Patch: Code and Test
Approval: Ok

 Description   

Motivation:

Clojure needs a faster variant of Atom for managing state inside transducers. That is, Atoms do the job, but they provide a little too much capability for the purposes of transducers. Specifically the compare and swap semantics of Atoms add too much overhead. Therefore, it was determined that a simple volatile ref type would work to ensure basic propagation of its value to other threads and reads of the latest write from any other thread. While updates are subject to race conditions, access is controlled by JVM guarantees.

Solution overview: Create a concrete type in Java, akin to clojure.lang.Box, but volatile inside supports IDeref, but not watches etc.

API:

(volatile! x) ;;ctor
(vreset! vol newval) ;;like reset
(vswap! vol f args) ;;same shape as swap!, but MACRO over vreset!

Patch: volatile3.diff

Screened by: fogus



 Comments   
Comment by Alex Miller [ 25/Aug/14 9:11 AM ]

Dumb benchmark before/after...

java -cp target/classes -Xmx512m -server clojure.main
(def t (take 1000000))
(def v (doall (range 1000000)))
(defn bench [t v]
  (time (into [] t v)))
(dotimes [_ 30] (bench t v))

before - 29-32 ms after warmup
after - 22-23 ms after warmup

Comment by Alex Miller [ 25/Aug/14 9:12 AM ]

From Stu H elsewhere:

Three questions:
1) Should we keep volatile? in the public API?
2) Should we work in terms of IVolatile interface (guessing no)
3) Do we need a CLJS version of these APIs?

Comment by Alex Miller [ 25/Aug/14 9:13 AM ]

1. We have many tickets requesting predicates over types that are "internal" and generally I find these to be helpful. They also can help in making core more portable to cljs (maybe those fns would fall back to atoms in cljs?).
2. We have tickets requesting the equivalent of this for IAtom (CLJ-803) etc. I don't think an interface adds any value to us here though. There seems to be some requests for this kind of passthrough interface from tooling as a decoupling point. Not putting my finger on those discussions but I know I've heard this, maybe on the mailing list.
3. I think yes if that allows us to be more efficient than whatever is being done now. Not obvious to me.

Comment by Nicola Mometto [ 25/Aug/14 9:40 AM ]

Why is vswap! a macro?

Comment by Ambrose Bonnaire-Sergeant [ 26/Aug/14 8:04 AM ]

An IAtom conversation: https://groups.google.com/forum/#!searchin/clojure-dev/iatom/clojure-dev/y5QoMqd44Lc/y4YmW09blk0J

Comment by Max Penet [ 26/Aug/14 10:28 AM ]

the vswap! macro is probably for performance reasons (the main motivation of this code to begin with), to avoid using apply or unrolling tons of arities

Comment by Nicola Mometto [ 26/Aug/14 1:07 PM ]

If that is the only reason, why can't it be a regular fn + :inline metadata?

Comment by Jozef Wagner [ 27/Aug/14 3:50 AM ]

why the bang in the name of volatile! function? If the reason is to warn users that this is an 'expert only' stuff, I suggest to use a verbose name instead, e.g. volatile-reference. (This will also be consistent with approach chosen in the names of volatile-mutable and unsynchronized-mutable hints.)

Comment by Rich Hickey [ 27/Aug/14 6:37 AM ]

Can you please lift the with-meta stuff out of the syntax-quote?
Actually, if volatile! ctor returned a type-hinted value that extra hinting might not even be needed. Let's do both for now.

Also the type hint on the volatile? arg makes no sense - it's a predicate asking if something is a volatile.

Comment by Alex Miller [ 28/Aug/14 9:05 AM ]

Made changes as requested.

Comment by Fogus [ 29/Aug/14 11:01 AM ]

I downloaded the patch and applied to latest master. I ran the isolated tests and the full test suite and also ensured that the patch didn't add any reflection warnings. I then modified the ticket description to add a little more context and motivation (for future readers). The code is straight-forward and clean.

Comment by Alex Miller [ 29/Aug/14 4:31 PM ]

Updated to volatile3.diff to address offline comment from Rich.





[CLJ-1511] stack overflow when comparing sequence results Created: 24/Aug/14  Updated: 07/Oct/14  Resolved: 27/Aug/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Chhi'mèd Künzang Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: transducers
Environment:

OS X 10.9.4


Attachments: Text File 0001-provide-working-implementations-for-LazyTransform-eq.patch    
Patch: Code and Test
Approval: Ok

 Description   

Comparing sequences created with sequence causes a stack overflow when used as first argument to =.

Consider this transducer:

user=> (def map-inc (map inc))
#'user/map-inc

When creating a sequence and comparing with expected results, it works fine as the second argument to the comparison:

user=> (= (range 1 11) (sequence map-inc (range 10)))
true

But a stack overflow occurs when the order of arguments is reversed:

user=> (= (sequence map-inc (range 10)) (range 1 11))

StackOverflowError   clojure.lang.LazyTransformer.equals (LazyTransformer.java:202)
user=> (clojure.stacktrace/print-stack-trace *e 10)
java.lang.StackOverflowError: null
 at clojure.lang.LazyTransformer.equiv (LazyTransformer.java:185)
    clojure.lang.LazyTransformer.equals (LazyTransformer.java:202)
    clojure.lang.LazyTransformer.equiv (LazyTransformer.java:185)
    clojure.lang.LazyTransformer.equals (LazyTransformer.java:202)
    clojure.lang.LazyTransformer.equiv (LazyTransformer.java:185)
    clojure.lang.LazyTransformer.equals (LazyTransformer.java:202)
    clojure.lang.LazyTransformer.equiv (LazyTransformer.java:185)
    clojure.lang.LazyTransformer.equals (LazyTransformer.java:202)
    clojure.lang.LazyTransformer.equiv (LazyTransformer.java:185)
    clojure.lang.LazyTransformer.equals (LazyTransformer.java:202)
nil

The error persists, even if the sequence is forced with doall:

user=> (= (doall (sequence map-inc (range 10))) (doall (range 1 11)))

StackOverflowError   clojure.lang.LazyTransformer.equiv (LazyTransformer.java:185)

It does work as expected, however, if the sequence is converted to a vector:

user=> (= (vec (sequence map-inc (range 10))) (range 1 11))
true


 Comments   
Comment by Nicola Mometto [ 25/Aug/14 4:31 AM ]

Patch provides equiv/equals implementations for LazyTransform based on ASeq equiv/equals





[CLJ-1501] LazySeq switches to equiv when using equals Created: 11/Aug/14  Updated: 14/Nov/14  Resolved: 14/Nov/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Jozef Wagner Assignee: Jozef Wagner
Resolution: Completed Votes: 0
Labels: collections, ft, interop, seq

Attachments: File clj-1501.diff    
Patch: Code and Test
Approval: Ok

 Description   

When comparing lazy seqs with java equality operator .equals, the implementation switches to the Clojures .equiv comparison. This switch is not present in any other Seq or ordered collection type.

user> (.equals '(3) '(3N))
false
user> (.equals [3] [3N])
false
user> (.equals (seq [3]) (seq [3N]))
false
user> (.equals (lazy-seq [3]) (lazy-seq [3N]))
true

Screened by: Alex Miller



 Comments   
Comment by Jozef Wagner [ 11/Aug/14 9:32 AM ]

Patch clj-1501.diff with tests added

Comment by Andy Fingerhut [ 08/Oct/14 5:50 PM ]

This ticket has no Fix Version/s, but is Screened, so at least in some code I have it is 'off the JIRA workflow state diagram'. Not sure if it shows up that way in your filters, Alex.

Comment by Alex Miller [ 08/Oct/14 7:45 PM ]

Yup thanks.





[CLJ-1499] Replace seq-based iterators with direct iterators for all non-seq collections that use SeqIterator Created: 08/Aug/14  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Rich Hickey Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1499-all.diff     Text File clj-1499-v10.patch     Text File clj-1499-v11.patch     Text File clj-1499-v12.patch     File clj-1499-v2.diff     File clj-1499-v3.diff     File clj-1499-v6.diff     File clj-1499-v7.diff     File clj-1499-v8.diff     Text File clj-1499-v9.patch     Text File defrecord-iterator.patch     File defrecord-iterator-v2.diff    
Patch: Code and Test
Approval: Ok

 Description   

Motivation:

Currently, the process of reducing over a map or set involves converting them to seqs. This action thus causes more intermediate objects than we'd like. Therefore, we'd like for reduce to be more efficient about the way that it reduces over these types. One way is for reduction to use the types' core collections while another way is to instead use a smart iterator type that abstracts the direct operation.

Solution

Add support for direct iterators instead of seq-based iterators for non-seq collections that use SeqIterator.

Patch adds support for direct iterators on the following (removing use of SeqIterator):

  • PersistentHashMap - new internal iterator (~20% faster)
  • APersistentSet - use internal map impl iterator (~5% faster)
  • PersistentQueue
  • PersistentStructMap
  • records (in core_deftype.clj)

Patch does not change use of SeqIterator in:

  • LazyTransformer$MultiStepper (not sure if this could be changed)
  • ASeq
  • LazySeq

In preparation for use in CLJ-1602, there is also a new IMapIterable interface that allows a map to publish direct Iterators of keys and values (without creating and pulling apart MapEntry objs). PersistentHashMap and PersistentArrayMap implement it right now and PersistentHashSet will check for and use the direct key iterator if possible.

Performance:

I re-ran the set of tests from CLJ-1546 for vec improvements which use these new iterators for PersistentArrayMap, PersistentHashMap, and PersistentHashSet. These test the time to run vec on a PHS or PHM of the given size. All timings in nanoseconds.

coll size 1.7.0-alpha4 1.7.0-alpha5 alpha5 + patch alpha5 + patch / alpha5
PHS 4 236 406 204 0.86
PHS 16 857 896 382 0.45
PHS 64 3637 3832 1723 0.47
PHS 256 14235 14672 6525 0.46
PHS 1024 67743 68857 44654 0.66
PHM 4 112 232 232 2.07
PHM 16 550 832 442 0.80
PHM 64 3014 3155 2014 0.67
PHM 256 11661 12082 7522 0.65
PHM 1024 57388 59787 44240 0.77

I also ran a set of tests just on reduce with reduce-kv for comparison (reduce-kv uses a different unaffected reduction path so it's a good baseline for comparison).

expr a-set / a-map size 1.6.0 1.7.0-alpha5 (ns) 1.7.0-alpha5 + patch (ns) patch / alpha5 patch / 1.6
(reduce + 0 a-set) 4 171 241 114 0.47 0.67
(reduce + 0 a-set) 16 663 826 395 0.48 0.60
(reduce + 0 a-set) 64 3637 4456 2361 0.53 0.65
(reduce + 0 a-set) 256 14772 16158 8872 0.55 0.60
(reduce + 0 a-set) 1024 72548 79010 49626 0.63 0.68
(reduce + 0 a-set) 16384 1197962 1234286 729335 0.59 0.61
(reduce #(+ %1 (val %2)) 0 a-map) 4 69 103 104 1.01 1.51
(reduce #(+ %1 (val %2)) 0 a-map) 16 640 752 506 0.67 0.79
(reduce #(+ %1 (val %2)) 0 a-map) 64 2861 3238 2388 0.74 0.83
(reduce #(+ %1 (val %2)) 0 a-map) 256 11998 12848 9526 0.74 0.79
(reduce #(+ %1 (val %2)) 0 a-map) 1024 59650 61913 54717 0.88 0.92
(reduce #(+ %1 (val %2)) 0 a-map) 16384 982317 969168 837842 0.86 0.85
(reducekv #(%1 %3) 0 a-map) 4 58 62 60 0.97 1.03
(reducekv #(%1 %3) 0 a-map) 16 233 250 248 0.99 1.06
(reducekv #(%1 %3) 0 a-map) 64 1477 1525 1442 0.95 0.98
(reducekv #(%1 %3) 0 a-map) 256 5470 5258 5201 0.99 0.95
(reducekv #(%1 %3) 0 a-map) 1024 27344 27442 27410 1.00 1.00
(reducekv #(%1 %3) 0 a-map) 16384 399591 397090 396267 1.00 0.99

Patch: clj-1499-v12.patch

Screened by:



 Comments   
Comment by Alex Miller [ 13/Aug/14 1:57 PM ]

The list of non-seqs that uses SeqIterator are:

  • records (in core_deftype.clj)
  • APersistentSet - fallback, maybe is ok?
  • PersistentHashMap
  • PersistentQueue
  • PersistentStructMap

Seqs (that do not need to be changed) are:

  • ASeq
  • LazySeq.java

LazyTransformer$MultiStepper - not sure

Comment by Ghadi Shayban [ 27/Sep/14 2:16 PM ]

attached iterator impl for defrecords. ready to leverage iteration for extmap when PHM iteration lands.

Comment by Alex Miller [ 29/Sep/14 12:52 PM ]

PHM patch

Comment by Alex Miller [ 29/Sep/14 10:26 PM ]

New patch that fixes bugs with PHMs with null keys (and added tests to expose those issues), added support for PHS.

Comment by Ghadi Shayban [ 29/Sep/14 10:45 PM ]

Alex, the defrecord patch already uses the iterator for extmap. It's just made better by the PHM patch.

Comment by Alex Miller [ 29/Sep/14 10:47 PM ]

Comment by Ghadi Shayban [ 30/Sep/14 4:17 PM ]

Heh. Skate to where the puck is going to be – Gretzky

Re: defrecord iterator: As is, it propagates exceptions from reaching the end of the ExtMap's iterator. As noted in CLJ-1453, PersistentArrayMap's iterator improperly returns an ArrayIndexOutOfBoundsException, rather than NoSuchElementException.

Comment by Alex Miller [ 30/Sep/14 6:41 PM ]

Hey Ghadi, rather than rebuilding the case map to pass to the RecordIterator, why don't you just pass the fields in iteration order to it and leverage the case map via .valAt like everything else?

Comment by Ghadi Shayban [ 30/Sep/14 7:30 PM ]

defrecord-iterator-v2.diff reuses valAt and minimizes macrology.

Comment by Alex Miller [ 07/Oct/14 2:04 PM ]

Comments from Stu (found under the couch):

"1. some of the impls (e.g. queue manually concatenate two iters. Would implementing iter-cat and calling that be simpler and more robust?
2. I found this tweak to the generative testing more useful in reporting failure, non-dependent on clojure.test, and capable of expecting failures. Waddya think?

(defn seq-iter-match
  [seqable iterable]
  (let [i (.iterator iterable)]
    (loop [s (seq seqable)
           n 0]
      (if (seq s)
        (do
          (when-not (.hasNext i)
            (throw (ex-info "Iterator exhausted before seq"
                            {:pos n :seqable seqable :iterable iterable})))
          (when-not (= (.next i) (first s))
            (throw (ex-info "Iterator and seq did not match"
                            {:pos n :seqable seqable :iterable iterable})))
          (recur (rest s) (inc n)))
        (when (.hasNext i)
          (throw (ex-info "Seq exhausted before iterator"
                          {:pos n :seqable seqable :iterable iterable})))))))

		(defspec seq-and-iter-match-for-maps
  identity
  [^{:tag clojure.test-clojure.data-structures/gen-map} m]
  (seq-iter-match m m))

3. similar generative approach would be good for the other types (looks like we just do maps)"

Comment by Alex Miller [ 02/Dec/14 3:03 PM ]

Latest patch (clj-1499-v6.diff) makes Stu's suggested change #2 above and adds tests recommended in #3. I looked at #1 but decided in the end that it wasn't going to make anything easier.

Comment by Alex Miller [ 09/Jan/15 11:11 AM ]

v7 patch just updates to current master

Comment by Alex Miller [ 14/Jan/15 4:08 PM ]

found a bug in last impl in maps with null values...fixing

Comment by Stuart Halloway [ 16/Jan/15 1:40 PM ]

v11 updates v10 to apply cleanly on master

Comment by Fogus [ 30/Jan/15 1:36 PM ]

Added a motivation section to the ticket.

Comment by Rich Hickey [ 20/Feb/15 7:54 AM ]

Any reason why MAKE_ENTRY/KEY/VAL are not final?

Comment by Alex Miller [ 20/Feb/15 8:00 AM ]

Nope, just an oversight. Will fix.

Comment by Alex Miller [ 20/Feb/15 9:46 AM ]

New -v12 patch is identical except makes MAKE_ENTRY, MAKE_KEY, MAKE_VAL final.

Comment by Fogus [ 27/Feb/15 8:45 AM ]

Screened latest v12 (with added finals) and would be happy to work in this code if the need arose. Additionally, with the added "motivation" the ticket and its solution makes more sense.





[CLJ-1498] Remove birth-thread check from transients Created: 08/Aug/14  Updated: 29/Aug/14  Resolved: 29/Aug/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Rich Hickey Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: collections, transient

Attachments: File clj-1498-2.diff     File clj-1498-3.diff     File clj-1498.diff    
Patch: Code
Approval: Ok

 Description   

Transients protect themselves from use by any thread other than the one that creates them. This is good for safety, however it eliminates certain valid usages of transients. For example, usage in a go-block might occur in subsequent invocations across multiple OS threads (but only one logical thread of control).

Current simple test:

user> (def v (transient []))
#'user/v
user> (persistent! @(future (conj! v 1)))
IllegalAccessError Transient used by non-owner thread  clojure.lang.PersistentVector$TransientVector.ensureEditable (PersistentVector.java:464)

Proposal: Remove the owner check from transient collections. (Leave the edit after persistent check as is.) The test above should succeed.

After:

user=> (def v (transient []))
#'user/v
user=> (persistent! @(future (conj! v 1)))
[1]

The clj-1498-3.diff version of the patch also replaces the AtomicReference<Thread> with AtomicBoolean as we can now track just ownership, not who owns it.

Doc update: Various pieces of documentation will need to be updated with this change, namely http://clojure.org/transients

Patch: clj-1498-3.diff

Alternative: Another idea would be to make this check optional with some kind of option on the transient call (transient coll :check-owner true). Not sure whether what the default would be for that.



 Comments   
Comment by Jozef Wagner [ 09/Aug/14 7:08 AM ]

I suggest to add a functionality to pass ownership of a transient to the different thread, or to release the ownership by passing nil.

user=> (def v (pass! (transient []) nil))
#'user/v
user=> (persistent! @(future (conj! v 1)))
[1]

pass! has to be called by current owner thread, or by any thread if the transient is currently released.

Comment by Alex Miller [ 13/Aug/14 1:42 PM ]

New patch that replaces AtomicReference<Thread> with AtomicBoolean.

Comment by Stuart Halloway [ 19/Aug/14 11:05 AM ]

Alex, can you please expand the example test you provided to a generative test that covers the following combinations:

  1. different collection sizes (above and below the ArrayMap size boundary)
  2. different shapes (vector vs. map)
  3. successful use across threads (positive use case this ticket enables)

data_structures.clj has helpers for generating transient interactions that you can build on.

Comment by Alex Miller [ 20/Aug/14 8:59 AM ]

Enhanced existing generative tests to test random actions against sets, vectors, and both PHM and PAM. Added additional actions to do transient modification actions in other threads as well as originating thread.





[CLJ-1497] sequence with transducers realizes n+2 elements Created: 08/Aug/14  Updated: 08/Aug/14  Resolved: 08/Aug/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1497.diff     File clj-1497v2.diff    
Approval: Ok

 Description   

The first element is realized at creation time:

user=> (def a (sequence (take 2) (map #(do (println (str "~" %)) %) (iterate inc 1))))
~1
#'user/a

Fully realizing the sequence realizes the other n-1 elements, and 2 more:

user=> a
(~2
~3
1 ~4
2)

Compare with version using seq operations:

user=> (def a (sequence (take 2 (map #(do (println (str "~" %)) %) (iterate inc 1)))))
#'user/a
user=> a
(~1
~2
1 2)

Transduce also doesn't seem to exhibit this issue:

user=> (def a (transduce (take 2) conj [] (map #(do (println (str "~" %)) %) (iterate inc 1))))
~1
~2
#'user/a
user=> a
[1 2]


 Comments   
Comment by Alex Miller [ 08/Aug/14 10:02 AM ]

Patch attached that improves the issue - will now only realize n+1 elements.

Comment by Nicola Mometto [ 08/Aug/14 10:16 AM ]

Nice, I added a commit on top of yours to delay the realization of the first element of the lazyseq to the first .next call instead of on SeqIteration creation

Comment by Alex Miller [ 08/Aug/14 11:12 AM ]

Fixed by Rich directly, not by patch.





[CLJ-1494] remove flatmap in favor of mapcat Created: 07/Aug/14  Updated: 03/Sep/14  Resolved: 03/Sep/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: None

Attachments: Text File 0001-remove-flatmap-use-1-arity-mapcat-instead.patch    
Patch: Code

 Description   

While all the transducers functions are implemented as an arity in the matching clojure core sequence, for mapcat a new function has been added: flatmap.
The reason for this is, as Rich said in a HN comment, "because mapcat's signature was not amenable to the additional arity".
This patch changes the mapcat signature to take at least one collection so that it's possible to add the 1-arity for the transducer function, eliminating the need for a different function, flatmap.

There has been no loss by removing the 1-arity version of mapcat as a sequence function since trying to use (mapcat f) as currently defined (not as redefined with this patch) would fail before transducers, and after transducers:
Before transducers (mapcat f) would result in a call to (map f) which would fail with an ArityException
After transducers that (map f) call would return a function, which then would be used as an argument to (apply concat the-f), resulting in a IllegalArgumentException since apply expects a sequence but it's been given a fn.



 Comments   
Comment by Alex Miller [ 03/Sep/14 11:02 AM ]

Done as a result of https://github.com/clojure/clojure/commit/7d84a9f6f35a503cddf98487b6544d18937c669e





[CLJ-1481] Typo in type-reflect's docstring Created: 27/Jul/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Defect Priority: Trivial
Reporter: Bozhidar Batsov Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: docstring, ft

Attachments: Text File 0001-Fix-a-typo.patch    
Patch: Code
Approval: Ok

 Description   

membrer -> member

Screened by: Alex Miller






[CLJ-1480] Incorrect param name reference in defmulti's docstring Created: 27/Jul/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Defect Priority: Trivial
Reporter: Bozhidar Batsov Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: docstring, ft

Attachments: Text File 0001-Fix-param-name-reference-in-defmulti-s-docstring.patch    
Patch: Code
Approval: Ok

 Description   

attribute-map should actually be attr-map

Screened by: Alex Miller






[CLJ-1479] Typo in filterv example Created: 27/Jul/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Trivial
Reporter: Bozhidar Batsov Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: documentation, ft

Attachments: Text File 0001-Fix-a-typo.patch    
Patch: Code
Approval: Ok

 Description   

filter -> filterv in changes.md

Screened by: Alex Miller






[CLJ-1478] Doc typo Created: 27/Jul/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Trivial
Reporter: Bozhidar Batsov Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: docstring, ft

Attachments: Text File 0001-Fix-a-typo.patch    
Patch: Code
Approval: Ok

 Description   

Another small typo fix:
from from -> from

Screened by: Alex Miller






[CLJ-1477] Fixed a typo Created: 27/Jul/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Trivial
Reporter: Bozhidar Batsov Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: documentation, ft

Attachments: Text File 0001-Fix-a-typo.patch    
Patch: Code
Approval: Ok

 Description   

Just a simple typo fix - "directy" -> "directly".

Screened by: Alex Miller






[CLJ-1466] clojure.core/bean should implement Iterable Created: 16/Jul/14  Updated: 07/Oct/14  Resolved: 07/Oct/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Ambrose Bonnaire-Sergeant Assignee: Unassigned
Resolution: Completed Votes: 2
Labels: ft, interop

Attachments: File iterable-bean-v2.diff    
Patch: Code and Test
Approval: Ok

 Description   

The changes in Clojure 1.6 hashing revealed that `bean` does not return a map that implements Iterable:

user=> (hash (bean (java.util.Date.)))

AbstractMethodError clojure.lang.APersistentMap.iterator()Ljava/util/Iterator;  clojure.core.proxy$clojure.lang.APersistentMap$ff19274a.iterator (:-1)

Patch adds `iterator` method to clojure.core/bean.

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 16/Jul/14 10:22 AM ]

One workaround:

(hash (apply hash-map (bean (java.util.Date.))))

Interestingly, into does not help b/c into uses reduce, which internally uses the iterator too.

Comment by Alex Miller [ 16/Jul/14 11:01 AM ]

APersistentMap implements Iterable and expects subclasses to fulfill that contract. The bean proxy does not. Instead of changing APersistentMap, why not add:

(iterator [] (.iterator pmap)

to the bean proxy definition?

Comment by Ambrose Bonnaire-Sergeant [ 16/Jul/14 11:19 AM ]

It seemed like an oversight that APersistentMap lacked a default iterator method.

That said, I haven't used OO inheritance for 4 years. Should I change the patch?

Comment by Ambrose Bonnaire-Sergeant [ 16/Jul/14 11:47 AM ]

Added new patch that just adds iterator to bean.





[CLJ-1439] Reduce keyword cache lookup cost Created: 05/Jun/14  Updated: 07/Oct/14  Resolved: 01/Aug/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Kyle Kingsbury Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: keywords, performance

Attachments: Text File 0001-Improve-Keyword.intern-performance.patch    
Patch: Code
Approval: Ok

 Description   

Background: Symbol is composed of name and namespace strings. Symbol construction interns both of these strings - this reduces memory usage and allows for string == checks inside Symbol. Keywords wrap a Symbol and have an additional cache to reuse Keyword instances.

Problem: Certain applications make heavy use of keywords (in particular the case of parsing or transforming JSON, XML, or other data into Clojure maps with keyword keys). Constructing the same keyword from a string over and over again will cause the string to be interned, a symbol constructed, and the lookup to occur in the keyword cache. In the case where the keyword already exists, this is more work than is necessary, making this path slower than it can be.

Reproduce: The following test simulates rounds of creating many keywords - the unique? flag indicates whether to use new or the same set of keywords each rep. unique?=false should be more similar to parsing a similar JSON record format over and over.

(set! *unchecked-math* true)

(defn kw-new [n unique?]
  (let [base (if unique? (str (rand)) "abcdef")]
    (loop [i 0
           kws (transient [])]
      (if (< i n)
        (recur (inc i) (conj! kws (keyword (str base i))))
        (persistent! kws)))))

(defn bench-kw [reps n unique?]
  (dotimes [_ reps]
    (let [begin (System/nanoTime)]
        (kw-new n unique?)
        (let [end (System/nanoTime)
              elapsed (/ (- end begin) 1000000.0)]
          (println elapsed "ms")))))

(bench-kw 50 10000 false)  ;; expected similar to JSON use case
(bench-kw 50 10000 true)   ;; for comparison

On 1.6, we see about 5.5 ms for repeated and 134 ms for unique after warmup.
With the patch, we see about 2.2 ms for repeated and 120 ms for unique after warmup.

Cause: Keyword construction based on a string involves:

  • Interning string(s) in new kw
  • Constructing Symbol with interned strings
  • Clearing Keywords from the Keyword cache if GC has reclaimed them
  • Constructing a new Keyword
  • Wrapping the Keyword in a WeakReference
  • CHM putIfAbsent on the cache
  • If new, return. If exists, get the old one and return.
  • In the event the Keyword is reclaimed by GC between the last 2 steps, retry.

This process involves a fair amount of speculative interning and object creation if the keyword already exist.

Proposal: Streamline the keyword construction process by reworking the cache implementation and the Keyword.intern() process. The patch changes the cache to key by string name instead of symbol, deferring interning and symbol creation on lookup to when we know the keyword construction is needed. The various Keyword.intern() methods are also reworked to take advantage if called with an existing Symbol to avoid re-creating it.

Patch: 0001-Improve-Keyword.intern-performance.patch

Related: CLJ-1415



 Comments   
Comment by Alex Miller [ 01/Aug/14 11:48 AM ]

Alternate changes were committed today to improve both symbol and keyword creation times.

https://github.com/clojure/clojure/commit/c9e70649d2652baf13b498c4c3ebb070118c4573

Comment by Alex Miller [ 07/Oct/14 4:42 PM ]

related patch was applied





[CLJ-1430] Improve performance of partial Created: 23/May/14  Updated: 05/Sep/14  Resolved: 29/Aug/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Timothy Baldridge Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: performance

Attachments: File partial-perf.diff    
Patch: Code
Approval: Ok

 Description   

This patch improves performance of partial by only using apply when needed. The code structure follows that of juxt.

Performance benchmark:

(ns partial-test.core
  (:require [criterium.core :refer [bench]])
  (:gen-class))

(defn -main []
  (let [f (partial + 1 1)]
    (println "Starting")
    (bench (f 1 1))
    (println "Done")))

Results for 1.6.0:

Evaluation count : 228751140 in 60 samples of 3812519 calls.
             Execution time mean : 266.700063 ns
    Execution time std-deviation : 2.966851 ns
   Execution time lower quantile : 262.641023 ns ( 2.5%)
   Execution time upper quantile : 274.207916 ns (97.5%)
                   Overhead used : 1.610513 ns

Found 3 outliers in 60 samples (5.0000 %)
	low-severe	 3 (5.0000 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

Results for 1.7.0 with this patch:

 Evaluation count : 348208140 in 60 samples of 5803469 calls.
              Execution time mean : 171.210533 ns
     Execution time std-deviation : 2.011660 ns
    Execution time lower quantile : 168.819526 ns ( 2.5%)
    Execution time upper quantile : 176.015584 ns (97.5%)
                    Overhead used : 2.644128 ns

 Found 3 outliers in 60 samples (5.0000 %)
 	low-severe	 3 (5.0000 %)
  Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

Benchmarks performed via lein uberjar + running via the commandline.

Patch: partial-perf.diff

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 23/May/14 10:46 AM ]

Screened, looks as expected.

Comment by Andy Fingerhut [ 02/Jun/14 10:50 AM ]

Timothy, just a nit that I would not have noticed except for my program that checks for name and email address of patch authors, to see if they are on my contributor's list, but do you really have both of the email addresses tbaldridge@gmail.com and tbaldidge@gmail.com (note the spelling difference)? The latter is the one on this patch.

Comment by Timothy Baldridge [ 02/Jun/14 11:04 AM ]

fixed email

Comment by Timothy Baldridge [ 02/Jun/14 11:05 AM ]

nice catch! it was a typeo in my .gitconfig defaults. I've fixed the patch.

Comment by Alex Miller [ 02/Jun/14 11:19 AM ]

Tim (and anyone really) - please let someone know if you need to change a screened patch! Looks fine here, but screener should be notified so they can re-screen.

Comment by Alex Baranosky [ 05/Sep/14 9:11 PM ]

Very nice patch. I've gotten into the habit of not using partial anymore for performance sensitive code. Perhaps this change means I need to rethink that.





[CLJ-1429] Cache unknown multimethod value default dispatch Created: 22/May/14  Updated: 29/Aug/14  Resolved: 29/Aug/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 2
Labels: performance

Attachments: Text File clj-1429.patch    
Patch: Code
Approval: Ok

 Description   

Multimethods maintain a cache from dispatch value (result of the dispatch function) to dispatch method. If the dispatch value does not find a match in the available methods, it falls through to a lookup using the default dispatch value and returns that method. This default dispatch case is NOT recorded in the cache. This means that every case that falls through to the default case incurs a scan of the methodTable (and the class inheritance checks that involves).

Perf test:

(defmulti mm class)
(defmethod mm String [s] s)
(defmethod mm Long [l] l)
(defmethod mm :default [v] v)

(defn perf [reps size]
  (let [data (take size (cycle ["abc" 5 :k]))]
    (dotimes [_ reps]
      (time (doall (map mm data))))))

And results:

;; Without patch:
user=> (perf 5 100000)
"Elapsed time: 1301.262 msecs"
"Elapsed time: 928.888 msecs"
"Elapsed time: 942.905 msecs"
"Elapsed time: 858.513 msecs"
"Elapsed time: 832.314 msecs"

;; With patch:
user=> (perf 5 100000)
"Elapsed time: 134.169 msecs"
"Elapsed time: 28.859 msecs"
"Elapsed time: 45.452 msecs"
"Elapsed time: 13.189 msecs"
"Elapsed time: 13.42 msecs"

Attached patch caches the mapping of unknown value -> default dispatch method and significantly improves the performance for this case.

Patch: clj-1429.patch
Screened by: Stu






[CLJ-1424] Reader conditionals Created: 15/May/14  Updated: 02/Apr/15  Resolved: 02/Apr/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Ghadi Shayban Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: reader

Attachments: File clj-1424-10.diff     Text File clj-1424-11.patch     Text File clj-1424-12.patch     File CLJ-1424-2.diff     File clj-1424-3.diff     File clj-1424-4.diff     File clj-1424-5.diff     File clj-1424-6.diff     File clj-1424-7.diff     File clj-1424-8.diff     File clj-1424-9.diff     File clojure-feature-expressions.diff    
Patch: Code and Test
Approval: Ok

 Description   

See http://dev.clojure.org/display/design/Reader+Conditionals for design.

The implementation details contain several major parts:

1) LispReader changes to implement new conditional read syntax and features.

In RT:

  • suppress-read - new dynamic var
  • readString() - new arity that takes opts

In LispReader:

  • read(...) entry point - added new variants that take opts. The opts-only form will decode :eof into the correct values for eofIsError and eofValue. Internal forms take both opts and pendingForms (list).
  • All reader invoke dispatch methods now take both opts and pendingForms which get drug around the call stack.
  • Defined all option keys and well-known allowed values - OPT_EOF, OPT_FEATURES, OPT_READ_COND, EOFTHROW, PLATFORM_KEY, PLATFORM_FEATURES, COND_ALLOW, COND_PRESERVE.
  • read(...) entry point will ensure installation of {:features #{:clj}}, appropriately merging others or none. There is no path that will pass :features inside Clojure but users can call LispReader with it directly.
  • readDelimitedList() was refactored so it can share logic with readCondDelimited().
  • Luke can provide more detail on the guts of reader conditional reading if needed.

New ReaderConditional and TaggedLiteral types.

In clojure.core:

  • tagged-literal?, tagged-literal, reader-conditional?, reader-conditional - new functions to support the preserved data form of these
  • new print-method impls for TaggedLiteral and ReaderConditional
  • read - added new arity that takes opts and explain opts in docstring
  • read-string - added new arity that takes opts (placement matches edn/read-string and is designed for partial application)

2) cljc file changes.
In Compiler:

  • OPTS_COND_ALLOWED - keep around a map {:read-cond :allow} to invoke reader when reading .cljc files.
  • readerOpts() - new private function to determine reader options to use based on file name (.clj vs .cljc)
  • load() - use readerOpts() and invoke LispReader with them
  • compile() - use readerOpts() and invoke LispReader with them

In RT:

  • load(...) - check first for .clj file, then for .cljc file when looking by ns

In clojure.main - update code to allow for .clj and .cljc files
In clojure.repl - update code to allow for .clj and .cljc files

3) Tests + infrastructure updates.

  • Move test/clojure/test_clojure/reader.clj to test/clojure/test_clojure/reader.cljc so that we can test reader conditionals.
  • Bump version of test.generative to get newer version that depends on newer tools.namespace which is able to find .cljc files as source files. The test script at src/script/run_test.clj uses this to find Clojure test files and without the bump, it will not find the new reader.cljc test ns. Note: if you use the ant build, you need to re-run antsetup.sh for this to take effect.
  • src/script/run_test.clj - as part of the version bump, move from deprecated namespace to new namespace
  • Added lots of tests in reader.cljc

Patch: clj-1424-12.patch

Screened by:

Related: CLJS-27, TRDR-14



 Comments   
Comment by Jozef Wagner [ 16/May/14 2:19 AM ]

Has there been a decision that CL syntax is going to be used? Related discussion can be found at design page, google groups discussion and another discussion.

Comment by Alex Miller [ 16/May/14 8:34 AM ]

No, no decisions on anything yet.

Comment by Ghadi Shayban [ 19/May/14 7:25 PM ]

Just to echo a comment from TRDR-14:

This is WIP and just one approach for feature expressions. There seem to be at least two couple diverging approaches emerging from the various discussion (Brandon Bloom's idea of read-time splicing being the other.)

In any case having all Clojure platforms be ready for the change is probably essential. Also backwards compatibility of feature expr code to Clojure 1.6 and below is also not trivial.

Comment by Kevin Downey [ 04/Aug/14 1:39 PM ]

if you have ever tried to do tooling for a language where the "parser" tossed out information or did some partial evaluation, it is a pain. this is basically what the #+cljs style feature expressions and bbloom's read time splicing both do with clojure's reader. I think resolving this at read time instead of having the compiler do it before macro expansion is a huge mistake and makes the reader much less useful for reading code.

Comment by Ghadi Shayban [ 04/Aug/14 2:00 PM ]

Kevin, what kind of tooling use case are you alluding to?

Comment by Kevin Downey [ 04/Aug/14 3:24 PM ]

any use case that involves reading code and not immediately handing it off to the compiler. if I wanted to write a little snippet to read in a function, add an unused argument to every arity then pprint it back, reader resolved feature expressions would not round trip.

if I want to write snippet of code to generate all the methods for a deftype (not a macro, just at the repl write a `for` expression) I can generate a clojure data structure, call pprint on it, then paste it in as code, reader feature expressions don't have a representation as data so I cannot do that, I would have to generate strings directly.

Comment by Alex Miller [ 22/Aug/14 9:10 AM ]

Changing Patch setting so this is not in Screenable yet (as it's still a wip).

Comment by Alex Miller [ 07/Nov/14 4:39 PM ]

Latest patch brings up to par with related patches in CLJS-27 and TRDR-14 and importantly adds support for loading .cljc files as Clojure files.

Comment by Andy Fingerhut [ 07/Nov/14 5:55 PM ]

Maybe undesirable behavior demonstrated below with latest Clojure master plus patch clj-1424-3.diff, due to the #+cljs skipping the comment, but not the (dec a). I thought it could be fixed simply by moving RT.suppressRead() check after (ret == r) check in read(), but that isn't correct.

user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs (dec a))")
(defn foo [a] (inc a))
user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs ; foo\n (dec a))")
(defn foo [a] (inc a) (dec a))
Comment by Alex Miller [ 21/Jan/15 4:28 PM ]

Added new clj-1424-4.diff which makes a couple of modifications:

  • removed support for and/or/not (#+ and #- remain)
  • *features* has been removed
  • if you wish to have a custom feature set while reading, there is a new option map that can be passed to read (this all parallels similar changes previously made to the edn reader)

Example of adding a "custom" feature to the feature set (which will always contain "clj" feature):

(read 
  {:features #{:custom}} 
  (java.io.PushbackReader. (java.io.StringReader. "[#+custom :x]")))
Comment by Andy Fingerhut [ 21/Jan/15 5:01 PM ]

Latest patch clj-1424-4.diff also exhibits maybe-undesirable behavior in which #+cljs can suppress an immediately following comment, rather than the form following it. See 07/Nov/14 comment with example above.

Comment by Alex Miller [ 21/Jan/15 6:16 PM ]

Thanks Andy, I'm aware. Haven't looked at it yet.

Comment by Luke VanderHart [ 25/Jan/15 9:26 PM ]

Patch clj-1424-5.diff modifies the code to use "read-conditionals", as outlined by Rich at: https://groups.google.com/d/msg/clojure-dev/LW0ocQ1RcYI/IBPPyfCpM3kJ

Comment by Alex Miller [ 26/Jan/15 12:33 PM ]

Some feedback:

1) Because pendingForms is an internal thing, I would make the read() that takes it non-public.
2) In readDelimitedList, I don't see the point of constructing a new LinkedList then checking if it's empty there. Should just make the add conditional on whether it's null or not.
3) You could treat pendingForms as a Deque (which LinkedList implements) and then use pop() instead of remove(0). The addAll(0, ...) is more painful to replicate though if you're sticking to Deque. I think I'd be tempted to just commit explicitly to LinkedList for pendingForms since we fully control the construction and use of it within the reader.
4) Might be nice to update the commented-out readers to support pendingForms as I did with opts. Or remove the updates for opts. Should either do all the mods or none on the commented-out code.
5) s/read-cond-splicing/read-cond-splice/ ? Seems like where it's used it should be a verb.
6) Should just use :default and make :else and :none throw exceptions. I think Rich mentioned :except or :exception too? or maybe I misheard that.
7) Should have some more tests to tweak the error cases - bad feature, uneven forms, default out of allowed position, bad contents for splice, etc.

Comment by Alex Miller [ 26/Jan/15 2:01 PM ]

From Chouser on the mailing list: "is it intentional that reading (clojure.core/read-cond ...) does not behave the same as (#? ...)? That is, (#? ...) can be read as c.c/read-cond depending on read options, but having been read, if it is printed again it doesn't round-trip back to #?. This is different, for example, from how #(...) is read as (fn* [] (...)), which then retains its meaning."

In shouldReadConditionally(), it looks like the == check vs READ_COND will not work. Instead of:
return (first == READ_COND || first == READ_COND_SPLICING);
do
return (READ_COND.equals(first) || READ_COND_SPLICING.equals(first));

For example, this test doesn't seem to give the right answer:

user=> (read-str-opts {:preserve-read-cond false} "(clojure.core/read-cond :clj :x :default :y)")
(clojure.core/read-cond :clj :x :default :y)    ;; should be :x
Comment by Michael Blume [ 26/Jan/15 3:27 PM ]

With this patch applied to master, lein check fails on instaparse:

Compiling namespace instaparse.abnf
Exception in thread "main" clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader, compiling:(abnf.clj:186:28)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3605)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3599)
	at clojure.lang.Compiler$DefExpr.eval(Compiler.java:436)
	at clojure.lang.Compiler.eval(Compiler.java:6772)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.RT.loadResourceScript(RT.java:384)
	at clojure.lang.RT.loadResourceScript(RT.java:375)
	at clojure.lang.RT.load(RT.java:459)
	at clojure.lang.RT.load(RT.java:425)
	at clojure.core$load$fn__5424.invoke(core.clj:5850)
	at clojure.core$load.doInvoke(core.clj:5849)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at user$eval52$fn__63.invoke(form-init5310597017138984927.clj:1)
	at user$eval52.invoke(form-init5310597017138984927.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6767)
	at clojure.lang.Compiler.eval(Compiler.java:6757)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.Compiler.loadFile(Compiler.java:7150)
	at clojure.main$load_script.invoke(main.clj:275)
	at clojure.main$init_opt.invoke(main.clj:280)
	at clojure.main$initialize.invoke(main.clj:308)
	at clojure.main$null_opt.invoke(main.clj:343)
	at clojure.main$main.doInvoke(main.clj:421)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader
	at clojure.lang.AFn.throwArity(AFn.java:429)
	at clojure.lang.AFn.invoke(AFn.java:36)
	at instaparse.cfg$eval800$safe_read_string__801.invoke(cfg.clj:163)
	at instaparse.cfg$process_string.invoke(cfg.clj:180)
	at instaparse.cfg$build_rule.invoke(cfg.clj:217)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:214)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:207)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core.protocols$seq_reduce.invoke(protocols.clj:30)
	at clojure.core.protocols$fn__6436.invoke(protocols.clj:59)
	at clojure.core.protocols$fn__6389$G__6384__6402.invoke(protocols.clj:13)
	at clojure.core$reduce.invoke(core.clj:6501)
	at clojure.core$into.invoke(core.clj:6582)
	at instaparse.cfg$ebnf.invoke(cfg.clj:277)
	at clojure.lang.AFn.applyToHelper(AFn.java:154)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3600)
	... 27 more
Failed.
Comment by Michael Blume [ 26/Jan/15 3:29 PM ]

Aha, of course, Instaparse is calling into the LispReader$StringReader directly.

Is it worth providing versions of these methods with the old arities? Or should instaparse just not be using Clojure internals this way?

Comment by Michael Blume [ 26/Jan/15 3:33 PM ]

https://github.com/Engelberg/instaparse/blob/v1.3.5/src/instaparse/cfg.clj#L159

Comment by Alex Miller [ 26/Jan/15 3:33 PM ]

Instaparse is reaching pretty deep inside implementation details here, so I'd say this should expect to break. We could back-fill the old arities here but I'd really prefer not to if possible.

Comment by Luke VanderHart [ 27/Jan/15 11:23 AM ]

clj-1424-6.diff addresses all the issues mentioned above. Per a comment from Rich, it also adds tests to ensure that nested splices work properly (they do).

There were two things from your list I didn't do, Alex:

3) I kept pendingForms as a List. Because we aren't confining ourselves to a Deque interface, I don't see the benefit of calling pop() over remove(0) (with identical semantics) as justification for over-specifying the concrete type.

5) I kept "read-cond-splicing" since it parallels the form of "unquote-splicing". Seems that those should be consistent.

Comment by Luke VanderHart [ 30/Jan/15 9:03 PM ]

clj-1424-7.diff contains Rich's "reader-conditionals" proposal.

Comment by Alex Miller [ 06/Feb/15 3:45 PM ]

ReaderConditional / TaggedLiteral
1) when patch applied I see some whitespace errors in here, also line endings seem different, might want to check it
2) a common pattern in other Java classes is private constructor and public static create() method
3) could use Util.hash() to clean up the "null->0" logic in hashCode()

LispReader
4) adds unused import: java.util.Iterator
5) it looks like returnOn flag could just be collapsed into checking if returnOnChar is non-null?
6) in readCondDelimited, EOF and FINISHED are never used and can be removed presumably

Comment by Luke VanderHart [ 07/Feb/15 1:49 PM ]

I have attached clj-1424-8.diff, which addresses your most recent comments, Alex. I formatted it using `git format patch` instead of `git diff` so it should have the email address added correctly.

Your comments are all addressed, with the exception of returnOn. I don't think that can be collapsed. You really need two values: one to say what character should cause a return, and one to say what value should be returned in that scenario. You could use a convention on the return value, I suppose, (e.g, null means a completed read) but there's already precedent for passing in the value to be returned (namely, eofValue).

Comment by Alex Miller [ 09/Feb/15 9:49 AM ]

Looks good. I think I actually mis-read what returnOn was doing, so np on that. I still see the whitespace issues and the CR/LF in those two files. Were you going to change those?

Comment by Alex Miller [ 09/Feb/15 4:24 PM ]

Added new -9 patch that squashes the last patch but is otherwise identical. The older patches in that diff were the source of still seeing whitespace errors on apply.

Comment by Rich Hickey [ 20/Feb/15 8:09 AM ]

I think in the first iteration we should allow reader conditionals only in .cljc files, and support only standard features :clj, :cljs and :clr.

Comment by Andy Fingerhut [ 27/Feb/15 10:35 AM ]

Comment on -10 patch:

The doc string additions for clojure.core/read are a bit cryptic for those not already steeped in the details. It would be good to at least mention 'reader conditionals' so people have something to Google for. Current additions:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow, or :preserve to keep all branches as data
+    :features - persistent set of feature keywords
+    :eof - on eof, return value unless :eofthrow, then throw

Alternate suggestion:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow to process reader conditionals, or :preserve to keep all branches as data.  Throw if reader conditional found and this key not present.
+    :features - persistent set of feature keywords for processing reader conditionals.
+    :eof - on eof, return value unless :eofthrow, then throw
Comment by Alex Miller [ 27/Feb/15 10:39 AM ]

Thanks Andy, will consider. Indeed, this will have lengthier docs on http://clojure.org/reader so I was trying to just hit the surface but that's a reasonable suggestion.

Comment by Fogus [ 06/Mar/15 11:10