<< Back to previous view

[CLJ-1620] Constants are leaked in case of a reentrant eval Created: 18/Dec/14  Updated: 19/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Christophe Grand Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: None

Attachments: Text File 0001-CLJ-1620-avoid-constants-leak-in-static-initalizer.patch     Text File 0001-CLJ-1620-avoid-constants-leak-in-static-initalizer-v2.patch     Text File eval-bindings.patch    
Patch: Code

 Description   

Compiling a function that references a non loaded (or unitialized) class triggers its init static. When the init static loads clojure code, some constants (source code I think) are leaked into the constants pool of the function under compilation.

It prevented CCW from working in some environments (Rational) because the static init of the resulting function was over 64K.

Steps to reproduce:

Load the leak.main ns and run the code in comments: the first function has 15 extra fiels despite being identical to the second one.

(ns leak.main)

(defn first-to-load []
  leak.Klass/foo)

(defn second-to-load []
  leak.Klass/foo)

(comment
=> (map (comp count #(.getFields %) class) [first-to-load second-to-load])
(16 1)
)
package leak;
 
import clojure.lang.IFn;
import clojure.lang.RT;
import clojure.lang.Symbol;
 
public class Klass {
  static {
    RT.var("clojure.core", "require").invoke(Symbol.intern("leak.leaky"));
  }
  public static IFn foo = RT.var("leak.leaky", "foo");
}
(ns leak.leaky)

(defn foo
  "Some doc"
  []
  "hello")

(def unrelated 42)

https://gist.github.com/cgrand/5dcb6fe5b269aecc6a5b#file-main-clj-L10



 Comments   
Comment by Christophe Grand [ 18/Dec/14 3:56 PM ]

Patch from Nicola Mometto

Comment by Nicola Mometto [ 18/Dec/14 4:01 PM ]

Attached the same patch with a more informative better commit message

Comment by Laurent Petit [ 18/Dec/14 4:03 PM ]

I'd like to thank Christophe and Alex for their invaluable help in understanding what was happening, formulating the right hypothesis and then finding a fix.

I would also mention that even if non IBM rational environments where not affected by the bug to the point were CCW would not work, they were still affected. For instance the class for a one-liner function wrapping an interop call weighs 700bytes once the patch is applied, when it weighed 90kbytes with current 1.6 or 1.7.

Comment by Laurent Petit [ 18/Dec/14 5:07 PM ]

In CCW for the initial problematic function, the -v2 patch produces exactly the same bytecode as if the referenced class does not load any namespace in its static initializers.
That is, the patch is valid. I will test it live in the IBM Rational environment ASAP.

Comment by Laurent Petit [ 19/Dec/14 12:10 AM ]

I confirm the patch fixes the issue detected initially in the IBM Rational environment





[CLJ-1619] PersistentVector implements IReduce but the no init arity throws Created: 17/Dec/14  Updated: 18/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-Implement-no-init-arity-of-reduce-for-PersistentVect.patch    
Patch: Code
Approval: Screened

 Description   

The reduce arity of IReduce in PersistentVector is implemented as: "throw new UnsupportedOperationException()".

After the CLJ-1572 patch is applied the following code will throw:

(reduce + [1 2])

Approach taken: Implement reduce(f) in PersistentVector.

Alternative: An alternate would be to change PersistentVector from IReduce to IReduceInit and remove the reduce without init function. In this case, reducing a vector would fall back to seqs.

Patch: 0001-Implement-no-init-arity-of-reduce-for-PersistentVect.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 18/Dec/14 10:59 AM ]

Is that return null there right? In the case of no elements, you should invoke f with no args right?

Comment by Nicola Mometto [ 18/Dec/14 11:04 AM ]

you're right, I didn't know that detail about the behaviour of reduce. Updated the patch to invoke (f) rather than returning nil when the coll is empty





[CLJ-1572] into does not work with IReduceInit Created: 24/Oct/14  Updated: 18/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: transducers

Attachments: Text File clj-1572-2.patch     Text File clj-1572-3.patch     Text File clj-1572-4.patch     Text File CLJ-1572-alternative-POC.patch     Text File clj-1572.patch    
Patch: Code and Test
Approval: Vetted

 Description   

This should work:

(into []
  (reify clojure.lang.IReduceInit
    (reduce [_ f start]
      (reduce f start (range 10)))))
IllegalArgumentException Don't know how to create ISeq from: user$eval5$reify__6
	clojure.lang.RT.seqFrom (RT.java:506)
	clojure.lang.RT.seq (RT.java:487)
	clojure.core/seq--seq--4091 (core.clj:135)
	clojure.core.protocols/seq-reduce (protocols.clj:30)
	clojure.core.protocols/fn--6422 (protocols.clj:42)
	clojure.core.protocols/fn--6369/f--6255--auto----G--6364--6382 (protocols.clj:13)
	clojure.core/reduce (core.clj:6469)
	clojure.core/into (core.clj:6550)

Cause: CollReduce only supports IReduce, not IReduceInit so when reduce calls into it, it falls back to trying to obtain a seq representation which fails.

Proposed: Extend CollReduce to IReduceInit and in the non-init arity, cast to IReduce. Also, now that CollReduce supports both IReduceInit and Iterable, a coll that implements both makes the path through CollReduce nondeterministic. transduce does an explicit check that prefers IReduceInit - the patch copies that approach to reduce as well.

Another consequence of this change is that since PersistentVector implements IReduce but throws on the non-init path, there are some test breakages. To address this, CLJ-1619 (which implements the non-init reduce) must be applied first.

Patch: clj-1572-4.patch
Depends on: CLJ-1619 being applied first



 Comments   
Comment by Alex Miller [ 24/Oct/14 10:40 AM ]

into calls reduce which calls into CollReduce. CollReduce extends to IReduce, but not to IReduceInit. If CollReduce were extended to IReduceInit for the arity it can support, into work as expected in the given example. Patch clj-1572.patch does this.

Comment by Ghadi Shayban [ 08/Nov/14 4:34 PM ]

It is also possible that core/reduce needs the same special casing of IReduceInit that transduce has to allow for a deterministic dispatch when transduce is called with (mapcat f), as mapcat calls reduce.

Comment by Stuart Halloway [ 10/Nov/14 11:02 AM ]

Can someone please expand on Ghadi's comment with an example of the problem?

Comment by Ghadi Shayban [ 10/Nov/14 11:14 AM ]

Example of something that is Iterable & ReduceInit:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L122-L128

Let's call that r/range in this example:
(transduce (mapcat r/range) + 0 [5 5 5 5 5])

The when the mapcat transducer encounters r/range, the inner reduce call will dispatch through CollReduce upon Iterable, rather than IReduceInit.

the inner call to reduce within cat:
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7243

Comment by Alex Miller [ 12/Nov/14 12:55 PM ]

To restate the issue from Ghadi for my own sake:

The CollReduce protocol extends to IReduce, IReduceInit and Iterable. Because these are all interfaces, its possible for a custom coll to implement two or more of them. In that case, Clojure will arbitrarily pick which protocol impl is called - this can result in the Iterable version being called instead of IReduce/IReduceInit (which should be preferred).

transduce avoids this by explicitly checking for IReduceInit and preferring it over CollReduce.

Ghadi is suggesting that reduce should also make this preference (currently it does not).

Comment by Nicola Mometto [ 17/Nov/14 3:06 PM ]

If CollReduce could be direcly backed by the IReduce interface, this would remove the need for explicit IReduceInit checking at the callsite.

It's already possible to (defprotocol CollReduce :on-interface clojure.lang.IReduce ..), I'm proposing adding the ability to map the "reduce" method to the coll-reduce protocol-fn aswell and go with this solution

Comment by Alex Miller [ 17/Nov/14 3:21 PM ]

CollReduce extends to two interfaces (IReduceInit and Iterable) and for some impls this is ambiguous under the CollReduce protocol. The check in reduce and transduce is to force the choice of IReduceInit so it is not ambiguous. I think your suggestion re-introduces that issue? Or maybe I'm just not understanding what you mean.

Comment by Nicola Mometto [ 17/Nov/14 3:46 PM ]

Turns out defprotocol already has that capability via :on metadata field.

The attached patch is a proof of concept of my proposal, if there's interest in this approach I can fix the deftype/record/reify method parser to automatically pick the var name rather than having to specify the method name.

Comment by Nicola Mometto [ 17/Nov/14 3:52 PM ]

Ah, I see now the issue. Disregard my patch then.

Comment by Ghadi Shayban [ 14/Dec/14 11:58 AM ]

Note that unless this patch is applied, a plain reduce over an Eduction goes through the seq/iterator path of CollReduce, and not eduction's native IReduceInit path.

Comment by Ghadi Shayban [ 17/Dec/14 5:03 PM ]

with this patch + CLJ-1546

(reduce + [1 2 3]) doesn't work anymore, breaking a few tests.

Comment by Ghadi Shayban [ 17/Dec/14 5:16 PM ]

Should have left a bit more detail.
https://github.com/clojure/clojure/commit/ad7d9c46992cac0e812ce3dd47584c9bb2fda11f

This might not have anything to do with CLJ-1546, just happened to have them both applied. Seems like vectors are both IReduce+IReduceInit, but throw on the IReduce impl.

Vectors were made IReduce before IReduce was split into IReduceInit.

Comment by Nicola Mometto [ 17/Dec/14 5:19 PM ]

I've opened CLJ-1619 with a patch implementing the no-init arity of reduce for PersistentVector

Comment by Nicola Mometto [ 17/Dec/14 5:20 PM ]

An alternative fix would be to just make PersistentVectors IReduceInit rather than IReduce but I don't see the point in doing that since the implementation is trivial.





[CLJ-1451] Add take-until Created: 20/Jun/14  Updated: 18/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Alexander Taggart Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: None

Attachments: Text File 0001-CLJ-1451-add-take-until.patch     Text File 0001-CLJ-1451-add-take-until.patch     Text File 0002-CLJ-1451-add-drop-until.patch     Text File 0003-let-take-until-and-drop-until-return-transducers.patch     Text File CLJ-1451-drop-until.patch     Text File CLJ-1451-take-until.patch    
Patch: Code and Test
Approval: Triaged

 Description   

Discussion: https://groups.google.com/d/topic/clojure-dev/NaAuBz6SpkY/discussion

It comes up when I would otherwise use (take-while pred coll), but I need to include the first item for which (pred item) is false.

(take-while pos? [1 2 0 3]) => (1 2)
(take-until zero? [1 2 0 3]) => (1 2 0)

Impl:

(defn take-until
  "Returns a lazy sequence of successive items from coll until
  (pred item) returns true, including that item. pred must be
  free of side-effects."
  [pred coll]
  (lazy-seq
    (when-let [s (seq coll)]
      (if (pred (first s))
        (cons (first s) nil)
        (cons (first s) (take-until pred (rest s)))))))

List of other suggested names: take-upto, take-to, take-through. It is not easy to find something in English that is short and unambiguously means "up to and including". That is one of the dictionary definitions for "through".



 Comments   
Comment by Alex Miller [ 20/Jun/14 10:21 AM ]

Patch welcome (w/tests).

Comment by Alexander Taggart [ 20/Jun/14 2:00 PM ]

Impl and tests for take-until and drop-until, one patch for each.

Comment by Jozef Wagner [ 20/Jun/14 3:01 PM ]

Please change :added metadata to "1.7".

Comment by Alexander Taggart [ 20/Jun/14 3:12 PM ]

Updated to :added "1.7"

Comment by John Mastro [ 21/Jun/14 6:26 PM ]

I'd like to propose take-through and drop-through as alternative names. I think "through" communicates more clearly how these differ from take-while and drop-while.

Comment by Andy Fingerhut [ 06/Aug/14 2:27 PM ]

Both patches CLJ-1451-drop-until.patch and CLJ-1451-take-until.patch dated Jun 20 2014 no longer apply cleanly to latest Clojure master due to some changes committed earlier today. I haven't checked whether they are straightforward to update, but would guess that they merely require updating a few lines of diff context.

See the section "Updating stale patches" at http://dev.clojure.org/display/community/Developing+Patches for suggestions on how to update patches.

Comment by Ghadi Shayban [ 13/Nov/14 11:19 PM ]

Would be nice to cover the transducer case too.

Comment by Michael Blume [ 13/Nov/14 11:54 PM ]

rerolled patches

Comment by Michael Blume [ 14/Nov/14 12:11 AM ]

Covered transducer case =)

Comment by Michael Blume [ 14/Nov/14 12:12 AM ]

Actually I like take/drop-through as well

Comment by Ghadi Shayban [ 16/Nov/14 12:41 PM ]

Michael, no volatile/state is necessary in the transducer, like take-while. Just wrap in 'reduced to terminate

Comment by Michael Blume [ 17/Dec/14 6:47 PM ]

a) you're clearly right about take-until

b) seriously I don't know what I was thinking with my take-until implementation, I'm going to claim lack of sleep.

c) I'm confused about how to make drop-until work without a volatile

Comment by Michael Blume [ 18/Dec/14 1:52 AM ]

Ghadi and I discussed this and couldn't think of a use case for drop-until. Are there any?

Here's a new take-until patch, generative tests included.

Open questions:

Is take-until a good name? My biggest concern is that take-until makes it sound like a slight modification of take, but this function reverses the sense of the predicate relative to take.





[CLJ-1580] Transient collections should guarantee thread visibility Created: 05/Nov/14  Updated: 17/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: transient

Attachments: Text File clj-1580.patch    
Patch: Code
Approval: Vetted

 Description   

With changes from CLJ-1498, transients are still thread isolated but may move between threads during their lifetime which introduces new concurrency concerns, namely visibility of changes across threads.

Approach: Make all transient collection fields either final or volatile to ensure visibility across threads.

Patch: clj-1580.patch

Screened by:






[CLJ-700] contains? broken for transient collections Created: 01/Jan/11  Updated: 17/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2
Fix Version/s: Release 1.8

Type: Defect Priority: Critical
Reporter: Herwig Hochleitner Assignee: Unassigned
Resolution: Unresolved Votes: 14
Labels: transient

Attachments: Java Source File 0001-Refactor-of-some-of-the-clojure-.java-code-to-fix-CL.patch     File clj-700-7.diff     File clj-700-8.diff     File clj-700.diff     Text File clj-700-patch4.txt     Text File clj-700-patch6.txt     Text File clj-700-rt.patch    
Patch: Code and Test
Approval: Vetted

 Description   

Behavior with Clojure 1.6.0:

user=> (contains? (transient {:x "fine"}) :x)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentArrayMap$TransientArrayMap  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (contains? (transient (hash-map :x "fine")) :x)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentHashMap$TransientHashMap  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (contains? (transient [1 2 3]) 0)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentVector$TransientVector  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (contains? (transient #{:x}) :x)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentHashSet$TransientHashSet  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (:x (transient #{:x}))
nil
;; expected: :x

user=> (get (transient #{:x}) :x)
nil
;; expected: :x

Behavior with latest Clojure master as of Jun 27 2014 (same as Clojure 1.6.0) plus patch clj-700-7.diff. In all cases it matches the expected results shown in comments above:

user=> (contains? (transient {:x "fine"}) :x)
true
user=> (contains? (transient (hash-map :x "fine")) :x)
true
user=> (contains? (transient [1 2 3]) 0)
true
user=> (contains? (transient #{:x}) :x)
true
user=> (:x (transient #{:x}))
:x
user=> (get (transient #{:x}) :x) 
:x

Analysis by Alexander Redington: This is caused by expectations in clojure.lang.RT regarding the type of collections for some methods, e.g. contains() and getFrom(). Checking for contains looks to see if the instance passed in is Associative (a subinterface of PersistentCollection), or IPersistentSet.

This patch refactors several of the Clojure interfaces so that logic abstract from the issue of immutability is pulled out to a general interface (e.g. ISet, IAssociative), but preserves the contract specified (e.g. Associatives only return Associatives when calling assoc()).

With more general interfaces in place the contains() and getFrom() methods were then altered to conditionally use the general interfaces which are agnostic of persistence vs. transience. Includes tests in transients.clj to verify the changes fix this problem.

Questions on this approach from Stuart Halloway to Rich Hickey:

1. this represents working back from the defect to rethinking abstractions (good!). Does it go far enough?

2. what are good names for the interfaces introduced here?

Alex Miller: Should also keep an eye on CLJ-787 as it may have some collisions with this one.

Patch: clj-700-8.diff

One 'trailing whitespace' warning is perfectly normal when applying this patch to latest Clojure master as of Sep 1 2014, as shown below. This is simply because of carriage returns at the end of lines in file Associative.java. I know of no way to avoid such a warning without removing CRs from all Clojure source files (e.g. CLJ-1026):

% git am -s --keep-cr --ignore-whitespace < ~/clj/patches/clj-700-8.diff
Applying: Refactor of some of the clojure .java code to fix CLJ-700.
/Users/andy/clj/latest-clj/clojure/.git/rebase-apply/patch:29: trailing whitespace.
public interface Associative extends IPersistentCollection, IAssociative{
warning: 1 line adds whitespace errors.
Applying: more CLJ-700: refresh to use hasheq

------

Adding an addendum here for now. Needs more discussion and clean up before screening. I added clj-700-rt.patch which is a completely different approach to solving this issue in a less invasive way - clj-700-rt.patch. - Alex M



 Comments   
Comment by Herwig Hochleitner [ 01/Jan/11 8:01 PM ]

the same is also true for TransientVectors

{{(contains? (transient [1 2 3]) 0)}}

false

Comment by Herwig Hochleitner [ 01/Jan/11 8:25 PM ]

As expected, TransientSets have the same issue; plus an additional, probably related one.

(:x (transient #{:x}))

nil

(get (transient #{:x}) :x)

nil

Comment by Alexander Redington [ 07/Jan/11 2:07 PM ]

This is caused by expectations in clojure.lang.RT regarding the type of collections for some methods, e.g. contains() and getFrom(). Checking for contains looks to see if the instance passed in is Associative (a subinterface of PersistentCollection), or IPersistentSet.

This patch refactors several of the Clojure interfaces so that logic abstract from the issue of immutability is pulled out to a general interface (e.g. ISet, IAssociative), but preserves the contract specified (e.g. Associatives only return Associatives when calling assoc()).

With more general interfaces in place the contains() and getFrom() methods were then altered to conditionally use the general interfaces which are agnostic of persistence vs. transience. Includes tests in transients.clj to verify the changes fix this problem.

Comment by Stuart Halloway [ 28/Jan/11 10:35 AM ]

Rich: Patch doesn't currently apply, but I would like to get your take on approach here. In particular:

  1. this represents working back from the defect to rethinking abstractions (good!). Does it go far enough?
  2. what are good names for the interfaces introduced here?
Comment by Alexander Redington [ 25/Mar/11 7:44 AM ]

Rebased the patch off the latest pull of master as of 3/25/2011, it should apply cleanly now.

Comment by Stuart Sierra [ 17/Feb/12 2:59 PM ]

Latest patch does not apply as of f5bcf647

Comment by Andy Fingerhut [ 17/Feb/12 5:59 PM ]

clj-700-patch2.txt does patch cleanly to latest Clojure head as of a few mins ago. No changes to patch except in context around changed lines.

Comment by Andy Fingerhut [ 07/Mar/12 3:23 AM ]

Sigh. Git patches applied via 'git am' are fragile beasts indeed. Look at them the wrong way and they fail to apply.

clj-700-patch3.txt applies cleanly to latest master as of Mar 7, 2012, but not if you use this command:

git am -s < clj-700-patch3.txt

I am pretty sure this is because of DOS CR/LF line endings in the file src/jvm/clojure/lang/Associative.java. The patch does apply cleanly if you use this command:

git am --keep-cr -s < clj-700-patch3.txt

Comment by Andy Fingerhut [ 23/Mar/12 6:34 PM ]

This ticket was changed to Incomplete and waiting on Rich when Stuart Halloway asked for feedback on the approach on 28/Jan/2011. Stuart Sierra changed it to not waiting on Rich on 17/Feb/2012 when he noted the patch didn't apply cleanly. Latest patch clj-700-patch3.txt does apply cleanly, but doesn't change the approach used since the time Stuart Halloway's concern was raised. Should it be marked as waiting on Rich again? Something else?

Comment by Stuart Halloway [ 08/Jun/12 12:44 PM ]

Patch 4 incorporates patch 3, and brings it up to date on hashing (i.e. uses hasheq).

Comment by Andy Fingerhut [ 08/Jun/12 12:52 PM ]

Removed clj-700-patch3.txt in favor of Stuart Halloway's improved clj-700-patch4.txt dated June 8, 2012.

Comment by Andy Fingerhut [ 18/Jun/12 3:06 PM ]

clj-700-patch5.txt dated June 18, 2012 is the same as Stuart Halloway's clj-700-patch4.txt, except for context lines that have changed in Clojure master since Stuart's patch was created. clj-700-patch4.txt no longer applies cleanly.

Comment by Andy Fingerhut [ 19/Aug/12 4:47 AM ]

Adding clj-700-patch6.txt, which is identical to Stuart Halloway's clj-700-patch4.txt, except that it applies cleanly to latest master as of Aug 19, 2012. Note that as described above, you must use the --keep-cr option to 'git am' when applying this patch for it to succeed. Removing clj-700-patch5.txt, since it no longer applies cleanly.

Comment by Stuart Sierra [ 24/Aug/12 1:08 PM ]

Patch fails as of commit 1c8eb16a14ce5daefef1df68d2f6b1f143003140

Comment by Andy Fingerhut [ 24/Aug/12 1:53 PM ]

Which patch did you try, and what command did you use? I tried applying clj-700-patch6.txt to the same commit, using the following command, and it applied, albeit with the warning messages shown:

% git am --keep-cr -s < clj-700-patch6.txt
Applying: Refactor of some of the clojure .java code to fix CLJ-700.
/Users/jafinger/clj/latest-clj/clojure/.git/rebase-apply/patch:29: trailing whitespace.
public interface Associative extends IPersistentCollection, IAssociative{
warning: 1 line adds whitespace errors.
Applying: more CLJ-700: refresh to use hasheq

Note the --keep-cr option, which is necessary for this patch to succeed. It is recommended in the "Screening Tickets" section of the JIRA workflow wiki page here: http://dev.clojure.org/display/design/JIRA+workflow

Comment by Andy Fingerhut [ 28/Aug/12 5:48 PM ]

Presumptuously changing Approval from Incomplete back to None, since the latest patch does apply cleanly if the --keep-cr option is used. It was in Screened state recently, but I'm not so presumptuous as to change it to Screened

Comment by Alex Miller [ 19/Aug/13 12:26 PM ]

I think through a series of different hands on this ticket it got knocked way back in the list. Re-marking vetted as it's previously been all the way up through screening. Should also keep an eye on CLJ-787 as it may have some collisions with this one.

Comment by Andy Fingerhut [ 08/Nov/13 10:14 AM ]

clj-700-7.diff is identical to clj-700-patch6.txt, except it applies cleanly to latest master. Only some lines of context in a test file have changed.

When I say "applies cleanly", I mean that there is one warning when using the proper "git am" command from the dev wiki page. This is because one line replaced in Associative.java has a CR/LF at the end of the line, because all lines in that file do.

Comment by Herwig Hochleitner [ 17/Feb/14 9:54 AM ]

Since clojure 1.5, contains? throws an IllegalArgumentException on transients.
In 1.6.0-beta1, transients are no longer marked as alpha.

Does this mean, that we won't be able to distinguish between a nil value and no value on a transient?

Comment by Stuart Halloway [ 27/Jun/14 10:20 AM ]

Request for someone to (1) update patch to apply cleanly, and (2) summarize approach so I don't have to read through the comment history.

Comment by Andy Fingerhut [ 27/Jun/14 11:02 AM ]

The latest patch is clj-700-7.diff dated Nov 8, 2013. I believe it is impossible to create a patch that applies any more cleanly using git for source files that have carriage returns in them, which at least one modified source file does. Here is the command I used on latest Clojure master as of today (Jun 27 2014), which is the same as that of March 25 2014:

% git am -s --keep-cr --ignore-whitespace < ~/clj/patches/clj-700-7.diff 
Applying: Refactor of some of the clojure .java code to fix CLJ-700.
/Users/admin/clj/latest-clj/clojure/.git/rebase-apply/patch:29: trailing whitespace.
public interface Associative extends IPersistentCollection, IAssociative{
warning: 1 line adds whitespace errors.
Applying: more CLJ-700: refresh to use hasheq

If you want a patch that doesn't have the 'trailing whitespace' warning in it, I think someone would have to commit a change that removed the carriage returns from file Associative.java. If you want such a patch, let me know and we can remove all of them from every source file and be done with this annoyance.

Comment by Andy Fingerhut [ 27/Jun/14 11:19 AM ]

Updated description to contain a copy of only those comments that seemed 'interesting'. Most comments have simply been "attached an updated patch that applies cleanly", or "changed the state of this ticket for reason X".

Comment by Alex Miller [ 27/Jun/14 1:19 PM ]

Looks like Andy did as requested, moving back to Screenable.

Comment by Andy Fingerhut [ 29/Aug/14 4:27 PM ]

Patch clj-700-7.diff dated Nov 8 2013 no longer applied cleanly to latest master after some commits were made to Clojure on Aug 29, 2014. It did apply cleanly before that day.

I have not checked how easy or difficult it might be to update this patch.

Comment by Andy Fingerhut [ 01/Sep/14 3:59 AM ]

Patch clj-700-8.diff dated Sep 1 2014 is identical to clj-700-7.diff, except that it applies "cleanly" to latest master, by which I mean it applies as cleanly as I think it is possible to apply for a git patch to a file with carriage return/line feed line endings, as one of the modified files still does.

Comment by Alex Miller [ 17/Dec/14 3:12 PM ]

Added new patch with alternate approach that just makes RT know about transients instead of refactoring the class hierarchy.

clj-700-rt.patch

In some ways I think the class hierarchy refactoring is due, but I'm not totally on board with all the changes in those patches and it has impacts on collections outside Clojure itself that are hard to reason about.





[CLJ-1601] transducer arities for map-indexed, distinct, and interpose Created: 25/Nov/14  Updated: 17/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: transducers

Attachments: Text File clj-1601-2.patch     Text File clj-1601-3.patch     Text File clj-1601.patch    
Patch: Code and Test
Approval: Vetted

 Description   
  • with generative tests
  • with examples demonstrating performance

Performance: Details in comments, summary:

(def v (vec (concat (range 1000) (range 1000))))
(into [] (distinct v))            ;; 821.3 µs
(into [] (distinct) v)            ;; 388.2 µs
(into [] (interpose nil v))       ;; 316.0 µs
(into [] (interpose nil) v)       ;; 35.5 µs
(into [] (map-indexed vector v))  ;; 76.8 µs
(into [] (map-indexed vector) v)  ;; 49.4 µs

Patch: clj-1601-3.patch

Screened by:



 Comments   
Comment by Alex Miller [ 25/Nov/14 11:54 AM ]

working on this

Comment by Alex Miller [ 25/Nov/14 4:22 PM ]

Initial patch with impls. Tests and perf still to do.

Comment by Alex Miller [ 27/Nov/14 7:09 AM ]

Perf tests, summarized in description:

user=> (use 'criterium.core)
nil
user=> (def v (vec (concat (range 1000) (range 1000))))
#'user/v
user=> (quick-bench (into [] (distinct v)))
WARNING: Final GC required 10.433088780213309 % of runtime
Evaluation count : 744 in 6 samples of 124 calls.
             Execution time mean : 821.339608 µs
    Execution time std-deviation : 11.351053 µs
   Execution time lower quantile : 811.901435 µs ( 2.5%)
   Execution time upper quantile : 837.972000 µs (97.5%)
                   Overhead used : 1.794010 ns
nil
user=> (quick-bench (into [] (distinct) v))
WARNING: Final GC required 10.78492057474076 % of runtime
Evaluation count : 14028 in 6 samples of 2338 calls.
             Execution time mean : 43.630656 µs
    Execution time std-deviation : 170.185825 ns
   Execution time lower quantile : 43.433193 µs ( 2.5%)
   Execution time upper quantile : 43.853959 µs (97.5%)
                   Overhead used : 1.794010 ns
				   
user=> (quick-bench (into [] (interpose nil v)))
WARNING: Final GC required 10.79555726490133 % of runtime
Evaluation count : 1914 in 6 samples of 319 calls.
             Execution time mean : 316.024853 µs
    Execution time std-deviation : 9.077484 µs
   Execution time lower quantile : 310.139273 µs ( 2.5%)
   Execution time upper quantile : 330.917486 µs (97.5%)
                   Overhead used : 1.794010 ns

Found 1 outliers in 6 samples (16.6667 %)
	low-severe	 1 (16.6667 %)
 Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
nil
user=> (quick-bench (into [] (interpose nil) v))
WARNING: Final GC required 10.70401297525592 % of runtime
Evaluation count : 17022 in 6 samples of 2837 calls.
             Execution time mean : 35.592672 µs
    Execution time std-deviation : 560.066138 ns
   Execution time lower quantile : 35.252348 µs ( 2.5%)
   Execution time upper quantile : 36.553414 µs (97.5%)
                   Overhead used : 1.794010 ns

Found 1 outliers in 6 samples (16.6667 %)
	low-severe	 1 (16.6667 %)
 Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
nil

user=> (quick-bench (into [] (map-indexed vector v)))
WARNING: Final GC required 12.45755646853723 % of runtime
Evaluation count : 7338 in 6 samples of 1223 calls.
             Execution time mean : 76.807691 µs
    Execution time std-deviation : 381.019170 ns
   Execution time lower quantile : 76.433202 µs ( 2.5%)
   Execution time upper quantile : 77.170733 µs (97.5%)
                   Overhead used : 1.794010 ns
nil
user=> (quick-bench (into [] (map-indexed vector) v))
WARNING: Final GC required 11.38700971837483 % of runtime
Evaluation count : 12474 in 6 samples of 2079 calls.
             Execution time mean : 49.458043 µs
    Execution time std-deviation : 620.716737 ns
   Execution time lower quantile : 48.995801 µs ( 2.5%)
   Execution time upper quantile : 50.229507 µs (97.5%)
                   Overhead used : 1.794010 ns
Comment by Alex Miller [ 17/Dec/14 1:50 PM ]

Updated based on comment from Christophe Grand that java.util.HashSet used in distinct impl had different hash/equality semantics than the set used with sequences.





[CLJ-1544] AOT bug involving namespaces loaded before AOT compilation started Created: 01/Oct/14  Updated: 17/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Allen Rohner Assignee: Unassigned
Resolution: Unresolved Votes: 4
Labels: aot

Attachments: Text File 0001-CLJ-1544-force-reloading-of-namespaces-during-AOT-co.patch     Text File 0001-CLJ-1544-force-reloading-of-namespaces-during-AOT-co-v2.patch    
Patch: Code
Approval: Vetted

 Description   

If namespace "a" that is being AOT compiled requires a namespace "b" that has been loaded but not AOT compiled, the classfile for that namespace will never be emitted on disk, causing errors when compiling uberjars or in other cases.

A minimal reproducible case is described in the following comment: http://dev.clojure.org/jira/browse/CLJ-1544?focusedCommentId=36734&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-36734

Other examples of the bug:
https://github.com/arohner/clj-aot-repro
https://github.com/methylene/class-not-found

A real issue triggered by this bug: https://github.com/cemerick/austin/issues/23

Approach: The approach taken by the attached patch is to force reloading of namespaces during AOT compilation if no matching classfile is found in the compile-path or in the classpath



 Comments   
Comment by Alex Miller [ 04/Dec/14 12:45 PM ]

Possibly related: CLJ-1457

Comment by Nicola Mometto [ 05/Dec/14 4:51 AM ]

Has anyone been able to reproduce this bug from a bare clojure repl? I have been trying to take lein out of the equation for an hour but I don't seem to be able to reproduce it – this makes me think that it's possible that this is a lein/classlojure/nrepl issue rather than a compiler/classloader bug

Comment by Nicola Mometto [ 06/Dec/14 4:20 PM ]

I was actually able to reproduce and understand this bug thanks to a minimal example reduced from a testcase for CLJ-1413.

>cat error.sh
#!/bin/sh

rm -rf target && mkdir target

java -cp src:clojure.jar clojure.main - <<EOF
(require 'myrecord)
(set! *compile-path* "target")
(compile 'core)
EOF

java -cp target:clojure.jar clojure.main -e "(use 'core)"

> cat src/core.clj
(in-ns 'core)
(clojure.core/require 'myrecord)
(clojure.core/import myrecord.somerecord)

>cat src/myrecord.clj
(in-ns 'myrecord)
(clojure.core/defrecord somerecord [])

> ./error.sh
Exception in thread "main" java.lang.ExceptionInInitializerError
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:344)
	at clojure.lang.RT.classForName(RT.java:2113)
	at clojure.lang.RT.classForName(RT.java:2122)
	at clojure.lang.RT.loadClassForName(RT.java:2141)
	at clojure.lang.RT.load(RT.java:430)
	at clojure.lang.RT.load(RT.java:411)
	at clojure.core$load$fn__5403.invoke(core.clj:5808)
	at clojure.core$load.doInvoke(core.clj:5807)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at clojure.core$load_one.invoke(core.clj:5613)
	at clojure.core$load_lib$fn__5352.invoke(core.clj:5653)
	at clojure.core$load_lib.doInvoke(core.clj:5652)
	at clojure.lang.RestFn.applyTo(RestFn.java:142)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$load_libs.doInvoke(core.clj:5691)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.core$apply.invoke(core.clj:630)
	at clojure.core$use.doInvoke(core.clj:5785)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at user$eval212.invoke(NO_SOURCE_FILE:1)
	at clojure.lang.Compiler.eval(Compiler.java:6767)
	at clojure.lang.Compiler.eval(Compiler.java:6730)
	at clojure.core$eval.invoke(core.clj:3076)
	at clojure.main$eval_opt.invoke(main.clj:288)
	at clojure.main$initialize.invoke(main.clj:307)
	at clojure.main$null_opt.invoke(main.clj:342)
	at clojure.main$main.doInvoke(main.clj:420)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: java.io.FileNotFoundException: Could not locate myrecord__init.class or myrecord.clj on classpath.
	at clojure.lang.RT.load(RT.java:443)
	at clojure.lang.RT.load(RT.java:411)
	at clojure.core$load$fn__5403.invoke(core.clj:5808)
	at clojure.core$load.doInvoke(core.clj:5807)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at clojure.core$load_one.invoke(core.clj:5613)
	at clojure.core$load_lib$fn__5352.invoke(core.clj:5653)
	at clojure.core$load_lib.doInvoke(core.clj:5652)
	at clojure.lang.RestFn.applyTo(RestFn.java:142)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$load_libs.doInvoke(core.clj:5691)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.core$apply.invoke(core.clj:628)
	at clojure.core$require.doInvoke(core.clj:5774)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at core__init.load(Unknown Source)
	at core__init.<clinit>(Unknown Source)
	... 33 more

This bug also has also affected Austin: https://github.com/cemerick/austin/issues/23

Essentially this bug manifests itself when a namespace defining a protocol or a type/record has been JIT loaded and a namespace that needs the protocol/type/record class is being AOT compiled later. Since the namespace defining the class has already been loaded the class is never emitted on disk.

Comment by Nicola Mometto [ 06/Dec/14 6:51 PM ]

I've attached a tentative patch fixing the issue in the only way I found reasonable: forcing the reloading of namespaces during AOT compilation if the compiled classfile is not found in the compile-path or in the classpath

Comment by Nicola Mometto [ 06/Dec/14 7:30 PM ]

Updated patch forces reloading of the namespace even if a classfile exists in the compile-path but the source file is newer, mimicking the logic of clojure.lang.RT/load

Comment by Nicola Mometto [ 06/Dec/14 7:39 PM ]

Further testing demonstrated that this bug is not only scoped to deftypes/defprotocols but can manifest itself in the general case of a namespace "a" requiring a namespace "b" already loaded, and AOT compiling the namespace "a"

Comment by Tassilo Horn [ 08/Dec/14 4:46 AM ]

I'm also affected by this bug. Is there some workaround I can apply in the meantime, e.g., by dictating the order in which namespaces are going to be loaded/compiled in project.clj?

Comment by Nicola Mometto [ 15/Dec/14 10:58 AM ]

Tassilo, if you don't have control over whether or not a namespace that an AOT namespace depends on has already been loaded before compilation starts, requiring those namespaces with :reload-all should be enough to work around this issue

Comment by Tassilo Horn [ 15/Dec/14 11:36 AM ]

Nicola, thanks! But in the meantime I've switched to using clojure.java.api and omit AOT-compilation. That works just fine, too.

Comment by Michael Blume [ 15/Dec/14 5:05 PM ]

Tassilo, that's often a good solution, another is to use a shim clojure class

(ns myproject.main-shim (:gen-class))

(defn -main [& args]
  (require 'myproject.main)
  ((resolve 'myproject.main) args))

then your shim namespace is AOT-compiled but nothing else in your project is.

Comment by Tassilo Horn [ 16/Dec/14 1:07 AM ]

Thanks Michael, that's a very good suggestion. In fact, I've always used AOT only as a means to export some functions to Java-land. Basically, I did as you suggest but required the to-be-exported fn's namespace in the ns-form which then causes AOT-compilation of that namespace and its own deps recursively. So your approach seems to be as convenient from the Java side (no need to clojure.java.require `require` in order to require the namespace with the fn I wanna call ) while still omitting AOT. Awesome!





[CLJ-979] Clojure resolves to wrong deftype classes when AOT compiling or reloading Created: 03/May/12  Updated: 17/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3, Release 1.4, Release 1.5, Release 1.6, Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Edmund Jackson Assignee: Unassigned
Resolution: Unresolved Votes: 13
Labels: aot, classloader, compiler

Attachments: Text File CLJ-979.patch     Text File clj-979-symptoms.patch     Text File CLJ-979-v2.patch     Text File CLJ-979-v3.patch     Text File CLJ-979-v4.patch     Text File CLJ-979-v5.patch     Text File CLJ-979-v6.patch     Text File CLJ-979-v7.patch    
Patch: Code and Test
Approval: Vetted

 Description   

Compiling a class via `deftype` during AOT compilation gives different results for the different constructors. These hashes should be identical.

user=> (binding [*compile-files* true] (eval '(deftype Abc [])))
user.Abc
user=> (hash Abc)
16446700
user=> (hash (class (->Abc)))
31966239 ;; should be 16446700

This also means that whenever there's a stale AOT compiled deftype class in the classpath, that class will be used rather then the JIT compiled one, breaking repl interaction.

Another demonstration of this classloader issue (from CLJ-1495) when reloading deftypes (no AOT) :

user> (defrecord Foo [bar])
user.Foo
user> (= (->Foo 42) #user.Foo{:bar 42}) ;;expect this to evaluate to true
true
user> (defrecord Foo [bar])
user.Foo
user> (= (->Foo 42) #user.Foo{:bar 42}) ;;expect this to evaluate to true also -- but it doesn't!
false
user>

This bug also affects AOT compilation of multimethods that dispatch on a class, this affected core.match for years see http://dev.clojure.org/jira/browse/MATCH-86, http://dev.clojure.org/jira/browse/MATCH-98. David had to work-around this issue by using a bunch of protocols instead of multimethods.

Cause of the bug: currently clojure uses Class.forName to resolve a class from a class name, which ignores the class cache from DynamicClassLoader thus reloading deftypes or mixing AOT compilation at the repl with deftypes breaks, resolving to the wrong class.

Approach: the current patch (CLJ-979-v7.patch) addresses this issue in multiple ways:

  • it makes RT.classForName/classForNameNonLoading look in the class cache before delegating to Class/forName if the current classloader is not a DynamicClassLoader (this incidentally addresses also CLJ-1457)
  • it makes clojure use RT.classForName/classForNameNonLoading instead of Class/forName
  • it overrides Classloader/loadClass so that it's class cache aware – this method is used by the jvm to load classes
  • it changes gen-interface to always emit an in-memory interface along the [optional] in disk interface so that the in-memory class is always updated.


 Comments   
Comment by Scott Lowe [ 12/May/12 9:05 PM ]

I can't reproduce this under Clojure 1.3 or 1.4, and Leiningen 1.7.1 on either Java 1.7.0-jdk7u4-b21 OpenJDK 64-Bit or Java 1.6.0_31 Java HotSpot 64-Bit. OS is Mac OS X 10.7.

Edmund, how are you running this AOT code? I wrapped your code in a main function and built an uberjar from it.

Comment by Edmund Jackson [ 13/May/12 2:20 AM ]

Hi Scott,

Interesting.

I have two use cases
1. AOT compile and call from repl.
My steps: git clone, lein compile, lein repl, (use 'aots.death), (in-ns 'aots.death), (= (class (Dontwork. nil)) (class (map->Dontwork {:a 1}))) => false

2. My original use case, which I've minimised here, is an AOT ns, producing a genclass that is called instantiated from other Java (no main). This produces the same error. I will produce an example of this and post it too.

Comment by Edmund Jackson [ 13/May/12 4:23 AM ]

Hi Scott,

Here is an example of it failing in the interop case: https://github.com/ejackson/aotquestion2
The steps I'm following to compile this all up are

git clone git@github.com:ejackson/aotquestion2.git
cd aotquestion2/cljside/
lein uberjar
lein install
cd ../javaside/
mvn package
java -jar ./target/aotquestion-1.0-SNAPSHOT.jar

and it dies with this:

Exception in thread "main" java.lang.ClassCastException: cljside.core.Dontwork cannot be cast to cljside.core.Dontwork
at cljside.MyClass.makeDontwork(Unknown Source)
at aotquestion.App.main(App.java:8)

The error message is really confusing (to me, anyway), but I think its the same root problem as for the REPL case.

What do you see when you run the above ?

Comment by Scott Lowe [ 13/May/12 8:41 AM ]

Ah, yes, looks like my initial attempt to reproduce was too simplistic. I used your second git repo, and can now confirm that it's failing for me with the same error.

Comment by Scott Lowe [ 13/May/12 10:35 PM ]

I looked into this a little further and the AOT generated code looks correct, in the sense that both code paths appear to be returning the same type.

However, I wonder if this is really a ClassLoader issue, whereby two definitions of the same class are being loaded at different times, because that would cause the x.y.Class cannot be cast to x.y.Class exception that we're seeing here.

Comment by Steve Miner [ 03/Sep/13 9:54 AM ]

This could be related to CLJ-1157 which deals with a ClassLoader issue with AOT compiled code.

Comment by Ambrose Bonnaire-Sergeant [ 29/Mar/14 1:11 PM ]

I've tried this patch attached to CLJ-1157 and it did not solve this issue.

Comment by Ambrose Bonnaire-Sergeant [ 29/Mar/14 2:27 PM ]

This bug seems to be rooted in different behaviour for do/let under compilation. Attached a patch showing these symptoms in the hope it helps people find the cause.

Comment by Peter Taoussanis [ 22/Sep/14 3:12 AM ]

Just a quick note to confirm that this still seems to be around as of Clojure 1.7.0-alpha2. Don't have any useful input on possible solutions, sorry.

Comment by Alex Miller [ 04/Dec/14 1:12 PM ]

Duplicates - CLJ-1495, CLJ-1132

Comment by Nicola Mometto [ 04/Dec/14 1:50 PM ]

The attached patch fixes the classloader issues by routing RT.classForName & variants through the DynamicClassLoader class cache before invoking Class.forName

Comment by Nicola Mometto [ 04/Dec/14 1:59 PM ]

Re-adding triaged status added by Alex Miller that got accidentaly nuked by a race-condition between my edits to the ticket description and Alex's ones

Comment by Nicola Mometto [ 04/Dec/14 2:30 PM ]

0001-CLJ-979-make-clojure-resolve-to-the-correct-Class-in-v2.patch is the same as 0001-CLJ-979-make-clojure-resolve-to-the-correct-Class-in.patch except it unconditionally looks for classes in the class cache of DynamicClassLoader, even if baseLoader() is not a DynamicClassLoader.
This fixes the bug of CLJ-1457 but might just be a workaround

Comment by Michael Blume [ 11/Dec/14 3:29 PM ]

Current patch blows up my Clojure build

https://gist.github.com/MichaelBlume/aa26fc715cbbdf711290

Comment by Nicola Mometto [ 11/Dec/14 3:45 PM ]

Michael: the current patch builds clojure fine for me, I'll try to reproduce. Which jvm version are you using?

Comment by Michael Blume [ 11/Dec/14 4:26 PM ]

[14:24][michael.blume@tcc-michael-4:~/workspace/clojure((0fc43db...))]$ java -version
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
[14:24][michael.blume@tcc-michael-4:~/workspace/clojure((0fc43db...))]$ mvn -version
Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 2014-08-11T13:58:10-07:00)
Maven home: /usr/local/Cellar/maven/3.2.3/libexec
Java version: 1.8.0_25, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.10.1", arch: "x86_64", family: "mac"

build was after I applied the patch to the current master branch of the clojure github repo

Comment by Andy Fingerhut [ 11/Dec/14 5:34 PM ]

I am seeing a similar compilation error as Michael Blume, with both JDK 1.7 and 1.8 on Mac OS X 10.9.5.

By accident I found that if I take latest Clojure master and do 'mvn package', then apply the patch CLJ-979.patch dated Dec 11 2014, then do 'mvn package' again without 'mvn clean', it compiles with no errors. If I do 'mvn clean' then 'mvn package' in a patched tree, I get the error every time I've tried.

Comment by Nicola Mometto [ 12/Dec/14 5:50 AM ]

The updated patch fixes the LinkageError Andy and Michael were getting.

Andy, Michael, can you confirm?

Comment by Nicola Mometto [ 12/Dec/14 9:38 AM ]

Added more testcases to new patch

Comment by Nicola Mometto [ 12/Dec/14 10:09 AM ]

Cleaned up the patch from whitespace changes

Comment by Andy Fingerhut [ 12/Dec/14 12:32 PM ]

I tried latest Clojure master plus patch CLJ-979-v4.patch, dated 12 Dec 2014, with Mac OS X 10.9.5 + JDK7, and Ubuntu Linux 14.04 with JDKs 6 through 9, and 'mvn clean' followed by 'mvn package' built and passed tests successfully with all of them.

I did notice that some files were created in the test directory that were not cleaned up by the end of the test, which you can use 'git status .' to see. Not sure if that is considered a bad thing for Clojure tests.

Comment by Nicola Mometto [ 12/Dec/14 1:07 PM ]

Thanks Andy, I've updated the patch and it now should remove all temporary classes created by the test.
It's probably not the best way to do it but I couldn't figure out how to do it another way.

Comment by Michael Blume [ 12/Dec/14 2:34 PM ]

Yep, looks good to me =)

Comment by Alex Miller [ 15/Dec/14 4:01 PM ]

Thanks first to Nicola for all his work so far on this!

Some feedback:
1) While the ticket itself isn't bad, I would really like to focus the title and description on a crisp statement of the real problem now that we understand it more. I'd like help on making sure we capture that correctly - how is this for a title: "Uses of Class.forName() miss classes in DynamicClassLoader cache?" ?

Similarly, the description should focus on the problem and show some examples. The defrecord one is good. The first example works for me before the patch and fails after?

2) The crux of this whole thing is the change in loading order in DCL.loadClass() - changing this is a big deal. We really need broader testing with things likely to be affected - off the top of my head: Immutant, Pomegranate, Leiningen, or anything else that monkeys with classloader stuff. Maybe something with Eclipse/OSGi if there is something testable out there.

3) DynamicClassLoader comments:
a) loadClass(String name) - I believe this is identical to the super impl, so can be removed.
b) findClass(String name) - now that we are hijacking loadClass(), I'm not sure it's even necessary to implement this or to call super.findClass() - if you get to the super.findClass(), I would expect that to always throw CNFE. Potentially, this method could even be removed (but this might do bad things if there are subclasses of DCL out there in the wild).
c) loadClass(String name, ...) - instead of calling findClass() and using the CNFE for control flow, you could just directly call findInMemoryClass(), then use a returned null as the decision factor. Again, this is possibly bad if there are DCL subclasses, so I'm on the fence about it.

4) Is the change in gen-interface something that should be a separate ticket? Seems like it could be separable.

5) I don't like the test changes with respect to set up and cleanup. The build already supports compiling a subset of test ns'es (like clojure.test-clojure.genclass.examples). I'd prefer to use that existing mechanism if at all possible. Check the build.xml for the hard-coded ns list.

6) What are the performance implications? I'm not expecting they are significant but we just made a bunch of changes for compilation performance and I'd hate to undo them all. Could findInMemoryClass be smarter about avoiding checks that can't succeed (avoiding "java.*" for example?).

Comment by Nicola Mometto [ 15/Dec/14 5:43 PM ]

1) It's not really about Class.forName() specifically, it's about DynamicClassLoader not being class cache aware in the loadClass method. The JVM uses the classloader loadClass method for resolving all kind of class usages,
including but not limited to Class.forName() (i.e. when loading some bytecode containing a "new" instruction, that class reference will be resolved via a call to loadClass)
I'll try to make the documentation a bit more clear, the first example is an exhibit of the bugged behaviour, the two calls should output the same hash.

2,4) So, there are 3 approaches to how DynamicClassLoader could go at it:

  • Prefer in-disk classes over in-memory classes, roughly the current approach (sometimes it will pick the in-memory class over the in-disk one causing weird bugs like Foo. and ->Foo constructing different classes), has the
    negative effect of breaking interaction between AOT compilation and JIT loading, which has created all sorts of troubles with redefinig deftypes/defprotocols in repls while having stale classfiles in disk.
  • Always pick the most-updated class, this has the advangate of being always correct but has several disadvantages that make it inpracticable in my opinion: we'd have to keep track of the timestamp in which a dynamic class
    is defined, and make the loadClass implementation such that if there a class is both in-memory and in-disk, it compares the timestamps and select the most updated one. This would complicate the implementation a lot and we'd
    likely have to pay a substantial performance hit.
  • Prefer in-memory classes over in-disk classes, the approach proposed in the current patches. It has the advantage of being almost always correct, make repl interaction & jit/aot mixing work fine and the implementation is
    mostly straightforward. The downside is that in cases like gen-class where an AOT class can actually be the most updated version, the in-memory version will be used. In clojure all the forms that do bytecode emission either
    only do AOT compilation or do AOT compilation on demand and always load the class in memory, except gen-interface that doesn't load the class in memory if it's being AOT compiled. Changing its semantics to behave like the
    other jit/aot compiling forms (deftype/defrecord/reify) is the only way to make this approach work so I don't think this should go in another ticket.

5) I don't like the previous testing strategy either but couldn't figure out a better way. Thanks for the pointer on the already in-place infrastructure, I'll check it out and update the patch

In the meantime I've uploaded a new patch addressing 3 and 6. Specifically:
3) I removed the unnecessary loadClass(String) arity, I've made loadClass(String, boolean) use findInMemoryClass(String) directly rather than relying on findClass(String) since nowhere in the documentation it guarantess that
findClass will be used by loadClass. However I've left the findClass(String) implementation in place in case there's code out there that relies on this.
6) I haven't done any serious testing but I haven't noticed any significant difference in compile times for any of my tools.* contrib libraries with the current patch. Filtering "java.*" class names before the inMemory check
didn't seem to produce any difference so it's not included in the updated patch. However I'll probably include an alternative patch with that filtering to do more performance testings and see if it can actually help a bit.

All this said, I'm afraid that I won't have time to personally do an in-depth benchmarking & cross project testing of this patch. I've been spending almost all the free time I had in the past weeks working through a bunch of tickets (mostly this one) but now because of school and other commitments I can't promise I will be able to do anything more than maintaining the current patch & answering to any questions about the bug. Any help in moving this ticket further would be appreciated, in particular to address points 2 and 6.

Comment by Alex Miller [ 16/Dec/14 8:33 AM ]

Thanks Nicola. I'll certainly take over sheparding the bug and appeal to the greater community for help in broad testing when I think we're ready for that.

Comment by Nicola Mometto [ 16/Dec/14 12:50 PM ]

Updated patch with better tests, addressing Alex Miller's comments.





[CLJ-1618] Widen set to take Iterable/IReduceInit Created: 17/Dec/14  Updated: 17/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File clj-1618.patch    
Patch: Code
Approval: Vetted

 Description   

Similar to CLJ-1546 (same thing on vec), set should work on IReducibleInit or Iterable. Currently eduction will work via Iterable but through SeqIterator. set on an IReduceInit will throw an error.

user=> (set (eduction (map inc) (range 100)))  ;; works, but slower path
user=> (set (reify clojure.lang.IReduceInit  
       (reduce [_ f start]
         (reduce f start (range 10)))))
IllegalArgumentException Don't know how to create ISeq from: user$eval1198$reify__1199  clojure.lang.RT.seqFrom (RT.java:506)

Approach: Check for and use IReduceInit path if available, otherwise fallback to seq. Additionally, the patch adds a modification to return a set without it's meta (same approach as CLJ-1546) if a set is passed, which is fast constant time with no change in effective behavior.

Performance: (using Criterium quick-bench)

Timings done with either (count (set coll)) or (count (into #{} coll)):

coll 1.6.0 into 1.6.0 set 1.7.0-alpha4 set 1.7.0-alpha4+patch set
(set (range 100)) 15.4 µs 17.0 µs 11.4 µs 0.0 µs
(vec (range 1000000)) 360.7 ms 702.5 ms 391.1 ms 358.6 ms
(doall (range 1000000)) 363.6 ms 736.9 ms 387.5 ms 371.0 ms
(doall (range 5)) 404.9 ns 612.3 ns 481.9 ns 445.9 ns
(eduction (map identity) (vec (range 100))) n/a n/a 11.3 µs 8.7 µs

See also: CLJ-1546, CLJ-1384

Patch: clj-1618.patch

Screened by:






[CLJ-1552] Consider kv support for transducers (similar to reducers fold) Created: 07/Oct/14  Updated: 16/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.8

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: transducers

Approval: Vetted

 Description   

In reducers, fold over a map has special support for kv. Consider whether/how to add this for transducers.



 Comments   
Comment by Marshall T. Vandegrift [ 16/Dec/14 11:13 AM ]

We don't have a JIRA "unvote" feature, but I'd like to register my vote against this proposed enhancement. As a heavy user of clojure.core.reducers, I consider the switch to k-v semantics when reducing a map to be a significant mis-feature. As only an initial transformation function applied directly to a map is able to receive the k-v semantics (a limitation I can’t see how would not carry over to transducers), this behavior crops up most frequently when re-ordering operations and discovering that an intermediate map has now caused an airity error somewhere in the middle of a chain of threaded transformations. I’ve never found cause to invoke it intentionally.





[CLJ-1546] Widen vec to take Iterable/IReduce Created: 02/Oct/14  Updated: 16/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: PNG File benchmark.png     Text File clj-1546-2.patch     Text File clj-1546-3.patch     Text File clj-1546-4.patch     Text File clj-1546-5.patch     Text File clj-1546.patch    
Patch: Code and Test
Approval: Vetted

 Description   

These examples should work but do not:

Something Iterable but not IReduce:

user> (def i (eduction (map inc) (range 100)))
#'user/i
user> (instance? java.util.Collection i)
false
user> (instance? Iterable i)
true
user> (vec i)
RuntimeException Unable to convert: class clojure.core.Iteration to Object[]

Something IReduceInit but not Iterable:

user=> (vec
  (reify clojure.lang.IReduceInit
    (reduce [_ f start]
      (reduce f start (range 10)))))
RuntimeException Unable to convert: class user$reify__15 to Object[]

Proposal: Add PersistentVector.create(Iterable) and PersistentVector.create(IReduceInit) to efficiently create PVs from those.

For performance, vec has several cases:
1) (vec) if vector?: return new vector w/o meta - this matches prior behavior but has a constant cost of a few ns, rather than linear cost. If not a vector, spill to LazilyPersistentVector.create(Object).

2) (LPV) instanceof IReduceInit: Anything reducible can reduce itself fastest. Right now this has a big benefit for PersistentList. on 1.7.0-alpha4 with list of size 1024, into=28 seconds, vec=18 seconds. After patch, vec=7 seconds. If maps, sets, and range were IReduce later they would also use this path and see noticeable boosts. This is also the branch that will handle the Eduction and IReduceInit cases added in the patch.

3) (LPV) instanceof ISeq: If the coll is a sequence already, best to walk it rather than build an iterator or array from it. This calls into PersistentVector.create(ISeq). That implementation now contains an optimization to build into an array and construct the PersistentVector directly from the array for sequences <= 32 elements (which is most common). Once that threshold is reached, it switches to building with transients. The benchmark shows that the patch makes vec substantially faster for all seqs and even faster than into in some cases.

4) (LPV) instanceof Iterable: For all non-Clojure collections (ArrayList) and current non-IReduce Clojure collections (PHM, PHS), this is fastest path. Iterators are preferred to seqs as they do not cache or hold onto the values as they go by. The PV.create() for Iterable uses transients. Due to slightly more overhead, small maps and sets are slightly slower but this would be fixed by CLJ-1499 and/or making PHM/PHS IReduceInit.

5) (LPV) otherwise RT.toArray(): catches Map, String, Object[], primitive array, etc. The important ones here are the arrays - they are slightly slower on small arrays due to overhead of checking more cases above, but big arrays are significantly faster than they were.

In addition, there was one hard-coded path in the Compiler into PersistentVector.create() and I re-routed that through LazilyPersistentVector instead as that code is now the place to choose the fastest path logic.

Patch: clj-1546-5.patch

Screened by:



 Comments   
Comment by Timothy Baldridge [ 02/Oct/14 9:44 AM ]

Is there a reason the final case for (vec something) can't just be a call to (into [] coll)? It seems a bit odd to do (to-array) on anything thats not a java collection or Iterable, when we have IReduce.

Comment by Rich Hickey [ 02/Oct/14 10:02 AM ]

re: Tim - yes, this needs to support IReduce (and thereby educe) as well

Comment by Alex Miller [ 14/Oct/14 9:56 AM ]

Added new patch that handles Iterable and IReduceInit in vec. It also makes calling with a vector much faster due to the first check. into is still faster for chunked seqs (due to special InternalReduce handling of chunking).

It would be possible to move more of the variant checking into LazilyPersistentVector or PersistentVector so it could be used in more contexts. I'm not sure how much to do with that.

It would also be possible to instead lean on reduce more from the Java side if there was a Java version of reduce (as defined in mikera's branch for http://dev.clojure.org/jira/browse/CLJ-1192 at https://github.com/mikera/clojure/compare/clj-1192-vec-performance. Something like that is the only way I can see of leveraging that same InternalReduce logic that makes into faster than vec.

Comment by Alex Miller [ 13/Nov/14 4:14 PM ]

Prior comments from Stu removed from description: "Open Question: Which branch should come first, Collection or IReduceInit? Collection reaches the fast path for small collections through LazilyPersistentVector, but IReduceInit should be faster for larger things. Related: Shouldn't the item count in LazilyPersistentVector be a bounded count?"

I have attached a new patch that simplifies the impl to do it in LazilyPersistentVector instead of in vec, which was easier due to "and" not being able yet when vec is implemented to do the length check.

I have also done a considerable amount of analysis on the matrix of incoming collections and best path to follow and also collected some data on what collections are commonly passed into vec. The current patch reflects those findings. Some highlights:

  • vec is called with PersistentVector in all projects I tested. The instanceof check takes that case from typically 100s of nanos to ~5 ns. So I do think it is worth doing.
  • vec is overwhelmingly called with small collections - in most cases the incoming collection is <10 elements. In cases where the collection is not a sequence, the path of creating the Vector with an owning array is the fastest option, beating even IReduce and transient building (as that path has some checks involved).
  • PersistentList is the only IReduce likely to be encountered by vec right now and adding that branch is a significant performance boost from prior impl and vs into. If maps and sets were IReduce, they would gain this as well.
  • chunked seqs will be significantly faster with into than vec as into goes through CollReduce and can leverage many optimizations on reducing through chunks that are not available to vec.
  • seqs in general though are now faster with vec than they were due to leveraging transients.
  • eduction results support IReduce and are also faster with vec than into.
  • range is currently slower with vec, but when range is IReduce, it will probably be faster with vec

In summary, some new conventional wisdom (after this patch) on (into []) vs vec:

  • vec is faster if passed a vector, an IReduce, or an array
  • into is faster when working with seqs, but even vec is better than it used to be and may even be faster for things like range in the future
Comment by Michael Blume [ 13/Nov/14 7:24 PM ]

Latest patch won't build for me when applied to master

compile-clojure:
     [java] Exception in thread "main" java.lang.ExceptionInInitializerError
     [java] 	at clojure.lang.Compile.<clinit>(Compile.java:29)
     [java] Caused by: java.lang.NoSuchMethodError: clojure.lang.LazilyPersistentVector.create(Ljava/util/Collection;)Lclojure/lang/IPersistentVector;, compiling:(clojure/core.clj:14:23)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7206)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:370)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:361)
     [java] 	at clojure.lang.RT.load(RT.java:440)
     [java] 	at clojure.lang.RT.load(RT.java:411)
     [java] 	at clojure.lang.RT.doInit(RT.java:448)
     [java] 	at clojure.lang.RT.<clinit>(RT.java:329)
     [java] 	... 1 more
     [java] Caused by: java.lang.NoSuchMethodError: clojure.lang.LazilyPersistentVector.create(Ljava/util/Collection;)Lclojure/lang/IPersistentVector;
     [java] 	at clojure.lang.LispReader$VectorReader.invoke(LispReader.java:1073)
     [java] 	at clojure.lang.LispReader.readDelimitedList(LispReader.java:1138)
     [java] 	at clojure.lang.LispReader$ListReader.invoke(LispReader.java:972)
     [java] 	at clojure.lang.LispReader.read(LispReader.java:183)
     [java] 	at clojure.lang.LispReader$WrappingReader.invoke(LispReader.java:535)
     [java] 	at clojure.lang.LispReader.readDelimitedList(LispReader.java:1138)
     [java] 	at clojure.lang.LispReader$MapReader.invoke(LispReader.java:1081)
     [java] 	at clojure.lang.LispReader.read(LispReader.java:183)
     [java] 	at clojure.lang.LispReader$MetaReader.invoke(LispReader.java:716)
     [java] 	at clojure.lang.LispReader.readDelimitedList(LispReader.java:1138)
     [java] 	at clojure.lang.LispReader$ListReader.invoke(LispReader.java:972)
     [java] 	at clojure.lang.LispReader.read(LispReader.java:183)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7190)
     [java] 	... 7 more
Comment by Alex Miller [ 13/Nov/14 7:28 PM ]

Did you clean first? I replaced that static method call there with a wider version but if you are cleaning fresh it should be fine.

Comment by Michael Blume [ 13/Nov/14 7:31 PM ]

Apologies, maven just wasn't doing a good job of tracking changes, running mvn clean fixes the build.

Comment by Alex Miller [ 25/Nov/14 9:58 AM ]

Added benchmark.png showing times (in ns), tested with criterium, for into and vec on different types and sizes on 1.7.0-alpha4 and then vec again after the patch.





[CLJ-1589] Cleanup internal-reduce implementation Created: 14/Nov/14  Updated: 15/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-cleanup-internal-reduce-impl.patch    
Patch: Code
Approval: Screened

 Description   

Currently internal-reduce provides an implementation for ArraySeq and the ArraySeq_* prim classes.
Since those classes implement IReduce the current patch makes instances of those classes fallback on coll-reduce's IReduce impl (that simply invokes .reduce)

This change is desiderable because it removes unnecessary duplicated code, reducing the implementation surface and making it easier to follow reduce's code path. In addition to ArraySeq there will be (based on other tickets) more seq impls that also IReduce, so it would be good to re-route back through coll-reduce when we get combinations of potentially reducible sub-seqs.

Patch: 0001-cleanup-internal-reduce-impl.patch

  • This patch depends on the patch for CLJ-1590 since the current IReduce impl for those ArraySeq classes doesn't properly handle Reduced

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 14/Nov/14 10:28 PM ]

I'm not sure whether this should be in 1.7 or not, but I'm adding it there so we can have a discussion on it regardless.





[CLJ-1606] Transducing an eduction finishes twice Created: 27/Nov/14  Updated: 15/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Herwig Hochleitner Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: transducers
Environment:

1.7.0-alpha4


Attachments: Text File CLJ-1606-2.patch     Text File CLJ-1606-2.patch     Text File CLJ-1606-3.patch     Text File CLJ-1606-4.patch     Text File CLJ-1606.patch    
Patch: Code and Test
Approval: Screened

 Description   
> (transduce (map identity)
             (fn
               ([s] (println "Finishing") s)
               ([s i] s))
             nil
             (eduction (map identity) []))
Finishing
Finishing
nil

Cause: transduce passes (xf f) into .reduce of Eduction, which calls transduce, causing completing xf to be called more than once.

Proposed: Eduction reduce should use (completing f) instead of f to isolate completion of inner xf from outer xf.

Patch: CLJ-1606-4.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 27/Nov/14 11:01 PM ]

identity is not a valid xf - changed to (map identity)

Comment by Ghadi Shayban [ 27/Nov/14 11:34 PM ]

identity is a valid though nonsensical transducer. fix & test added.

Comment by Ghadi Shayban [ 28/Nov/14 12:06 AM ]

Simple reproduction similar to into:

(transduce (map dec)
           (completing conj! persistent!)
           (transient [])
           (eduction (map inc) (range 6)))

;; ClassCastException clojure.lang.PersistentVector cannot be cast to clojure.lang.ITransientCollection

into doesn't use completing, and conj! has an arity that hides the problem.

Comment by Alex Miller [ 28/Nov/14 8:54 AM ]

I removed trailing whitespace in the patch so it applies cleanly.

Comment by Ghadi Shayban [ 14/Dec/14 11:16 PM ]

This patch is a little more subtle than I thought. Completion of the eduction's rfn needs to be handled separately from the "outer" transduce's xform. Patch coming.

Comment by Ghadi Shayban [ 14/Dec/14 11:32 PM ]

New patch with tests that completes the inner xform without completing the passed in rfn

Comment by Ghadi Shayban [ 15/Dec/14 1:19 AM ]

both -3 and -2 are equivalent. -3 is probably better stylistically.

Comment by Alex Miller [ 15/Dec/14 8:37 AM ]

Added CLJ-1606-4.patch - identical to -3, just fixed whitespace error.





[CLJ-1616] Frequencies incompatible with eduction Created: 14/Dec/14  Updated: 14/Dec/14  Resolved: 14/Dec/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Ghadi Shayban Assignee: Unassigned
Resolution: Not Reproducible Votes: 0
Labels: None


 Description   

Reproduction:
This needs the CLJ-1606 patch to apply, so that eduction works.

(frequencies (eduction (take 5) (range 50)))
;; ArityException Wrong number of args (1) passed to: core/frequencies/fn--6730

Cause: The reduce function that 'frequencies' calls is lacking the completing arity.

Simplest fix is to add the completing arity. Could be useful to allow frequencies to take a transducer stack.

mapv/filterv are similarly affected but seem less useful than using into with transducers.



 Comments   
Comment by Alex Miller [ 14/Dec/14 8:14 PM ]

Doesn't this work with CLJ-1572 + CLJ-1606 patches?

Comment by Ghadi Shayban [ 14/Dec/14 9:11 PM ]

No, not when there is something like 'take' in the picture. Transducers imply a reducing function with two different arities [1]. When 'frequencies' reduces over the collection (the eduction), a transducer inside the eduction might terminate early and cause the arity-1 rfn to be called, which will eventually bottom out here and throw the missing arity. [2]

CLJ-1572 helps dispatch properly
CLJ-1606 helps eduction actually work

[1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L6520-L6521
[2] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L6859

Comment by Alex Miller [ 14/Dec/14 9:49 PM ]

The example given works for me when I have CLJ-1572 + CLJ-1606 - what am I missing?

Comment by Ghadi Shayban [ 14/Dec/14 10:42 PM ]

Sigh you're not missing anything. I have an active repl that I can reproduce this on...

But with a bare build with CLJ-1572 and CLJ-1606 applied it does not happen. Give me a little bit to track this down. Intuitively it seems correct that something trying to complete frequencies's rfn:

(fn [counts x]
             (assoc! counts x (inc (get counts x 0))))

would fail.

Comment by Ghadi Shayban [ 14/Dec/14 10:49 PM ]

I'll reopen if I can figure out what happened





[CLJ-1615] transient set "keys" and "values" wind up with different metadata Created: 12/Dec/14  Updated: 13/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: collections, meta, transient

Attachments: Text File 0001-CLJ-1615-ensure-transient-set-keys-and-values-have-c.patch     Text File 0001-demonstrate-CLJ-1615.patch     Text File CLJ-1615-entryAt.patch    
Patch: Code and Test

 Description   
(let [s (-> #{} 
          transient 
          (conj! (clojure.core/with-meta [-7] {:mynum 0}))
          (conj! (clojure.core/with-meta [-7] {:mynum -1})) 
          persistent!)]
  [(meta (s [-7])) (meta (first s))])
=> [{:mynum -1} {:mynum 0}]

basically it looks like the "key" (the value we get by seqing on the set) retains the metadata from the first conj! but the "value" (what we get by calling invoke with the "key") carries the metadata from the second conj!. This does not match the behavior if we don't use transients:

(let [s (-> #{} 
          (conj (clojure.core/with-meta [-7] {:mynum 0}))
          (conj (clojure.core/with-meta [-7] {:mynum -1})))]
  [(meta (s [-7])) (meta (first s))])
=> [{:mynum 0} {:mynum 0}]

(found playing with zach tellman's collection-check)



 Comments   
Comment by Michael Blume [ 12/Dec/14 5:07 PM ]

Attached patch demonstrating problem (not a fix)

Comment by Michael Blume [ 12/Dec/14 5:40 PM ]

More investigation:

The difference between "keys" and "vals" arises from the fact that clojure sets use maps under the covers.

The difference between persistent and transient seems to be because PersistentHashSet.cons short-circuits on contains (https://github.com/clojure/clojure/blob/clojure-1.6.0/src/jvm/clojure/lang/PersistentHashSet.java#L97) and ATransientSet.conj does not (https://github.com/clojure/clojure/blob/clojure-1.6.0/src/jvm/clojure/lang/ATransientSet.java#L27)

Adding a contains check to ATransientSet.conj makes the behavior consistent and passes the attached test, but I imagine this could cause a performance hit. Thoughts?

Comment by Michael Blume [ 12/Dec/14 5:43 PM ]

Attached proposed fix – note that this may cause a performance hit for transient sets.

Comment by Michael Blume [ 13/Dec/14 2:40 PM ]

Attaching an alternative fix – instead of doing a contains check on every transient conj, back set.get with entryAt. More invasive but possibly faster.





[CLJ-1515] Reify the result of range and add IReduceInit Created: 29/Aug/14  Updated: 12/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Timothy Baldridge Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None

Attachments: Text File clj-1515-2.patch     Text File clj-1515-3.patch     Text File clj-1515-4.patch     Text File clj-1515-5.patch     Text File clj-1515-6.patch     Text File clj-1515-7.patch     Text File clj-1515-8.patch     Text File clj-1515-9.patch     Text File clj-1515.patch     File patch.diff     File range-patch3.diff     File reified-range4.diff    
Patch: Code and Test
Approval: Vetted

 Description   

Currently range returns a lazy chunked seq. If the return value of range were reified into a type we could optimize common cases and add IReduce support.

Approach: this patch revives the unused (but previously existing) clojure.lang.Range class. This class acts as a lazy seq and implements several other appropriate interfaces such as Counted and Indexed. This type is implemented in Java since range is needed fairly on in core.clj before deftype is defined. The attached patch provides two Range impls sharing some common code in AbstractRange. Range uses Numbers.* methods for all math due to the input types to range being unknown. LongRange handles the specific (but very common) case of a long start/end/step for higher performance. The special case of (range) is just handled with (iterate inc' 0) (which is further optimized for reduce in CLJ-1603).

Note: The patch also includes a tiny tweak in filter that has nothing to do with this patch other than being found while testing. It is a perf boost for all filter operations by avoiding calling .nth twice for every element in every chunk. Notice the filter seq example below gets an extra improvement in perf. If desired, this change could be split out.

Performance:
timings done via criterium quick-bench

expr 1.6.0 1.7.0-alpha4 +patch
(count (filter odd? (take (* 1024 1024) (range)))) 183 ms 173 ms 170 ms
(transduce (take (* 1024 1024)) + (range)) n/a 67 ms 81 ms (w/CLJ-1603: 41 ms)
(count (range (* 1024 1024))) 75 ms 69 ms 0 ms
(reduce + (map inc (range (* 1024 1024)))) 71 ms 68 ms 46 ms
(reduce + (map inc (map inc (range (* 1024 1024))))) 89 ms 91 ms 69 ms
(count (filter odd? (range (* 1024 1024)))) 69 ms 65 ms 43 ms
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) n/a 67 ms 36 ms
(doall (range 0 31)) 1.41 µs 1.51 µs 3.02 µs
(into [] (map inc (range 31))) 1.76 µs 1.77 µs 1.43 µs
(into [] (map inc) (range 31)) n/a 1.60 µs 0.63 µs
(doall (range 1/2 1000 1/3)) 1.58 ms 1.53 ms 1.66 ms
(into [] (range 1/2 1000 1/3)) 1.52 ms 1.51 ms 1.38 ms
(doall (range 0.5 1000 0.33)) 0.15 ms 0.14 ms 0.35 ms
(into [] (range 0.5 1000 0.33)) 0.13 ms 0.12 ms 0.08 ms

These results are a bit mixed but in general I think they make the most common and important things faster while some less important things are slightly slower. In general the "doall" examples are slower as this is kind of the worst case wrt overhead and values are retrieved via seq/next looping (the slowest option). Stacked sequence ops happen via the the chunked seq impl (which is a little faster), and the transduce/into will use the reduce impl (which is much faster).

Patch: clj-1515-9.patch

Screened by:

Screener question: (range) and range on non-longs both support auto-promotion towards infinity in this patch, which seems to be implied by the doc string but was not actually implemented or tested correctly afaict.



 Comments   
Comment by Alex Miller [ 29/Aug/14 3:19 PM ]

1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

Comment by Timothy Baldridge [ 29/Aug/14 3:40 PM ]

1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Comment by Alex Miller [ 29/Aug/14 3:49 PM ]

6) might be useful if you're walking through it with nth, which hits count everytime, but doubt that's common
7) yep, reasonable

Comment by Andy Fingerhut [ 18/Sep/14 6:52 AM ]

I have already pointed out to Edipo in personal email the guidelines on what labels to use for Clojure JIRA tickets here: http://dev.clojure.org/display/community/Creating+Tickets

Comment by Timothy Baldridge [ 19/Sep/14 10:02 AM ]

New patch with IReduce directly on Range instead of relying on iterators

Comment by Alex Miller [ 01/Oct/14 2:00 PM ]

The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is.

Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.

Comment by Timothy Baldridge [ 03/Oct/14 10:00 AM ]

Looks like chunked seqs do make lazy seq code about 5x faster in these tests.

Comment by Ghadi Shayban [ 03/Oct/14 10:22 AM ]

I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.

Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

Comment by Timothy Baldridge [ 03/Oct/14 10:38 AM ]

I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

Comment by Alex Miller [ 03/Oct/14 10:46 AM ]

I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.

On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Comment by Ghadi Shayban [ 03/Oct/14 11:00 AM ]

Tim, I'm not suggesting unboxed math, but the singular fast-path of all-Longs that you and Alex describe. I mistakenly lower-cased Long/Float.

Comment by Timothy Baldridge [ 31/Oct/14 11:30 AM ]

Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.

Comment by Nicola Mometto [ 14/Nov/14 12:51 PM ]

As discussed with Tim in #clojure, the current patch should not change ArrayChunk's reduce impl, that's an error.

Comment by Alex Miller [ 09/Dec/14 2:40 AM ]

Still a work in progress...

Comment by Nicola Mometto [ 09/Dec/14 8:44 AM ]

Alex, while this is still a work in progress, I see that the change on ArrayChunk#reduce from previous WIP patches not only has not been reverted but has been extended. I don't think the current approach makes sense as ArrayChunk#reduce is not part of the IReduce/IReduceInit contract but of the IChunk contract and changing the behaviour to be IReduce-like in its handling of reduced introduces the burden of having to use preserve-reduced on the reducing function to no apparent benefit.

Given that the preserve-reduced is done on the clojure side, it seems to me like directly invoking .reduce rather than routing through internal-reduce should be broken but I haven't tested it.

Comment by Alex Miller [ 09/Dec/14 9:49 AM ]

That's the work in progress part - I haven't looked at yet. I have not extended or done any work re ArrayChunk, just carried through what was on the prior patch. I'll be working on it again tomorrow.

Comment by Ghadi Shayban [ 10/Dec/14 11:14 PM ]

I am impressed and have learned a ton through this exercise.

quick review of clj-1515-2
1) withMeta gives the newly formed object the wrong meta.
2) LongRange/create() is the new 0-arity constructor for range, which sets the 'end' to Double/POSITIVE_INFINITY cast as a long. Current core uses Double/POSITIVE_INFINITY directly. Not sure how many programs rely upon iterating that far, or how they would break.
3) Relatedly, depending on the previous point: Because only all-long arguments receive chunking, the very common case of (range) with no args would be unchunked. Doesn't seem like too much of a stretch to add chunking to the other impl.
4) Though the commented invariants say that Range is never empty, the implementation uses a magic value of _count == 0 to mean not cached, which is surprising to me. hashcodes have the magic value of -1
5) s/instanceof Reduced/RT.isReduced
6) is the overflow behavior of "int count()" correct?

Comment by Alex Miller [ 11/Dec/14 12:06 AM ]

1) agreed!
2) Good point. I am definitely changing behavior on this (max of 9223372036854775807). I will look at whether this can be handled without affecting perf. Really, handling an infinite end point is not compatible with several things in LongRange.
3) I actually did implement chunking for the general Range and found it was slower (the original Clojure chunking is faster). LongRange is making up for that difference with improved primitive numerics.
4) Since empty is invalid, 0 and -1 are equally invalid. But I agree -1 conveys the intent better.
5) agreed
6) probably not. ties into 2/3.

Thanks for this, will address.

Comment by Alex Miller [ 11/Dec/14 12:11 AM ]

Added -4 patch that addresses 1,4,5 but not the (range) stuff.

Comment by Alex Miller [ 11/Dec/14 12:51 PM ]

Latest -7 patch addresses all feedback and perf #s updated.





[CLJ-1614] Clojure does not start: ClassCastException Created: 12/Dec/14  Updated: 12/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Vladimir Tsichevski Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: compiler
Environment:

Eclipse RCP



 Description   

The clojure.lang.Compiler class static code throws the ClassCastException when reading compiler options from System properties (Compiler.java, line 260 in the git master release). When running Clojure from Eclipse RCP application the System properties may have non-string values.

Checking if the value is String and ignoring non-strings fixes this problem.






[CLJ-1613] :or defaults should refer to enclosing scope in map destructuring Created: 12/Dec/14  Updated: 12/Dec/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Michał Marczyk Assignee: Michał Marczyk
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-CLJ-1613-evaluate-or-defaults-in-enclosing-scope-in-.patch    

 Description   

Michael Blume noticed that :or defaults can depend on the values of other keys, see https://groups.google.com/d/msg/clojure/6kOhpPOpHWM/ITjWwQFS_VQJ

Michael's Gist https://gist.github.com/MichaelBlume/4891dafdd31f0dcbc727 displays a case where an associative form involving :keys and :or compiles or not depending on the order of symbols in :keys. By tweaking that case one can arrive at expressions which always compile, but produce different values depending on :keys:

(let [foo 1
       bar 2
       {:keys [bar foo]
        :or {foo 3 bar (inc foo)}} {}]
  {:foo foo :bar bar})
;= {:foo 3, :bar 4}

(let [foo 1
      bar 2
      {:keys [foo bar]
       :or {foo 3 bar (inc foo)}} {}]
  {:foo foo :bar bar})
;= {:foo 3, :bar 2}

I believe that the most natural solution is to demand that :or defaults be evaluated in an enclosing scope where none of the destructuring-introduced locals are present. This approach is taken by the 0001 patch.



 Comments   
Comment by Michael Blume [ 12/Dec/14 2:27 AM ]

I suspect that this is the right thing to do but I think it's important to note that this will break existing code https://github.com/ngrunwald/ring-middleware-format/blob/master/src/ring/middleware/format_params.clj#L214





Generated at Fri Dec 19 02:52:24 CST 2014 using JIRA 4.4#649-r158309.