<< Back to previous view

[CLJ-1330] Class name clash between top-level functions and defn'ed ones Created: 22/Jan/14  Updated: 30/Jun/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 7
Labels: aot, compiler

Attachments: Text File 0001-Fix-CLJ-1330-make-top-level-named-functions-classnam.patch     File demo1.clj    
Patch: Code
Approval: Vetted

 Description   

Named anonymous fn's are not guaranteed to have unique class names when AOT-compiled.

For example:

(defn g [])
(def xx (fn g []))

When AOT-compiled both functions will emit user$g.class, the latter overwriting the former.

Demonstration script: demo1.clj

Patch: 0001-Fix-CLJ-1330-make-top-level-named-functions-classnam.patch

Approach: Generate unique class names for named fn's the same way as for unnamed anonymous fn's.

See also: This patch also fixes the issue reported in CLJ-1227.



 Comments   
Comment by Ambrose Bonnaire-Sergeant [ 22/Jan/14 11:12 AM ]

This seems like the reason why jvm.tools.analyzer cannot analyze clojure.core. On analyzing a definline, there is an "attempted duplicate class definition" error.

This doesn't really matter, but I thought it may or may not be useful information to someone.

Comment by Nicola Mometto [ 22/Jan/14 11:35 AM ]

Attached a fix.

This also fixes AOT compiling of code like:

(def x (fn foo []))
(fn foo [])
Comment by Nicola Mometto [ 22/Jan/14 11:39 AM ]

Cleaned up patch

Comment by Alex Miller [ 22/Jan/14 12:43 PM ]

It looks like the patch changes indentation of some of the code - can you fix that?

Comment by Nicola Mometto [ 22/Jan/14 3:57 PM ]

Updated patch without whitespace changes

Comment by Alex Miller [ 22/Jan/14 4:15 PM ]

Thanks, that's helpful.

Comment by Alex Miller [ 24/Jan/14 10:03 AM ]

There is consensus that this is a problem, however this is an area of the code with broad impacts as it deals with how classes are named. To that end, there is some work that needs to be done in understanding the impacts before we can consider it.

Some questions we would like to answer:

1) According to Rich, naming of (fn x []) function classes used to work in the manner of this patch - with generated names. Some code archaeology needs to be done on why that was changed and whether the change to the current behavior was addressing problems that we are likely to run into.

2) Are there issues with recursive functions? Are there impacts either in AOT or non-AOT use cases? Need some tests.

3) Are there issues with dynamic redefinition of functions? With the static naming scheme, redefinition causes a new class of the same name which can be picked up by reload of classes compiled to the old definition. With the dynamic naming scheme, redefinition will create a differently named class so old classes can never pick up a redefinition. Is this a problem? What are the impacts with and without AOT? Need some tests.

Comment by Nicola Mometto [ 24/Jan/14 11:39 AM ]

Looks like the current behaviour has been such since https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28

My guess is that Rich simply forgot to consider the (def y (fn x [] ..)) case.

Regarding 2 and 3, the dynamic naming scheme is no different than what happens for anonymous functions so I don't see how this could cause any issue.

Recursion on the fn arg is simply a call to .invoke on "this", it's classname unaware.

I can add some tests to test that

(def y (fn x [] 1))
and
(fn x [] 2)
compile to different classnames but other than that I don't see what should be tested.

Comment by Stuart Halloway [ 27/Jun/14 2:17 PM ]

incomplete pending the answers to Alex Miller's questions in the comments

Comment by Nicola Mometto [ 27/Jun/14 3:20 PM ]

I believe I already answered his questions, I'll try to be a bit more explicit:
I tracked the relevant commit from Rich which added the dynamic naming behaviour https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28#diff-f17f860d14163523f1e1308ece478ddbL3081 which clearly shows that this bug was present since then so.

Regarding redefinitions or recursive functions, both of those operations never take in account the generated fn name so they are unaffected.





[CLJ-1250] Reducer (and folder) instances hold onto the head of seqs Created: 03/Sep/13  Updated: 27/Jun/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Christophe Grand Assignee: Unassigned
Resolution: Unresolved Votes: 9
Labels: memory, reducers

Attachments: Text File after-change.txt     Text File before-change.txt     Text File CLJ-1250-20131211.patch     Text File CLJ-1250-AllInvokeSites-20140204.patch     Text File CLJ-1250-AllInvokeSites-20140320.patch    
Patch: Code
Approval: Vetted

 Description   

Problem Statement
A shared function #'clojure.core.reducers/reducer holds on to the head of a reducible collection, causing it to blow up when the collection is a lazy sequence.

Cause: #'reducer closes over a collection when in order to reify CollReduce, and the closed-over is never cleared. When code attempts to reduce over this anonymous transformed collection, it will realize the tail while the head is stored in the closed-over.

Reproduction steps:
Compare the following calls:

(time (reduce + 0 (map identity (range 1e8))))
(time (reduce + 0 (r/map identity (range 1e8))))

The second call should fail on a normal or small heap.

(If reducers are faster than seqs, increase the range.)

Patch
CLJ-1250-AllInvokeSites-20140320.patch takes Approach #2

Approaches:

1) Reimplement the #'reducer (and #'folder) transformation fns similar to the manner that Christophe proposes here:

(defrecord Reducer [coll xf])

(extend-protocol 
  clojure.core.protocols/CollReduce
  Reducer
      (coll-reduce [r f1]
                   (clojure.core.protocols/coll-reduce r f1 (f1)))
      (coll-reduce [r f1 init]
                   (clojure.core.protocols/coll-reduce (:coll r) ((:xf r) f1) init)))

(def rreducer ->Reducer) 

(defn rmap [f coll]
  (rreducer coll (fn [g] 
                   (fn [acc x]
                     (g acc (f x))))))

Advantages: Relatively non-invasive change.
Disadvantages: Not evident code. Additional protocol dispatch, though only incurred once

2) Clear the reference to 'this' on the stack just before a tail call occurs

When a callee is in return position, clear the local variable reference to 'this' in the caller's stack frame, which will make the caller and all its closed-overs eligible for reclamation.

Patch 1211 takes this approach for InvokeExpr call sites in return position. Patch 1214 takes the same approach for InvokeExprs and also static and instance interop calls.

Here is the code that performs the clearing excerpted from the 1214 patch:

void emitClearThis(GeneratorAdapter gen) {
		gen.visitInsn(Opcodes.ACONST_NULL);
		gen.visitVarInsn(Opcodes.ASTORE, 0);
	}

Tail calls wrapped inside a try/catch/finally clause cannot have 'this' cleared, because closed-overs/locals may need to be emitted for exception handling blocks. Both patches consider and handle this edge case.

Advantages: Fixes this case with no user code changes. Enables GC to do reclaim closed-overs references earlier.
Disadvantages: A compiler change.

3) Alternate approach

from Christophe Grand:
Another way would be to enhance the local clearing mechanism to also clear "this" but it's complex since it may be needed to keep a reference to "this" for caches long after obvious references to "this" are needed.

Advantages: Fine-grained
Disadvantages: Complex, invasive, and the compiler is hard to hack on.

Mitigations
Avoid reducing on lazy-seqs and instead operate on vectors / maps, or custom reifiers of CollReduce or CollFold. This could be easier with some implementations of common collection functions being available (like iterate and partition).

See https://groups.google.com/d/msg/clojure-dev/t6NhGnYNH1A/2lXghJS5HywJ for previous discussion.



 Comments   
Comment by Gary Fredericks [ 03/Sep/13 8:53 AM ]

Fixed indentation in description.

Comment by Ghadi Shayban [ 11/Dec/13 11:08 PM ]

Adding a patch that clears "this" before tail calls. Verified that Christophe's repro case is fixed.

Will upload a diff of the bytecode soon.

Any reason this juicy bug was taken off 1.6?

Comment by Ghadi Shayban [ 11/Dec/13 11:17 PM ]

Here's the bytecode for the clojure.core.reducers/reducer reify before and after the change... Of course a straight diff isn't useful because all the line numbers changed. Kudos to Gary Trakhman for the no.disassemble lein plugin.

Comment by Christophe Grand [ 12/Dec/13 6:58 AM ]

Ghadi, I'm a bit surprised by this part of the patch: was the local clearing always a no-op here?

-		if(context == C.RETURN)
+		if(shouldClear)
 			{
-			ObjMethod method = (ObjMethod) METHOD.deref();
-			method.emitClearLocals(gen);
+                            gen.visitInsn(Opcodes.ACONST_NULL);
+                            gen.visitVarInsn(Opcodes.ASTORE, 0);
 			}

The problem with this approach (clear this on tail call) is that it adds yet another special case. To me the complexity stem from having to keep this around even if the user code doesn't refer to it.

Comment by Ghadi Shayban [ 12/Dec/13 7:19 AM ]

Thank you - I failed to mention this in the commit message: it appears that emitClearLocals() belonging to both ObjMethod and FnMethod (its child) are empty no-ops. I believe the actual local clearing is on line 4855.

I agree re: another special case in the compiler.

Comment by Alex Miller [ 12/Dec/13 8:56 AM ]

Ghadi re 1.6 - this ticket was never in the 1.6 list, it has not yet been vetted by Rich but is ready to do so when we open up again after 1.6.

Comment by Ghadi Shayban [ 12/Dec/13 8:59 AM ]

Sorry I confused the critical list with the Rel1.6 list.

Comment by Ghadi Shayban [ 14/Dec/13 11:16 AM ]

New patch 20131214 that handles all tail invoke sites (InvokeExpr + StaticMethodExpr + InstanceMethodExpr). 'StaticInvokeExpr' seems like an old remnant that had no active code path, so that was left as-is.

The approach taken is still the same as the original small patch that addressed only InvokeExpr, except that it is now using a couple small helpers. The commit message has more details.

Also a 'try' block with no catch or finally clause now becomes a BodyExpr. Arguably a user error, historically accepted, and still accepted, but now they are a regular BodyExpr, instead of being wrapped by a the no-op try/catch mechanism. This second commit can be optionally discarded.

With this patch on my machine (4/8 core/thread Ivy Bridge) running on bare clojure.main:
Christophe's test cases both run i 3060ms on a artificially constrained 100M max heap, indicating a dominant GC overhead. (But they now both work!)

When max heap is at a comfortable 2G the reducers version outpaces the lazyseq at 2100ms vs 2600ms!

Comment by Ghadi Shayban [ 13/Jan/14 10:48 AM ]

Updating stale patch after latest changes to master. Latest is CLJ-1250-AllInvokeSites-20140113

Comment by Ghadi Shayban [ 04/Feb/14 3:50 PM ]

Updating patch after murmur changes

Comment by Tassilo Horn [ 13/Feb/14 4:52 AM ]

Ghadi, I suffer from the problem of this issue. Therefore, I've applied your patch CLJ-1250-AllInvokeSites-20140204.patch to the current git master. However, then I get lots of "java.lang.NoSuchFieldError: array" errors when the clojure tests are run:

     [java] clojure.test-clojure.clojure-set
     [java] 
     [java] java.lang.NoSuchFieldError: array
     [java] 	at clojure.core.protocols$fn__6026.invoke(protocols.clj:123)
     [java] 	at clojure.core.protocols$fn__5994$G__5989__6003.invoke(protocols.clj:19)
     [java] 	at clojure.core.protocols$fn__6023.invoke(protocols.clj:147)
     [java] 	at clojure.core.protocols$fn__5994$G__5989__6003.invoke(protocols.clj:19)
     [java] 	at clojure.core.protocols$seq_reduce.invoke(protocols.clj:31)
     [java] 	at clojure.core.protocols$fn__6017.invoke(protocols.clj:48)
     [java] 	at clojure.core.protocols$fn__5968$G__5963__5981.invoke(protocols.clj:13)
     [java] 	at clojure.core$reduce.invoke(core.clj:6213)
     [java] 	at clojure.set$difference.doInvoke(set.clj:61)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:442)
     [java] 	at clojure.test_clojure.clojure_set$fn__1050$fn__1083.invoke(clojure_set.clj:109)
     [java] 	at clojure.test_clojure.clojure_set$fn__1050.invoke(clojure_set.clj:109)
     [java] 	at clojure.test$test_var$fn__7123.invoke(test.clj:704)
     [java] 	at clojure.test$test_var.invoke(test.clj:704)
     [java] 	at clojure.test$test_vars$fn__7145$fn__7150.invoke(test.clj:721)
     [java] 	at clojure.test$default_fixture.invoke(test.clj:674)
     [java] 	at clojure.test$test_vars$fn__7145.invoke(test.clj:721)
     [java] 	at clojure.test$default_fixture.invoke(test.clj:674)
     [java] 	at clojure.test$test_vars.invoke(test.clj:718)
     [java] 	at clojure.test$test_all_vars.invoke(test.clj:727)
     [java] 	at clojure.test$test_ns.invoke(test.clj:746)
     [java] 	at clojure.core$map$fn__2665.invoke(core.clj:2515)
     [java] 	at clojure.lang.LazySeq.sval(LazySeq.java:40)
     [java] 	at clojure.lang.LazySeq.seq(LazySeq.java:49)
     [java] 	at clojure.lang.Cons.next(Cons.java:39)
     [java] 	at clojure.lang.RT.boundedLength(RT.java:1655)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:130)
     [java] 	at clojure.core$apply.invoke(core.clj:619)
     [java] 	at clojure.test$run_tests.doInvoke(test.clj:761)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:137)
     [java] 	at clojure.core$apply.invoke(core.clj:617)
     [java] 	at clojure.test.generative.runner$run_all_tests$fn__527.invoke(runner.clj:255)
     [java] 	at clojure.test.generative.runner$run_all_tests$run_with_counts__519$fn__523.invoke(runner.clj:251)
     [java] 	at clojure.test.generative.runner$run_all_tests$run_with_counts__519.invoke(runner.clj:251)
     [java] 	at clojure.test.generative.runner$run_all_tests.invoke(runner.clj:253)
     [java] 	at clojure.test.generative.runner$test_dirs.doInvoke(runner.clj:304)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:137)
     [java] 	at clojure.core$apply.invoke(core.clj:617)
     [java] 	at clojure.test.generative.runner$_main.doInvoke(runner.clj:312)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:408)
     [java] 	at user$eval564.invoke(run_tests.clj:3)
     [java] 	at clojure.lang.Compiler.eval(Compiler.java:6657)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7084)
     [java] 	at clojure.lang.Compiler.loadFile(Compiler.java:7040)
     [java] 	at clojure.main$load_script.invoke(main.clj:274)
     [java] 	at clojure.main$script_opt.invoke(main.clj:336)
     [java] 	at clojure.main$main.doInvoke(main.clj:420)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:408)
     [java] 	at clojure.lang.Var.invoke(Var.java:379)
     [java] 	at clojure.lang.AFn.applyToHelper(AFn.java:154)
     [java] 	at clojure.lang.Var.applyTo(Var.java:700)
     [java] 	at clojure.main.main(main.java:37)
Comment by Ghadi Shayban [ 13/Feb/14 8:23 AM ]

Can you give some details about your JVM/environment that can help reproduce? I'm not encountering this error.

Comment by Tassilo Horn [ 13/Feb/14 9:41 AM ]

Sure. It's a 64bit ThinkPad running GNU/Linux.

% java -version
java version "1.7.0_51"
OpenJDK Runtime Environment (IcedTea 2.4.5) (ArchLinux build 7.u51_2.4.5-1-x86_64)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
Comment by Ghadi Shayban [ 13/Feb/14 10:19 AM ]

Strange, that is exactly my mail env, OpenJDK7 on Arch, 64-bit. I have also tested on JDK 6/7/8 on OSX mavericks. Are you certain that the git tree is clean besides the patch? (Arch users unite!)

Comment by Tassilo Horn [ 14/Feb/14 1:13 AM ]

Yes, the tree is clean. But now I see that I get the same error also after resetting to origin/master, so it's not caused by your patch at all. Oh, now the error vanished after doing a `mvn clean`! So problem solved.

Comment by Nicola Mometto [ 19/Feb/14 12:32 PM ]

Ghandi, FnExpr.parse should bind IN_TRY_BLOCK to false before analyzing the fn body, consider the case

(try (do something (fn a [] (heap-consuming-op a))) (catch Exception e ..))

Here in the a function the this local will never be cleared even though it's perfectly safe to.
Admittedly this is an edge case but we should cover this possibility too.

Comment by Ghadi Shayban [ 19/Feb/14 2:06 PM ]

You may have auto-corrected my name to Ghandi instead of Ghadi. I wish I were that wise =)

I will update the patch for FnExpr (that seems reasonable), but maybe after 1.6 winds down and the next batch of tickets get scrutiny. It would be nice to get input on a preferred approach from Rich or core after it gets vetted – or quite possibly not vetted.

Comment by Nicola Mometto [ 19/Feb/14 6:11 PM ]

hah, sorry for the typo on the name

Seems reasonable to me, in the meantime I just pushed to tools.analyzer/tools.emitter complete support for "this" clearing, I'll test this a bit in the next few days to make sure it doesn't cause unexpected problems.

Comment by Andy Fingerhut [ 24/Feb/14 12:13 PM ]

Patch CLJ-1250-AllInvokeSites-20140204.patch no longer applies cleanly to latest master as of Feb 23, 2014. It did on Feb 14, 2014. Most likely some of its context lines are changed by the commit to Clojure master on Feb 23, 2014 – I haven't checked in detail.

Comment by Ghadi Shayban [ 20/Mar/14 4:39 PM ]

Added a patch that 1) applies cleanly, 2) binds the IN_TRY_EXPR to false initially when analyzing FnExpr and 3) uses RT.booleanCast





[CLJ-1152] PermGen leak in multimethods and protocol fns when evaled Created: 30/Jan/13  Updated: 08/Aug/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Chouser Assignee: Unassigned
Resolution: Unresolved Votes: 6
Labels: memory, protocols

Attachments: File multifn_weak_method_cache.diff     File naive-lru-for-multimethods-and-protocols.diff     File naive-lru-method-cache-for-multimethods.diff    
Patch: Code
Approval: Vetted

 Description   

There is a PermGen memory leak that we have tracked down to protocol methods and multimethods called inside an eval, because of the caches these methods use. The problem only arises when the value being cached is an instance of a class (such as a function or reify) that was defined inside the eval. Thus extending IFn or dispatching a multimethod on an IFn are likely triggers.

My fellow LonoClouder, Jeff Dik describes how to reproduce and work around the problem:

The easiest way that I have found to test this is to set "-XX:MaxPermSize" to a reasonable value so you don't have to wait too long for the PermGen spaaaaace to fill up, and to use "-XX:+TraceClassLoading" and "-XX:+TraceClassUnloading" to see the classes being loaded and unloaded.

leiningen project.clj
(defproject permgen-scratch "0.1.0-SNAPSHOT"
  :dependencies [[org.clojure/clojure "1.5.0-RC1"]]
  :jvm-opts ["-XX:MaxPermSize=32M"
             "-XX:+TraceClassLoading"
             "-XX:+TraceClassUnloading"])

You can use lein swank 45678 and connect with slime in emacs via M-x slime-connect.

To monitor the PermGen usage, you can find the Java process to watch with "jps -lmvV" and then run "jstat -gcold <PROCESS_ID> 1s". According to the jstat docs, the first column (PC) is the "Current permanent spaaaaace capacity (KB)" and the second column (PU) is the "Permanent spaaaaace utilization (KB)". VisualVM is also a nice tool for monitoring this.

Multimethod leak

Evaluating the following code will run a loop that eval's (take* (fn foo [])).

multimethod leak
(defmulti take* (fn [a] (type a)))

(defmethod take* clojure.lang.Fn
  [a]
  '())

(def stop (atom false))
(def sleep-duration (atom 1000))

(defn run-loop []
  (when-not @stop
    (eval '(take* (fn foo [])))
    (Thread/sleep @sleep-duration)
    (recur)))

(future (run-loop))

(reset! sleep-duration 0)

In the lein swank session, you will see many lines like below listing the classes being created and loaded.

[Loaded user$eval15802$foo__15803 from __JVM_DefineClass__]
[Loaded user$eval15802 from __JVM_DefineClass__]

These lines will stop once the PermGen spaaaaace fills up.

In the jstat monitoring, you'll see the amount of used PermGen spaaaaace (PU) increase to the max and stay there.

-    PC       PU        OC          OU       YGC    FGC    FGCT     GCT
 31616.0  31552.7    365952.0         0.0      4     0    0.000    0.129
 32000.0  31914.0    365952.0         0.0      4     0    0.000    0.129
 32768.0  32635.5    365952.0         0.0      4     0    0.000    0.129
 32768.0  32767.6    365952.0      1872.0      5     1    0.000    0.177
 32768.0  32108.2    291008.0     23681.8      6     2    0.827    1.006
 32768.0  32470.4    291008.0     23681.8      6     2    0.827    1.006
 32768.0  32767.2    698880.0     24013.8      8     4    1.073    1.258
 32768.0  32767.2    698880.0     24013.8      8     4    1.073    1.258
 32768.0  32767.2    698880.0     24013.8      8     4    1.073    1.258

A workaround is to run prefer-method before the PermGen spaaaaace is all used up, e.g.

(prefer-method take* clojure.lang.Fn java.lang.Object)

Then, when the used PermGen spaaaaace is close to the max, in the lein swank session, you will see the classes created by the eval'ing being unloaded.

[Unloading class user$eval5950$foo__5951]
[Unloading class user$eval3814]
[Unloading class user$eval2902$foo__2903]
[Unloading class user$eval13414]

In the jstat monitoring, there will be a long pause when used PermGen spaaaaace stays close to the max, and then it will drop down, and start increasing again when more eval'ing occurs.

-    PC       PU        OC          OU       YGC    FGC    FGCT     GCT
 32768.0  32767.9    159680.0     24573.4      6     2    0.167    0.391
 32768.0  32767.9    159680.0     24573.4      6     2    0.167    0.391
 32768.0  17891.3    283776.0     17243.9      6     2   50.589   50.813
 32768.0  18254.2    283776.0     17243.9      6     2   50.589   50.813

The defmulti defines a cache that uses the dispatch values as keys. Each eval call in the loop defines a new foo class which is then added to the cache when take* is called, preventing the class from ever being GCed.

The prefer-method workaround works because it calls clojure.lang.MultiFn.preferMethod, which calls the private MultiFn.resetCache method, which completely empties the cache.

Protocol leak

The leak with protocol methods similarly involves a cache. You see essentially the same behavior as the multimethod leak if you run the following code using protocols.

protocol leak
(defprotocol ITake (take* [a]))

(extend-type clojure.lang.Fn
  ITake
  (take* [this] '()))

(def stop (atom false))
(def sleep-duration (atom 1000))

(defn run-loop []
  (when-not @stop
    (eval '(take* (fn foo [])))
    (Thread/sleep @sleep-duration)
    (recur)))

(future (run-loop))

(reset! sleep-duration 0)

Again, the cache is in the take* method itself, using each new foo class as a key.

A workaround is to run -reset-methods on the protocol before the PermGen spaaaaace is all used up, e.g.

(-reset-methods ITake)

This works because -reset-methods replaces the cache with an empty MethodImplCache.



 Comments   
Comment by Chouser [ 30/Jan/13 9:10 AM ]

I think the most obvious solution would be to constrain the size of the cache. Adding an item to the cache is already not the fastest path, so a bit more work could be done to prevent the cache from growing indefinitely large.

That does raise the question of what criteria to use. Keep the first n entries? Keep the n most recently used (which would require bookkeeping in the fast cache-hit path)? Keep the n most recently added?

Comment by Jamie Stephens [ 18/Oct/13 9:35 AM ]

At a minimum, perhaps a switch to disable the caches – with obvious performance impact caveats.

Seems like expensive LRU logic is probably the way to go, but maybe don't have it kick in fully until some threshold is crossed.

Comment by Alex Miller [ 18/Oct/13 4:28 PM ]

A report seeing this in production from mailing list:
https://groups.google.com/forum/#!topic/clojure/_n3HipchjCc

Comment by Adrian Medina [ 10/Dec/13 11:43 AM ]

So this is why we've been running into PermGen space exceptions! This is a fairly critical bug for us - I'm making extensive use of multimethods in our codebase and this exception will creep in at runtime randomly.

Comment by Kevin Downey [ 17/Apr/14 9:52 PM ]

it might be better to split this in to two issues, because at a very abstract level the two issues are the "same", but concretely they are distinct (protocols don't really share code paths with multimethods), keeping them together in one issue seems like a recipe for a large hard to read patch

Comment by Kevin Downey [ 26/Jul/14 5:49 PM ]

naive-lru-method-cache-for-multimethods.diff replaces the methodCache in multimethods with a very naive lru cache built on PersistentHashMap and PersistentQueue

Comment by Kevin Downey [ 28/Jul/14 7:09 PM ]

naive-lru-for-multimethods-and-protocols.diff creates a new class clojure.lang.LRUCache that provides an lru cache built using PHashMap and PQueue behind an IPMap interface.

changes MultiFn to use an LRUCache for its method cache.

changes expand-method-impl-cache to use an LRUCache for MethodImplCache's map case

Comment by Kevin Downey [ 30/Jul/14 3:10 PM ]

I suspect my patch naive-lru-for-multimethods-and-protocols.diff is just wrong, unless MethodImplCache really is being used as a cache we can't just toss out entries when it gets full.

looking at the deftype code again, it does look like MethidImplCache is being used as a cache, so maybe the patch is fine

if I am sure of anything it is that I am unsure so hopefully someone who is sure can chime in

Comment by Nicola Mometto [ 31/Jul/14 11:02 AM ]

I haven't looked at your patch, but I can confirm that the MethodImplCache in the protocol function is just being used as a cache

Comment by dennis zhuang [ 08/Aug/14 6:21 AM ]

I developed a new patch that convert the methodCache in MultiFn to use WeakReference for dispatch value,and clear the cache if necessary.

I've test it with the code in ticket,and it looks fine.The classes will be unloaded when perm gen is almost all used up.





[CLJ-701] Compiler loses 'loop's return type in some cases Created: 03/Jan/11  Updated: 26/Jul/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Backlog
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Chouser Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None
Environment:

Clojure commit 9052ca1854b7b6202dba21fe2a45183a4534c501, version 1.3.0-master-SNAPSHOT


Attachments: File hoistedmethod-pass-1.diff     File hoistedmethod-pass-2.diff     File hoistedmethod-pass-3.diff     File hoistedmethod-pass-4.diff     File hoistedmethod-pass-5.diff    
Patch: Code
Approval: Vetted

 Description   
(set! *warn-on-reflection* true)
(fn [] (loop [b 0] (recur (loop [a 1] a))))

Generates the following warnings:

recur arg for primitive local: b is not matching primitive, had: Object, needed: long
Auto-boxing loop arg: b

This is interesting for several reasons. For one, if the arg to recur is a let form, there is no warning:

(fn [] (loop [b 0] (recur (let [a 1] a))))

Also, the compiler appears to understand the return type of loop forms just fine:

(use '[clojure.contrib.repl-utils :only [expression-info]])
(expression-info '(loop [a 1] a))
;=> {:class long, :primitive? true}

The problem can of course be worked around using an explicit cast on the loop form:

(fn [] (loop [b 0] (recur (long (loop [a 1] a)))))

Reported by leafw in IRC: http://clojure-log.n01se.net/date/2011-01-03.html#10:31



 Comments   
Comment by a_strange_guy [ 03/Jan/11 4:36 PM ]

The problem is that a 'loop form gets converted into an anonymous fn that gets called immediately, when the loop is in a expression context (eg. its return value is needed, but not as the return value of a method/fn).

so

(fn [] (loop [b 0] (recur (loop [a 1] a))))

gets converted into

(fn [] (loop [b 0] (recur ((fn [] (loop [a 1] a))))))

see the code in the compiler:
http://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Compiler.java#L5572

this conversion already bites you if you have mutable fields in a deftype and want to 'set! them in a loop

http://dev.clojure.org/jira/browse/CLJ-274

Comment by Christophe Grand [ 23/Nov/12 2:28 AM ]

loops in expression context are lifted into fns because else Hotspot doesn't optimize them.
This causes several problems:

  • type inference doesn't propagate outside of the loop[1]
  • the return value is never a primitive
  • mutable fields are inaccessible
  • surprise allocation of one closure objects each time the loop is entered.

Adressing all those problems isn't easy.
One can compute the type of the loop and emit a type hint but it works only with reference types. To make it works with primitive, primitie fns aren't enough since they return only long/double: you have to add explicit casts.
So solving the first two points can be done in a rather lccal way.
The two other points require more impacting changes, the goal would be to emit a method rather than a fn. So it means at the very least changing ObjExpr and adding a new subclassof ObjMethod.

[1] beware of CLJ-1111 when testing.

Comment by Alex Miller [ 21/Oct/13 10:28 PM ]

I don't think this is going to make it into 1.6, so removing the 1.6 tag.

Comment by Kevin Downey [ 21/Jul/14 7:14 PM ]

an immediate solution to this might be to hoist loops out in to distinct non-ifn types generated by the compiler with an invoke method that is typed to return the getJavaClass() type of the expression, that would give us the simplifying benefits of hoisting the code out and free use from the Object semantics of ifn

Comment by Kevin Downey [ 22/Jul/14 8:39 PM ]

I have attached a 3 part patch as hoistedmethod-pass-1.diff

3ed6fed8 adds a new ObjMethod type to represent expressions hoisted out in to their own methods on the enclosing class

9c39cac1 uses HoistedMethod to compile loops not in the return context

901e4505 hoists out try expressions and makes it possible for try to return a primitive expression (this might belong on http://dev.clojure.org/jira/browse/CLJ-1422)

Comment by Kevin Downey [ 22/Jul/14 8:54 PM ]

with hoistedmethod-pass-1.diff the example code generates bytecode like this

user=> (println (no.disassemble/disassemble (fn [] (loop [b 0] (recur (loop [a 1] a))))))
// Compiled from form-init1272682692522767658.clj (version 1.5 : 49.0, super bit)
public final class user$eval1675$fn__1676 extends clojure.lang.AFunction {
  
  // Field descriptor #7 Ljava/lang/Object;
  public static final java.lang.Object const__0;
  
  // Field descriptor #7 Ljava/lang/Object;
  public static final java.lang.Object const__1;
  
  // Method descriptor #10 ()V
  // Stack: 2, Locals: 0
  public static {};
     0  lconst_0
     1  invokestatic java.lang.Long.valueOf(long) : java.lang.Long [16]
     4  putstatic user$eval1675$fn__1676.const__0 : java.lang.Object [18]
     7  lconst_1
     8  invokestatic java.lang.Long.valueOf(long) : java.lang.Long [16]
    11  putstatic user$eval1675$fn__1676.const__1 : java.lang.Object [20]
    14  return
      Line numbers:
        [pc: 0, line: 1]

 // Method descriptor #10 ()V
  // Stack: 1, Locals: 1
  public user$eval1675$fn__1676();
    0  aload_0 [this]
    1  invokespecial clojure.lang.AFunction() [23]
    4  return
      Line numbers:
        [pc: 0, line: 1]
  
  // Method descriptor #25 ()Ljava/lang/Object;
  // Stack: 3, Locals: 3
  public java.lang.Object invoke();
     0  lconst_0
     1  lstore_1 [b]
     2  aload_0 [this]
     3  lload_1 [b]
     4  invokevirtual user$eval1675$fn__1676.__hoisted1677(long) : long [29]
     7  lstore_1 [b]
     8  goto 2
    11  areturn
      Line numbers:
        [pc: 0, line: 1]
      Local variable table:
        [pc: 2, pc: 11] local: b index: 1 type: long
        [pc: 0, pc: 11] local: this index: 0 type: java.lang.Object

 // Method descriptor #27 (J)J
  // Stack: 2, Locals: 5
  public long __hoisted1677(long b);
    0  lconst_1
    1  lstore_3 [a]
    2  lload_3
    3  lreturn
      Line numbers:
        [pc: 0, line: 1]
      Local variable table:
        [pc: 2, pc: 3] local: a index: 3 type: long
        [pc: 0, pc: 3] local: this index: 0 type: java.lang.Object
        [pc: 0, pc: 3] local: b index: 1 type: java.lang.Object

}
nil
user=> 
  

the body of the method __hoisted1677 is the inner loop

for reference the part of the bytecode from the same function compiled with 1.6.0 is pasted here https://gist.github.com/hiredman/f178a690718bde773ba0 the inner loop body is missing because it is implemented as its own IFn class that is instantiated and immediately executed. it closes over a boxed version of the numbers and returns an boxed version

Comment by Kevin Downey [ 23/Jul/14 12:43 AM ]

hoistedmethod-pass-2.diff replaces 901e4505 with f0a405e3 which fixes the implementation of MaybePrimitiveExpr for TryExpr

with hoistedmethod-pass-2.diff the largest clojure project I have quick access to (53kloc) compiles and all the tests pass

Comment by Alex Miller [ 23/Jul/14 12:03 PM ]

Thanks for the work on this!

Comment by Kevin Downey [ 23/Jul/14 2:05 PM ]

I have been working through running the tests for all the contribs projects with hoistedmethod-pass-2.diff, there are some bytecode verification errors compiling data.json and other errors elsewhere, so there is still work to do

Comment by Kevin Downey [ 25/Jul/14 7:08 PM ]

hoistedmethod-pass-3.diff

49782161 * add HoistedMethod to the compiler for hoisting expresssions out well typed methods
e60e6907 * hoist out loops if required
547ba069 * make TryExpr MaybePrimitive and hoist tries out as required

all contribs whose tests pass with master pass with this patch.

the change from hoistedmethod-pass-2.diff in this patch is the addition of some bookkeeping for arguments that take up more than one slot

Comment by Nicola Mometto [ 26/Jul/14 1:37 AM ]

Kevin there's still a bug regarding long/doubles handling:
On commit 49782161, line 101 of the diff, you're emitting gen.pop() if the expression is in STATEMENT position, you need to emit gen.pop2() instead if e.getReturnType is long.class or double.class

Test case:

user=> (fn [] (try 1 (finally)) 2)
VerifyError (class: user$eval1$fn__2, method: invoke signature: ()Ljava/lang/Object;) Attempt to split long or double on the stack  user/eval1 (NO_SOURCE_FILE:1)
Comment by Kevin Downey [ 26/Jul/14 1:46 AM ]

bah, all that work to figure out the thing I couldn't get right and of course I overlooked the thing I knew at the beginning. I want to get rid of some of the code duplication between emit and emitUnboxed for TryExpr, so when I get that done I'll fix the pop too

Comment by Kevin Downey [ 26/Jul/14 12:52 PM ]

hoistedmethod-pass-4.diff logically has the same three commits, but fixes the pop vs pop2 issue and rewrites emit and emitUnboxed for TryExpr to share most of their code

Comment by Kevin Downey [ 26/Jul/14 12:58 PM ]

hoistedmethod-pass-5.diff fixes a stupid mistake in the tests in hoistedmethod-pass-4.diff





[CLJ-700] contains? broken for transient collections Created: 01/Jan/11  Updated: 27/Jun/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Herwig Hochleitner Assignee: Unassigned
Resolution: Unresolved Votes: 8
Labels: None

Attachments: Java Source File 0001-Refactor-of-some-of-the-clojure-.java-code-to-fix-CL.patch     File clj-700-7.diff     File clj-700.diff     Text File clj-700-patch4.txt     Text File clj-700-patch6.txt    
Patch: Code and Test
Approval: Vetted

 Description   

Behavior with Clojure 1.6.0:

user=> (contains? (transient {:x "fine"}) :x)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentArrayMap$TransientArrayMap  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (contains? (transient (hash-map :x "fine")) :x)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentHashMap$TransientHashMap  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (contains? (transient [1 2 3]) 0)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentVector$TransientVector  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (contains? (transient #{:x}) :x)
IllegalArgumentException contains? not supported on type: clojure.lang.PersistentHashSet$TransientHashSet  clojure.lang.RT.contains (RT.java:724)
;; expected: true

user=> (:x (transient #{:x}))
nil
;; expected: :x

user=> (get (transient #{:x}) :x)
nil
;; expected: :x

Behavior with latest Clojure master as of Jun 27 2014 (same as Clojure 1.6.0) plus patch clj-700-7.diff. In all cases it matches the expected results shown in comments above:

user=> (contains? (transient {:x "fine"}) :x)
true
user=> (contains? (transient (hash-map :x "fine")) :x)
true
user=> (contains? (transient [1 2 3]) 0)
true
user=> (contains? (transient #{:x}) :x)
true
user=> (:x (transient #{:x}))
:x
user=> (get (transient #{:x}) :x) 
:x

Analysis by Alexander Redington: This is caused by expectations in clojure.lang.RT regarding the type of collections for some methods, e.g. contains() and getFrom(). Checking for contains looks to see if the instance passed in is Associative (a subinterface of PersistentCollection), or IPersistentSet.

This patch refactors several of the Clojure interfaces so that logic abstract from the issue of immutability is pulled out to a general interface (e.g. ISet, IAssociative), but preserves the contract specified (e.g. Associatives only return Associatives when calling assoc()).

With more general interfaces in place the contains() and getFrom() methods were then altered to conditionally use the general interfaces which are agnostic of persistence vs. transience. Includes tests in transients.clj to verify the changes fix this problem.

Questions on this approach from Stuart Halloway to Rich Hickey:

1. this represents working back from the defect to rethinking abstractions (good!). Does it go far enough?

2. what are good names for the interfaces introduced here?

Alex Miller: Should also keep an eye on CLJ-787 as it may have some collisions with this one.

Patch: clj-700-7.diff

One 'trailing whitespace' warning is perfectly normal when applying this patch to latest Clojure master as of Jun 27 2014, as shown below. This is simply because of carriage returns at the end of lines in file Associative.java. I know of no way to avoid such a warning without removing CRs from all Clojure source files (e.g. CLJ-1026):

% git am -s --keep-cr --ignore-whitespace < ~/clj/patches/clj-700-7.diff 
Applying: Refactor of some of the clojure .java code to fix CLJ-700.
/Users/admin/clj/latest-clj/clojure/.git/rebase-apply/patch:29: trailing whitespace.
public interface Associative extends IPersistentCollection, IAssociative{
warning: 1 line adds whitespace errors.
Applying: more CLJ-700: refresh to use hasheq


 Comments   
Comment by Herwig Hochleitner [ 01/Jan/11 8:01 PM ]

the same is also true for TransientVectors

{{(contains? (transient [1 2 3]) 0)}}

false

Comment by Herwig Hochleitner [ 01/Jan/11 8:25 PM ]

As expected, TransientSets have the same issue; plus an additional, probably related one.

(:x (transient #{:x}))

nil

(get (transient #{:x}) :x)

nil

Comment by Alexander Redington [ 07/Jan/11 2:07 PM ]

This is caused by expectations in clojure.lang.RT regarding the type of collections for some methods, e.g. contains() and getFrom(). Checking for contains looks to see if the instance passed in is Associative (a subinterface of PersistentCollection), or IPersistentSet.

This patch refactors several of the Clojure interfaces so that logic abstract from the issue of immutability is pulled out to a general interface (e.g. ISet, IAssociative), but preserves the contract specified (e.g. Associatives only return Associatives when calling assoc()).

With more general interfaces in place the contains() and getFrom() methods were then altered to conditionally use the general interfaces which are agnostic of persistence vs. transience. Includes tests in transients.clj to verify the changes fix this problem.

Comment by Stuart Halloway [ 28/Jan/11 10:35 AM ]

Rich: Patch doesn't currently apply, but I would like to get your take on approach here. In particular:

  1. this represents working back from the defect to rethinking abstractions (good!). Does it go far enough?
  2. what are good names for the interfaces introduced here?
Comment by Alexander Redington [ 25/Mar/11 7:44 AM ]

Rebased the patch off the latest pull of master as of 3/25/2011, it should apply cleanly now.

Comment by Stuart Sierra [ 17/Feb/12 2:59 PM ]

Latest patch does not apply as of f5bcf647

Comment by Andy Fingerhut [ 17/Feb/12 5:59 PM ]

clj-700-patch2.txt does patch cleanly to latest Clojure head as of a few mins ago. No changes to patch except in context around changed lines.

Comment by Andy Fingerhut [ 07/Mar/12 3:23 AM ]

Sigh. Git patches applied via 'git am' are fragile beasts indeed. Look at them the wrong way and they fail to apply.

clj-700-patch3.txt applies cleanly to latest master as of Mar 7, 2012, but not if you use this command:

git am -s < clj-700-patch3.txt

I am pretty sure this is because of DOS CR/LF line endings in the file src/jvm/clojure/lang/Associative.java. The patch does apply cleanly if you use this command:

git am --keep-cr -s < clj-700-patch3.txt

Comment by Andy Fingerhut [ 23/Mar/12 6:34 PM ]

This ticket was changed to Incomplete and waiting on Rich when Stuart Halloway asked for feedback on the approach on 28/Jan/2011. Stuart Sierra changed it to not waiting on Rich on 17/Feb/2012 when he noted the patch didn't apply cleanly. Latest patch clj-700-patch3.txt does apply cleanly, but doesn't change the approach used since the time Stuart Halloway's concern was raised. Should it be marked as waiting on Rich again? Something else?

Comment by Stuart Halloway [ 08/Jun/12 12:44 PM ]

Patch 4 incorporates patch 3, and brings it up to date on hashing (i.e. uses hasheq).

Comment by Andy Fingerhut [ 08/Jun/12 12:52 PM ]

Removed clj-700-patch3.txt in favor of Stuart Halloway's improved clj-700-patch4.txt dated June 8, 2012.

Comment by Andy Fingerhut [ 18/Jun/12 3:06 PM ]

clj-700-patch5.txt dated June 18, 2012 is the same as Stuart Halloway's clj-700-patch4.txt, except for context lines that have changed in Clojure master since Stuart's patch was created. clj-700-patch4.txt no longer applies cleanly.

Comment by Andy Fingerhut [ 19/Aug/12 4:47 AM ]

Adding clj-700-patch6.txt, which is identical to Stuart Halloway's clj-700-patch4.txt, except that it applies cleanly to latest master as of Aug 19, 2012. Note that as described above, you must use the --keep-cr option to 'git am' when applying this patch for it to succeed. Removing clj-700-patch5.txt, since it no longer applies cleanly.

Comment by Stuart Sierra [ 24/Aug/12 1:08 PM ]

Patch fails as of commit 1c8eb16a14ce5daefef1df68d2f6b1f143003140

Comment by Andy Fingerhut [ 24/Aug/12 1:53 PM ]

Which patch did you try, and what command did you use? I tried applying clj-700-patch6.txt to the same commit, using the following command, and it applied, albeit with the warning messages shown:

% git am --keep-cr -s < clj-700-patch6.txt
Applying: Refactor of some of the clojure .java code to fix CLJ-700.
/Users/jafinger/clj/latest-clj/clojure/.git/rebase-apply/patch:29: trailing whitespace.
public interface Associative extends IPersistentCollection, IAssociative{
warning: 1 line adds whitespace errors.
Applying: more CLJ-700: refresh to use hasheq

Note the --keep-cr option, which is necessary for this patch to succeed. It is recommended in the "Screening Tickets" section of the JIRA workflow wiki page here: http://dev.clojure.org/display/design/JIRA+workflow

Comment by Andy Fingerhut [ 28/Aug/12 5:48 PM ]

Presumptuously changing Approval from Incomplete back to None, since the latest patch does apply cleanly if the --keep-cr option is used. It was in Screened state recently, but I'm not so presumptuous as to change it to Screened

Comment by Alex Miller [ 19/Aug/13 12:26 PM ]

I think through a series of different hands on this ticket it got knocked way back in the list. Re-marking vetted as it's previously been all the way up through screening. Should also keep an eye on CLJ-787 as it may have some collisions with this one.

Comment by Andy Fingerhut [ 08/Nov/13 10:14 AM ]

clj-700-7.diff is identical to clj-700-patch6.txt, except it applies cleanly to latest master. Only some lines of context in a test file have changed.

When I say "applies cleanly", I mean that there is one warning when using the proper "git am" command from the dev wiki page. This is because one line replaced in Associative.java has a CR/LF at the end of the line, because all lines in that file do.

Comment by Herwig Hochleitner [ 17/Feb/14 9:54 AM ]

Since clojure 1.5, contains? throws an IllegalArgumentException on transients.
In 1.6.0-beta1, transients are no longer marked as alpha.

Does this mean, that we won't be able to distinguish between a nil value and no value on a transient?

Comment by Stuart Halloway [ 27/Jun/14 10:20 AM ]

Request for someone to (1) update patch to apply cleanly, and (2) summarize approach so I don't have to read through the comment history.

Comment by Andy Fingerhut [ 27/Jun/14 11:02 AM ]

The latest patch is clj-700-7.diff dated Nov 8, 2013. I believe it is impossible to create a patch that applies any more cleanly using git for source files that have carriage returns in them, which at least one modified source file does. Here is the command I used on latest Clojure master as of today (Jun 27 2014), which is the same as that of March 25 2014:

% git am -s --keep-cr --ignore-whitespace < ~/clj/patches/clj-700-7.diff 
Applying: Refactor of some of the clojure .java code to fix CLJ-700.
/Users/admin/clj/latest-clj/clojure/.git/rebase-apply/patch:29: trailing whitespace.
public interface Associative extends IPersistentCollection, IAssociative{
warning: 1 line adds whitespace errors.
Applying: more CLJ-700: refresh to use hasheq

If you want a patch that doesn't have the 'trailing whitespace' warning in it, I think someone would have to commit a change that removed the carriage returns from file Associative.java. If you want such a patch, let me know and we can remove all of them from every source file and be done with this annoyance.

Comment by Andy Fingerhut [ 27/Jun/14 11:19 AM ]

Updated description to contain a copy of only those comments that seemed 'interesting'. Most comments have simply been "attached an updated patch that applies cleanly", or "changed the state of this ticket for reason X".

Comment by Alex Miller [ 27/Jun/14 1:19 PM ]

Looks like Andy did as requested, moving back to Screenable.





[CLJ-1157] Classes generated by gen-class aren't loadable from remote codebase for mis-implementation of static-initializer Created: 04/Feb/13  Updated: 18/Apr/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5, Release 1.6
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Tsutomu Yano Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: gen-class
Environment:

Tested on Mac OS X 10.9.1 and Oracle JVM 1.7.0_51 with Clojure 1.6 master SNAPSHOT


Attachments: File 20140121_fix_classloader.diff    
Patch: Code
Approval: Vetted

 Description   

When a genclass'ed object is serialized and sent to a remote system, the remote system throws an exception loading the object:

Exception in thread "main" java.lang.ExceptionInInitializerError
        at java.io.ObjectStreamClass.hasStaticInitializer(Native Method)
        at java.io.ObjectStreamClass.computeDefaultSUID(ObjectStreamClass.java:1723)
        at java.io.ObjectStreamClass.access$100(ObjectStreamClass.java:69)
        at java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:247)
        at java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:245)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.io.ObjectStreamClass.getSerialVersionUID(ObjectStreamClass.java:244)
        at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:600)
        at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1601)
        at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
        at sun.rmi.server.UnicastRef.unmarshalValue(UnicastRef.java:324)
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:173)
        at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:194)
        at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:148)
        at $Proxy0.getResult(Unknown Source)
        at client.SampleClient$_main.doInvoke(SampleClient.clj:12)
        at clojure.lang.RestFn.invoke(RestFn.java:397)
        at clojure.lang.AFn.applyToHelper(AFn.java:159)
        at clojure.lang.RestFn.applyTo(RestFn.java:132)
        at client.SampleClient.main(Unknown Source)
 Caused by: java.io.FileNotFoundException: Could not locate remoteserver/SampleInterfaceImpl__init.class or remoteserver/SampleInterfaceImpl.clj on classpath: 
        at clojure.lang.RT.load(RT.java:434)
        at clojure.lang.RT.load(RT.java:402)
        at clojure.core$load$fn__5039.invoke(core.clj:5520)
        at clojure.core$load.doInvoke(core.clj:5519)
        at clojure.lang.RestFn.invoke(RestFn.java:408)
        at clojure.lang.Var.invoke(Var.java:415)
        at remoteserver.SampleInterfaceImpl.<clinit>(Unknown Source)
        ... 23 more

Reproduce:

// build
git clone git://github.com/tyano/clojure_genclass_fix.git
cd clojure_genclass_fix
sh build.sh
// start rmiregistry
rmiregistry -J-Djava.rmi.server.useCodebaseOnly=false &
// start server
cd remoteserver
sh start.sh
// Start client
cd ../client
sh start.sh

Cause:

A gen-classed class (in this case, SampleInterfaceImpl.class) uses a static-initializer for loading SampleInterfaceImpl__init.class like:

static
  {
    RT.var("clojure.core", "load").invoke("/remoteserver/SampleInterfaceImpl");
  }

RT.load in default uses a context-classloader for loading __init.class but all classes depending on a gen-classed class must be loaded from the same classloader. In this case, RT.load must use a remote URLClassLoader which loads the main class.

Proposed:

Instead produce the equivalent of this in the static initializer:

static
  {
    Var.pushThreadBindings(RT.map(new Object[] { Compiler.LOADER, SampleInterfaceImpl.class.getClassLoader() }));
    try {
        RT.var("clojure.core", "load").invoke("/remoteserver/SampleInterfaceImpl");
    }
    finally 
    {
        Var.popThreadBindings();
    }
  }

With this code, RT.load will uses a same classloader which load SampleInterfaceImpl.class.

Patch: 20140121_fix_classloader.diff

Screened by:



 Comments   
Comment by Stuart Halloway [ 01/Mar/13 10:20 AM ]

This sounds reasonable, but anything touching classloaders must be considered very carefully.

Comment by Stuart Halloway [ 01/Mar/13 12:12 PM ]

It seems overly complex to have the patch do so much code generation. Why not implement a method that does this job, and have the generated code call that?

Comment by Andy Fingerhut [ 11/Jan/14 2:47 PM ]

Patch 20130204_fix_classloader.diff dated Feb 3, 2013 no longer applies cleanly as of the latest commits to Clojure master on Jan 11, 2014. The only conflict in applying the patch appears to be in the file src/jvm/clojure/asm/commons/GeneratorAdapter.java. This is probably due to the commit for ticket CLJ-713 that was committed today, updating the ASM library.

Comment by Tsutomu Yano [ 21/Jan/14 3:01 AM ]

I put a new patch applicable on the latest master branch.
This new patch is simpler and robust because the code-generation becomes very simple. Now It just call a method implemented with Java.

And I fixed my sample program and the 'HOW TO REPRODUCT THIS ISSUE' section of this ticket, because old description is not runnable on newest JVM. It is because the specification of remote method call of the newest JVM was changed from the old one. In the newest JVM, we need a 'java.rmi.server.useCodebaseOnly=false' option for making the behavior of remote call same as old one.
pull the newest sample.





[CLJ-823] Piping seque into seque can deadlock Created: 03/Aug/11  Updated: 21/Aug/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Greg Chapman Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None
Environment:

Windows 7; JVM 1.6; Clojure 1.3 beta 1


Attachments: Text File clj-823-v1.patch    
Patch: Code and Test
Approval: Vetted

 Description   

I'm not sure if this is a supported scenario, but the following deadlocks in Clojure 1.3:

(let [xs (seque (range 150000))
ys (seque (filter odd? xs))]
(apply + ys))

As I understand it, the problem is that ys' fill takes place on an agent thread, so when it calls xs' drain, the (send-off agt fill) does not immediately trigger xs' fill, but is instead put on the nested list to be performed when ys' agent returns. Unfortunately, ys' fill will eventually block trying to take from xs, and so it never returns and the pending send-offs are never sent. Wrapping the send-off in drain to:

(future (send-off agt fill))

is a simple (probably not optimal) way to fix the deadlock.



 Comments   
Comment by Peter Monks [ 07/Jan/13 3:43 PM ]

Reproduced on 1.4.0 and 1.5.0-RC1 as well, albeit with this example:

(seque 3 (seque 3 (range 10)))

Comment by Stuart Halloway [ 30/Mar/13 9:16 AM ]

release-pending-sends?

Comment by Andy Fingerhut [ 21/Aug/14 6:13 PM ]

clj-823-v1.patch uses (release-pending-sends) in seque's drain function in an attempt to avoid the deadlock when a seque is fed into another seque, as suggested by Stuart Halloway. It adds Peter Monks's small quick test case demonstrating the deadlock, which fails (i.e. hangs until killed) without the change and passes with it.





[CLJ-1424] Feature Expressions Created: 15/May/14  Updated: 06/Aug/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Ghadi Shayban Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: reader

Attachments: File CLJ-1424-2.diff     File clojure-feature-expressions.diff    
Patch: Code and Test
Approval: Vetted

 Description   

Feature expressions based directly on Common Lisp. See Clojure design docs, which includes discussion and links to Common Lisp documentation for feature expressions here: http://dev.clojure.org/display/design/Feature+Expressions

#+ #- and or not
are supported. Unreadable tagged literals are suppressed through the *suppress-read* dynamic var. For example, with *features* being #{:clj}, which is the default, the following should read :foo

#+cljs #js {:one :two} :foo

The initial *features* set can be augmented (clj will always be included) with the clojure.features System property:

-Dclojure.features=production,embedded

Patch: CLJ-1424-2.diff

Questions: Should *suppress-read* override *read-eval*?

Related: CLJS-27, TRDR-14



 Comments   
Comment by Jozef Wagner [ 16/May/14 2:19 AM ]

Has there been a decision that CL syntax is going to be used? Related discussion can be found at design page, google groups discussion and another discussion.

Comment by Alex Miller [ 16/May/14 8:34 AM ]

No, no decisions on anything yet.

Comment by Ghadi Shayban [ 19/May/14 7:25 PM ]

Just to echo a comment from TRDR-14:

This is WIP and just one approach for feature expressions. There seem to be at least two couple diverging approaches emerging from the various discussion (Brandon Bloom's idea of read-time splicing being the other.)

In any case having all Clojure platforms be ready for the change is probably essential. Also backwards compatibility of feature expr code to Clojure 1.6 and below is also not trivial.

Comment by Kevin Downey [ 04/Aug/14 1:39 PM ]

if you have ever tried to do tooling for a language where the "parser" tossed out information or did some partial evaluation, it is a pain. this is basically what the #+cljs style feature expressions and bbloom's read time splicing both do with clojure's reader. I think resolving this at read time instead of having the compiler do it before macro expansion is a huge mistake and makes the reader much less useful for reading code.

Comment by Ghadi Shayban [ 04/Aug/14 2:00 PM ]

Kevin, what kind of tooling use case are you alluding to?

Comment by Kevin Downey [ 04/Aug/14 3:24 PM ]

any use case that involves reading code and not immediately handing it off to the compiler. if I wanted to write a little snippet to read in a function, add an unused argument to every arity then pprint it back, reader resolved feature expressions would not round trip.

if I want to write snippet of code to generate all the methods for a deftype (not a macro, just at the repl write a `for` expression) I can generate a clojure data structure, call pprint on it, then paste it in as code, reader feature expressions don't have a representation as data so I cannot do that, I would have to generate strings directly.





[CLJ-1224] Records do not cache hash like normal maps Created: 24/Jun/13  Updated: 27/Jun/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Brandon Bloom Assignee: Unassigned
Resolution: Unresolved Votes: 7
Labels: defrecord, performance

Attachments: Text File 0001-cache-hasheq-and-hashCode-for-records.patch     Text File 0001-CLJ-1224-cache-hasheq-and-hashCode-for-records.patch    
Patch: Code
Approval: Vetted

 Description   

Records do not cache their hash codes like normal Clojure maps, which affects their performance. This problem has been fixed in CLJS, but still affects JVM CLJ.

Approach: Cache hash values in record definitions, similar to maps.

Patch: 0001-CLJ-1224-cache-hasheq-and-hashCode-for-records.patch

Also see: http://dev.clojure.org/jira/browse/CLJS-281



 Comments   
Comment by Nicola Mometto [ 14/Feb/14 5:46 PM ]

I want to point out that my patch breaks ABI compatibility.
A possible approach to avoid this would be to have 3 constructors instead of 2, I can write the patch to support this if desired.

Comment by Stuart Halloway [ 27/Jun/14 11:09 AM ]

The patch 0001-CLJ-1224-cache-hasheq-and-hashCode-for-records.patch is broken in at least two ways:

  • The fields __hash and __hasheq are adopted by new records created by .assoc and .without, which will cause those records to have incorrect (and likely colliding) hash values
  • The addition of the new fields breaks the promise of defrecord, which includes an N+2 constructor taking meta and extmap. With the patch, defrecords get an N+4 constructor letting callers pick hash codes.

I found these problems via the following reasoning:

  • Code has been touched near __extmap
  • Grep for all uses of __extmap and see what breaks
Comment by Nicola Mometto [ 27/Jun/14 2:53 PM ]

Patch 0001-cache-hasheq-and-hashCode-for-records.patch fixes both those issues, reintroducing the N+2 arity constructor

Comment by Alex Miller [ 27/Jun/14 4:08 PM ]

Questions addressed, back to Vetted.





[CLJ-1400] Error "Can't refer to qualified var that doesn't exist" should name the bad symbol Created: 09/Apr/14  Updated: 29/Jul/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Howard Lewis Ship Assignee: Scott Bale
Resolution: Unresolved Votes: 0
Labels: Compiler, errormsgs
Environment:

OS X


Attachments: File clj-1400-1.diff    
Patch: Code and Test
Approval: Vetted

 Description   

Def of var with a ns that doesn't exist will yield this error:

user> (def foo/bar 1)
CompilerException java.lang.RuntimeException: Can't refer to qualified var that doesn't exist, compiling:(NO_SOURCE_PATH:1:1)

Cause: Compiler.lookupVar() returns null if the ns in a qualified var does not exist yet.

Proposed: The error message would be improved by naming the symbol and throwing a CompilerException with file/line/col info. It's not obvious, but this may be the only case where this error occurs. If so, the error message could be more specific that the ns is the part that doesn't exist.

Patch:

Screened by:



 Comments   
Comment by Scott Bale [ 25/Jun/14 9:58 AM ]

This looks to me like relatively low hanging fruit unless I'm missing something; assigning to myself.

Comment by Scott Bale [ 26/Jun/14 11:23 PM ]

Patch clj-1400-1.diff to Compiler.java.

With this patch the example would now look like:

user> (def foo/bar 1)
CompilerException java.lang.RuntimeException: Qualified symbol foo/bar refers to nonexistent namespace: foo, compiling:(NO_SOURCE_PATH:1:1)

I'm not sure the if(namesStaticMember(sym)) [see below], and the 2nd branch, is even necessary. Just by inspection I suspect it is not.

[footnote]

public static boolean namesStaticMember(Symbol sym){
	return sym.ns != null && namespaceFor(sym) == null;
}
Comment by Scott Bale [ 26/Jun/14 11:24 PM ]

patch: code and test

Comment by Scott Bale [ 26/Jun/14 11:27 PM ]

I tested on an actual source file, and the exception message included the file/line/col info as desired:

user=> CompilerException java.lang.RuntimeException: Qualified symbol goo/bar refers to nonexistent namespace: goo, compiling:(/home/scott/dev/foo.clj:3:1)




Generated at Thu Aug 21 23:05:20 CDT 2014 using JIRA 4.4#649-r158309.