<< Back to previous view

[DFRS-2] Make writing footer checksums less expensive or optional Created: 17/Dec/13  Updated: 18/Dec/13

Status: Open
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Ghadi Shayban Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None

Approval: Incomplete

 Description   

Problem:
JVM profiler indicates checksums as implemented are a significant bottleneck.

Cause:
impl.RawOutput wraps the provided OutputStream with a CheckedOutputStream. Every time a rawInt is written, CheckedOutputStream calls on its checksum to update itself.

Adler32's update method happens to be native, which may not be germane to the problem.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/zip/Adler32.java#91

The read side of data.fressian already exposes a knob for checksums to be ignored in RawInput. No such knob exists on the write side.

Checksums are used in the footer methods. They may be extremely useful for data at rest, but may be redundant with other out-of-band mechanisms.

Possible solutions
Buffering so that checksums don't recalculate frequently.
Exposing a knob to control whether write checksums are enabled. This would potentially involve changes with the footer.



 Comments   
Comment by Stuart Halloway [ 18/Dec/13 8:33 AM ]

It is definitely possible that the checksum calculation dings perf. (And if so, another possible solution is just removing checksums entirely from Fressian.)

That said, I don't want to trust a profiler. To move this forward, would like to see a benchmark of a real-world use case without the profiler in play.





[CLJ-1408] Add transient keyword to cached toString() value in _str Created: 19/Apr/14  Updated: 01/Aug/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Tomasz Nurkiewicz Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: interop

Attachments: Text File CLJ-1408-2.patch     Text File CLJ-1408.patch    
Patch: Code
Approval: Incomplete

 Description   

_str field in Keyword and Symbol classes lazily caches result of toString(). Because this field is not transient, serializing (using Java serialization) any keyword before and after calling toString() for the first time yields different results:

(import (java.io ByteArrayOutputStream ObjectOutputStream
                 ByteArrayInputStream  ObjectInputStream))

(defn- serialize [obj]
  (with-open [bos (ByteArrayOutputStream.)
              stream (ObjectOutputStream. bos)]
    (.writeObject stream obj)
    (-> bos .toByteArray seq)))

(def s1 (serialize :k))
;(println :k)
(def s2 (serialize :k))
(println (= s1 s2))

Uncomment (println :k) and suddenly s1 and s2 are no longer equal.

This issue came up when I was trying to use keywords as key in [Hazelcast](https://github.com/hazelcast/hazelcast) map. Hazelcast uses serialized keys in various scenarios, thus if I first put something to map under key :k and then print :k, I can no longer find such key.

Patch: CLJ-1408-2.patch

Screened by:



 Comments   
Comment by Alex Miller [ 19/Apr/14 7:28 AM ]

Hi Tomasz, it would be good to fix this, can you sign the CLA?

Comment by Tomasz Nurkiewicz [ 20/Apr/14 7:26 AM ]

Thanks, I'll sign and send CLA ASAP.

Comment by Tomasz Nurkiewicz [ 08/May/14 4:10 PM ]

My contributor greement arrived, please merge this patch whenever you find suitable.

Comment by Alex Miller [ 08/May/14 10:16 PM ]

Hi Tomasz, I noticed you added the private keyword - please remove that and update the patch.

Comment by Tomasz Nurkiewicz [ 09/May/14 3:55 PM ]

Removed `private` keyword

Comment by Alex Miller [ 20/Jun/14 9:22 AM ]

On second thought, it looks like we have most of the infrastructure for serialization testing anyways, so would appreciate an updated patch with the example turned into a serialization test. Please see test/clojure/test_clojure/serialization.clj for a place to put this (using existing roundtrip function).

Comment by Andy Fingerhut [ 01/Aug/14 9:29 PM ]

Tomasz, in addition to Alex's previous comment, it appears that a commit made to Clojure master earlier today causes your patches to no longer apply cleanly. I haven't looked to see whether updating the patches would be easy, but likely it is.





[CLJ-1250] Reducer (and folder) instances hold onto the head of seqs Created: 03/Sep/13  Updated: 22/Aug/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Christophe Grand Assignee: Unassigned
Resolution: Unresolved Votes: 9
Labels: memory, reducers

Attachments: Text File after-change.txt     Text File before-change.txt     Text File CLJ-1250-20131211.patch     Text File CLJ-1250-AllInvokeSites-20140204.patch     Text File CLJ-1250-AllInvokeSites-20140320.patch    
Patch: Code
Approval: Incomplete

 Description   

Problem Statement
A shared function #'clojure.core.reducers/reducer holds on to the head of a reducible collection, causing it to blow up when the collection is a lazy sequence.

Cause: #'reducer closes over a collection when in order to reify CollReduce, and the closed-over is never cleared. When code attempts to reduce over this anonymous transformed collection, it will realize the tail while the head is stored in the closed-over.

Reproduction steps:
Compare the following calls:

(time (reduce + 0 (map identity (range 1e8))))
(time (reduce + 0 (r/map identity (range 1e8))))

The second call should fail on a normal or small heap.

(If reducers are faster than seqs, increase the range.)

Patch
CLJ-1250-AllInvokeSites-20140320.patch takes Approach #2

Approaches:

1) Reimplement the #'reducer (and #'folder) transformation fns similar to the manner that Christophe proposes here:

(defrecord Reducer [coll xf])

(extend-protocol 
  clojure.core.protocols/CollReduce
  Reducer
      (coll-reduce [r f1]
                   (clojure.core.protocols/coll-reduce r f1 (f1)))
      (coll-reduce [r f1 init]
                   (clojure.core.protocols/coll-reduce (:coll r) ((:xf r) f1) init)))

(def rreducer ->Reducer) 

(defn rmap [f coll]
  (rreducer coll (fn [g] 
                   (fn [acc x]
                     (g acc (f x))))))

Advantages: Relatively non-invasive change.
Disadvantages: Not evident code. Additional protocol dispatch, though only incurred once

2) Clear the reference to 'this' on the stack just before a tail call occurs

When a callee is in return position, clear the local variable reference to 'this' in the caller's stack frame, which will make the caller and all its closed-overs eligible for reclamation.

Patch 1211 takes this approach for InvokeExpr call sites in return position. Patch 1214 takes the same approach for InvokeExprs and also static and instance interop calls.

Here is the code that performs the clearing excerpted from the 1214 patch:

void emitClearThis(GeneratorAdapter gen) {
		gen.visitInsn(Opcodes.ACONST_NULL);
		gen.visitVarInsn(Opcodes.ASTORE, 0);
	}

Tail calls wrapped inside a try/catch/finally clause cannot have 'this' cleared, because closed-overs/locals may need to be emitted for exception handling blocks. Both patches consider and handle this edge case.

Advantages: Fixes this case with no user code changes. Enables GC to do reclaim closed-overs references earlier.
Disadvantages: A compiler change.

3) Alternate approach

from Christophe Grand:
Another way would be to enhance the local clearing mechanism to also clear "this" but it's complex since it may be needed to keep a reference to "this" for caches long after obvious references to "this" are needed.

Advantages: Fine-grained
Disadvantages: Complex, invasive, and the compiler is hard to hack on.

Mitigations
Avoid reducing on lazy-seqs and instead operate on vectors / maps, or custom reifiers of CollReduce or CollFold. This could be easier with some implementations of common collection functions being available (like iterate and partition).

See https://groups.google.com/d/msg/clojure-dev/t6NhGnYNH1A/2lXghJS5HywJ for previous discussion.



 Comments   
Comment by Gary Fredericks [ 03/Sep/13 8:53 AM ]

Fixed indentation in description.

Comment by Ghadi Shayban [ 11/Dec/13 11:08 PM ]

Adding a patch that clears "this" before tail calls. Verified that Christophe's repro case is fixed.

Will upload a diff of the bytecode soon.

Any reason this juicy bug was taken off 1.6?

Comment by Ghadi Shayban [ 11/Dec/13 11:17 PM ]

Here's the bytecode for the clojure.core.reducers/reducer reify before and after the change... Of course a straight diff isn't useful because all the line numbers changed. Kudos to Gary Trakhman for the no.disassemble lein plugin.

Comment by Christophe Grand [ 12/Dec/13 6:58 AM ]

Ghadi, I'm a bit surprised by this part of the patch: was the local clearing always a no-op here?

-		if(context == C.RETURN)
+		if(shouldClear)
 			{
-			ObjMethod method = (ObjMethod) METHOD.deref();
-			method.emitClearLocals(gen);
+                            gen.visitInsn(Opcodes.ACONST_NULL);
+                            gen.visitVarInsn(Opcodes.ASTORE, 0);
 			}

The problem with this approach (clear this on tail call) is that it adds yet another special case. To me the complexity stem from having to keep this around even if the user code doesn't refer to it.

Comment by Ghadi Shayban [ 12/Dec/13 7:19 AM ]

Thank you - I failed to mention this in the commit message: it appears that emitClearLocals() belonging to both ObjMethod and FnMethod (its child) are empty no-ops. I believe the actual local clearing is on line 4855.

I agree re: another special case in the compiler.

Comment by Alex Miller [ 12/Dec/13 8:56 AM ]

Ghadi re 1.6 - this ticket was never in the 1.6 list, it has not yet been vetted by Rich but is ready to do so when we open up again after 1.6.

Comment by Ghadi Shayban [ 12/Dec/13 8:59 AM ]

Sorry I confused the critical list with the Rel1.6 list.

Comment by Ghadi Shayban [ 14/Dec/13 11:16 AM ]

New patch 20131214 that handles all tail invoke sites (InvokeExpr + StaticMethodExpr + InstanceMethodExpr). 'StaticInvokeExpr' seems like an old remnant that had no active code path, so that was left as-is.

The approach taken is still the same as the original small patch that addressed only InvokeExpr, except that it is now using a couple small helpers. The commit message has more details.

Also a 'try' block with no catch or finally clause now becomes a BodyExpr. Arguably a user error, historically accepted, and still accepted, but now they are a regular BodyExpr, instead of being wrapped by a the no-op try/catch mechanism. This second commit can be optionally discarded.

With this patch on my machine (4/8 core/thread Ivy Bridge) running on bare clojure.main:
Christophe's test cases both run i 3060ms on a artificially constrained 100M max heap, indicating a dominant GC overhead. (But they now both work!)

When max heap is at a comfortable 2G the reducers version outpaces the lazyseq at 2100ms vs 2600ms!

Comment by Ghadi Shayban [ 13/Jan/14 10:48 AM ]

Updating stale patch after latest changes to master. Latest is CLJ-1250-AllInvokeSites-20140113

Comment by Ghadi Shayban [ 04/Feb/14 3:50 PM ]

Updating patch after murmur changes

Comment by Tassilo Horn [ 13/Feb/14 4:52 AM ]

Ghadi, I suffer from the problem of this issue. Therefore, I've applied your patch CLJ-1250-AllInvokeSites-20140204.patch to the current git master. However, then I get lots of "java.lang.NoSuchFieldError: array" errors when the clojure tests are run:

     [java] clojure.test-clojure.clojure-set
     [java] 
     [java] java.lang.NoSuchFieldError: array
     [java] 	at clojure.core.protocols$fn__6026.invoke(protocols.clj:123)
     [java] 	at clojure.core.protocols$fn__5994$G__5989__6003.invoke(protocols.clj:19)
     [java] 	at clojure.core.protocols$fn__6023.invoke(protocols.clj:147)
     [java] 	at clojure.core.protocols$fn__5994$G__5989__6003.invoke(protocols.clj:19)
     [java] 	at clojure.core.protocols$seq_reduce.invoke(protocols.clj:31)
     [java] 	at clojure.core.protocols$fn__6017.invoke(protocols.clj:48)
     [java] 	at clojure.core.protocols$fn__5968$G__5963__5981.invoke(protocols.clj:13)
     [java] 	at clojure.core$reduce.invoke(core.clj:6213)
     [java] 	at clojure.set$difference.doInvoke(set.clj:61)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:442)
     [java] 	at clojure.test_clojure.clojure_set$fn__1050$fn__1083.invoke(clojure_set.clj:109)
     [java] 	at clojure.test_clojure.clojure_set$fn__1050.invoke(clojure_set.clj:109)
     [java] 	at clojure.test$test_var$fn__7123.invoke(test.clj:704)
     [java] 	at clojure.test$test_var.invoke(test.clj:704)
     [java] 	at clojure.test$test_vars$fn__7145$fn__7150.invoke(test.clj:721)
     [java] 	at clojure.test$default_fixture.invoke(test.clj:674)
     [java] 	at clojure.test$test_vars$fn__7145.invoke(test.clj:721)
     [java] 	at clojure.test$default_fixture.invoke(test.clj:674)
     [java] 	at clojure.test$test_vars.invoke(test.clj:718)
     [java] 	at clojure.test$test_all_vars.invoke(test.clj:727)
     [java] 	at clojure.test$test_ns.invoke(test.clj:746)
     [java] 	at clojure.core$map$fn__2665.invoke(core.clj:2515)
     [java] 	at clojure.lang.LazySeq.sval(LazySeq.java:40)
     [java] 	at clojure.lang.LazySeq.seq(LazySeq.java:49)
     [java] 	at clojure.lang.Cons.next(Cons.java:39)
     [java] 	at clojure.lang.RT.boundedLength(RT.java:1655)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:130)
     [java] 	at clojure.core$apply.invoke(core.clj:619)
     [java] 	at clojure.test$run_tests.doInvoke(test.clj:761)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:137)
     [java] 	at clojure.core$apply.invoke(core.clj:617)
     [java] 	at clojure.test.generative.runner$run_all_tests$fn__527.invoke(runner.clj:255)
     [java] 	at clojure.test.generative.runner$run_all_tests$run_with_counts__519$fn__523.invoke(runner.clj:251)
     [java] 	at clojure.test.generative.runner$run_all_tests$run_with_counts__519.invoke(runner.clj:251)
     [java] 	at clojure.test.generative.runner$run_all_tests.invoke(runner.clj:253)
     [java] 	at clojure.test.generative.runner$test_dirs.doInvoke(runner.clj:304)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:137)
     [java] 	at clojure.core$apply.invoke(core.clj:617)
     [java] 	at clojure.test.generative.runner$_main.doInvoke(runner.clj:312)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:408)
     [java] 	at user$eval564.invoke(run_tests.clj:3)
     [java] 	at clojure.lang.Compiler.eval(Compiler.java:6657)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7084)
     [java] 	at clojure.lang.Compiler.loadFile(Compiler.java:7040)
     [java] 	at clojure.main$load_script.invoke(main.clj:274)
     [java] 	at clojure.main$script_opt.invoke(main.clj:336)
     [java] 	at clojure.main$main.doInvoke(main.clj:420)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:408)
     [java] 	at clojure.lang.Var.invoke(Var.java:379)
     [java] 	at clojure.lang.AFn.applyToHelper(AFn.java:154)
     [java] 	at clojure.lang.Var.applyTo(Var.java:700)
     [java] 	at clojure.main.main(main.java:37)
Comment by Ghadi Shayban [ 13/Feb/14 8:23 AM ]

Can you give some details about your JVM/environment that can help reproduce? I'm not encountering this error.

Comment by Tassilo Horn [ 13/Feb/14 9:41 AM ]

Sure. It's a 64bit ThinkPad running GNU/Linux.

% java -version
java version "1.7.0_51"
OpenJDK Runtime Environment (IcedTea 2.4.5) (ArchLinux build 7.u51_2.4.5-1-x86_64)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
Comment by Ghadi Shayban [ 13/Feb/14 10:19 AM ]

Strange, that is exactly my mail env, OpenJDK7 on Arch, 64-bit. I have also tested on JDK 6/7/8 on OSX mavericks. Are you certain that the git tree is clean besides the patch? (Arch users unite!)

Comment by Tassilo Horn [ 14/Feb/14 1:13 AM ]

Yes, the tree is clean. But now I see that I get the same error also after resetting to origin/master, so it's not caused by your patch at all. Oh, now the error vanished after doing a `mvn clean`! So problem solved.

Comment by Nicola Mometto [ 19/Feb/14 12:32 PM ]

Ghandi, FnExpr.parse should bind IN_TRY_BLOCK to false before analyzing the fn body, consider the case

(try (do something (fn a [] (heap-consuming-op a))) (catch Exception e ..))

Here in the a function the this local will never be cleared even though it's perfectly safe to.
Admittedly this is an edge case but we should cover this possibility too.

Comment by Ghadi Shayban [ 19/Feb/14 2:06 PM ]

You may have auto-corrected my name to Ghandi instead of Ghadi. I wish I were that wise =)

I will update the patch for FnExpr (that seems reasonable), but maybe after 1.6 winds down and the next batch of tickets get scrutiny. It would be nice to get input on a preferred approach from Rich or core after it gets vetted – or quite possibly not vetted.

Comment by Nicola Mometto [ 19/Feb/14 6:11 PM ]

hah, sorry for the typo on the name

Seems reasonable to me, in the meantime I just pushed to tools.analyzer/tools.emitter complete support for "this" clearing, I'll test this a bit in the next few days to make sure it doesn't cause unexpected problems.

Comment by Andy Fingerhut [ 24/Feb/14 12:13 PM ]

Patch CLJ-1250-AllInvokeSites-20140204.patch no longer applies cleanly to latest master as of Feb 23, 2014. It did on Feb 14, 2014. Most likely some of its context lines are changed by the commit to Clojure master on Feb 23, 2014 – I haven't checked in detail.

Comment by Ghadi Shayban [ 20/Mar/14 4:39 PM ]

Added a patch that 1) applies cleanly, 2) binds the IN_TRY_EXPR to false initially when analyzing FnExpr and 3) uses RT.booleanCast

Comment by Alex Miller [ 22/Aug/14 9:31 AM ]

Can you squash the patch and add tests to cover all this stuff?

Comment by Ghadi Shayban [ 22/Aug/14 9:47 AM ]

Sure. Have any ideas for how to test proper behavior of reference clearing? Know of some prior art in the test suite?

Comment by Alex Miller [ 22/Aug/14 10:25 AM ]

Something like the test in the summary would be a place to start. I don't know of any test that actually inspects bytecode or anything but that's probably not wise anyways. Need to make that kind of a test but get coverage on the different kinds of scenarios you're covering - try/catch, etc.





[CLJ-1161] sources jar has bad versions.properties resource Created: 11/Feb/13  Updated: 25/Apr/14

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: Release 1.6, Release 1.7

Type: Defect Priority: Minor
Reporter: Steve Miner Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-CLJ-1161-Remove-version.properties-from-sources-JAR.patch    
Patch: Code
Approval: Incomplete

 Description   

The "sources" jar (at least since Clojure 1.4 and including 1.5 RC) has a bad version.properties file in it. The resource clojure/version.properties is literally:

version=${version}

The regular Clojure jar has the correct version string in that resource.

I came across a problem when I was experimenting with the sources jar (as used by IDEs). I naively added the sources jar to my classpath, and Clojure died on start up. The bad clojure/versions.properties file was found first, which led to a parse error as the clojure version was being set.

Solution: patch leaves version.properties file out of sources JAR, where it causes problems for tools.



 Comments   
Comment by Steve Miner [ 11/Feb/13 10:04 AM ]

Notes from the dev mailing list:

The "sources" JAR is generated by another Maven plugin, configured here:
https://github.com/clojure/clojure/blob/clojure-1.5.0-RC15/pom.xml#L169-L181

The simplest solution might be to just exclude the file from the sources jar. It looks like maven-source-plugin has an excludes option which would do the trick:

http://maven.apache.org/plugins/maven-source-plugin/jar-mojo.html#excludes

Comment by Jeff Valk [ 21/Apr/14 8:20 AM ]

This issue is marked closed, but I'm still seeing it: the clojure-1.6.0-sources.jar, clojure-1.5.1-sources.jar, etc on Maven Central still contain the bad version.properties files. More specifically, it looks like the fix has been applied to builds in the SNAPSHOTS repository, but not to RELEASES.

Fix applied: https://oss.sonatype.org/content/repositories/snapshots/org/clojure/clojure/
Not fixed: https://oss.sonatype.org/content/repositories/releases/org/clojure/clojure/

Comment by Alex Miller [ 24/Apr/14 4:15 PM ]

Not sure what's needed here, but marking incomplete.

Comment by Jeff Valk [ 25/Apr/14 11:13 AM ]

Would a fix for this update existing sources jars (1.5.1, 1.6.0, etc) on Central? Or would any fix have to wait on the next Clojure release?

Comment by Alex Miller [ 25/Apr/14 12:37 PM ]

For all the same reasons that mutable state is undesirable, changing an existing release jar in the central Maven repository is also undesirable. While it's not technically impossible, we will not update existing releases and this will need to wait for the next. I've looked at this problem a little and I do not yet know enough to know how to fix it or why it even varies between snapshot and release. Help welcome!

In which tool do you see a resulting problem from this?

Comment by Jeff Valk [ 25/Apr/14 11:56 PM ]

Despite the way I phrased the question, I'd hoped that would be the answer. It's the right policy.

Unfortunately, this issue leaves the released sources jars essentially unusable from a tools standpoint. CIDER now has source code navigation from stacktraces – you can jump into both Clojure and Java function definitions from the error/trace. For the latter, the sources jar (for Clojure or any other Java library) needs to be on the classpath as a dev dependency. There's more host interop support in the works for CIDER too ("embrace the host platform"), but not being able to add a dependency on a stable Clojure sources jar presents a wrinkle.

Are the official Clojure releases built by Hudson? The Hudson build right before the 1.6.0 release (#532) and the one right after (#534) both show the exclusion fix, as does the git clojure-1.6.0 tag, which when I check out and build from source, is fine. The Hudson builds with release tags (e.g. 1.6 = #533, 1.6-RC1 = #512, etc), though, don't show any artifacts other than a pom.xml. This would seem to make it rather hard to audit builds...am I missing something?





[CLJ-1093] Empty PersistentCollections get incorrectly evaluated as their generic clojure counterpart Created: 24/Oct/12  Updated: 06/Jul/14

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 4
Labels: collections, compiler

Attachments: Text File 0001-CLJ-1093-fix-compilation-of-empty-PersistentCollecti.patch     Text File clj-1093-fix-empty-record-literal-patch-v2.txt    
Patch: Code and Test
Approval: Incomplete

 Description   
user> (defrecord x [])
user.x
user> #user.x[]   ;; expect: #user.x{}
{}
user> #user.x{}   ;; expect: #user.x{}
{}
user> #clojure.lang.PersistentTreeMap[]
{}
user> (class *1)  ;; expect: clojure.lang.PersistentTreeMap
clojure.lang.PersistentArrayMap

Cause: Compiler's ConstantExpr parser returns an EmptyExpr for all empty persistent collections, even if they are of types other than the core collections (for example: records, sorted collections, custom collections). EmptyExpr reports its java class as one the classes - IPersistentList/IPersistentVector/IPersistentMap/IPersistentSet rather than the original type.

Proposed: If one of the Persistent* classes, then create EmptyExpr as before, otherwise retain the ConstantExpression of the original collection.
Since EmptyExpr is a compiler optimization that applies only to some concrete clojure collections, making EmptyExpr dispatch on concrete types rather than on generic interfaces makes the compiler behave as expected

Patch: 0001-CLJ-1093-fix-compilation-of-empty-PersistentCollecti.patch

Screened by:



 Comments   
Comment by Timothy Baldridge [ 27/Nov/12 11:41 AM ]

Unable to reproduce this bug on latest version of master. Most likely fixed by some of the recent changes to data literal readers.

Marking Not-Approved.

Comment by Timothy Baldridge [ 27/Nov/12 11:41 AM ]

Could not reproduce in master.

Comment by Nicola Mometto [ 01/Mar/13 1:23 PM ]

I just checked, and the problem still exists for records with no arguments:

Clojure 1.6.0-master-SNAPSHOT
user=> (defrecord a [])
user.a
user=> #user.a[]
{}

Admittedly it's an edge case and I see little usage for no-arguments records, but I think it should be addressed aswell since the current behaviour is not what one would expect

Comment by Herwig Hochleitner [ 02/Mar/13 8:14 AM ]

Got the following REPL interaction:

% java -jar ~/.m2/repository/org/clojure/clojure/1.5.0/clojure-1.5.0.jar
user=> (defrecord a [])
user.a
user=> (a.)
#user.a{}
user=> #user.a{}
{}
#user.a[]
{}

This should be reopened or declined for another reason than reproducability.

Comment by Nicola Mometto [ 10/Mar/13 2:18 PM ]

I'm reopening this since the bug is still there.

Comment by Andy Fingerhut [ 13/Mar/13 2:04 PM ]

Patch clj-1093-fix-empty-record-literal-patch-v2.txt dated Mar 13, 2013 is identical to Bronsa's patch 001-fix-empty-record-literal.patch dated Oct 24, 2012, except that it applies cleanly to latest master. I'm not sure why the older patch doesn't but git doesn't like something about it.

Comment by Nicola Mometto [ 26/Jun/13 8:06 PM ]

Patch 0001-CLJ-1093-fix-empty-records-literal-v2.patch solves more issues than the previous patch that was not evident to me at the time.

Only collections that are either PersistentList or PersistentVector or PersistentHash[Map|Set] or PersistentArrayMap can now be EmptyExpr.
This is because we don't want every IPersistentCollection to be emitted as a generic one eg.

user=> (class #clojure.lang.PersistentTreeMap[])
clojure.lang.PersistentArrayMap

Incidentally, this patch also solves CLJ-1187
This patch should be preferred over the one on CLJ-1187 since it's more general

Comment by Jozef Wagner [ 09/Aug/13 2:08 AM ]

Maybe this is related:

user=> (def x `(quote ~(list 1 (clojure.lang.PersistentTreeMap/create (seq [1 2 3 4])))))
#'user/x
user=> x
(quote (1 {1 2, 3 4}))
user=> (class (second (second x)))
clojure.lang.PersistentTreeMap
user=> (eval x)
(1 {1 2, 3 4})
user=> (class (second (eval x)))
clojure.lang.PersistentArrayMap

Even if the collection is not evaluated, it is still converted to the generic clojure counterpart.

Comment by Alex Miller [ 24/Apr/14 4:44 PM ]

In the change for ObjectExpr.emitValue() where you've added PersistentArrayMap to the PersistentHashMap case, should the IPersistentVector case below that be PersistentVector instead, otherwise it would snare a custom IPersistentVector that's not a PersistentVector, right?

This line: "else if(form instanceof ISeq)" at the end of the Compiler diff has different leading tabs which makes the diff slightly more confusing than it could be.

Would be nice to add a test for the sorted map case in the description.

Marking incomplete to address some of these.

Comment by Alex Miller [ 13/May/14 10:43 PM ]

bump

Comment by Nicola Mometto [ 14/May/14 4:24 AM ]

Attached patch 0001-CLJ-1093-fix-empty-collection-literal-evaluation.patch which implements your suggestions.

replacing IPersistentVector with PersistentVector in ObjectExpr.emitValue() exposes a print-dup limitation: it expects every IPersistentCollection to have a static "create" method.

This required special casing for MapEntry and APersistentVector$SubVector

Comment by Nicola Mometto [ 16/May/14 3:57 PM ]

I updated the patch adding print-dups for APersistentVector$SubVec and other IPersistentVectors rather than special casing them in the compiler

Comment by Alex Miller [ 23/May/14 4:21 PM ]

All of the checks on concrete classes in the Compiler parts of this patch don't sit well with me. I understand how you got to this point and I don't have an alternate recommendation (yet) but all of that just feels like the wrong direction.

We want to be built on abstractions such that internal collections are not special; they should conform to the same expectations as an external collection and both should follow the same paths in the compiler - needing to check for those types is a flag for me that something is amiss.

I am marking Incomplete for now based on these thoughts.

Comment by Nicola Mometto [ 06/Jul/14 10:01 AM ]

I've been thinking for a while about this issue and I've come to the conclusion that in my later patches I've been trying to incorporate fixes for 3 different albeit related issues:

1- Clojure transforms all empty IPersistentCollections in their generic Clojure counterpart

user> (defrecord x [])
user.x
user> #user.x[]   ;; expected: #user.x{}
{}
user> #user.x{}   ;; expected: #user.x{}
{}
user> #clojure.lang.PersistentTreeMap[]
{}
user> (class *1)  ;; expected: clojure.lang.PersistentTreeMap
clojure.lang.PersistentArrayMap

2- Clojure transforms all the literals of collections implementing the Clojure interfaces (IPersistentList, IPersistentVector ..) that are NOT defined with deftype or defrecord, to their
generic Clojure counterpart

user=> (class (eval (sorted-map 1 1)))
clojure.lang.PersistentArrayMap ;; expected: clojure.lang.PersistentTreeMap

3- print-dup is broken for some Clojure persistent collections

user=> (print-dup (subvec [1] 0) *out*)
#=(clojure.lang.APersistentVector$SubVector/create [1])
user=> #=(clojure.lang.APersistentVector$SubVector/create [1])
IllegalArgumentException No matching method found: create  clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:53)

I'll keep this ticket regarding issue #1 and open two other tickets for issue #2 and #3

Comment by Nicola Mometto [ 06/Jul/14 10:15 AM ]

I've attached a new patch fixing only this issue, the approach is explained in the description





[CLJ-971] Jar within a jar throws a runtime error Created: 10/Apr/12  Updated: 06/Jul/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2, Release 1.3
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Ron Romero Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: None
Environment:

Maven using the one-jar plugin


Attachments: Text File clj-971-1.patch     Text File clj-971-2.patch    
Patch: Code
Approval: Incomplete

 Description   

The code in RT.load() will load a class by name in several situations. One of these is when a .class resource exists, a .clj resource exists and the class is newer than the clj file. To determine the age of the class and clj, a call is made into RT.lastModified with the resource URL. If the URL is a jar protocol, then RT.lastModified() will try to open a connection to that jar, find the entry in the jar, and return its time.

Sometimes deployment occurs using nested jar files (for example: http://one-jar.sourceforge.net/) and custom classloaders. In this usage, the jar file can be opened, but the entry will not be found and an NPE will result during load:

Caused by: java.lang.NullPointerException
    at clojure.lang.RT.lastModified(RT.java:374)
    at clojure.lang.RT.load(RT.java:408)
    at clojure.lang.RT.load(RT.java:398)
    at clojure.lang.RT.doInit(RT.java:434)
    at clojure.lang.RT.<clinit>(RT.java:316)
... 7 more

Proposal: Make lastModified() more tolerant of finding a null entry and falling back to return the last modified date of the jar file itself.

Patch: clj-971-2.patch

Screened by:



 Comments   
Comment by Anders Sveen [ 11/Nov/13 3:29 AM ]

Would be awesome if you could get this working. Wanting to use som Clojure libs in Java so Leiningen uberjar is not an option right now.

Comment by Paavo Parkkinen [ 24/Nov/13 7:11 AM ]

Took the code change in the description and rolled it into a patch to hopefully push this forward a little bit.

Comment by Alex Miller [ 24/Nov/13 10:06 AM ]

Thanks Paavo - on a quick scan, it looks like in the case you're interested in the updated code would now call url.openConnection() twice - perhaps that could be factored out?

Comment by Paavo Parkkinen [ 25/Nov/13 6:19 AM ]

New patch with duplicate calls to url.openConnection() factored out.

Comment by Gleb Kanterov [ 13/Dec/13 4:30 AM ]

I had the same issue, adding following line to manifest file worked for me

One-Jar-URL-Factory: com.simontuffs.onejar.JarClassLoader$OneJarURLFactory
Comment by Stuart Halloway [ 31/Jan/14 1:04 PM ]

Can we get a test showing the normal path and the can't-read-inside path?

Comment by Sean Shubin [ 17/Apr/14 9:34 PM ]

I notice this solution falls back on the last modified date for the entire jar, while it is still possible to get the date of the individual file, albeit less efficiently.
I posted a way to get the individual file date in CLJ-1405.
Perhaps you don't need the date of the individual file, but I am having a hard time groking the calling code so I am not sure what its intent is.
I did notice in the calling code that the if statement
(classURL != null && (cljURL == null || lastModified(classURL, classfile) > lastModified(cljURL, cljfile))) || classURL == null
Is logically equivalent to the much simpler
classURL == null || cljURL == null || lastModified(classURL, classfile) > lastModified(cljURL, cljfile)

Comment by Alex Miller [ 23/May/14 10:56 AM ]

Why is the solution that Gleb proposed not sufficient without changing Clojure? It seems fairly trivial to add a manifest line in the one-jar packaging step:

One-Jar-URL-Factory: com.simontuffs.onejar.JarClassLoader$OneJarURLFactory

Stu - I doubt the effort of setup is worth a test in the clojure tests but it might be useful to have documented steps to repro manually.

Sean - I think the effort on CLJ-1405 of scanning the full jar is going too far and not needed if we decide to go forward with this.

Comment by Sean Shubin [ 06/Jul/14 12:32 AM ]

Alex,

Regarding scanning the jar, I was worried about the behavior not matching the intent of the code. As I have not had time to get familiar with the code, I figured this route was safer. If scanning the jar turns out to be overkill, so be it, we could just have the lastModified return 0 or something. Either way the code should be under test coverage, I will try to block out some time for that next weekend. Also, as I mentioned in CLJ-1405 (sorry about the duplicate issue), I did document the steps to reproduce the issue here: https://github.com/SeanShubin/clojure-one-jar.

Regarding Gleb's solution, that seems like a decent workaround, but I figured it would be better to fix the bug where it is rather than rely on a workaround. Shouldn't Clojure be changed if it is legitimately a bug in Clojure?

Comment by Alex Miller [ 06/Jul/14 6:06 PM ]

Onejar creates a new kind of classpath container (which happens to have the same type as a .jar) but works differently. As such, resource urls that start with the jar protocol no longer work in certain ways (in this case returning null for lastModified). Java provides a mechanism to customize this behavior with a new url protocol, which is what the One-Jar-URL-Factory installs and uses. It thus seems to me that onejar introduces the issue and also provides the solution. As far as I see it, this is not a bug in Clojure itself.

If there were situations (without custom jar types or protocols) where lastModified() really does throw an NPE, then I would be more convinced that this was a scenario where Clojure should change behavior.

To modify the greeter example given in CLJ-1405:
1) bump the onejar-maven-plugin version to at least 1.4.5
2) add the following to the executions/execution/configuration for the plugin:

<manifestEntries>
      <One-Jar-URL-Factory>com.simontuffs.onejar.JarClassLoader$OneJarURLFactory</One-Jar-URL-Factory>
    </manifestEntries>




[CLJ-787] transient blows up when passed a vector created by subvec Created: 03/May/11  Updated: 23/May/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alexander Redington Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None

Attachments: Text File CLJ-787-p1.patch    
Patch: Code and Test
Approval: Incomplete

 Description   

Subvectors created with subvec from a PersistentVector cannot be made transient:

user=> (transient (subvec [1 2 3 4] 2))
ClassCastException clojure.lang.APersistentVector$SubVector cannot be cast to clojure.lang.IEditableCollection  clojure.core/transient (core.clj:2864)

Cause: APersistentVector$SubVector does not implement IEditableCollection

Patch: CLJ-787-p1.patch

Approach: Create a TransientSubVector based on an underlying TransientVector.

Two assumptions:

  • It's okay for TransientSubVector to delegate the ensureEditable functionality to the underlying TransientVector (sometimes explicitly, sometimes implicitly) - calling ensureEditable explicitly also requires that the field for the underlying vector be the concrete TransientVector type rather than the ITransientVector interface.
  • When an operation that would throw an exception on a PersistentVector happens from the wrong thread (or after persistent!), we throw that exception rather than the IllegalAccessError that transients throw when accessed inappropriately.


 Comments   
Comment by Stuart Sierra [ 31/May/11 9:28 AM ]

Confirmed. APersistentVector$SubVector does not implement IEditableCollection.

The current implementation of TransientVector depends on implementation details of PersistentVector, so it is not a trivial fix. The simplest fix might be to implement IEditableCollection.asTransient in SubVector by creating a new PersistentVector, but I do not know the performance implications.

Comment by Gary Fredericks [ 25/May/13 8:11 PM ]

We could get the same performance characteristics as SubVector by creating a TransientSubVector based on an underlying TransientVector, right?

Preparing a patch to that effect.

Comment by Gary Fredericks [ 25/May/13 10:58 PM ]

Text from the commit msg:

Made two assumptions:

  • It's okay for TransientSubVector to delegate the ensureEditable
    functionality to the underlying TransientVector (sometimes
    explicitely, sometimes implicitely) – calling ensureEditable
    explicitely also requires that the field for the underlying vector
    be the concrete TransientVector type rather than the
    ITransientVector interface.
  • When an operation that would throw an exception on a
    PersistentVector happens from the wrong thread (or after
    persistent!), we throw that exception rather than the
    IllegalAccessError that transients throw when accessed
    inappropriately.
Comment by Alex Miller [ 11/Oct/13 4:17 PM ]

I think there are some assumptions being made in this patch about the class structure here that do not hold. The structure is, admittedly, quite twisty.

A counter-example that highlights one of a few subtypes of APersistentVector that are not PersistentVector (like MapEntry):

user=> (transient (subvec (first {:a 1}) 0 1))
ClassCastException clojure.lang.MapEntry cannot be cast to clojure.lang.IEditableCollection  clojure.lang.APersistentVector$TransientSubVector.<init> (APersistentVector.java:592)

PersistentVector.SubVector expects to work on anything that implements IPersistentVector. Note that this includes concrete types such as MapEntry and LazilyPersistentVector, but could also be any user-implemented type IPersistentVector type too. TransientSubVector is making the assumption that the IPersistentVector in a SubVector question is also an IEditableCollection (that can be converted to be transient). Note that while PersistentVector implements TransientVector (and IEditableCollection), APersistentVector does not. To really implement this in tandem with SubVector, I think you would need to guarantee that IPersistentVector extended IEditableCollection and I don't think that's something we want to do.

I don't see an easy solution. Any time I see all these modifiers (Transient, Sub, etc) being created in different combinations, it is a clear sign that independent kinds of functionality are being remixed into single inheritance OO trees. You can see the same thing in most collection libraries (even Java's - need a ConcurrentIdentitySortedMap? too bad!).

Needs more thought.

Comment by Andy Fingerhut [ 08/Nov/13 10:17 AM ]

Patch CLJ-787-p1.patch no longer applies cleanly to latest master, but it is only because of some new tests added to the transients.clj file since the patch was created, so it is trivial to update in that sense. Not updating it now due to other more significant issues with the patch described above.

Comment by Alex Miller [ 17/Jan/14 10:19 AM ]

No good solution to consider right now, removing from 1.6.





Generated at Fri Aug 22 10:56:57 CDT 2014 using JIRA 4.4#649-r158309.