<< Back to previous view

[DFRS-2] Make writing footer checksums less expensive or optional Created: 17/Dec/13  Updated: 18/Dec/13

Status: Open
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Ghadi Shayban Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None

Approval: Incomplete

 Description   

Problem:
JVM profiler indicates checksums as implemented are a significant bottleneck.

Cause:
impl.RawOutput wraps the provided OutputStream with a CheckedOutputStream. Every time a rawInt is written, CheckedOutputStream calls on its checksum to update itself.

Adler32's update method happens to be native, which may not be germane to the problem.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/zip/Adler32.java#91

The read side of data.fressian already exposes a knob for checksums to be ignored in RawInput. No such knob exists on the write side.

Checksums are used in the footer methods. They may be extremely useful for data at rest, but may be redundant with other out-of-band mechanisms.

Possible solutions
Buffering so that checksums don't recalculate frequently.
Exposing a knob to control whether write checksums are enabled. This would potentially involve changes with the footer.



 Comments   
Comment by Stuart Halloway [ 18/Dec/13 8:33 AM ]

It is definitely possible that the checksum calculation dings perf. (And if so, another possible solution is just removing checksums entirely from Fressian.)

That said, I don't want to trust a profiler. To move this forward, would like to see a benchmark of a real-world use case without the profiler in play.





[CLJ-1591] Symbol not being bound in namespace when name clashes with clojure.core Created: 14/Nov/14  Updated: 25/Nov/14

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Mike Anderson Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None

Approval: Incomplete

 Description   

The following code fails (both in 1.6 and latest 1.7-alpha4):

user=> (ns foo)
nil
foo=>  (def inc inc)
WARNING: inc already refers to: #'clojure.core/inc in namespace: foo, being replaced by: #'foo/inc
#'foo/inc

;; Note inc is unbound at this point, which causes the exception below
foo=> inc
#<Unbound Unbound: #'foo/inc>
foo=> (ns bar)
nil
bar=> (require ['foo :refer ['inc]])
WARNING: inc already refers to: #'clojure.core/inc in namespace: bar, being replaced by: #'foo/inc
nil
bar=> (inc 8)

IllegalStateException Attempting to call unbound fn: #'foo/inc  clojure.lang.Var$Unbound.throwArity (Var.java:43)

Further investigation shows that foo/inc is unbound:

foo/inc
=> #<Unbound Unbound: #'foo/inc>

Further investigation also shows that replacing the (def inc inc) with almost anything else, e.g. (def inc dec), (def inc clojure.core/inc), or (def inc (fn [n] (+ n 1))), causes no exception (but the warnings remain).

I would expect:
a) foo/inc should be bound and have the same value as clojure.core/inc
b) No error when requiring foo/inc
c) bar/inc should be bound to foo/inc



 Comments   
Comment by Nicola Mometto [ 14/Nov/14 10:04 PM ]

The second error should be expected, the right syntax should be (require ['foo :refer ['inc]]) (note the leading quote before inc)

Comment by Mike Anderson [ 14/Nov/14 10:20 PM ]

Thanks for the catch Nicola - I've edited the description. Still get the same error however (just with a slightly different message)

Comment by Alex Miller [ 14/Nov/14 10:22 PM ]

See comment...

Comment by Mike Anderson [ 14/Nov/14 10:24 PM ]

@Alex what comment? Note that the error still occurs even with the right syntax....

Comment by Mike Anderson [ 14/Nov/14 10:26 PM ]

Appears to have been closed prematurely

Comment by Alex Miller [ 14/Nov/14 10:39 PM ]

I can't reproduce with the correct syntax:

Clojure 1.7.0-master-SNAPSHOT
user=> (ns foo)
nil
foo=> (def inc inc)
WARNING: inc already refers to: #'clojure.core/inc in namespace: foo, being replaced by: #'foo/inc
#'foo/inc
foo=> (ns bar)
nil
bar=> (require ['foo :refer ['inc]])
WARNING: inc already refers to: #'clojure.core/inc in namespace: bar, being replaced by: #'foo/inc
nil
Comment by Mike Anderson [ 14/Nov/14 10:55 PM ]

The problem is that the var is still unbound and causes e.g. the following error:

=> (foo/inc 8)
IllegalStateException Attempting to call unbound fn: #'foo/inc clojure.lang.Var$Unbound.throwArity (Var.java:43)

I don't think that should be expected - or am I missing something?

Comment by Alex Miller [ 14/Nov/14 10:57 PM ]

Ah, will take a look. But not right now.

Comment by Andy Fingerhut [ 15/Nov/14 1:09 PM ]

Updated the description with a few more details. The exception goes away if you do (def inc (fn [n] (+ n 1))) instead of (def inc inc), for example. The warnings remain.

Comment by Tom Crayford [ 20/Nov/14 11:07 AM ]

Unsure if this is the same issue (I think it might be?), but I reproduced the exact same error message with AOT compilation involved:

reproduced in this git repository: https://github.com/yeller/compiler_update_not_referenced_bug

clone it, run `lein do clean, uberjar, test`, and that error message will show up every time for me

Comment by Andy Fingerhut [ 20/Nov/14 5:43 PM ]

Mike, I think replacing (def inc inc) in your example with (def inc clojure.core/inc) should be considered as a reasonable workaround for this issue, unless you have some use case where you need to def inc to something that is not in clojure.core (and if so, why?)

The reason (def inc inc) behaves this way is, if not absolutely necessary, at least commonly used in Clojure programs to define recursive functions, e.g. (defn fib [n] (if (<= n 1) 1 (+ (fib (dec n)) (fib (- n 2))))), so that the occurrences of fib in the body are resolved to the fib being defined.

Comment by Alex Miller [ 22/Nov/14 9:05 AM ]

Moving to 1.7 until I can look at this more deeply.

Comment by Mike Anderson [ 23/Nov/14 6:08 PM ]

Andy - yes the workaround is fine for me right now.

I don't think this is an urgent issue but it may be exposing a subtle complexity regarding assumptions about the state of the namespace at different times. Perhaps the semantics should be something like:

  • The def statement itself should be run before the var is interned. e.g. (def inc (inc 5)) should result in (def inc 6)
  • Anything complied / deferred to run after completion of the def statement should use the new var (i.e. the new var should be referenced by fns, lazy sequences etc.)
Comment by Andy Fingerhut [ 23/Nov/14 6:36 PM ]

I'm not sure what your proposal means in a case like this:

(def inc (fn [x] (inc x)))

Is the second inc to be interpreted/resolved before or after the new inc is created? Because it is (fn ...) it should be the after-behavior? What else besides fn should cause the after-behavior, rather than the before-behavior?

Even more fun (not saying that people often write code like this, but the compiler can handle it today):

(def inc (if (> (inc y) 5)
           (fn [x] (inc x))
           (fn [x] (dec x))))

I think the current compiler behavior of 'in the body of a def, the def'd symbol always refers to the new var, not any earlier def'd vars' is fairly straightforward to explain.

Comment by Tom Crayford [ 23/Nov/14 9:15 PM ]

Should I file the AOT issue reproduced in that thing as a new issue?

Comment by Andy Fingerhut [ 24/Nov/14 5:16 PM ]

Tom: Alex Miller or another screener would be best to say whether the AOT issue should be a separate ticket, but my best guess would be "go for it". I tried to look at the link you gave but it seems not to point to anything. Could you double-check that link?

Comment by Tom Crayford [ 24/Nov/14 6:48 PM ]

Andy,

Great. I'll write one up tomorrow sometime. I accidentally left that repo as private, should be visible now.

Comment by Andy Fingerhut [ 24/Nov/14 8:11 PM ]

Tom, looked at your project. Thanks for that. It appears not to have anything like (def inc inc) in it. It throws exception during test step of 'lein do clean, uberjar, test' consistently for me, too, but compiles with only warnings and passes tests with 'lein do clean, test'. I have more test results showing in which Clojure versions these results change. To summarize, the changes to Clojure that appear to make the biggest difference in the results are below (these should be added to the new ticket you create – you are welcome to do so):

Clojure 1.6.0, 1.7.0-alpha1, and later changes up through the commit with description "CLJ-1378: Allows FnExpr to override its reported class with a type hint": No errors or warnings for either lein command above.

Next commit with description "Add clojure.core/update, like update-in but takes a single key" that adds clojure.core/update: 'lein do clean, test' is fine, but 'lein do clean, uberjar' throws exception during compilation, probably due to CLJ-1241.

Next commit with description "fix CLJ-1241": 'lein do clean, test' and 'lein do clean, uberjar' give warnings about clojure.core/update, but no errors or exceptions. 'lein do clean, uberjar, test' throws exception during test step that is same as the one I see with Clojure 1.7.0-alpha4. Debug prints of values of clojure.core/update and int-map/update (in data.int-map and in Tom's namespace compiler-update-not-referenced-bug.core) show things look fine when printed inside data.int-map, and in Tom's namespace when not doing the uberjar, but when doing the uberjar, test, int-map/update is unbound in Tom's namespace.

In case it makes a difference, my testing was done with Mac OS X 10.9.5, Leiningen 2.5.0 on Java 1.7.0_45 Java HotSpot(TM) 64-Bit Server VM

Comment by Nicola Mometto [ 25/Nov/14 3:44 PM ]

Tom, I've opened a ticket with a patch fixing the AOT issue: http://dev.clojure.org/jira/browse/CLJ-1604





[CLJ-1580] Transient collections should guarantee thread visibility Created: 05/Nov/14  Updated: 25/Nov/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: transient

Approval: Incomplete

 Description   

With changes from CLJ-1498, transients are still thread isolated but may move between threads during their lifetime which introduces new concurrency concerns, namely visibility of changes across threads.






[CLJ-1515] Reify the result of range Created: 29/Aug/14  Updated: 14/Nov/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Timothy Baldridge Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None

Attachments: File patch.diff     File range-patch3.diff     File reified-range4.diff    
Patch: Code and Test
Approval: Incomplete

 Description   

Currently range simply returns a lazy seq. If the return value of range were reified into a type (as it is in ClojureScript) we could optimize many functions on that resulting type. Some operations such as count and nth become O(1) in this case, while others such as reduce could receive a performance boost do to the reduced number of allocations.

Approach: this patch revives the unused (but previously existing) clojure.lang.Range class. This class acts as a lazy seq and implements several other appropriate interfaces such as Counted and Indexed. This type is implemented in Java since range is needed fairly on in core.clj before deftype is defined. The attached patch uses Numbers.* methods for all math due to the input types to range being unknown. The class also supplies a .iterator() method which allows for allocation free reducing over range.

Note: this code keeps backwards compatibility with the existing range code. This means some parts of the class (mostly relating to a step size of 0) are a bit more complex than desired, but these bits were needed to get all the tests to pass.

Note: this code does not preserve the chunked-seq nature of the original range. The fact that range used to return chunked seqs was not published in the doc strings and so it was removed to allow for simpler code in Range.java.

Performance:
(timings done at the repl run via java -jar)

(dotimes [x 100] (time (dotimes [x 1] (count (range (* 1024 1024))))))
master => 80-110ms
patch => 0.014ms


(dotimes [x 100] (time (dotimes [x 1] (reduce + (map inc (range (* 1024 1024)))))))
master => 76-87ms
patch => 340-360ms


(dotimes [x 100] (time (dotimes [x 1] (reduce + (map inc (map inc (range (* 1024 1024))))))))
master => 97-123ms
patch=> 490-577ms



(dotimes [x 100] (time (dotimes [x 1] (count (filter odd? (range (* 1024 1024)))))))
master => 87-104ms
patch => 370-330ms


(dotimes [x 100] (time (dotimes [x 1] (transduce (comp (map inc) (map inc)) + (range (* 1024 1024))))))
master=>76-116ms
patch => 44ms-59ms

Patch: reified-range4.diff (some tests fail)



 Comments   
Comment by Alex Miller [ 29/Aug/14 3:19 PM ]

1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

Comment by Timothy Baldridge [ 29/Aug/14 3:40 PM ]

1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Comment by Alex Miller [ 29/Aug/14 3:49 PM ]

6) might be useful if you're walking through it with nth, which hits count everytime, but doubt that's common
7) yep, reasonable

Comment by Andy Fingerhut [ 18/Sep/14 6:52 AM ]

I have already pointed out to Edipo in personal email the guidelines on what labels to use for Clojure JIRA tickets here: http://dev.clojure.org/display/community/Creating+Tickets

Comment by Timothy Baldridge [ 19/Sep/14 10:02 AM ]

New patch with IReduce directly on Range instead of relying on iterators

Comment by Alex Miller [ 01/Oct/14 2:00 PM ]

The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is.

Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.

Comment by Timothy Baldridge [ 03/Oct/14 10:00 AM ]

Looks like chunked seqs do make lazy seq code about 5x faster in these tests.

Comment by Ghadi Shayban [ 03/Oct/14 10:22 AM ]

I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.

Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

Comment by Timothy Baldridge [ 03/Oct/14 10:38 AM ]

I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

Comment by Alex Miller [ 03/Oct/14 10:46 AM ]

I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.

On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Comment by Ghadi Shayban [ 03/Oct/14 11:00 AM ]

Tim, I'm not suggesting unboxed math, but the singular fast-path of all-Longs that you and Alex describe. I mistakenly lower-cased Long/Float.

Comment by Timothy Baldridge [ 31/Oct/14 11:30 AM ]

Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.

Comment by Nicola Mometto [ 14/Nov/14 12:51 PM ]

As discussed with Tim in #clojure, the current patch should not change ArrayChunk's reduce impl, that's an error.





[CLJ-1424] Feature Expressions Created: 15/May/14  Updated: 07/Nov/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Ghadi Shayban Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: reader

Attachments: File CLJ-1424-2.diff     File clj-1424-3.diff     File clojure-feature-expressions.diff    
Patch: Code
Approval: Incomplete

 Description   

Feature expressions based directly on Common Lisp. See Clojure design docs, which includes discussion and links to Common Lisp documentation for feature expressions here: http://dev.clojure.org/display/design/Feature+Expressions

#+ #- and or not
are supported. Unreadable tagged literals are suppressed through the *suppress-read* dynamic var. For example, with *features* being #{:clj}, which is the default, the following should read :foo

#+cljs #js {:one :two} :foo

The initial *features* set can be augmented (clj will always be included) with the clojure.features System property:

-Dclojure.features=production,embedded

Patch: clj-1424-3.diff

Questions: Should *suppress-read* override *read-eval*?

Related: CLJS-27, TRDR-14



 Comments   
Comment by Jozef Wagner [ 16/May/14 2:19 AM ]

Has there been a decision that CL syntax is going to be used? Related discussion can be found at design page, google groups discussion and another discussion.

Comment by Alex Miller [ 16/May/14 8:34 AM ]

No, no decisions on anything yet.

Comment by Ghadi Shayban [ 19/May/14 7:25 PM ]

Just to echo a comment from TRDR-14:

This is WIP and just one approach for feature expressions. There seem to be at least two couple diverging approaches emerging from the various discussion (Brandon Bloom's idea of read-time splicing being the other.)

In any case having all Clojure platforms be ready for the change is probably essential. Also backwards compatibility of feature expr code to Clojure 1.6 and below is also not trivial.

Comment by Kevin Downey [ 04/Aug/14 1:39 PM ]

if you have ever tried to do tooling for a language where the "parser" tossed out information or did some partial evaluation, it is a pain. this is basically what the #+cljs style feature expressions and bbloom's read time splicing both do with clojure's reader. I think resolving this at read time instead of having the compiler do it before macro expansion is a huge mistake and makes the reader much less useful for reading code.

Comment by Ghadi Shayban [ 04/Aug/14 2:00 PM ]

Kevin, what kind of tooling use case are you alluding to?

Comment by Kevin Downey [ 04/Aug/14 3:24 PM ]

any use case that involves reading code and not immediately handing it off to the compiler. if I wanted to write a little snippet to read in a function, add an unused argument to every arity then pprint it back, reader resolved feature expressions would not round trip.

if I want to write snippet of code to generate all the methods for a deftype (not a macro, just at the repl write a `for` expression) I can generate a clojure data structure, call pprint on it, then paste it in as code, reader feature expressions don't have a representation as data so I cannot do that, I would have to generate strings directly.

Comment by Alex Miller [ 22/Aug/14 9:10 AM ]

Changing Patch setting so this is not in Screenable yet (as it's still a wip).

Comment by Alex Miller [ 07/Nov/14 4:39 PM ]

Latest patch brings up to par with related patches in CLJS-27 and TRDR-14 and importantly adds support for loading .cljc files as Clojure files.

Comment by Andy Fingerhut [ 07/Nov/14 5:55 PM ]

Maybe undesirable behavior demonstrated below with latest Clojure master plus patch clj-1424-3.diff, due to the #+cljs skipping the comment, but not the (dec a). I thought it could be fixed simply by moving RT.suppressRead() check after (ret == r) check in read(), but that isn't correct.

user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs ; foo\n (dec a))")
(defn foo [a] (inc a) (dec a))




[CLJ-1400] Error "Can't refer to qualified var that doesn't exist" should name the bad symbol Created: 09/Apr/14  Updated: 07/Oct/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.8

Type: Enhancement Priority: Minor
Reporter: Howard Lewis Ship Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: Compiler, errormsgs
Environment:

OS X


Attachments: File clj-1400-2.diff     File clj-1400-3.diff     File clj-1400-4.diff    
Patch: Code and Test
Approval: Incomplete

 Description   

Def of var with a ns that doesn't exist will yield this error:

user> (def foo/bar 1)
CompilerException java.lang.RuntimeException: Can't refer to qualified var that doesn't exist, compiling:(NO_SOURCE_PATH:1:1)

Cause: Compiler.lookupVar() returns null if the ns in a qualified var does not exist yet.

Proposed: The error message would be improved by naming the symbol and throwing a CompilerException with file/line/col info. It's not obvious, but this may be the only case where this error occurs. If so, the error message could be more specific that the ns is the part that doesn't exist.

Patch: clj-1400-4.diff

Screened by: Alex Miller



 Comments   
Comment by Scott Bale [ 25/Jun/14 9:58 AM ]

This looks to me like relatively low hanging fruit unless I'm missing something; assigning to myself.

Comment by Scott Bale [ 26/Jun/14 11:23 PM ]

Patch clj-1400-1.diff to Compiler.java.

With this patch the example would now look like:

user> (def foo/bar 1)
CompilerException java.lang.RuntimeException: Qualified symbol foo/bar refers to nonexistent namespace: foo, compiling:(NO_SOURCE_PATH:1:1)

I'm not sure the if(namesStaticMember(sym)) [see below], and the 2nd branch, is even necessary. Just by inspection I suspect it is not.

[footnote]

public static boolean namesStaticMember(Symbol sym){
	return sym.ns != null && namespaceFor(sym) == null;
}
Comment by Scott Bale [ 26/Jun/14 11:24 PM ]

patch: code and test

Comment by Scott Bale [ 26/Jun/14 11:27 PM ]

I tested on an actual source file, and the exception message included the file/line/col info as desired:

user=> CompilerException java.lang.RuntimeException: Qualified symbol goo/bar refers to nonexistent namespace: goo, compiling:(/home/scott/dev/foo.clj:3:1)
Comment by Andy Fingerhut [ 29/Aug/14 4:46 PM ]

Patch clj-1400-1.diff dated Jun 26 2014 no longer applied cleanly to latest master after some commits were made to Clojure on Aug 29, 2014. It did apply cleanly before that day.

I have not checked how easy or difficult it might be to update this patch. See section "Updating Stale Patches" on this wiki page for some tips on updating patches: http://dev.clojure.org/display/community/Developing+Patches

Comment by Scott Bale [ 31/Aug/14 3:53 PM ]

Attached is an updated patch: "clj-1400-2.diff". I removed the stale patch.

Comment by Alex Miller [ 09/Sep/14 9:29 AM ]

Few comments to address:

  • Compiler diff was using spaces, not tabs, which makes it harder to diff. I attached a -3.diff that fixes this.
  • the call to namesStaticMember seems weird. The name of that method is confusing for this use. Beyond that, I think it's doing more than you need. That method is going to attempt resolve the qualified name in terms of the current ns, but I think you don't even want to do that. Rather you just want to know if the sym has a ns (sym.ns != null) - isn't that enough?
  • In what case will the other error "Var doesn't exist" occur? In other words, in what case will lookupVar not succeed in creating a new var here? If there is no such case, then remove this case. If there is such a case, then add a test.
Comment by Scott Bale [ 11/Sep/14 11:19 PM ]

Agree with all three of your bullets. Attached is an updated patch, clj-1400-4.diff.

  • I used tabs in Compiler.java
  • After close inspection of call to lookupVar(...), I believe null is returned only in the case of exactly this ticket (the symbol having a non-null namespace which has not been loaded yet). So I've taken out the conditional and the 2nd branch.
  • (Test is unchanged)
Comment by Scott Bale [ 11/Sep/14 11:22 PM ]

(properly named patch)

Comment by Alex Miller [ 11/Sep/14 11:37 PM ]

You could throw a CompilerException with the location of the problem instead (as the ticket description suggests).

Comment by Scott Bale [ 19/Sep/14 2:37 PM ]

Sorry, I should've mentioned because this wasn't obvious to me either (and in fact I forgot until just now): the RuntimeException is already caught and wrapped in a CompilerException.

I'm not sure which try-catch block within Compiler.java this is happening in, there are multiple. But you can see in the output that the exception is a CompilerException and the file|line|col info is there:

In the Repl...

user> (def foo/bar 1)
CompilerException java.lang.RuntimeException: Qualified symbol foo/bar refers to nonexistent namespace: foo, compiling:(NO_SOURCE_PATH:1:1)

...or in a source file

user=> CompilerException java.lang.RuntimeException: Qualified symbol goo/bar refers to nonexistent namespace: goo, compiling:(/home/scott/dev/foo.clj:3:1)

Also, at the point at which the RuntimeException of this patch is being thrown, the source line and col params to CompilerException are not available, or at least not afaict.

Comment by Alex Miller [ 07/Oct/14 12:34 PM ]

I'll follow up on this patch later - Rich thought it was making too many assumptions. I think we validated many of those but need to double-check those.





[CLJ-1322] doseq with several bindings causes "ClassFormatError: Invalid Method Code length" Created: 10/Jan/14  Updated: 06/Oct/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Miikka Koskinen Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: None
Environment:

Clojure 1.5.1, java 1.7.0_25, OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1ubuntu0.12.04.2)


Attachments: Text File doseq-bench.txt     Text File doseq.patch     File script.clj    
Patch: Code
Approval: Incomplete

 Description   

Important Perf Note the new impl is faster for collections that are custom-reducible but not chunked, and is also faster for large numbers of bindings. The original implementation is hand tuned for chunked collections, and wins for larger chunked coll/smaller binding count scenarios, presumably due to the fn call/return tracking overhead of reduce. Details are in the comments.
Screened By
Patch doseq.patch

user=> (def a1 (range 10))
#'user/a1
user=> (doseq [x1 a1 x2 a1 x3 a1 x4 a1 x5 a1 x6 a1 x7 a1 x8 a1] (do))
CompilerException java.lang.ClassFormatError: Invalid method Code length 69883 in class file user$eval1032, compiling:(NO_SOURCE_PATH:2:1)

While this example is silly, it's a problem we've hit a couple of times. It's pretty surprising when you have just a couple of lines of code and suddenly you get the code length error.



 Comments   
Comment by Kevin Downey [ 18/Apr/14 12:20 AM ]

reproduces with jdk 1.8.0 and clojure 1.6

Comment by Nicola Mometto [ 22/Apr/14 5:35 PM ]

A potential fix for this is to make doseq generate intermediate fns like `for` does instead of expanding all the code directly.

Comment by Ghadi Shayban [ 25/Jun/14 8:39 PM ]

Existing doseq handles chunked-traversal internally, deciding the
mechanics of traversal for a seq. In addition to possibly conflating
concerns, this is causing a code explosion blowup when more bindings are
added, approx 240 bytes of bytecode per binding (without modifiers).

This approach redefs doseq later in core.clj, after protocol-based
reduce (and other modern conveniences like destructuring.)

It supports the existing :let, :while, and :when modifiers.

New is a stronger assertion that modifiers cannot come before binding
expressions. (Same semantics as let, i.e. left to right)

valid: [x coll :when (foo x)]
invalid: [:when (foo x) x coll]

This implementation does not suffer from the code explosion problem.
About 25 bytes of bytecode + 1 fn per binding.

Implementing this without destructuring was not a party, luckily reduce
is defined later in core.

Comment by Andy Fingerhut [ 26/Jun/14 12:25 AM ]

For anyone reviewing this patch, note that there are already many tests for correct functionality of doseq in file test/clojure/test_clojure/for.clj. It may not be immediately obvious, but every test for 'for' defined with deftest-both is a test for 'for' and also for 'doseq'.

Regarding the current implementation of doseq: it in't simply that it is too many bytes per binding, it is that the code size doubles with each additional binding. See these results, which measures size of the macroexpanded form rather than byte code size, but those two things should be fairly linearly related to each other here:

(defn formsize [form]
  (count (with-out-str (print (macroexpand form)))))

user=> (formsize '(doseq [x (range 10)] (print x)))
652
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
1960
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
4584
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
9947
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))
20997

Here are results for the same expressions after Ghadi's patch doseq.patch dated June 25 2014:

user=> (formsize '(doseq [x (range 10)] (print x)))
93
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
170
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
247
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
324
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))
401

It would be good to see some performance results with and without this patch, too.

Comment by Stuart Halloway [ 28/Jun/14 2:21 PM ]

In the tests below, the new impl is called "doseq2", vs. the original impl "doseq"

(def hund (into [] (range 100)))
(def ten (into [] (range 10)))
(def arr (int-array 100))
(def s "superduper")

;; big seq, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a (range 100000000)])))
;; 1.2 sec

(dotimes [_ 5]
  (time (doseq2 [a (range 100000000)])))
;; 1.8 sec

;; small unchunked reducible, few bindings: doseq2 wins
(dotimes [_ 5]
  (time (doseq [a s b s c s])))
;; 0.5 sec

(dotimes [_ 5]
  (time (doseq2 [a s b s c s])))
;; 0.2 sec

(dotimes [_ 5]
  (time (doseq [a arr b arr c arr])))
;; 40 msec

(dotimes [_ 5]
  (time (doseq2 [a arr b arr c arr])))
;; 8 msec

;; small chunked reducible, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a hund b hund c hund])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a hund b hund c hund])))
;; 8 msec

;; more bindings: doseq2 wins bigger and bigger
(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten ])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten ])))
;; 0.4 msec

(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten e ten])))
;; 18 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten e ten])))
;; 1 msec
Comment by Ghadi Shayban [ 28/Jun/14 6:23 PM ]

Hmm, I cannot reproduce your results.

I'm not sure whether you are testing with lein, on what platform, what jvm opts.

Can we test using this little harness instead directly against clojure.jar? I've attached a the harness and two runs of results (one w/ default heap, the other 3GB w/ G1GC)

I added a medium and small (range) too.

Anecdotally, I see doseq2 outperform in all cases except the small range. Using criterium shows a wider performance gap favoring doseq2.

I pasted the results side by side for easier viewing.

core/doseq                          doseq2
"Elapsed time: 1610.865146 msecs"   "Elapsed time: 2315.427573 msecs"
"Elapsed time: 2561.079069 msecs"   "Elapsed time: 2232.479584 msecs"
"Elapsed time: 2446.674237 msecs"   "Elapsed time: 2234.556301 msecs"
"Elapsed time: 2443.129809 msecs"   "Elapsed time: 2224.302855 msecs"
"Elapsed time: 2456.406103 msecs"   "Elapsed time: 2210.383112 msecs"

;; med range, few bindings:
core/doseq                          doseq2
"Elapsed time: 28.383197 msecs"     "Elapsed time: 31.676448 msecs"
"Elapsed time: 13.908323 msecs"     "Elapsed time: 11.136818 msecs"
"Elapsed time: 18.956345 msecs"     "Elapsed time: 11.137122 msecs"
"Elapsed time: 12.367901 msecs"     "Elapsed time: 11.049121 msecs"
"Elapsed time: 13.449006 msecs"     "Elapsed time: 11.141385 msecs"

;; small range, few bindings:
core/doseq                          doseq2
"Elapsed time: 0.386334 msecs"      "Elapsed time: 0.372388 msecs"
"Elapsed time: 0.10521 msecs"       "Elapsed time: 0.203328 msecs"
"Elapsed time: 0.083378 msecs"      "Elapsed time: 0.179116 msecs"
"Elapsed time: 0.097281 msecs"      "Elapsed time: 0.150563 msecs"
"Elapsed time: 0.095649 msecs"      "Elapsed time: 0.167609 msecs"

;; small unchunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 2.351466 msecs"      "Elapsed time: 2.749858 msecs"
"Elapsed time: 0.755616 msecs"      "Elapsed time: 0.80578 msecs"
"Elapsed time: 0.664072 msecs"      "Elapsed time: 0.661074 msecs"
"Elapsed time: 0.549186 msecs"      "Elapsed time: 0.712239 msecs"
"Elapsed time: 0.551442 msecs"      "Elapsed time: 0.518207 msecs"

core/doseq                          doseq2
"Elapsed time: 95.237101 msecs"     "Elapsed time: 55.3067 msecs"
"Elapsed time: 41.030972 msecs"     "Elapsed time: 30.817747 msecs"
"Elapsed time: 42.107288 msecs"     "Elapsed time: 19.535747 msecs"
"Elapsed time: 41.088291 msecs"     "Elapsed time: 4.099174 msecs"
"Elapsed time: 41.03616 msecs"      "Elapsed time: 4.084832 msecs"

;; small chunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 31.793603 msecs"     "Elapsed time: 40.082492 msecs"
"Elapsed time: 17.302798 msecs"     "Elapsed time: 28.286991 msecs"
"Elapsed time: 17.212189 msecs"     "Elapsed time: 14.897374 msecs"
"Elapsed time: 17.266534 msecs"     "Elapsed time: 10.248547 msecs"
"Elapsed time: 17.227381 msecs"     "Elapsed time: 10.022326 msecs"

;; more bindings:
core/doseq                          doseq2
"Elapsed time: 4.418727 msecs"      "Elapsed time: 2.685198 msecs"
"Elapsed time: 2.421063 msecs"      "Elapsed time: 2.384134 msecs"
"Elapsed time: 2.210393 msecs"      "Elapsed time: 2.341696 msecs"
"Elapsed time: 2.450744 msecs"      "Elapsed time: 2.339638 msecs"
"Elapsed time: 2.223919 msecs"      "Elapsed time: 2.372942 msecs"

core/doseq                          doseq2
"Elapsed time: 28.869393 msecs"     "Elapsed time: 2.997713 msecs"
"Elapsed time: 22.414038 msecs"     "Elapsed time: 1.807955 msecs"
"Elapsed time: 21.913959 msecs"     "Elapsed time: 1.870567 msecs"
"Elapsed time: 22.357315 msecs"     "Elapsed time: 1.904163 msecs"
"Elapsed time: 21.138915 msecs"     "Elapsed time: 1.694175 msecs"
Comment by Ghadi Shayban [ 28/Jun/14 6:47 PM ]

It's good that the benchmarks contain empty doseq bodies in order to isolate the overhead of traversal. However, that represents 0% of actual real-world code.

At least for the first benchmark (large chunked seq), adding in some tiny amount of work did not change results signifantly. Neither for (map str [a])

(range 10000000) =>  (map str [a])
core/doseq
"Elapsed time: 586.822389 msecs"
"Elapsed time: 563.640203 msecs"
"Elapsed time: 369.922975 msecs"
"Elapsed time: 366.164601 msecs"
"Elapsed time: 373.27327 msecs"
doseq2
"Elapsed time: 419.704021 msecs"
"Elapsed time: 371.065783 msecs"
"Elapsed time: 358.779231 msecs"
"Elapsed time: 363.874448 msecs"
"Elapsed time: 368.059586 msecs"

nor for intrisics like (inc a)

(range 10000000)
core/doseq
"Elapsed time: 317.091849 msecs"
"Elapsed time: 272.360988 msecs"
"Elapsed time: 215.501737 msecs"
"Elapsed time: 206.639181 msecs"
"Elapsed time: 206.883343 msecs"
doseq2
"Elapsed time: 241.475974 msecs"
"Elapsed time: 193.154832 msecs"
"Elapsed time: 198.757873 msecs"
"Elapsed time: 197.803042 msecs"
"Elapsed time: 200.603786 msecs"

I still see reduce-based doseq ahead of the original, except for small seqs

Comment by Ghadi Shayban [ 04/Aug/14 2:55 PM ]

A form like the following will not work with this patch:

(go (doseq [c chs] (>! c :foo)))

as the go macro doesn't traverse fn boundaries. The only such code I know is core.async/mapcat*, a private fn supporting a fn that is marked deprecated.

Comment by Ghadi Shayban [ 07/Aug/14 2:09 PM ]

I see #'clojure.core/run! was just added, which has a similar limitation

Comment by Rich Hickey [ 29/Aug/14 8:19 AM ]

Please consider Ghadi's feedback, esp re: closures.

Comment by Ghadi Shayban [ 22/Sep/14 4:36 PM ]

The current expansion of a doseq [1] under a go form is less than ideal due to the amount of control flow. 14 states in the state machine vs. 7 with loop/recur

[1] Comparison of macroexpansion of (go ... doseq) vs (go ... loop/recur)
https://gist.github.com/ghadishayban/639009900ce1933256a1





[CLJ-1161] sources jar has bad versions.properties resource Created: 11/Feb/13  Updated: 06/Oct/14

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: Release 1.8

Type: Defect Priority: Minor
Reporter: Steve Miner Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-CLJ-1161-Remove-version.properties-from-sources-JAR.patch    
Patch: Code
Approval: Incomplete

 Description   

The "sources" jar (at least since Clojure 1.4 and including 1.5 RC) has a bad version.properties file in it. The resource clojure/version.properties is literally:

version=${version}

The regular Clojure jar has the correct version string in that resource.

I came across a problem when I was experimenting with the sources jar (as used by IDEs). I naively added the sources jar to my classpath, and Clojure died on start up. The bad clojure/versions.properties file was found first, which led to a parse error as the clojure version was being set.

Solution: patch leaves version.properties file out of sources JAR, where it causes problems for tools.



 Comments   
Comment by Steve Miner [ 11/Feb/13 10:04 AM ]

Notes from the dev mailing list:

The "sources" JAR is generated by another Maven plugin, configured here:
https://github.com/clojure/clojure/blob/clojure-1.5.0-RC15/pom.xml#L169-L181

The simplest solution might be to just exclude the file from the sources jar. It looks like maven-source-plugin has an excludes option which would do the trick:

http://maven.apache.org/plugins/maven-source-plugin/jar-mojo.html#excludes

Comment by Jeff Valk [ 21/Apr/14 8:20 AM ]

This issue is marked closed, but I'm still seeing it: the clojure-1.6.0-sources.jar, clojure-1.5.1-sources.jar, etc on Maven Central still contain the bad version.properties files. More specifically, it looks like the fix has been applied to builds in the SNAPSHOTS repository, but not to RELEASES.

Fix applied: https://oss.sonatype.org/content/repositories/snapshots/org/clojure/clojure/
Not fixed: https://oss.sonatype.org/content/repositories/releases/org/clojure/clojure/

Comment by Alex Miller [ 24/Apr/14 4:15 PM ]

Not sure what's needed here, but marking incomplete.

Comment by Jeff Valk [ 25/Apr/14 11:13 AM ]

Would a fix for this update existing sources jars (1.5.1, 1.6.0, etc) on Central? Or would any fix have to wait on the next Clojure release?

Comment by Alex Miller [ 25/Apr/14 12:37 PM ]

For all the same reasons that mutable state is undesirable, changing an existing release jar in the central Maven repository is also undesirable. While it's not technically impossible, we will not update existing releases and this will need to wait for the next. I've looked at this problem a little and I do not yet know enough to know how to fix it or why it even varies between snapshot and release. Help welcome!

In which tool do you see a resulting problem from this?

Comment by Jeff Valk [ 25/Apr/14 11:56 PM ]

Despite the way I phrased the question, I'd hoped that would be the answer. It's the right policy.

Unfortunately, this issue leaves the released sources jars essentially unusable from a tools standpoint. CIDER now has source code navigation from stacktraces – you can jump into both Clojure and Java function definitions from the error/trace. For the latter, the sources jar (for Clojure or any other Java library) needs to be on the classpath as a dev dependency. There's more host interop support in the works for CIDER too ("embrace the host platform"), but not being able to add a dependency on a stable Clojure sources jar presents a wrinkle.

Are the official Clojure releases built by Hudson? The Hudson build right before the 1.6.0 release (#532) and the one right after (#534) both show the exclusion fix, as does the git clojure-1.6.0 tag, which when I check out and build from source, is fine. The Hudson builds with release tags (e.g. 1.6 = #533, 1.6-RC1 = #512, etc), though, don't show any artifacts other than a pom.xml. This would seem to make it rather hard to audit builds...am I missing something?





[CLJ-1152] PermGen leak in multimethods and protocol fns when evaled Created: 30/Jan/13  Updated: 06/Oct/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: Release 1.8

Type: Defect Priority: Critical
Reporter: Chouser Assignee: Unassigned
Resolution: Unresolved Votes: 9
Labels: memory, protocols

Attachments: File naive-lru-for-multimethods-and-protocols.diff     File protocol_multifn_weak_ref_cache.diff    
Patch: Code
Approval: Incomplete

 Description   

There is a PermGen memory leak that we have tracked down to protocol methods and multimethods called inside an eval, because of the caches these methods use. The problem only arises when the value being cached is an instance of a class (such as a function or reify) that was defined inside the eval. Thus extending IFn or dispatching a multimethod on an IFn are likely triggers.

Reproducing: The easiest way that I have found to test this is to set "-XX:MaxPermSize" to a reasonable value so you don't have to wait too long for the PermGen spaaaaace to fill up, and to use "-XX:+TraceClassLoading" and "-XX:+TraceClassUnloading" to see the classes being loaded and unloaded.

leiningen project.clj
(defproject permgen-scratch "0.1.0-SNAPSHOT"
  :dependencies [[org.clojure/clojure "1.5.0-RC1"]]
  :jvm-opts ["-XX:MaxPermSize=32M"
             "-XX:+TraceClassLoading"
             "-XX:+TraceClassUnloading"])

You can use lein swank 45678 and connect with slime in emacs via M-x slime-connect.

To monitor the PermGen usage, you can find the Java process to watch with "jps -lmvV" and then run "jstat -gcold <PROCESS_ID> 1s". According to the jstat docs, the first column (PC) is the "Current permanent space capacity (KB)" and the second column (PU) is the "Permanent space utilization (KB)". VisualVM is also a nice tool for monitoring this.

Multimethod leak

Evaluating the following code will run a loop that eval's (take* (fn foo [])).

multimethod leak
(defmulti take* (fn [a] (type a)))

(defmethod take* clojure.lang.Fn
  [a]
  '())

(def stop (atom false))
(def sleep-duration (atom 1000))

(defn run-loop []
  (when-not @stop
    (eval '(take* (fn foo [])))
    (Thread/sleep @sleep-duration)
    (recur)))

(future (run-loop))

(reset! sleep-duration 0)

In the lein swank session, you will see many lines like below listing the classes being created and loaded.

[Loaded user$eval15802$foo__15803 from __JVM_DefineClass__]
[Loaded user$eval15802 from __JVM_DefineClass__]

These lines will stop once the PermGen space fills up.

In the jstat monitoring, you'll see the amount of used PermGen space (PU) increase to the max and stay there.

-    PC       PU        OC          OU       YGC    FGC    FGCT     GCT
 31616.0  31552.7    365952.0         0.0      4     0    0.000    0.129
 32000.0  31914.0    365952.0         0.0      4     0    0.000    0.129
 32768.0  32635.5    365952.0         0.0      4     0    0.000    0.129
 32768.0  32767.6    365952.0      1872.0      5     1    0.000    0.177
 32768.0  32108.2    291008.0     23681.8      6     2    0.827    1.006
 32768.0  32470.4    291008.0     23681.8      6     2    0.827    1.006
 32768.0  32767.2    698880.0     24013.8      8     4    1.073    1.258
 32768.0  32767.2    698880.0     24013.8      8     4    1.073    1.258
 32768.0  32767.2    698880.0     24013.8      8     4    1.073    1.258

A workaround is to run prefer-method before the PermGen space is all used up, e.g.

(prefer-method take* clojure.lang.Fn java.lang.Object)

Then, when the used PermGen space is close to the max, in the lein swank session, you will see the classes created by the eval'ing being unloaded.

[Unloading class user$eval5950$foo__5951]
[Unloading class user$eval3814]
[Unloading class user$eval2902$foo__2903]
[Unloading class user$eval13414]

In the jstat monitoring, there will be a long pause when used PermGen space stays close to the max, and then it will drop down, and start increasing again when more eval'ing occurs.

-    PC       PU        OC          OU       YGC    FGC    FGCT     GCT
 32768.0  32767.9    159680.0     24573.4      6     2    0.167    0.391
 32768.0  32767.9    159680.0     24573.4      6     2    0.167    0.391
 32768.0  17891.3    283776.0     17243.9      6     2   50.589   50.813
 32768.0  18254.2    283776.0     17243.9      6     2   50.589   50.813

The defmulti defines a cache that uses the dispatch values as keys. Each eval call in the loop defines a new foo class which is then added to the cache when take* is called, preventing the class from ever being GCed.

The prefer-method workaround works because it calls clojure.lang.MultiFn.preferMethod, which calls the private MultiFn.resetCache method, which completely empties the cache.

Protocol leak

The leak with protocol methods similarly involves a cache. You see essentially the same behavior as the multimethod leak if you run the following code using protocols.

protocol leak
(defprotocol ITake (take* [a]))

(extend-type clojure.lang.Fn
  ITake
  (take* [this] '()))

(def stop (atom false))
(def sleep-duration (atom 1000))

(defn run-loop []
  (when-not @stop
    (eval '(take* (fn foo [])))
    (Thread/sleep @sleep-duration)
    (recur)))

(future (run-loop))

(reset! sleep-duration 0)

Again, the cache is in the take* method itself, using each new foo class as a key.

Workaround: A workaround is to run -reset-methods on the protocol before the PermGen space is all used up, e.g.

(-reset-methods ITake)

This works because -reset-methods replaces the cache with an empty MethodImplCache.

Patch: protocol_multifn_weak_ref_cache.diff

Screened by:



 Comments   
Comment by Chouser [ 30/Jan/13 9:10 AM ]

I think the most obvious solution would be to constrain the size of the cache. Adding an item to the cache is already not the fastest path, so a bit more work could be done to prevent the cache from growing indefinitely large.

That does raise the question of what criteria to use. Keep the first n entries? Keep the n most recently used (which would require bookkeeping in the fast cache-hit path)? Keep the n most recently added?

Comment by Jamie Stephens [ 18/Oct/13 9:35 AM ]

At a minimum, perhaps a switch to disable the caches – with obvious performance impact caveats.

Seems like expensive LRU logic is probably the way to go, but maybe don't have it kick in fully until some threshold is crossed.

Comment by Alex Miller [ 18/Oct/13 4:28 PM ]

A report seeing this in production from mailing list:
https://groups.google.com/forum/#!topic/clojure/_n3HipchjCc

Comment by Adrian Medina [ 10/Dec/13 11:43 AM ]

So this is why we've been running into PermGen space exceptions! This is a fairly critical bug for us - I'm making extensive use of multimethods in our codebase and this exception will creep in at runtime randomly.

Comment by Kevin Downey [ 17/Apr/14 9:52 PM ]

it might be better to split this in to two issues, because at a very abstract level the two issues are the "same", but concretely they are distinct (protocols don't really share code paths with multimethods), keeping them together in one issue seems like a recipe for a large hard to read patch

Comment by Kevin Downey [ 26/Jul/14 5:49 PM ]

naive-lru-method-cache-for-multimethods.diff replaces the methodCache in multimethods with a very naive lru cache built on PersistentHashMap and PersistentQueue

Comment by Kevin Downey [ 28/Jul/14 7:09 PM ]

naive-lru-for-multimethods-and-protocols.diff creates a new class clojure.lang.LRUCache that provides an lru cache built using PHashMap and PQueue behind an IPMap interface.

changes MultiFn to use an LRUCache for its method cache.

changes expand-method-impl-cache to use an LRUCache for MethodImplCache's map case

Comment by Kevin Downey [ 30/Jul/14 3:10 PM ]

I suspect my patch naive-lru-for-multimethods-and-protocols.diff is just wrong, unless MethodImplCache really is being used as a cache we can't just toss out entries when it gets full.

looking at the deftype code again, it does look like MethidImplCache is being used as a cache, so maybe the patch is fine

if I am sure of anything it is that I am unsure so hopefully someone who is sure can chime in

Comment by Nicola Mometto [ 31/Jul/14 11:02 AM ]

I haven't looked at your patch, but I can confirm that the MethodImplCache in the protocol function is just being used as a cache

Comment by dennis zhuang [ 08/Aug/14 6:21 AM ]

I developed a new patch that convert the methodCache in MultiFn to use WeakReference for dispatch value,and clear the cache if necessary.

I've test it with the code in ticket,and it looks fine.The classes will be unloaded when perm gen is almost all used up.

Comment by Alex Miller [ 22/Aug/14 4:55 PM ]

I don't know which to evaluate here. Does multifn_weak_method_cache.diff supersede naive-lru-for-multimethods-and-protocols.diff or are these alternate approaches both under consideration?

Comment by Kevin Downey [ 22/Aug/14 8:26 PM ]

the most straight forward thing, I think, is to consider them as alternatives, I am not a huge fan of weakrefs, but of course not using weakrefs we have to pick some bounding size for the cache, and the cache has a strong reference that could prevent a gc, so there are trade offs. My reasons to stay away from weak refs in general are using them ties the behavior of whatever you are building to the behavior of the gc pretty strongly. that may be considered a matter of personal taste

Comment by Andy Fingerhut [ 29/Aug/14 4:31 PM ]

All patches dated Aug 8 2014 and earlier no longer applied cleanly to latest master after some commits were made to Clojure on Aug 29, 2014. They did apply cleanly before that day.

I have not checked how easy or difficult it might be to update the patches.

Comment by Kevin Downey [ 29/Aug/14 7:00 PM ]

I've updated naive-lru-for-multimethods-and-protocols.diff to apply to the current master

Comment by Andy Fingerhut [ 29/Aug/14 7:34 PM ]

Thanks, Kevin. While JIRA allows multiple attachments to a ticket with the same filename but different contents, that can be confusing for people looking for a particular patch, and for a program I have that evaluates patches for things like whether they apply and build cleanly. Would you mind removing the older one, or in some other way making all the names unique?

Comment by Kevin Downey [ 29/Aug/14 8:43 PM ]

I deleted all of my attachments accept for my latest and greatest

Comment by dennis zhuang [ 30/Aug/14 9:51 AM ]

I updated multifn_weak_method_cache2.diff patch too.

I think using weak reference cache is better,because we have to keep one cache per multifn.When you have many multi-functions, there will be many LRU caches in memory,and they will consume too much memory and CPU for evictions. You can't choose a proper threshold for LRU cache in every environment.
But i don't have any benchmark data to support my opinion.

Comment by Alex Miller [ 10/Sep/14 2:38 PM ]

I'm going to set the LRU cache patch aside. I don't think it's possible to find a "correct" size for it and it seems weird to me to extend APersistentMap to build such a thing anyways.

I think it makes more sense to follow the same strategy used for other caches (such as the Keyword cache) - a combination ConcurrentHashMap with WeakReferences and a ReferenceQueue for clean-up. I don't see any compelling reason not to take the same path as other internal caches.

Comment by Alex Miller [ 10/Sep/14 3:44 PM ]

Stepping back a little to think about the problem.... our requirements are:
1) cache map of dispatch value (could be any Object) to multimethod function (IFn)
2) do we want keys to be compared based on equality or identity? identity-based opens up more reference-based caching options and is fine for most common dispatch types (Class, Keyword), but reduces (often eliminates?) cache hits for all other types where values are likely to be equiv but not identical (vector of strings for example)
3) concurrent access to cache
4) cache cannot grow without bound
5) cache cannot retain strong references to dispatch values (the cache keys) because the keys might be instances of classes that were loaded in another classloader which will prevent GC in permgen

multifn_weak_method_cache.diff uses a ConcurrentHashMap (#3) that maps RefWrapper around keys to IFn (#1). The patch uses Util.equals() (#2) for (Java) equality-based comparisons. The RefWrapper wraps them in WeakReferences to avoid #5. Cache clearing based on the ReferenceQueue is used to prevent #4.

A few things definitely need to be fixed:

  • Util.equals() should be Util.equiv()
  • methodCache and rq should be final
  • Why does RefWrapper have obj and expect rq to possibly be null?
  • RefWrapper fields should all be final
  • Whitespace errors in patch

Another idea entirely - instead of caching dispatch value, cache based on hasheq of dispatch value then equality check on value. Could then use WeakHashMap and no RefWrapper.

This patch does not cover the protocol cache. Is that just waiting for the multimethod case to look good?

Comment by dennis zhuang [ 10/Sep/14 7:18 PM ]

Hi, alex, thanks for your review.But the latest patch is multifn_weak_method_cache2.diff. I will update the patch soon by your review, but i have a few questions to be explained.

1) I will use Util.equiv() instead of Util.equals().But what's the difference of them?
2) When the RefWrapper is retained as key in ConcurrentHashMap, it wraps the obj in WeakReference.But when trying to find it in ConcurrentHashMap, it uses obj directly as strong reference, and create it with passing null ReferenceQueue.Please look at the multifn_weak_method_cache2.diff line number 112. It short, the patch stores the dispatch value as weak reference in cache,but uses strong reference for cache getting.

3) If caching dispatch value based on hasheq , can we avoid hasheq value conflicts? If two different dispatch value have a same hasheq( or why it doesn't happen?), we would be in trouble.

Sorry, the patch doesn't cover the protocol cache, i will add it ASAP.

Comment by dennis zhuang [ 11/Sep/14 2:02 AM ]

The new patch 'protocol_multifn_weak_ref_cache.diff' is uploaded.

1) Using Util.equiv() instead of Util.equals()
2) Moved the RefWrapper and it's associated methods to Util.java, and refactor the code based on alex's review.
3) Fixed whitespace errors.
4) Fixed PermGen leak in protocol fns.

Comment by Alex Miller [ 03/Oct/14 10:35 AM ]

I screened this ticket again with Brenton Ashworth and had the following comments:

1) We need to have a performance test to verify that we have not negatively impacted performance of multimethods or protocol invocation.
2) Because there are special cases around null keys in the multimethod cache, please verify that there are existing example tests using null dispatch values in the existing test coverage.
3) In Util$RefWrapper.getObj() - why does this return this.ref at the end? It was not clear to me that the comment was correct or that this was useful in any way.
4) In Util$RefWrapper.clearRefWrapCache() - can k == null in that if check? If not, can we omit that? Also, if you explicitly create the Iterator from the entry set, you can call .remove() on it more efficiently than calling .remove() on the cache itself.
5) In core_deftype / MethodImplCache, it appears that you are modifying a now-mutable field rather than the prior version that was going to great lengths to stay immutable. It's not clear to me what the implications of this change are and that concerns me. Can it use a different collection or code to stay immutable?
6) Please update the description of this ticket to include an approach section that describes the changes we are making.

Thanks!





[CLJ-787] transient blows up when passed a vector created by subvec Created: 03/May/11  Updated: 06/Oct/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Alexander Redington Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: None

Attachments: Text File CLJ-787-p1.patch    
Patch: Code and Test
Approval: Incomplete

 Description   

Subvectors created with subvec from a PersistentVector cannot be made transient:

user=> (transient (subvec [1 2 3 4] 2))
ClassCastException clojure.lang.APersistentVector$SubVector cannot be cast to clojure.lang.IEditableCollection  clojure.core/transient (core.clj:2864)

Cause: APersistentVector$SubVector does not implement IEditableCollection

Patch: CLJ-787-p1.patch

Approach: Create a TransientSubVector based on an underlying TransientVector.

Two assumptions:

  • It's okay for TransientSubVector to delegate the ensureEditable functionality to the underlying TransientVector (sometimes explicitly, sometimes implicitly) - calling ensureEditable explicitly also requires that the field for the underlying vector be the concrete TransientVector type rather than the ITransientVector interface.
  • When an operation that would throw an exception on a PersistentVector happens from the wrong thread (or after persistent!), we throw that exception rather than the IllegalAccessError that transients throw when accessed inappropriately.


 Comments   
Comment by Stuart Sierra [ 31/May/11 9:28 AM ]

Confirmed. APersistentVector$SubVector does not implement IEditableCollection.

The current implementation of TransientVector depends on implementation details of PersistentVector, so it is not a trivial fix. The simplest fix might be to implement IEditableCollection.asTransient in SubVector by creating a new PersistentVector, but I do not know the performance implications.

Comment by Gary Fredericks [ 25/May/13 8:11 PM ]

We could get the same performance characteristics as SubVector by creating a TransientSubVector based on an underlying TransientVector, right?

Preparing a patch to that effect.

Comment by Gary Fredericks [ 25/May/13 10:58 PM ]

Text from the commit msg:

Made two assumptions:

  • It's okay for TransientSubVector to delegate the ensureEditable
    functionality to the underlying TransientVector (sometimes
    explicitely, sometimes implicitely) – calling ensureEditable
    explicitely also requires that the field for the underlying vector
    be the concrete TransientVector type rather than the
    ITransientVector interface.
  • When an operation that would throw an exception on a
    PersistentVector happens from the wrong thread (or after
    persistent!), we throw that exception rather than the
    IllegalAccessError that transients throw when accessed
    inappropriately.
Comment by Alex Miller [ 11/Oct/13 4:17 PM ]

I think there are some assumptions being made in this patch about the class structure here that do not hold. The structure is, admittedly, quite twisty.

A counter-example that highlights one of a few subtypes of APersistentVector that are not PersistentVector (like MapEntry):

user=> (transient (subvec (first {:a 1}) 0 1))
ClassCastException clojure.lang.MapEntry cannot be cast to clojure.lang.IEditableCollection  clojure.lang.APersistentVector$TransientSubVector.<init> (APersistentVector.java:592)

PersistentVector.SubVector expects to work on anything that implements IPersistentVector. Note that this includes concrete types such as MapEntry and LazilyPersistentVector, but could also be any user-implemented type IPersistentVector type too. TransientSubVector is making the assumption that the IPersistentVector in a SubVector question is also an IEditableCollection (that can be converted to be transient). Note that while PersistentVector implements TransientVector (and IEditableCollection), APersistentVector does not. To really implement this in tandem with SubVector, I think you would need to guarantee that IPersistentVector extended IEditableCollection and I don't think that's something we want to do.

I don't see an easy solution. Any time I see all these modifiers (Transient, Sub, etc) being created in different combinations, it is a clear sign that independent kinds of functionality are being remixed into single inheritance OO trees. You can see the same thing in most collection libraries (even Java's - need a ConcurrentIdentitySortedMap? too bad!).

Needs more thought.

Comment by Andy Fingerhut [ 08/Nov/13 10:17 AM ]

Patch CLJ-787-p1.patch no longer applies cleanly to latest master, but it is only because of some new tests added to the transients.clj file since the patch was created, so it is trivial to update in that sense. Not updating it now due to other more significant issues with the patch described above.

Comment by Alex Miller [ 17/Jan/14 10:19 AM ]

No good solution to consider right now, removing from 1.6.





[ASYNC-102] implement IDeref, IBlockingDeref for channels Created: 05/Nov/14  Updated: 10/Nov/14

Status: Open
Project: core.async
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File async-102.patch    
Patch: Code and Test
Approval: Incomplete

 Description   
  • each deref will take a value from the channel
  • deref from a closed channel returns nil
  • deref with timeout must either timeout or take value from the channel (not both!). Might just want to call some variant of alt!! through the var
  • use internals efficiently
  • do NOT implement realized, neither interpretation is that great


 Comments   
Comment by Stuart Halloway [ 09/Nov/14 8:24 AM ]

The async-102.patch does not match the description "a channel becomes realized when it is closed" – instead, a channel becomes realized when a value is currently available. Both of these interpretations seem problematic to me, and I think the definition of realized? needs clarification to support this new case. The other uses of realized? in Clojure all guarantee that subsequent derefs will be filled immediately, but that will not be true for channels if another consumer takes the value.

Once that is solved, I need better doc strings for the core.async implementation protocols before I can screen this. What guarantees are made about use of threads? Where does the work enqueue, and what happens if the queue is full?

Comment by Alex Miller [ 09/Nov/14 10:32 PM ]

My implementation of realized? is based on Rich's response to your question (from internal chat): "open+empty = false, else true" and I believe that is what's implemented and reflected in the tests. This also makes more sense to me than only being realized on close.

Your comment vs other cases of realized? seems accurate, so I agree that's a question.

Can you be more specific on which implementation protocol? I'm guessing you specifically mean Channel and Buffer. Any thread could be calling into the Channel. The M2MC protects its internal state with a mutex and will forward calls down to the Buffer. All calls into the buffer are protected by the channel mutex. M2MC enqueues pending puts and takes in the puts and takes lists. Those are bounded by the fixed limit clojure.core.async.impl.protocols/MAX-QUEUE-SIZE (1024), at which point an exception is thrown.





Generated at Wed Nov 26 05:21:06 CST 2014 using JIRA 4.4#649-r158309.