<< Back to previous view

[CLJ-1184] Evaling #{do ...} or [do ...] is treated as the do special form Created: 16/Mar/13  Updated: 12/Apr/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.6

Type: Defect Priority: Trivial
Reporter: Jiří Maršík Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: Compiler, bug

Approval: Vetted

 Description   

Evaluating a persistent collection for which the function 'first' returns the symbol 'do' leads to that collection being treated as the do special form, even though it may be a vector or even a set. IMHO, the expected result would be to report that 'do' cannot be resolved. For the cause, see the if condition on line 6604 of Compiler.java in clojure-1.5.1.

E.g.:
[do 1 2]
;=> 2

#{"hello" "goodbye" do}
;=> "hello"
; Wat?






[CLJ-1182] Regexp are never equal Created: 12/Mar/13  Updated: 13/Apr/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Christian Fortin Assignee: Omer Iqbal
Resolution: Unresolved Votes: 0
Labels: bug

Attachments: File fix-CLJ-1182.diff    
Patch: Code
Approval: Triaged

 Description   

The following (= #"asd" #"asd") will return false in CLJ, even if they are the same.

Consequently, everything with a regexp in it (lists, vectors, maps...) will never be equal.

It seems to have been fixed for CLJS, but not for CLJ.
https://github.com/clojure/clojurescript/commit/e35c3a57472fa62ae41591418a73794dc8ac6dde



 Comments   
Comment by Omer Iqbal [ 12/Mar/13 4:08 PM ]

added an implementation for the equiv method if both args are java.util.regex.Pattern

Comment by Andy Fingerhut [ 12/Mar/13 5:54 PM ]

Omer, could you also include in your patch a few test cases? At least one that validates that two regex's that should be equal, are equal, and another that two regex's that should be different, are non-equal. Preferably the first of those tests should fail with the existing Clojure code, and pass with your changes.

Comment by Omer Iqbal [ 13/Mar/13 5:39 AM ]

I updated the patch with some tests. Hope I added them in the correct file. I also changed the implementation a bit, by instead of adding another implementation of equiv with a different signature, checking the type of the Object in the equiv method with signature (Object k1, Object k2).
For the sake of consistency I also added an EquivPred implementation, though I'm not entirely sure when its used.

Comment by Andy Fingerhut [ 13/Mar/13 1:04 PM ]

All comments here refer to the patch named fix-CLJ-1182.diff dated Mar 13, 2013.

The location for the new tests looks reasonable. However, note that your new patch has your old changes plus some new ones, not just the new ones. In particular, the new signature for equiv is still in your latest patch. You should start from a clean pull of the latest Clojure master and make only the changes you want when creating a patch, not build on top of previous changes you have made.

Also, there are several whitespace-only changes in your patch that should not be included.

Comment by Omer Iqbal [ 13/Mar/13 1:39 PM ]

I uploaded a clean patch, removing the whitespace diff and only adding the latest changes. Thanks for clarifying the workflow!
Just to clarify, this refers to the patch named fix-CLJ-1182.diff dated Mar 13 1:34 PM

Comment by Stuart Halloway [ 29/Mar/13 5:46 AM ]

I am recategorizing this as an enhancement, because if this is a bug it is a bug in Java – the Java Patterns class documents being immutable, but apparently does not implement .equals.

Other recent "make Clojure more complicated to work around problems in Java" patches have been rejected, and I suspect this one will be too, because it might impact the performance of equality everywhere.

Comment by Stuart Halloway [ 12/Apr/13 9:04 AM ]

At first pass, Rich and I both believe that, as regex equality is undecidable, that Java made the right choice in using identity for equality, that this ticket should be declined, and the ClojureScript should revert to match.

But leaving this ticket open for now so that ClojureScript dev can weigh in.

Comment by Michael Drogalis [ 12/Apr/13 9:32 AM ]

What do you mean when you say "undecidable"?

Comment by Alexander Redington [ 12/Apr/13 10:03 AM ]

If Regex instances were interned by the reader, but still used identity for equality, then code like

(= #"abc" #"abc")

would return true, but it wouldn't diverge in the definition of equality for regular expressions between Java and Clojure.

Comment by Fogus [ 12/Apr/13 10:13 AM ]

Undecidable means that for any given regular expression, there is no single way to write it. For example #"[a]" #"a" both match the same strings, but are they the same? Maybe. But how can we decide if /any/ given regular expression matches all of the same strings as any other? The answer is, you can't. Java does provide a Pattern#pattern method that returns the string that was used to build it, but I'm not sure if that could/should be used as a basis for equality given the potential perf hit.

Comment by Herwig Hochleitner [ 12/Apr/13 10:31 AM ]

I posted in Stu's thread: https://groups.google.com/d/topic/clojure-dev/OTPNJQbPtds/discussion
TL;DR: Disagree with undecidability, agree with reverting to identity based equality

Comment by Michael Drogalis [ 12/Apr/13 10:32 AM ]

That makes sense to me. Thanks Fogus.

Comment by Herwig Hochleitner [ 12/Apr/13 9:42 PM ]

From my post to the ml thread, it might not be entirely clear, why I don't think we should implement equality for host regexes:

It would involve parsing and would leave a lot of room for errors and platform-idiosycracies to leak. And it would be different for every platform.

As Alexander said, I also think this ticket could be resolved by interning regex literals, akin to keywords. That, however, would warrant a design page first, because there are tradeoffs to be made about how far the interning should go.

Comment by Rich Hickey [ 13/Apr/13 8:51 AM ]

Why are we spending time on this? Where is the problem statement? Does someone actually need this for a real world purpose not solved by using regex strings as keys?

Comment by Michael Drogalis [ 13/Apr/13 9:13 PM ]

It was prompted as a matter of consistency, which I think is valid. I can't think of a good reason to use regex's as keys though.





[CLJ-1179] distinct? does not accept zero arguments Created: 09/Mar/13  Updated: 12/Apr/13  Resolved: 12/Apr/13

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Jean Niklas L'orange Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: bug

Attachments: Text File clj-1179-distinct-zero-arguments.txt    
Patch: Code
Approval: Not Approved

 Description   

distinct? cannot be invoked with zero arguments. When using the pattern (apply distinct? x), this is bothersome as you have to check whether x is empty or not. It is also logical that distinct? should return true if no arguments are given, since there are no duplicates.

What (small set of) steps will reproduce the problem?

user=> (apply distinct? [])
ArityException Wrong number of args (0) passed to: core$distinct-QMARK-  clojure.lang.AFn.throwArity (AFn.java:437)

What is the expected output? What do you see instead?

I would expect distinct? to return true whenever given zero arguments.

What version are you using?

This was tested under Clojure 1.4 and Clojure 1.5.



 Comments   
Comment by Jean Niklas L'orange [ 09/Mar/13 6:08 AM ]

Attached the straightforward patch which solves this issue.





[CLJ-1177] clojure.java.io/resource and non-ASCII characters Created: 07/Mar/13  Updated: 10/Mar/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Trevor Wennblom Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: bug, enhancement

Attachments: Text File clj-1177-patch-v1.txt    
Patch: Code

 Description   

clojure.java.io/resource corrupts path containing UTF-8 characters without issuing warning.

user=> (System/getProperty "java.runtime.version")
"1.8.0-ea-b79"
user=> (clojure-version)
"1.5.0"
user=> (System/getProperty "user.dir")
"/dir/déf"
user=> (clojure.java.io/resource "myfile.txt")
#<URL file:/dir/d%c3%a9f/resources/myfile.txt>
user=> (slurp (clojure.java.io/resource "myfile.txt") :encoding "UTF-8")
FileNotFoundException /dir/déf/resources/myfile.txt (No such file or directory)  java.io.FileInputStream.open (FileInputStream.java:-2)


 Comments   
Comment by Andy Fingerhut [ 08/Mar/13 12:10 AM ]

Below is a workaround, at least. I don't know, but perhaps the as-file method for URLs in io.clj of Clojure, the part that converts %hh sequences to a character with code point in the range 0 through 255, is at least partly at fault here. I don't know right now if it is possible to modify that code to handle the general case of whatever character encoding munging is going on here to when .getResource creates the URL object.

clojure.java.io/resource is documented to return a Java object of type java.net.URL, which seems like it does %hh escaping of many characters. Reference [1] to a Java bug from 2001 where a Java user was surprised by the then-recent change in behavior of the getResource method [2].

Doing a little searching I found this StackOverflow question [3], which has what might be a workaround. I tried it on my Mac OS X 10.6 system running JDK 1.6 and it seemed to work:

(slurp (.getContent (clojure.java.io/resource "abcíd/foo.txt")))

That getContent is a method for class java.net.URL [4]

[1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4466485
[2] http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Class.html#getResource%28java.lang.String%29
[3] http://stackoverflow.com/questions/13013629/best-international-alternative-to-javas-getclass-getresource
[4] http://docs.oracle.com/javase/1.5.0/docs/api/java/net/URL.html#getContent%28%29

Comment by Trevor Wennblom [ 08/Mar/13 9:56 AM ]

Hi Andy,

Thanks for the background and suggestions, that's very helpful.

I'm gradually learning Clojure with no Java experience. In this case I was searching for the preferred Clojure way to access items in directories declared under :resource-paths in a Leiningen project.clj file. Perhaps clojure.java.io/resource isn't the best way to do this as it's possibly too tied to the expectation for a URI instead of a more general IRI.

You're suggested workaround did work for my use case:

(slurp (.getContent (clojure.java.io/resource "abcíd/foo.txt")))

but hopefully there would be more native/direct Clojure way to accomplish the same eventually.

I don't know if java.net.IDN would be useful internally as a fix in clojure.java.io/resource — I'm assuming not since it wasn't added until Java 6.[1]

user=> (import 'java.net.IDN)
java.net.IDN
user=> (java.net.IDN/toASCII "/dir/déf")
"xn--/dir/df-gya"
user=> (java.net.IDN/toUnicode "xn--/dir/df-gya")
"/dir/déf"

[1]: http://docs.oracle.com/javase/6/docs/api/java/net/IDN.html

Comment by Andy Fingerhut [ 08/Mar/13 1:30 PM ]

Patch clj-1177-patch-v1.txt dated Mar 8 2013 is an attempt to solve this issue, in what I think may be a correct way. As specified in RFC 3986, when taking a Unicode string and making a URL of it, it should be encoded in UTF-8 and then each individual byte is subject to the %HH hex encoding. This patch reverses that to turn URLs into file names.

Tested on Mac OS X 10.6 with a command line like this (it doesn't work without the -Dfile.encoding=UTF-8 option on my Mac, probably because the default encoding is MacRoman):

% java -cp clojure.jar:path/to/resource -Dfile.encoding=UTF-8 clojure.main
user=> (require '[clojure.java.io :as io])
nil
user=> (io/resource "abcíd/foo.txt")
#<URL file:/Users/jafinger/clj/clj-ns-browser/resource/abc%c3%add/foo.txt>
user=> (slurp (io/resource "abcíd/foo.txt"))
"The quick brown fox jumped over the lázy dög!\n"





[CLJ-1136] Type hinting for array classes does not work in binding forms Created: 20/Dec/12  Updated: 21/Dec/12

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Luke VanderHart Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: Compiler, bug
Environment:

replicated on OpenJDK 7u9 on Ubuntu 12.04, and Hotspot 1.6.0_37 on OSX Lion



 Description   

Type hints don't work as expected in binding forms.

The following form results in a reflection warning:

(let [^{:tag (Class/forName "[Ljava.lang.Object;")} a (make-array Object 2)]
(aget a 0))

However, hinting does appear to work correctly on vars:

(def ^{:tag (Class/forName "[Ljava.lang.Object;")} a (make-array Object 2))
(aget a 0) ;; no reflection warning



 Comments   
Comment by Ghadi Shayban [ 20/Dec/12 10:51 PM ]

It's a little more insidious than type hinting: the compiler doesn't evaluate metadata in the binding vec.

This doesn't throw the necessary exception...

(let [^{:foo (Class/forName "not real")} bar 42]
bar)

neither this...

(let [^{gyorgy ligeti} a 42]
a)

Gyorgy Ligeti never resolves.

These two equivalent examples don't reflect:
(let [^objects a (make-array Object 2)]
(aget a 0))

(let [a ^objects (make-array Object 2)]
(aget a 0))

Comment by Ghadi Shayban [ 21/Dec/12 11:09 AM ]

On only the left-hand side of a local binding, metadata on a symbol is not analyzed or evaluated.





[CLJ-1134] star-directive in clojure.pprint/cl-format with an at-prefix ("~n@*") do not obey its specifications Created: 18/Dec/12  Updated: 26/Dec/12

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Jean Niklas L'orange Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: bug, pprint

Attachments: Text File clj-1134-star-directive-in-cl-format.txt    
Patch: Code and Test

 Description   

The star-directive in clojure.pprint/cl-format with an at-prefix (~n@*) does not obey its specifications according to Common Lisp the Language, 2nd Edition. There are two bugs within ~n@* as of right now:

  1. When ~n@* is supposed to jump forward over more than one argument, it jumps one step backward as if it had seen ~:*. For instance, (cl-format nil "~D ~3@*~D" 0 1 2 3) will return "0 0" and not "0 3" as expected.
  2. When ~@* is seen, the formatter is supposed to jump to the first argument (as n defaults to 0, see specification linked above). However, whenever a ~@*-directive is seen, the formatter jumps to the second argument instead.

What (small set of) steps will reproduce the problem?

Inside a clean Clojure repl, perform these steps:

user=> (require '[clojure.pprint :refer [cl-format]])
nil
user=> (cl-format nil "~D ~3@*~D" 0 1 2 3)
"0 0"                                           ;; Expected: "0 3"
user=> (cl-format nil "~D~D~D~D ~@*~D" 0 1 2 3)
"0123 1"                                        ;; Expected: "0123 0"

What is the expected output? What do you see instead?

The expected output is "0 3" and "0123 0", but is "0 0" and "0123 1" as shown above.

What version are you using?

Tested on both 1.4.0 and 1.5.0-beta2, both have the defect described.

Please provide any additional information below.

The format strings which reproduces the problem has been compared with the format function from the Common Lisp implementations SBCL, CLisp and Clozure. All of them print the expected output.



 Comments   
Comment by Jean Niklas L'orange [ 18/Dec/12 9:28 PM ]

Patch attached.

It may be easier to read the changes the patch does from within JIRA instead from the commit message, so I've added it here:

This solves two issues as specified by #CLJ-1134. Issue #1 is solved by doing a
relative jump forward within absolute-reposition in cl_format.clj, line 114 by
switching (- (:pos navigator) position) with (- position (:pos navigator)).

Issue #2 is handled by changing the default n-parameter to * depending on
whether the @-prefix is placed or not. If it is placed, then n defaults to
0, otherwise it defaults to 1.

In addition, new tests have been appended to test_cl_format.clj to ensure the
correctness of this patch. The tests have been tested on the Common Lisp
implementation GNU CLISP 2.49, which presumably handle the ~n@*
correctly. This patch and GNU CLISP returns the same output for each format
call, sans case for printed symbols; Common Lisp has case-insensitive symbols,
whereas Clojure has not.





[CLJ-1087] clojure.data/diff uses set union on key seqs Created: 15/Oct/12  Updated: 15/Oct/12

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Tom Jack Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: bug, performance

Attachments: Text File clj-1087-diff-perf-enhance-patch-v1.txt    
Patch: Code

 Description   

clojure.data/diff, on line 118, defines:

java.util.Map
(diff-similar [a b]
  (diff-associative a b (set/union (keys a) (keys b))))

Since keys returns a key seq, this seems like an error. clojure.set/union has strange and inconsistent behavior with regard to non-sets, and in this case the two key seqs are concatenated. Based on a cursory benchmark, it seems that this bug is a slight performance gain when the maps have no common keys, and a significant performance loss when the maps have the same keys. The results are still correct because of the merging reduce in diff-associative.

The patch is easy (just call set on each key seq).



 Comments   
Comment by Andy Fingerhut [ 15/Oct/12 2:52 PM ]

clj-1087-diff-perf-enhance-patch-v1.txt dated Oct 15 2012 implements Tom's suggested performance enhancement, although not exactly in the way he suggested. It does calculate the union of the two key sequences.





[CLJ-1030] Misleading ClassCastException when coercing a String to int Created: 25/Jul/12  Updated: 06/Jan/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Philipp Meier Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: bug

Attachments: File improved-int-char-casting-error-messages.diff    
Patch: Code and Test

 Description   

Observed behaviour

(int "0") =>
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Character

Expected behaviour
(int "0") =>
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
or
IllegalArgumentException



 Comments   
Comment by Andy Fingerhut [ 24/Sep/12 11:20 AM ]

If someone wants to improve the behavior of int in this case, they should also consider similar improvements to the error messages for unchecked-int, char, and unchecked-char.

Comment by Michael Drogalis [ 06/Jan/13 8:45 PM ]

Patch improved-int-char-casting-error-messages.diff on January 6, 2013.





[CLJ-1028] (compile) crashes with NullPointerException if public function 'load' is defined Created: 20/Jul/12  Updated: 20/Jul/12  Resolved: 20/Jul/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Ben Kelly Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: Compiler, bug
Environment:

Linux, OpenJDK 1.6.0 64bit


Attachments: File stack-trace    

 Description   

When performing AOT compilation, if the namespace being compiled or one of the namespaces :required by it defines a public function named 'load', the compiler will crash with a NullPointerException.

The following code demonstrates this:

(ns ca.ancilla.kessler.core (:gen-class)) (defn load [x] x) (defn -main [& args] 0)

When run directly, with 'clojure -m ca.ancilla.kessler.core' or 'clojure ca/ancilla/kessler/core.clj', it runs as expected. When loaded with 'clojure -i' and (compile)d, however, or when automatically compiled by 'lein run', it results in a NullPointerException (the complete stack trace is attached).

This occurs whether or not 'load' or actually called. It does not, however, occur if 'load' is private.



 Comments   
Comment by Ben Kelly [ 20/Jul/12 12:43 PM ]

If you add (:refer-clojure :exclude [load]) to the (ns), it works fine:

(ns ca.ancilla.kessler.core (:refer-clojure :exclude [load]) (:gen-class))
(defn load [x] x)
(defn -main [& args] 0)

Thanks to llasram on IRC for discovering this.

Comment by Stuart Halloway [ 20/Jul/12 4:35 PM ]

You should not replace functions in clojure.core. This is left legal (with a loud CONSOLE warning) for compatibility, but programs that do it are in error.

Comment by Ben Kelly [ 20/Jul/12 10:06 PM ]

So, just to make sure that I have this right, then...

If I want to create a namespace with a public function that shares a name with a function in clojure.core, the only supported way of doing this is to (:refer-clojure :exclude [list of all such functions])?

If so, it would be nice if the warning were replaced with an error, rather than having the compiler emit an error and then crash.





[CLJ-1022] gen-class destroys method annotations Created: 03/Jul/12  Updated: 03/Jul/12

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Maris Orbidans Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: bug


 Description   

When extending a class gen-class doesn't preserve method annotations.

If class com.bar.Foo has annotated methods then in MyClass all annotations are gone.

(gen-class
:name com.my.MyClass
:extends com.bar.Foo
:implements [com.google.common.base.Supplier]
:prefix demo-
:post-init post-init)

(defn demo-post-init [this]
(info "initialized")
(swank.swank/start-server :port 68478))

(defn demo-get [_]
(get-msg))

Class<?> aClass = Class.forName("com.my.MyClass");
Method[] methods = aClass.getMethods();

for (Method m : methods) {
Annotation[] annotations = m.getAnnotations();
System.out.println(m.getName()+" "+annotations.length);
for (Annotation a : annotations) { System.out.println(a.annotationType().getClass().getName()); }
}






[CLJ-1021] (let [i 5] (defmacro m [v] v) m) interprets m as a function, not a macro Created: 02/Jul/12  Updated: 13/Sep/12

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Zii Prime Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: bug

Attachments: File 001-propagate-on-macro-meta.diff    
Patch: Code

 Description   

(let [i 5] (defmacro m [& args] args) (def x (m (inc 5) (inc 6) (inc 7))) x m (meta #'m))
is
[(8) #<user$eval522$m_523 user$eval522$m_523@11a74355> {:macro true, :ns #<Namespace user>, :name m, :arglists ([& args]), :line 1, :file "NO_SOURCE_PATH"}]

It appears to be interpreting m as a function despite the metadata. This behavior only shows up inside the (let ...).



 Comments   
Comment by Nicola Mometto [ 18/Aug/12 11:43 AM ]

The problem is the same as the one for http://dev.clojure.org/jira/browse/CLJ-918

The patch linked there was mine but i had no CA signed at that time, and no account on jira.

Here is the same patch in the correct format.

Bug #918 should be closed as this is a duplicate





[CLJ-1006] Quotient on bigdec may produce wrong result Created: 01/Jun/12  Updated: 01/Mar/13  Resolved: 09/Nov/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Major
Reporter: laurent joigny Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: bug
Environment:

Linux 3.2.0-24-generic #39-Ubuntu SMP i686 GNU/Linux


Attachments: Java Source File TestBigDecimalQuotient.java    

 Description   

Hi,

As discussed on the mailing list in the message "When arithmetic on a computer bite back" (01/jun)

There may be bug in the way quotient is implemented for bigdec.

user> (quot 1.4411518807585587E17 2) ;; correct with doubles
7.2057594037927936E16
user> (quot 1.4411518807585587E+17M 2) ;; wrong with BigDecs
72057594037927935M


Laurent



 Comments   
Comment by laurent joigny [ 01/Jun/12 5:48 PM ]

I can reproduce the bug when using BigDecimal constructor on String.
See attached file for a test class.

More infos :
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.1) (6b24-1.11.1-4ubuntu3)
OpenJDK Client VM (build 20.0-b12, mixed mode, sharing)

Comment by laurent joigny [ 01/Jun/12 5:49 PM ]

A simple test file, that you can drop in Clojure sources and execute to reproduce the bug on BigDecimal constructor using String as argument.

Comment by Tassilo Horn [ 03/Jun/12 4:30 AM ]

Seems to be a general precision problem. Note that in

user> (quot 1.4411518807585587E17 2) ;; correct with doubles
7.2057594037927936E16
user> (quot 1.4411518807585587E+17M 2) ;; wrong with BigDecs
72057594037927935M

the double result is actually wrong and the bigdec one is correct. The problem which lead to the wrong conclusion is that in your calculation the input number is already wrong.

So the moral is: don't use any floating points (neither doubles nor bigdecs) for computations involving divisibility tests.

For bigdecs, you can set the math context for making computations throw exceptions if they lose precision, though:

user> (binding [*math-context* (java.math.MathContext. 1 java.math.RoundingMode/UNNECESSARY)]
	       (quot (bigdec (Math/pow 2 58)) 2))
;Division impossible
;  [Thrown class java.lang.ArithmeticException]
Comment by Stuart Sierra [ 09/Nov/12 8:49 AM ]

Not a bug. Just floating-point arithmetic.





[CLJ-988] the locking in MultiFn.java (synchronized methods) can cause lots of contention in multithreaded programs Created: 08/May/12  Updated: 21/Sep/12  Resolved: 21/Sep/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.5

Type: Enhancement Priority: Major
Reporter: Kevin Downey Assignee: Stuart Sierra
Resolution: Completed Votes: 10
Labels: bug, performance

Attachments: Text File clj-988-tests-only-patch-v1.txt     File issue988-lockless-multifn+tests-120817.diff     File issue988-lockless-multifn+tests.diff    
Patch: Code and Test
Approval: Ok

 Description   

if you call a single multimethod a lot in multithreaded code you get lots of contention for the lock on the multimethod. this contention slows things down a lot.

this is due to getMethod being synchronized. it would be great if there was some fast non-locking path through the multimethod.



 Comments   
Comment by Kevin Downey [ 08/May/12 11:30 AM ]

http://groups.google.com/group/clojure-dev/browse_thread/thread/6a8219ae3d4cd0ae?hl=en

Comment by David Santiago [ 11/May/12 6:38 AM ]

Here's a stab in the dark attempt at rewriting MultiFn to use atoms to swap between immutable copies of its otherwise mutable state.

The four pieces of the MultiFn's state that are mutable and protected by locks are now contained in the MultiFn.State class, which is immutable and contains convenience functions for creating a new one with one field changed. An instance of this class is now held in an atom in the MultiFn called state. Changes to any of these four members are now done with an atomic swap of these State objects.

The getMethod/findAndCacheBestMethod complex was rewritten to better enable the atomic logic. findAndCacheBestMethod was replaced with findBestMethod, which merely looks up the method; the caching logic was moved to getMethod so that it can be retried easily as part of the work that method does.

As it was findAndCacheBestMethod seemingly had the potential to cause a stack overflow in a perfect storm of heavy concurrent modification, since it calls itself recursively if it finds that the hierarchy has changed while it has done its work. This logic is now done in the CAS looping of getMethod, so hopefully that is not even an unlikely possibility anymore.

There is still seemingly a small race here, since the check is done of a regular variable against the value in a ref. Now as before, the ref could be updated just after you do the check, but before the MultiFn's state is updated. Of course, only the method lookup part of a MultiFn call was synchronized before; it could already change after the lookup but before the method itself executed, having a stale method finish seemingly after the method had been changed. Things are no different now in general, with the atom-based approach, so perhaps this race is not a big deal, as a stale value can't persist for long.

The patch passes the tests and Clojure and multimethods seems to work.

Comment by Kevin Downey [ 12/May/12 8:59 PM ]

this patch gets rid of the ugly lock contention issues. I have not been able to benchmark it vs. the old multimethod implementation, but looking at it, I would not be surprised if it is faster when the system is in a steady state.

Comment by Stuart Halloway [ 08/Jun/12 12:11 PM ]

This looks straightforward, except for getMethod. Can somebody with context add more discussion of how method caching works?

Also, it would be great to have tests with this one.

Comment by David Santiago [ 15/Jun/12 4:44 AM ]

Obviously I didn't write the original code, so I'm not the ideal
person to explain this stuff. But I did work with it a bit recently,
so in the hopes that I can be helpful, I'm writing down my
understanding of the code as I worked with it. Since I came to the
code and sort of reverse engineered my way to this understanding,
hopefully laying this all out will make any mistakes or
misunderstandings I may have made easier to catch and correct. To
ensure some stability, I'll talk about the original MultiFn code as it
stands at this commit:
https://github.com/clojure/clojure/blob/8fda34e4c77cac079b711da59d5fe49b74605553/src/jvm/clojure/lang/MultiFn.java

There are four pieces of state that change as the multimethod is either
populated with methods or called.

  • methodTable: A persistent map from a dispatch value (Object) to
    a function (IFn). This is the obvious thing you think it is,
    determining which dispatch values call which function.
  • preferTable: A persistent map from a dispatch value (Object) to
    another value (Object), where the key "is preferred" to the value.
  • methodCache: A persistent map from a dispatch value (Object) to
    function (IFn). By default, the methodCache is assigned the same
    map value as the methodTable. If values are calculated out of the
    hierarchy during a dispatch, the originating value and the
    ultimate method it resolves to are inserted as additional items in
    the methodCache so that subsequent calls can jump right to the
    method without recursing down the hierarchy and preference table.
  • cachedHierarchy: An Object that refers to the hierarchy that is
    reflected in the latest cached values. It is used to check if the
    hierarchy has been updated since we last updated the cache. If it
    has been updated, then the cache is flushed.

I think reset(), addMethod(), removeMethod(), preferMethod(),
prefers(), isA(), dominates(), and resetCache() are extremely
straightforward in both the original code and the patch. In the
original code, the first four of those are synchronized, and the other
four are only called from methods that are synchronized (or from
methods only called from methods that are synchronized).

Invoking a multimethod through its invoke() function will call
getFn(). getFn() will call the getMethod() function, which is
synchronized. This means any call of the multimethod will wait for and
take a lock as part of method invocation. The goal of the patch in
this issue is to remove this lock on calls into the multimethod. It in
fact removes the locks on all operations, and instead keeps its
internal mutable state by atomically swapping a private immutable
State object held in an Atom called state.

The biggest change in the patch is to the
getFn()>getMethod()>findAndCacheBestMethod() complex from the
original code. I'll describe how that code works first.

In the original code, getFn() does nothing but bounce through
getMethod(). getMethod() tries three times to find a method to call,
after checking that the cache is up to date and flushing it if it
isn't:

1. It checks if there's a method for the dispatch value in the
methodCache.

2. If not, it calls findAndCacheBestMethod() on the
dispatch value. findAndCacheBestMethod() does the following:

1. It iterates through every map entry in the method table,
keeping at each step the best entry it has found so far
(according to the hierarchy and preference tables).

2. If it did not find a suitable entry, it returns null.

3. Otherwise, it checks if the hierarchy has been changed since the cache
was last updated. If it has not changed, it inserts the method
into the cache and returns it. If it has been changed, it
resets the cache and calls itself recursively to repeat the process.

3. Failing all else, it will return the method for the default
dispatch value.

Again, remember everything in the list above happens in a call to a
synchronized function. Also note that as it is currently written,
findAndCacheBestMethod() uses recursion for iteration in a way that
grows the stack. This seems unlikely to cause a stack overflow unless
the multimethod is getting its hierarchy updated very rapidly for a
sustained period while someone else tries to call it. Nonetheless, the
hierarchy is held in an IRef that is updated independently of the
locking of the MultiFn. Finally, note that the multimethod is locked
only while the method is being found. Once it is found, the lock is
released and the method actually gets called afterwards without any
synchronization, meaning that by the time the method actually
executes, the multimethod may have already been modified in a way that
suggests a different method should have been called. Presumably this
is intended, understood, and not a big deal.

Moving on now to the patch in this issue. As mentioned, the main
change is updating this entire apparatus to work with a single atomic
swap to control concurrency. This means that all updates to the
multimethod's state have to happen at one instant in time. Where the
original code could make multiple changes to the state at different
times, knowing it was safely protected by an exclusive lock, rewriting
for atom swaps requires us to reorganize the code so that all updates
to the state happen at the same time with a single CAS.

To implement this change, I pulled the implicit looping logic from
findAndCacheBestMethod() up into getMethod() itself, and broke the
"findBestMethod" part into its own function, findBestMethod(), which
makes no update to any state while implementing the same
logic. getMethod() now has an explicit loop to avoid stack-consuming
recursion on retries. This infinite for loop covers all of the logic
in getMethod() and retries until a method is successfully found and a
CAS operation succeeds, or we determine that the method cannot be
found and we return the default dispatch value's implementation.

I'll now describe the operation of the logic in the for loop. The
first two steps in the loop correspond to things getMethod() does
"before" its looping construct in the original code, but we have to do
in the loop to get the latest values.

1. First we dereference our state, and assign this value to both
oldState and newState. We also set a variable called needWrite to
false; this is so we can avoid doing a CAS (they're not free) when
we have not actually updated the state.

2. Check if the cache is stale, and flush it if so. If the cache
gets flushed, set needWrite to true, as the state has changed.

3. Check if the methodCache has an entry for this dispatch
value. If so, we are "finished" in the sense that we found the
value we wanted. However, we may need to update the state. So,

  • If needWrite is false, we can return without a CAS, so just
    break out of the loop and return the method.
  • Otherwise, we need to update the state object with a CAS. If
    the CAS is successful, break out of the loop and return the
    target function. Otherwise, continue on the next iteration
    of the loop, skipping any other attempts to fetch the method
    later in the loop (useless work, at this point).

4. The value was not in the methodCache, so call the function
findBestMethod() to attempt to find a suitable method based on the
hierarchy and preferences. If it does find us a suitable method,
we now need to cache it ourselves. We create a new state object
with the new method cache and attempt to update the state atom
with a CAS (we definitely need a write here, so no need to check
needWrite at this point).

The one thing that is possibly questionable is the check at this
point to make sure the hierarchy has not been updated since the
beginning of this method. I inserted this here to match the
identical check at the corresponding point in
findAndCacheBestMethod() in the original code. That is also a
second check, since the cache is originally checked for freshness
at the very beginning of getMethod() in the original code. That
initial check happens at the beginning of the loop in the
patch. Given that there is no synchronization with the method
hierarchy, it is not clear to me that this second check is needed,
since we are already proceeding with a snapshot from the beginning
of the loop. Nonetheless, it can't hurt as far as I can tell, it
is how the original code worked, and I assume there was some
reason for that, so I kept the second check.

5. Finally, if findBestMethod() failed to find us a method for the
dispatch value, find the method for the default dispatch value and
return that by breaking out of the loop.

So the organization of getMethod() in the patch is complicated by two
factors: (1) making the retry looping explicit and stackless, (2)
skipping the CAS when we don't need to update state, and (3) skipping
needless work later in the retry loop if we find a value but are
unable to succeed in our attempt to CAS. Invoking a multimethod that
has a stable hierarchy and a populated cache should not even have a
CAS operation (or memory allocation) on this code path, just a cache
lookup after the dispatch value is calculated.

Comment by David Santiago [ 15/Jun/12 4:45 AM ]

I've updated this patch (removing the old version, which is entirely superseded by this one). The actual changes to MultiFn.java are identical (modulo any thing that came through in the rebase), but this newer patch has tests of basic multimethod usage, including defmulti, defmethod, remove-method, prefer-method and usage of these in a hierarchy that updates in a way interleaved with calls.

Comment by David Santiago [ 15/Jun/12 6:38 AM ]

I created a really, really simple benchmark to make sure this was an improvement. The following tests were on a quad-core hyper-threaded 2012 MBP.

With two threads contending for a simple multimethod:
The lock-based multifns run at an average of 606ms, with about 12% user, 15% system CPU at around 150%.
The lockless multifns run at an average of 159ms, with about 25% user, 3% system CPU at around 195%.

With four threads contending for a simple multimethod:
The lock-based multifns run at an average of 1.2s, with about 12% user, 15% system, CPU at around 150%.
The lockless multifns run at an average of 219ms, with about 50% user, 4% system, CPU at around 330%.

You can get the code at https://github.com/davidsantiago/multifn-perf

Comment by David Santiago [ 14/Aug/12 10:02 PM ]

It's been a couple of months, and so I just wanted to check in and see if there was anything else needed to move this along.

Also, Alan Malloy pointed out to me that my benchmarks above did not mention single-threaded performance. I actually wrote this into the tests above, but I neglected to report them at the time. Here are the results on the same machine as above (multithreaded versions are basically the same as the previous comment).

With a single thread performing the same work:
The lock-based multifns run at an average of 142ms.
The lockless multifns run at an average of 115ms.

So the lockless multimethods are still faster even in a single-threaded case, although the speedup is more modest compared to the speedups in the multithreaded cases above. This is not surprising, but it is good to know.

Comment by Stuart Sierra [ 17/Aug/12 2:58 PM ]

Screened. The approach is sound.

I can confirm similar performance measurements using David Santiago's benchmark, compared with Clojure 1.5 master as of commit f5f4faf.

Mean runtime (ms) of a multimethod when called repeatedly from N threads:

|            | N=1 | N=2 | N=4 |
|------------+-----+-----+-----|
| 1.5 master |  80 | 302 | 765 |
| lockless   |  63 |  88 | 125 |

My only concern is that the extra allocations of the State class will create more garbage, but this is probably not significant if we are already using persistent maps. It would be interesting to compare this implementation with one using thread-safe mutable data structures (e.g. ConcurrentHashMap) for the cache.

Comment by David Santiago [ 17/Aug/12 7:05 PM ]

I think your assessment that it's not significant compared to the current implementation using persistent maps is correct. Regarding the extra garbage, note that the new State is only created when the hierarchy has changed or there's a cache miss (related, obviously)... situations where you're already screwed. Then it won't have to happen again for the same method (until another change to the multimethod). So for most code, it won't happen very often.

ConcurrentHashMap might be faster, it'd be interesting to see. My instinct is to keep it as close to being "made out of Clojure" as possible. In fact, it's hard to see why this couldn't be rewritten in Clojure itself some day, as I believe Chas Emerick has suggested. Also, I would point out that two of the three maps are used from the Clojure side in core.clj. I assume they would be happier if they were persistent maps.

Funny story: I was going to point out the parts of the code that were called from the clojure side just now, and alarmingly cannot find two of the functions. I think I must have misplaced them while rewriting the state into an immutable object. Going to attach a new patch with the fix and some tests for it in a few minutes.

Comment by David Santiago [ 17/Aug/12 7:44 PM ]

Latest patch for this issue. Supersedes issue988-lockless-multifn+tests.diff as of 08/17/2012.

Comment by David Santiago [ 17/Aug/12 7:49 PM ]

As promised, I reimplemented those two functions. I also added more multimethod tests to the test suite. The new tests should definitely prevent a similar mistake. While I was at it, I went through core.clj and found all the multimethod API functions I could and ensured that there were at least some basic functionality tests for all of them. The ones I found were: defmulti, defmethod, remove-all-methods, remove-method, prefer-method, methods, get-method, prefers (Some of those already had tests from the earlier version of the patch).

Really sorry about catching this right after you vetted the patch. 12771 test assertions were apparently not affected by prefers and methods ceasing to function, but now there are 12780 to help prevent a similar error. Since you just went through it, I'm leaving the older version of the patch up so you can easily see the difference to what I've added.

Comment by Rich Hickey [ 15/Sep/12 9:05 AM ]

https://github.com/clojure/clojure/commit/83ebf814d5d6663c49c1b2d0d076b57638bff673 should fix these issues. The patch here was too much of a change to properly vet.

If you could though, I'd appreciate a patch with just the multimethod tests.

Comment by Andy Fingerhut [ 15/Sep/12 10:59 AM ]

Patch clj-988-tests-only-patch-v1.txt dated Sep 15 2012 is a subset of David Santiago's
patch issue988-lockless-multifn+tests-120817.diff dated Aug 17 2012. It includes only the tests from that patch. Applies cleanly and passes tests with latest master after Rich's read/write lock change for multimethods was committed.

Comment by Rich Hickey [ 17/Sep/12 9:20 AM ]

tests-only patch ok





[CLJ-983] proxy-super does not restore original binding if call throws exception Created: 05/May/12  Updated: 05/May/12

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.1, Release 1.2, Release 1.3
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Valentin Mahrwald Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: bug

Attachments: Text File proxy_super.patch    

 Description   

The code for proxy-call-with-super internally used by proxy-super does not handle exceptions from the call method:

(defn proxy-call-with-super [call this meth]
(let [m (proxy-mappings this)]
(update-proxy this (assoc m meth nil))
(let [ret (call)]
(update-proxy this m)
ret)))

As a result the following sequence behaves unexpectedly:
(def obj (proxy [java.lang.ClassLoader] []
(loadClass [cl] (println "enter")
(proxy-super loadClass cl))))
(.loadClass obj "nonexistent")
;; prints enter, then throws ClassNotFoundException
(.loadClass obj "nonexistent")
;; does not print anything before throwing cnfe

A try-finally block around the invocation would solve this particular issue.



 Comments   
Comment by Valentin Mahrwald [ 05/May/12 9:04 AM ]

Test & fix patch on latest Clojure github branch.





[CLJ-971] Jar within a jar throws a runtime error Created: 10/Apr/12  Updated: 10/Apr/12

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2, Release 1.3
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Ron Romero Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: bug
Environment:

Maven using the one-jar plugin



 Description   

I've created two jar files in my multi-project Maven setup. The first jar is the "engine", and it includes the clojure jar in it. The other jar is the "application". It includes the engine and then packages itself into a one-jar jar file. This means we have a jar within a jar: The "onejar" contains the engine jar, which in turn contains that clojure jar.

I then get an error in the runtime:

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at com.simontuffs.onejar.Boot.run(Boot.java:340)
at com.simontuffs.onejar.Boot.main(Boot.java:166)
Caused by: java.lang.ExceptionInInitializerError
at com.ziroby.clojure.App.main(App.java:14)
... 6 more
Caused by: java.lang.NullPointerException
at clojure.lang.RT.lastModified(RT.java:374)
at clojure.lang.RT.load(RT.java:408)
at clojure.lang.RT.load(RT.java:398)
at clojure.lang.RT.doInit(RT.java:434)
at clojure.lang.RT.<clinit>(RT.java:316)
... 7 more

See also my Stack Overflow question on this at http://stackoverflow.com/questions/7763480/making-an-executable-jar-that-evals-clojure-strings

In researching it, I've found the problem lies in RT.lastModified, where it tries to determine last modified time by looking at the modified time on the jar file for Clojure. But there's not actually a jar file, since it's embedded in another.

I've found that adding a null check solves the problem. My lastModified looks like this now:

static public long lastModified(URL url, String libfile) throws Exception{
if(url.getProtocol().equals("jar")) { ZipEntry entry = ((JarURLConnection) url.openConnection()).getJarFile().getEntry(libfile); if (entry != null) return entry.getTime(); }

return url.openConnection().getLastModified();
}

This runs successfully.

If you'd prefer, I can submit a patch, or commit directly.






Generated at Wed May 22 01:46:54 CDT 2013 using JIRA 4.4#649-r158309.