<< Back to previous view

[CLJ-1330] Class name clash between top-level functions and defn'ed ones Created: 22/Jan/14  Updated: 13/Sep/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Nicola Mometto Assignee: Alex Miller
Resolution: Unresolved Votes: 7
Labels: aot, compiler

Attachments: Text File 0001-CLJ-1093-v2.patch     Text File 0001-Fix-CLJ-1330-make-top-level-named-functions-classnam.patch     Text File 0001-Fix-CLJ-1330-refactored.patch     File demo1.clj    
Patch: Code
Approval: Vetted


Named anonymous fn's are not guaranteed to have unique class names when AOT-compiled.

For example:

(defn g [])
(def xx (fn g []))

When AOT-compiled both functions will emit user$g.class, the latter overwriting the former.

Demonstration script: demo1.clj

Patch: 0001-CLJ-1093-v2.patch

Approach: Generate unique class names for named fn's the same way as for unnamed anonymous fn's.

See also: This patch also fixes the issue reported in CLJ-1227.

Comment by Ambrose Bonnaire-Sergeant [ 22/Jan/14 11:12 AM ]

This seems like the reason why jvm.tools.analyzer cannot analyze clojure.core. On analyzing a definline, there is an "attempted duplicate class definition" error.

This doesn't really matter, but I thought it may or may not be useful information to someone.

Comment by Nicola Mometto [ 22/Jan/14 11:35 AM ]

Attached a fix.

This also fixes AOT compiling of code like:

(def x (fn foo []))
(fn foo [])
Comment by Nicola Mometto [ 22/Jan/14 11:39 AM ]

Cleaned up patch

Comment by Alex Miller [ 22/Jan/14 12:43 PM ]

It looks like the patch changes indentation of some of the code - can you fix that?

Comment by Nicola Mometto [ 22/Jan/14 3:57 PM ]

Updated patch without whitespace changes

Comment by Alex Miller [ 22/Jan/14 4:15 PM ]

Thanks, that's helpful.

Comment by Alex Miller [ 24/Jan/14 10:03 AM ]

There is consensus that this is a problem, however this is an area of the code with broad impacts as it deals with how classes are named. To that end, there is some work that needs to be done in understanding the impacts before we can consider it.

Some questions we would like to answer:

1) According to Rich, naming of (fn x []) function classes used to work in the manner of this patch - with generated names. Some code archaeology needs to be done on why that was changed and whether the change to the current behavior was addressing problems that we are likely to run into.

2) Are there issues with recursive functions? Are there impacts either in AOT or non-AOT use cases? Need some tests.

3) Are there issues with dynamic redefinition of functions? With the static naming scheme, redefinition causes a new class of the same name which can be picked up by reload of classes compiled to the old definition. With the dynamic naming scheme, redefinition will create a differently named class so old classes can never pick up a redefinition. Is this a problem? What are the impacts with and without AOT? Need some tests.

Comment by Nicola Mometto [ 24/Jan/14 11:39 AM ]

Looks like the current behaviour has been such since https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28

My guess is that Rich simply forgot to consider the (def y (fn x [] ..)) case.

Regarding 2 and 3, the dynamic naming scheme is no different than what happens for anonymous functions so I don't see how this could cause any issue.

Recursion on the fn arg is simply a call to .invoke on "this", it's classname unaware.

I can add some tests to test that

(def y (fn x [] 1))
(fn x [] 2)
compile to different classnames but other than that I don't see what should be tested.

Comment by Stuart Halloway [ 27/Jun/14 2:17 PM ]

incomplete pending the answers to Alex Miller's questions in the comments

Comment by Nicola Mometto [ 27/Jun/14 3:20 PM ]

I believe I already answered his questions, I'll try to be a bit more explicit:
I tracked the relevant commit from Rich which added the dynamic naming behaviour https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28#diff-f17f860d14163523f1e1308ece478ddbL3081 which clearly shows that this bug was present since then so.

Regarding redefinitions or recursive functions, both of those operations never take in account the generated fn name so they are unaffected.

Comment by Alex Miller [ 12/Sep/14 4:32 PM ]

Summarizing some cases here from before/after the patch:

1) top-level fn (always has name)
	1.6 - namespace$name
	patch - namespace$name
2) non-top-level fn with name
	1.6 - namespace$name (collides with prior case)
	patch - namespace$topname__x__name  	<-- CHANGED
3) anonymous fn (no name)
	1.6 - namespace$name$fn__x
	patch - namespace$name$fn__x
4) top-level anonymous fn (no name, not at all useful :)
	1.6 - namespace$fn__x
	patch - namespace$fn__x

The key problem is that the first 2 cases produce the identical class name on 1.6. The patch alters the non-top-level named fn so there is no conflict.

Prior to the referenced old commit, I believe cases 1 and 2 would both produce namespace$name__x (where x is unique) so they would not collide. The change was made to prevent the top-level name from changing ("don't append numbers on top-level fn class names"). While the similar change was made on non-top-level fn names, I do not think it needed to be.

I've thought through (and tried) a bunch of the implications of this with the help of Nicola's comments above and I do not see an issue with any of the things I've considered. From a binary compatibility point of view with existing AOT code, old code compiled together should be self-consistent and continue to work. New compiled code will also be consistent. I can't think of a way that new code would expect to know the old name of a non-top-level function such that there could be an issue.

One question - why change the code such that the new class name is namespace$name$topname__x__name instead of namespace$name$topname_name__x (or something else?). And relatedly, while the diff is small, could we refactor a couple more lines to make the intent and cases clearer?

I am 90% ok with this patch but want a little thought into that question before I mark screened.

Comment by Nicola Mometto [ 12/Sep/14 4:47 PM ]

Alex, the attached patch munges into ns$topname__name__x, not into ns$topname__x__name.

Comment by Nicola Mometto [ 12/Sep/14 5:22 PM ]

The attached patch 0001-Fix-CLJ-1330refactored.patch contains the same fix from 0001-FixCLJ-1330-make-top-level-named-functions-classnam.patch but also refactors the code that deals with fn name munging

Comment by Alex Miller [ 12/Sep/14 6:22 PM ]

Hmmm.. I will double-check. That's not why I recall seeing when I did AOT.

Comment by Nicola Mometto [ 12/Sep/14 7:26 PM ]

New patch 0001-CLJ-1093-v2.patch improves the fn naming scheme a lot.
I've threw together a number of test cases that show the improvement + bug fixes:

pre patch:

Clojure 1.7.0-master-SNAPSHOT
user=> (fn [])
#<user$eval1$fn__2 user$eval1$fn__2@4e13aa4e>
user=> (fn a [])
#<user$eval5$a__6 user$eval5$a__6@6946a317>
user=> (let [a (fn [])] a)
#<user$eval9$a__10 user$eval9$a__10@15fdf894>
user=> (let [a (fn a [])] a)
#<user$eval13$a__14 user$eval13$a__14@1d87a604>
user=> (let [a (fn x [])] a)
#<user$eval17$x__18 user$eval17$x__18@7f0cd67f>
user=> (def a (fn [])) a
#<user$a user$a@33e1ccbc>
user=> (def a (fn x [])) a
#<user$x user$x@59a04a1b> ;; <== BUG!!
user=> (def ^{:foo (fn [])} a) (-> (meta #'a) :foo)
#<user$fn__23 user$fn__23@d9c21c6>
user=> (def ^{:foo (fn a [])} a) (-> (meta #'a) :foo)
#<user$a user$a@420dd874> ;; <== BUG!!
user=> (def a (fn [] (fn []))) (a)
#<user$a$fn__30 user$a$fn__30@6f57be76>
user=> (def a (fn [] (fn x []))) (a)
#<user$a$x__35 user$a$x__35@79930089>
user=> (let [x (fn [])] (def a (fn [] x))) a (a)
#<user$eval40$a__43 user$eval40$a__43@6db1694e>
#<user$eval40$x__41 user$eval40$x__41@20bd16bb>
user=> (let [x (fn a [])] (def a (fn [] x))) (a)
#<user$eval48$a__49 user$eval48$a__49@75d6d1d4> ;; <== the local binding name doesn't appear in the class name
user=> (let [x (fn x [])] (def a (fn [] x))) (a)
#<user$eval56$x__57 user$eval56$x__57@16d81c91>

post patch:

Clojure 1.7.0-master-SNAPSHOT
user=> (fn [])
#<user$eval1$fn__3 user$eval1$fn__3@3c92218c>
user=> (fn a [])
#<user$eval6$a__8 user$eval6$a__8@6f85c59c>
user=> (let [a (fn [])] a)
#<user$eval11$a__13 user$eval11$a__13@4d051922>
user=> (let [a (fn a [])] a)
#<user$eval16$a__a__18 user$eval16$a__a__18@138a92e7>
user=> (let [a (fn x [])] a)
#<user$eval21$a__x__23 user$eval21$a__x__23@528ef256>
user=> (def a (fn [])) a
#<user$a user$a@6bef63f9>
user=> (def a (fn x [])) a
#<user$a__x__28 user$a__x__28@5f0bebef>
user=> (def ^{:foo (fn [])} a) (-> (meta #'a) :foo)
#<user$fn__30 user$fn__30@4cf0f2eb>
user=> (def ^{:foo (fn a [])} a) (-> (meta #'a) :foo)
#<user$a__35 user$a__35@37ff95a9>
user=> (def a (fn [] (fn []))) (a)
#<user$a$fn__41 user$a$fn__41@fd34eac>
user=> (def a (fn [] (fn x []))) (a)
#<user$a$x__48 user$a$x__48@6fc334de>
user=> (let [x (fn [])] (def a (fn [] x))) a (a)
#<user$eval54$a__58 user$eval54$a__58@7c721de>
#<user$eval54$x__56 user$eval54$x__56@43f7b41b>
user=> (let [x (fn a [])] (def a (fn [] x))) (a)
#<user$eval64$x__a__66 user$eval64$x__a__66@460d4> ;; <== the local binding name is included in the class name
user=> (let [x (fn x [])] (def a (fn [] x))) (a)
#<user$eval74$x__x__76 user$eval74$x__x__76@77cab87f>

As you can see, this last patch not only fixes the two bugs, but also improves fn naming in let contexts by preserving the name of the local binding in the class name, this I believe will be a great improvement in the understandability of stacktraces.

Generated at Fri Sep 19 14:53:24 CDT 2014 using JIRA 4.4#649-r158309.