Clojure

Class name clash between top-level functions and defn'ed ones

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Critical Critical
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: Release 1.7
  • Component/s: None
  • Labels:
  • Patch:
    Code
  • Approval:
    Vetted

Description

Named anonymous fn's are not guaranteed to have unique class names when AOT-compiled.

For example:

(defn g [])
(def xx (fn g []))

When AOT-compiled both functions will emit user$g.class, the latter overwriting the former.

Demonstration script: demo1.clj

Patch: 0001-CLJ-1093-v2.patch

Approach: Generate unique class names for named fn's the same way as for unnamed anonymous fn's.

See also: This patch also fixes the issue reported in CLJ-1227.

  1. 0001-CLJ-1093-v2.patch
    12/Sep/14 7:26 PM
    2 kB
    Nicola Mometto
  2. 0001-Fix-CLJ-1330-make-top-level-named-functions-classnam.patch
    22/Jan/14 3:57 PM
    1 kB
    Nicola Mometto
  3. 0001-Fix-CLJ-1330-refactored.patch
    12/Sep/14 5:27 PM
    2 kB
    Nicola Mometto
  4. demo1.clj
    23/Jan/14 1:10 PM
    0.6 kB
    Stuart Sierra

Activity

Hide
Ambrose Bonnaire-Sergeant added a comment -

This seems like the reason why jvm.tools.analyzer cannot analyze clojure.core. On analyzing a definline, there is an "attempted duplicate class definition" error.

This doesn't really matter, but I thought it may or may not be useful information to someone.

Show
Ambrose Bonnaire-Sergeant added a comment - This seems like the reason why jvm.tools.analyzer cannot analyze clojure.core. On analyzing a definline, there is an "attempted duplicate class definition" error. This doesn't really matter, but I thought it may or may not be useful information to someone.
Hide
Nicola Mometto added a comment -

Attached a fix.

This also fixes AOT compiling of code like:

(def x (fn foo []))
(fn foo [])
Show
Nicola Mometto added a comment - Attached a fix. This also fixes AOT compiling of code like:
(def x (fn foo []))
(fn foo [])
Hide
Nicola Mometto added a comment -

Cleaned up patch

Show
Nicola Mometto added a comment - Cleaned up patch
Hide
Alex Miller added a comment -

It looks like the patch changes indentation of some of the code - can you fix that?

Show
Alex Miller added a comment - It looks like the patch changes indentation of some of the code - can you fix that?
Hide
Nicola Mometto added a comment -

Updated patch without whitespace changes

Show
Nicola Mometto added a comment - Updated patch without whitespace changes
Hide
Alex Miller added a comment -

Thanks, that's helpful.

Show
Alex Miller added a comment - Thanks, that's helpful.
Hide
Alex Miller added a comment -

There is consensus that this is a problem, however this is an area of the code with broad impacts as it deals with how classes are named. To that end, there is some work that needs to be done in understanding the impacts before we can consider it.

Some questions we would like to answer:

1) According to Rich, naming of (fn x []) function classes used to work in the manner of this patch - with generated names. Some code archaeology needs to be done on why that was changed and whether the change to the current behavior was addressing problems that we are likely to run into.

2) Are there issues with recursive functions? Are there impacts either in AOT or non-AOT use cases? Need some tests.

3) Are there issues with dynamic redefinition of functions? With the static naming scheme, redefinition causes a new class of the same name which can be picked up by reload of classes compiled to the old definition. With the dynamic naming scheme, redefinition will create a differently named class so old classes can never pick up a redefinition. Is this a problem? What are the impacts with and without AOT? Need some tests.

Show
Alex Miller added a comment - There is consensus that this is a problem, however this is an area of the code with broad impacts as it deals with how classes are named. To that end, there is some work that needs to be done in understanding the impacts before we can consider it. Some questions we would like to answer: 1) According to Rich, naming of (fn x []) function classes used to work in the manner of this patch - with generated names. Some code archaeology needs to be done on why that was changed and whether the change to the current behavior was addressing problems that we are likely to run into. 2) Are there issues with recursive functions? Are there impacts either in AOT or non-AOT use cases? Need some tests. 3) Are there issues with dynamic redefinition of functions? With the static naming scheme, redefinition causes a new class of the same name which can be picked up by reload of classes compiled to the old definition. With the dynamic naming scheme, redefinition will create a differently named class so old classes can never pick up a redefinition. Is this a problem? What are the impacts with and without AOT? Need some tests.
Hide
Nicola Mometto added a comment -

Looks like the current behaviour has been such since https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28

My guess is that Rich simply forgot to consider the (def y (fn x [] ..)) case.

Regarding 2 and 3, the dynamic naming scheme is no different than what happens for anonymous functions so I don't see how this could cause any issue.

Recursion on the fn arg is simply a call to .invoke on "this", it's classname unaware.

I can add some tests to test that

(def y (fn x [] 1))
and
(fn x [] 2)
compile to different classnames but other than that I don't see what should be tested.

Show
Nicola Mometto added a comment - Looks like the current behaviour has been such since https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28 My guess is that Rich simply forgot to consider the (def y (fn x [] ..)) case. Regarding 2 and 3, the dynamic naming scheme is no different than what happens for anonymous functions so I don't see how this could cause any issue. Recursion on the fn arg is simply a call to .invoke on "this", it's classname unaware. I can add some tests to test that
(def y (fn x [] 1))
and
(fn x [] 2)
compile to different classnames but other than that I don't see what should be tested.
Hide
Stuart Halloway added a comment -

incomplete pending the answers to Alex Miller's questions in the comments

Show
Stuart Halloway added a comment - incomplete pending the answers to Alex Miller's questions in the comments
Hide
Nicola Mometto added a comment -

I believe I already answered his questions, I'll try to be a bit more explicit:
I tracked the relevant commit from Rich which added the dynamic naming behaviour https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28#diff-f17f860d14163523f1e1308ece478ddbL3081 which clearly shows that this bug was present since then so.

Regarding redefinitions or recursive functions, both of those operations never take in account the generated fn name so they are unaffected.

Show
Nicola Mometto added a comment - I believe I already answered his questions, I'll try to be a bit more explicit: I tracked the relevant commit from Rich which added the dynamic naming behaviour https://github.com/clojure/clojure/commit/4651e60808bb459355a3a5d0d649c4697c672e28#diff-f17f860d14163523f1e1308ece478ddbL3081 which clearly shows that this bug was present since then so. Regarding redefinitions or recursive functions, both of those operations never take in account the generated fn name so they are unaffected.
Hide
Alex Miller added a comment - - edited

Summarizing some cases here from before/after the patch:

1) top-level fn (always has name)
	1.6 - namespace$name
	patch - namespace$name
2) non-top-level fn with name
	1.6 - namespace$name (collides with prior case)
	patch - namespace$topname__x__name  	<-- CHANGED
3) anonymous fn (no name)
	1.6 - namespace$name$fn__x
	patch - namespace$name$fn__x
4) top-level anonymous fn (no name, not at all useful :)
	1.6 - namespace$fn__x
	patch - namespace$fn__x

The key problem is that the first 2 cases produce the identical class name on 1.6. The patch alters the non-top-level named fn so there is no conflict.

Prior to the referenced old commit, I believe cases 1 and 2 would both produce namespace$name__x (where x is unique) so they would not collide. The change was made to prevent the top-level name from changing ("don't append numbers on top-level fn class names"). While the similar change was made on non-top-level fn names, I do not think it needed to be.

I've thought through (and tried) a bunch of the implications of this with the help of Nicola's comments above and I do not see an issue with any of the things I've considered. From a binary compatibility point of view with existing AOT code, old code compiled together should be self-consistent and continue to work. New compiled code will also be consistent. I can't think of a way that new code would expect to know the old name of a non-top-level function such that there could be an issue.

One question - why change the code such that the new class name is namespace$name$topname__x__name instead of namespace$name$topname_name__x (or something else?). And relatedly, while the diff is small, could we refactor a couple more lines to make the intent and cases clearer?

I am 90% ok with this patch but want a little thought into that question before I mark screened.

Show
Alex Miller added a comment - - edited Summarizing some cases here from before/after the patch:
1) top-level fn (always has name)
	1.6 - namespace$name
	patch - namespace$name
2) non-top-level fn with name
	1.6 - namespace$name (collides with prior case)
	patch - namespace$topname__x__name  	<-- CHANGED
3) anonymous fn (no name)
	1.6 - namespace$name$fn__x
	patch - namespace$name$fn__x
4) top-level anonymous fn (no name, not at all useful :)
	1.6 - namespace$fn__x
	patch - namespace$fn__x
The key problem is that the first 2 cases produce the identical class name on 1.6. The patch alters the non-top-level named fn so there is no conflict. Prior to the referenced old commit, I believe cases 1 and 2 would both produce namespace$name__x (where x is unique) so they would not collide. The change was made to prevent the top-level name from changing ("don't append numbers on top-level fn class names"). While the similar change was made on non-top-level fn names, I do not think it needed to be. I've thought through (and tried) a bunch of the implications of this with the help of Nicola's comments above and I do not see an issue with any of the things I've considered. From a binary compatibility point of view with existing AOT code, old code compiled together should be self-consistent and continue to work. New compiled code will also be consistent. I can't think of a way that new code would expect to know the old name of a non-top-level function such that there could be an issue. One question - why change the code such that the new class name is namespace$name$topname__x__name instead of namespace$name$topname_name__x (or something else?). And relatedly, while the diff is small, could we refactor a couple more lines to make the intent and cases clearer? I am 90% ok with this patch but want a little thought into that question before I mark screened.
Hide
Nicola Mometto added a comment - - edited

Alex, the attached patch munges into ns$topname__name__x, not into ns$topname__x__name.

Show
Nicola Mometto added a comment - - edited Alex, the attached patch munges into ns$topname__name__x, not into ns$topname__x__name.
Hide
Nicola Mometto added a comment -

The attached patch 0001-Fix-CLJ-1330refactored.patch contains the same fix from 0001-FixCLJ-1330-make-top-level-named-functions-classnam.patch but also refactors the code that deals with fn name munging

Show
Nicola Mometto added a comment - The attached patch 0001-Fix-CLJ-1330refactored.patch contains the same fix from 0001-FixCLJ-1330-make-top-level-named-functions-classnam.patch but also refactors the code that deals with fn name munging
Hide
Alex Miller added a comment -

Hmmm.. I will double-check. That's not why I recall seeing when I did AOT.

Show
Alex Miller added a comment - Hmmm.. I will double-check. That's not why I recall seeing when I did AOT.
Hide
Nicola Mometto added a comment - - edited

New patch 0001-CLJ-1093-v2.patch improves the fn naming scheme a lot.
I've threw together a number of test cases that show the improvement + bug fixes:

pre patch:

Clojure 1.7.0-master-SNAPSHOT
user=> (fn [])
#<user$eval1$fn__2 user$eval1$fn__2@4e13aa4e>
user=> (fn a [])
#<user$eval5$a__6 user$eval5$a__6@6946a317>
user=> (let [a (fn [])] a)
#<user$eval9$a__10 user$eval9$a__10@15fdf894>
user=> (let [a (fn a [])] a)
#<user$eval13$a__14 user$eval13$a__14@1d87a604>
user=> (let [a (fn x [])] a)
#<user$eval17$x__18 user$eval17$x__18@7f0cd67f>
user=> (def a (fn [])) a
#'user/a
#<user$a user$a@33e1ccbc>
user=> (def a (fn x [])) a
#'user/a
#<user$x user$x@59a04a1b> ;; <== BUG!!
user=> (def ^{:foo (fn [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$fn__23 user$fn__23@d9c21c6>
user=> (def ^{:foo (fn a [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$a user$a@420dd874> ;; <== BUG!!
user=> (def a (fn [] (fn []))) (a)
#'user/a
#<user$a$fn__30 user$a$fn__30@6f57be76>
user=> (def a (fn [] (fn x []))) (a)
#'user/a
#<user$a$x__35 user$a$x__35@79930089>
user=> (let [x (fn [])] (def a (fn [] x))) a (a)
#'user/a
#<user$eval40$a__43 user$eval40$a__43@6db1694e>
#<user$eval40$x__41 user$eval40$x__41@20bd16bb>
user=> (let [x (fn a [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval48$a__49 user$eval48$a__49@75d6d1d4> ;; <== the local binding name doesn't appear in the class name
user=> (let [x (fn x [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval56$x__57 user$eval56$x__57@16d81c91>

post patch:

Clojure 1.7.0-master-SNAPSHOT
user=> (fn [])
#<user$eval1$fn__3 user$eval1$fn__3@3c92218c>
user=> (fn a [])
#<user$eval6$a__8 user$eval6$a__8@6f85c59c>
user=> (let [a (fn [])] a)
#<user$eval11$a__13 user$eval11$a__13@4d051922>
user=> (let [a (fn a [])] a)
#<user$eval16$a__a__18 user$eval16$a__a__18@138a92e7>
user=> (let [a (fn x [])] a)
#<user$eval21$a__x__23 user$eval21$a__x__23@528ef256>
user=> (def a (fn [])) a
#'user/a
#<user$a user$a@6bef63f9>
user=> (def a (fn x [])) a
#'user/a
#<user$a__x__28 user$a__x__28@5f0bebef>
user=> (def ^{:foo (fn [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$fn__30 user$fn__30@4cf0f2eb>
user=> (def ^{:foo (fn a [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$a__35 user$a__35@37ff95a9>
user=> (def a (fn [] (fn []))) (a)
#'user/a
#<user$a$fn__41 user$a$fn__41@fd34eac>
user=> (def a (fn [] (fn x []))) (a)
#'user/a
#<user$a$x__48 user$a$x__48@6fc334de>
user=> (let [x (fn [])] (def a (fn [] x))) a (a)
#'user/a
#<user$eval54$a__58 user$eval54$a__58@7c721de>
#<user$eval54$x__56 user$eval54$x__56@43f7b41b>
user=> (let [x (fn a [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval64$x__a__66 user$eval64$x__a__66@460d4> ;; <== the local binding name is included in the class name
user=> (let [x (fn x [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval74$x__x__76 user$eval74$x__x__76@77cab87f>
user=>

As you can see, this last patch not only fixes the two bugs, but also improves fn naming in let contexts by preserving the name of the local binding in the class name, this I believe will be a great improvement in the understandability of stacktraces.

Show
Nicola Mometto added a comment - - edited New patch 0001-CLJ-1093-v2.patch improves the fn naming scheme a lot. I've threw together a number of test cases that show the improvement + bug fixes: pre patch:
Clojure 1.7.0-master-SNAPSHOT
user=> (fn [])
#<user$eval1$fn__2 user$eval1$fn__2@4e13aa4e>
user=> (fn a [])
#<user$eval5$a__6 user$eval5$a__6@6946a317>
user=> (let [a (fn [])] a)
#<user$eval9$a__10 user$eval9$a__10@15fdf894>
user=> (let [a (fn a [])] a)
#<user$eval13$a__14 user$eval13$a__14@1d87a604>
user=> (let [a (fn x [])] a)
#<user$eval17$x__18 user$eval17$x__18@7f0cd67f>
user=> (def a (fn [])) a
#'user/a
#<user$a user$a@33e1ccbc>
user=> (def a (fn x [])) a
#'user/a
#<user$x user$x@59a04a1b> ;; <== BUG!!
user=> (def ^{:foo (fn [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$fn__23 user$fn__23@d9c21c6>
user=> (def ^{:foo (fn a [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$a user$a@420dd874> ;; <== BUG!!
user=> (def a (fn [] (fn []))) (a)
#'user/a
#<user$a$fn__30 user$a$fn__30@6f57be76>
user=> (def a (fn [] (fn x []))) (a)
#'user/a
#<user$a$x__35 user$a$x__35@79930089>
user=> (let [x (fn [])] (def a (fn [] x))) a (a)
#'user/a
#<user$eval40$a__43 user$eval40$a__43@6db1694e>
#<user$eval40$x__41 user$eval40$x__41@20bd16bb>
user=> (let [x (fn a [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval48$a__49 user$eval48$a__49@75d6d1d4> ;; <== the local binding name doesn't appear in the class name
user=> (let [x (fn x [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval56$x__57 user$eval56$x__57@16d81c91>
post patch:
Clojure 1.7.0-master-SNAPSHOT
user=> (fn [])
#<user$eval1$fn__3 user$eval1$fn__3@3c92218c>
user=> (fn a [])
#<user$eval6$a__8 user$eval6$a__8@6f85c59c>
user=> (let [a (fn [])] a)
#<user$eval11$a__13 user$eval11$a__13@4d051922>
user=> (let [a (fn a [])] a)
#<user$eval16$a__a__18 user$eval16$a__a__18@138a92e7>
user=> (let [a (fn x [])] a)
#<user$eval21$a__x__23 user$eval21$a__x__23@528ef256>
user=> (def a (fn [])) a
#'user/a
#<user$a user$a@6bef63f9>
user=> (def a (fn x [])) a
#'user/a
#<user$a__x__28 user$a__x__28@5f0bebef>
user=> (def ^{:foo (fn [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$fn__30 user$fn__30@4cf0f2eb>
user=> (def ^{:foo (fn a [])} a) (-> (meta #'a) :foo)
#'user/a
#<user$a__35 user$a__35@37ff95a9>
user=> (def a (fn [] (fn []))) (a)
#'user/a
#<user$a$fn__41 user$a$fn__41@fd34eac>
user=> (def a (fn [] (fn x []))) (a)
#'user/a
#<user$a$x__48 user$a$x__48@6fc334de>
user=> (let [x (fn [])] (def a (fn [] x))) a (a)
#'user/a
#<user$eval54$a__58 user$eval54$a__58@7c721de>
#<user$eval54$x__56 user$eval54$x__56@43f7b41b>
user=> (let [x (fn a [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval64$x__a__66 user$eval64$x__a__66@460d4> ;; <== the local binding name is included in the class name
user=> (let [x (fn x [])] (def a (fn [] x))) (a)
#'user/a
#<user$eval74$x__x__76 user$eval74$x__x__76@77cab87f>
user=>
As you can see, this last patch not only fixes the two bugs, but also improves fn naming in let contexts by preserving the name of the local binding in the class name, this I believe will be a great improvement in the understandability of stacktraces.

People

Vote (7)
Watch (11)

Dates

  • Created:
    Updated: