Compiler sometimes gives constant collections types which mismatch with their runtime values
Description
Environment
Clojure 1.3 om Mac OS 10.7, Clojure 1.5.0 alpha1 on Linux x86_64 (OpenJDK 1.7.0 b147)
Attachments
Activity

Alex Miller March 6, 2014 at 1:53 PM
Reopened just to fix the ticket fields.

Alex Miller March 5, 2014 at 5:44 PM
Remove from 1.6 release. Still needs more analysis.

Nicola Mometto February 22, 2014 at 8:41 PM
Immediatelly after posting the previous response it occurred to me that approaches b and c have an inherent problem:
with that approach it's no longer possible to access public fields/methods in instances of PAMs or PHMs that do not belong in IPersistentMap without incurring in reflection.
This means that for example calls to java.lang.Map instance methods would require type-hinting the map literal while now it can be resolved at compile-time without the need of type-hinting.
Here's an example using a first patch that implements the fix as described by approaches -b or -c

Nicola Mometto February 22, 2014 at 8:25 PM
Alex, I'm sorry here's the correct link: http://dev.clojure.org/jira/browse/CLJ-944?focusedCommentId=30684&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-30684
Your analysis of the emitted bytecode is correct, however when the code is loaded JIT, calls to `def` will be interpreted https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Compiler.java#L409-L434 as a result, if the initexpr is a ConstantExpr, the in-memory value of that ConstantExpr will be used and if it's the case that the expression is a map, that will always be a PHM.
I understand that switching between PAMs and PHMs is merely an optimization but having code behave differently if it's AOT compiled rather than JIT loaded is undesiderable IMHO.

Alex Miller February 22, 2014 at 7:53 PM
Nicola, I'm not getting your point here (the linked comment doesn't seem to match up to the comments about (def a {:a 1}) so that's confusing me). In the case of (def a {:a 1}), my reading of the bytecode is that RT.map() (which returns IPersistentMap) will be invoked at runtime to generate the map. There is more than one potential concrete class for a (depending on size), but nothing is expecting it to be a particular concrete type here. I do not expect literal maps to retain order or be ArrayMaps (you should call array-map if you want that).
I'm hoping to get some consensus from Rich about the newly updated problem, cause, and approach before I move further. I'd currently classify the problem as the failed .containsKey call, the cause as the compiler replacing the MapExpr with an expression that specifies its type more narrowly than it should, and B and C as some candidate solutions.
If we want the compiler to generate concrete maps that match RT.map(), I think that's ok too, but I don't see it as the cause right now.
Details
Assignee
UnassignedUnassignedReporter
Alf Kristian StøyleAlf Kristian StøyleLabels
Approval
OkPatch
Code and TestPriority
MajorAffects versions
Fix versions
Details
Details
Assignee
Reporter

Labels
Approval
Patch
Priority

The map is a clojure.lang.PersistentArrayMap, which obviously has a containsKey method (https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentArrayMap.java#L95).
Casting it works fine though:
The problem is not present in Clojure 1.2.1.
Cause: In the first example, the Compiler will parse the MapExpr (which reports a Java class of IPersistentMap) into a ConstantExpr whose value is a PersistentHashMap (and reports type as same). The .containsKey() call is a HostExpr which parses to an InstanceMethodExpr.
When emitted to bytecode, the ConstantExpr becomes a call to RT.map() to generate a map. RT.map() produces either a PersistentArrayMap or PersistentHashMap depending on size. The InstanceMethodExpr becomes a call to PersistentHashMap.containsKey() (based on the reported type of it's expression). In the case where RT.map() will produce a PersistentArrayMap at runtime (small maps), this causes a mismatch.
Note also that this same scenario occurs for constant sets, vectors, etc - all of those cases produce ConstantExpr's with the type of the concrete data structure, instead of the more general IPersistentMap, IPersistentSet, etc. All of those scenarios should be addressed in the same way. But, those interfaces do not capture the full capabilities of the returned object. The compiler and the marshaller need to agree on the types to produce.
Approach: One approach (A) explored previously was to have MapExpr always produce a ConstantExpr containing a value of the same type that will be produced at runtime by RT.map(). While this works, it is also fragile, and sidesteps the real issue that MapExpr creates a ConstantExpr with a type that is too specific. Rich said this is the wrong approach.
(B) Another approach is to create an anonymous subclass of ConstantExpr in MapExpr that overrides getJavaClass() to report a more general type. This isn't done anywhere else and it may have negative ramifications (but I have not found any).
(C) Another approach is to follow the path of example #2 above. That example wraps the map in a MetaExpr which reports getJavaClass() returning the hinted type. In the case above, it hints the concrete type that happens to match the return of RT.map(). Instead, we could wrap the ConstantExpr in a MetaExpr hinted on the abstract type IPersistentMap (the return type of RT.map()).
RH - a more general (interface) type will not give full capabilities. Perhaps APersistent*
Patch: