<< Back to previous view

[CLJ-322] Enhance AOT compilation process to emit classfiles only for explicitly-specified namespaces Created: 29/Apr/10  Updated: 07/Sep/16

Status: In Progress
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Chas Emerick Assignee: Unassigned
Resolution: Unresolved Votes: 27
Labels: aot

Attachments: Text File 0322-limit-aot-resolved.patch     File CLJ-322.diff     Text File compile-interop-1.patch     GZip Archive write-classes-1.diff.gz    
Patch: Code and Test
Approval: Vetted
Waiting On: Chas Emerick


Summary: still needs decision on implementation approach.

This was originally/erroneously reported by Howard Lewis Ship in the clojure-contrib assembla:

My build file specifies the namespaces to AOT compile but if I include another namespace
(even from a JAR dependency) that is not AOT compiled, the other namespace will be compiled as well.

In my case, I was using clojure-contrib's clojure.contrib.str-utils2 namespace, and I got a bunch of
clojure/contrib/str_utils2 classes in my output directory.

I think that the AOT compiler should NOT precompile any namespaces that are transitively reached,
only namespaces in the set specified by the command line are appropriate.

As currently coded, you will frequently find unwanted third-party dependencies in your output JARs;
further, if multiple parties depend on the same JARs, this could cause bloating and duplication in the
eventual runtime classpath.

Having the option of shipping either all AOT-compiled classfiles or mixed source/AOT depending upon one's distribution requirements would make that phase of work with a clojure codebase significantly easier and less error-prone. The only question in my mind is what the default should be. We're all used to the current behaviour, but I'd guess that any nontrivial project where the form of the distributable matters (i.e. the source/AOT mix), providing as much control as possible by default makes the most sense. Given the tooling that most people are using, it's trivial (and common practice, IIUC) to provide a comprehensive list of namespaces one wishes to compile, so making that the default shouldn't be a hurdle to anyone. If an escape hatch is desired, a --transitive switch to clojure.lang.Compile could be added.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/322
aot-transitivity-option-compat-322.diff - https://www.assembla.com/spaces/clojure/documents/aI7Eu-HeGr35ImeJe5cbLA/download/aI7Eu-HeGr35ImeJe5cbLA
aot-transitivity-option-322.diff - https://www.assembla.com/spaces/clojure/documents/aIWFiWHeGr35ImeJe5cbLA/download/aIWFiWHeGr35ImeJe5cbLA

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

hlship said: I'd like to reinforce this. I've been doing research on Clojure build tools for an upcoming talk and all of them (Maven, Leiningen, Gradle) have the same problem: the AOT compile extends from the desired namespaces (such as one containing a :gen-class) to every reached namespace. This is going to cause a real ugliness when application A uses libraries B and C that both depend on library D (such as clojure-contrib) and B and C are thus both bloated with duplicate, unwanted AOT compiled classes from the library D.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: This behaviour is an implementation detail of Clojure's AOT compilation process, and is orthogonal to any particular build tooling.

I am working on a patch that would provide a mechanism for such tooling to disable this default behaviour.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: A first cut of a change to address this issue is here (caution, work in progress!):


This makes available a new recognized system property, clojure.compiler.transitive, which defaults to true. When set/bound to false (i.e. -Dclojure.compiler.transitive=false when using clojure.lang.Compile), only the first loaded file (either the ns named in the call to compile or each of the namespaces named as arguments to clojure.lang.Compile) will have classfiles written to disk.

This means that this compilation invocation:

java -cp <your classpath> -Dclojure.compiler.transitive=false clojure.lang.Compile com.bar com.baz

will generate classfiles only for com.bar and com.baz, but not for any of the namespaces or other files they load, require, or use.

The only shortcoming of this WIP patch is that classfiles are still generated for proxy and gen-class classes defined outside of the explicitly-named namespaces. What I thought was a solution for this ended up breaking the loading of generated interfaces (as produced by defprotocol, etc).

I'll take a second look at this before the end of the week, but wanted to get this out there so as to get any comments people might have.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

technomancy said: Looks good, but I'm having trouble getting it to work. I tried compiling from master of Chas's fork on github, but I still got the all the .class files generated with -Dclojure.compiler.transitive=false. It could be a quirk of the way I'm using ant to fork off processes though. Is it possible to set it using System/setProperty, or must it be given as a property on the command-line?

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: Bah, that's just bad documentation. :-/

The system property is only provided by clojure.lang.Compile; the value of it drives the binding of clojure.core/transitive-compile, which has a root binding of true.

You should be able to configure the transitivity the same way you configure compile-path (system prop to clojure.lang.Compile or a direct binding when at the REPL, etc).

If not, ping me in irc or elsewhere.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

meikelbrandmeyer said: I think, excluding parts 'load'ed is a little strong. I have some namespaces which load several parts from different files, but which belong to the same namespace. The most prominent example of such a case is clojure.core itself. I'm find with stopping require and use, but load is a bit too much, I'd say.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

technomancy said: Chas: Thanks; will give that a go.

Meikel: Do people actually use load outside of clojure.core? I thought it was only used there because clojure.core is a "special" namespace where you want more vars to be available than can reasonably fit in a single file. Splitting up a namespace into several files is quite unadvisable otherwise.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

technomancy said: I can confirm that this works for me modulo the proxy/gen-class issue that Chas mentioned. I would love to see this in Clojure 1.2; it would really clean up a lot of build-related issues.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

meikelbrandmeyer said: I used it several times and this is the first time, I hear that it is unadvisable to do so. Even with a lower number of Vars in the namespace (c.c is here certainly exceptional) and might be of use to split several "sections" of code which belong to the same namespace but have different functionality. Whether to use a big comment in the source to indicate the section or split things into subfiles is a matter of taste. But it's a perfectly reasonable thing todo.

Another use case, where I use this (and c.c.lazy-xml, IIRC) is to conditionally load code depending on whether a dependency is available or not. Eg. vimclojure uses c.c.pprint and c.c.stacktrace/clj-stacktrace transparently depending on their availability.

There are perfectly legal uses of load. I don't see any "unadvisable" here.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: Thanks, Meikel; I had forgotten about that use case, as I don't use load directly myself at all. I probably wouldn't say it's inadvisable, just mostly unnecessary. In any case, that's a good catch. It complicates things a bit, but we'll see what happens. I'm going to take another whack at resolving the proxy/gen-class case and narrowing the impact of nontransitivity to use and require later tonight.

I agree wholeheartedly that this should be in 1.2, assuming the technical bits work out. This has been an irritant for quite a long time. I actually believe that nontransitivity should be the default – no one wants or expects to have classfiles show up for dependencies show up when a project is AOT-compiled. I think the only negative impact would be whoever still fiddles with compilation at the REPL, and doesn't use maven or lein – and even then, it's just a matter of binding another var.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

meikelbrandmeyer said: Then the var should be added to the default bindings in the clojure.main repl. Then it's set!-able like the other vars ��� warn-on-reflection and friends.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: This is looking pretty good (still WIP):


Thank you again for mentioning load, Meikel: it was very helpful in resolving the proxy/gen-class issue as well.

Just a single data point: the jar produced by the medium-sized project I've been using for testing the changes has shrunk from 1.8MB to less than 1MB. That's not the only reason this is a good change, but it's certainly a nice side-effect.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: [file:aIWFiWHeGr35ImeJe5cbLA]

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: [file:aI7Eu-HeGr35ImeJe5cbLA]

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: Patched attached. The compat one retains the current default behaviour [*transitive-compile* true], the other changes the default so that transitivity is a non-default option. At least of those I've spoken to about this, the latter is preferred.

The user impact of changing the default would be:

  1. The result of compiling from the REPL will change. Getting back current behaviour would require adding a [*transitive-compile* true] binding to the existing bindings one must set when compiling from the REPL.
  2. The same as #1 goes for those scripting AOT compilation via clojure.lang.Compile as well (whether by shell scripts, ant, etc).
  3. Those using lein, clojure-maven-plugin, gradle, and others will likely have a new option provided by those tools, and perhaps a different default than the language's. I suspect those using such tools would much prefer a change from the default behaviour in any case.
Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

hlship said: Just had a brain-storm:

How about an option to support transitive compilation, but only if the Clojure source file being compiled as a file: URL (i.e., its a local file on the file system, not a file stored in a JAR). That would make it easier to use compilation on the local project without transitively compiling imported libraries, such as clojure-contrib.

So transitive-compile should be a keyword, not a boolean, with values :all (for 1.1 behavior), :none (to compile only the exact specified namespaces) or :local (to compile transitively, but only for local files, not source files from JARs).

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

cemerick said: (Crossposted to the clojure-dev list)

I thought about this some, and I don't think that's a good idea, at least for now. I'm uncomfortable with semantics changing depending upon where code is being loaded from – which, depending upon a tool's implementation, might be undefined. E.g. if the com.foo.bar ns is available in source form in one directory, but as classes from a jar, and classpaths aren't being constructed in a stable fashion, then the results of compilation will change.

If we decide that special treatment depending upon the source of code is warranted in the future, that's a fairly straightforward thing to do w.r.t. the API – we could have :all and :local as you suggest, with nil representing :none.

Comment by Assembla Importer [ 28/Sep/10 12:18 AM ]

stu said: Rich is not comfortable enough with the implementation complexity of this patch (e.g. the guard clause for proxies and gen-class) to slide this in as a minor fix under the wire for 1.2.

Better to live with the pain we know a little longer than ship something we don't have enough experience with to be confident.

Comment by Chas Emerick [ 19/Nov/10 9:28 PM ]

Updated patch to cleanly apply to HEAD and address issues raised by screening done by Cosmin Stejerean. Also includes proper tests.

Note: this patch's tests require the fix for CLJ-432!

Comment by Stuart Halloway [ 29/Nov/10 7:18 AM ]

the "-resolved" patch resolves a conflict in main.clj

Comment by Stuart Halloway [ 29/Nov/10 7:25 AM ]

Several questions:

  1. I am getting an ant build error: "/Users/stuart/repos/clojure/build.xml:137: java.io.FileNotFoundException: Could not locate clojure/test_clojure/aot/nontransitive__init.class or clojure/test_clojure/aot/nontransitive.clj on classpath:"
  2. It feels icky to have a method named writeClassFile that, under some circumstances, does not write a class file, but instead loads it via a dynamic loader. Maybe this is just a naming issue.
  3. Are there any other ways to accomplish the goals of load-level? Or, taking the other side, if we are going to have a load-level, are there other possible consumers who might have different needs?
  4. (Minor) Why the use :only idiom instead of just require?
Comment by Stuart Sierra [ 10/Dec/10 3:34 PM ]

An alternative approach: patch write-classes-1.diff.gz

From my forked branch

What this patch does:

  • Keeps 'compile' and 'compile-files' exactly the same
  • Adds 'compile-write-classes' to write .class files for specifically named classes
  • Minor compiler changes to support this

This approach was prompted by the following observations:

  • Java interop is the dominant reason for needing .class files
  • Things other than namespaces can generate classes for Java interop:
    • deftype/defrecord
    • defprotocol
    • gen-class/gen-interface
  • For library releases, we want to control which .class files are emitted on a per-class basis, not per-namespace
  • Some legitimate uses of AOT compilation will want transitive compilation
    • Pre-compiling an entire application before release
Comment by Chas Emerick [ 10/Dec/10 4:04 PM ]

S. Halloway: My apologies, I didn't know you had commented. I thought that, having assigned this issue to myself, I'd be notified of updates.

FWIW, I aim to review your comments and SS' approach over the weekend.

Comment by Chas Emerick [ 16/Dec/10 7:36 AM ]

S. Halloway:

1. Certainly shouldn't happen. AFAIK, others have screened the patch, presumably with a successful build.
2. Agreed; given the approach, I think it's just a bad name.
3. Yes, I think S. Sierra's is one. See my next comment.
4. Because the :use form was already there. I've actually been using that form of :use more and more; I've found that easier than occasionally having to shuffle around specs between :use and :require. I think I'm aping Chris Houser in that regard.

Comment by Chas Emerick [ 16/Dec/10 9:00 AM ]

I think S. Sierra's approach is fundamentally superior what I offered. I have two suggestions: one slight perspective change (which implies no change in the actual implementation), and an idea for an even simpler approach (at least from a user perspective), in that order.

While interop is the driving requirement behind AOT, I absolutely do not want to have to keep an updated enumeration of all of the classes resulting from each and every defrecord et al. usages in my pom.xml/project.clj (and I wouldn't wish the task of ferreting those usages and their resulting classnames on any build tool author).

Right now, *compile-write-classes* is documented to be a set of classname strings, but could just as easily be any other function. *compile-write-classes* should be documented to accept any predicate function (renamed to e.g. *compile-write-class?*?). There's no reason why it shouldn't be bound to, e.g. #(re-matches #"foo\.bar\.[\w_]+$" %) if I know that all my records are defined in the foo.bar namespace.

To go along with that, I think some package/classname-globbing utilities along with corresponding options to clojure.lang.Compile would be most welcome. Classname munging rules are not exactly obvious, and it'd be good to make things a little easier for users in this regard.

Another alternative

If there's a closed set of forms that generate classes that one might reasonably be interested in having in a build result (outside of use cases for pervasive AOT), then why not have a simple option that only those forms utilize? gen-class and gen-interface already do this, but reusing the all-or-nothing *compile-files* binding; if they keyed off of a binding that implied a diminished scope (e.g. *compile-interop-forms* – which would be true if *compile-files* were true), then they'd do exactly what we wanted. Extending this approach to deftype (and therefore defrecord) should be straightforward.

An implementation of this would probably be somewhat more complicated than S. Sierra's patch, though not as complex as my original stab at the problem (i.e. no *load-level*). On the plus side:

1. No additional configuration for users or implementation work for build tool authors, aside from the addition of the boolean diminished-scope AOT option
2. Class file generation would remain opaque from a build process standpoint
3. Future/other class-generating forms (there are a few people futzing with ASM independently, etc) can make local decisions about whether or not to participate in interop-centric classfile generation. This might be particularly helpful if a given form emits multiple classes, making the determination of a classname-based filter fn less straightforward.

I can see wanting to further restrict AOT to specific classnames in certain circumstances, in which case the above and S. Sierra's patch might be complimentary.

Comment by Stuart Sierra [ 16/Dec/10 11:49 AM ]

I like the idea of *compile-interop-forms*. But is it always possible to determine what an "interop form" is? I think it is, I'm just not sure.

Comment by Allen Rohner [ 09/Oct/11 12:50 PM ]

I'm also in favor of compile-interop-forms. As far as determining, how about sticking metadata on the var?

(defmacro ^{:interop-form true} deftype ...)

Comment by Stuart Sierra [ 21/Oct/11 8:38 AM ]

Summary and design discussion on wiki at http://dev.clojure.org/display/design/Transitive+AOT+Compilation

Comment by Stuart Sierra [ 29/Nov/11 6:54 PM ]

New attachment compile-interop-1.patch has new approach: Add a third possible value for *compile-files*. True and false keep their original meanings, but :interop causes only interop-related forms to be written out as .class files. "Interop forms" are gen-class, gen-interface, deftype, defrecord, defprotocol, and definterface.


  • doesn't change existing behavior
  • handles common case for non-transitive AOT (interop)
  • minimal changes to the compiler


  • not flexible
Comment by Stuart Sierra [ 02/Dec/11 8:12 AM ]

Just realized my patch doesn't solve the transitive compilation problem. If library A loads library B, then compiling interop forms in A will also emit interop .class files in B.

Comment by Paudi Moriarty [ 01/Jan/13 3:55 AM ]

It's disappointing to see an important issue like this still unresolved after 2.5 years. This is a real pain for us. We have a large closed source project where shipping source is not an option. This forces us to manage the AOT'ing of dependencies due to the hard dependency on protocol interfaces introduced by transitive AOT compilation (see https://groups.google.com/forum/?fromgroups=#!topic/clojure-dev/r3A1JOIiwVU).

Comment by Andy Fingerhut [ 01/Jan/13 4:27 PM ]

Paul, do you have a suggestion for which of the approaches described in comments here, or on the wiki page http://dev.clojure.org/display/design/Transitive+AOT+Compilation would be preferable solution for you? Or perhaps even a patch that implements your preferred approach?

Comment by Howard Lewis Ship [ 04/Jan/13 4:18 PM ]


I'm now consulting with Paudi's organization, so I think I can speak for him (I'm now the default buildmeister).

I like Stuart's :interop idea, but that is somewhat orthogonal to our needs.

I return to what I would like; compilation would compile specific namespaces; dependencies of those namespaces would not be compiled.

To be honest, I'm still a little hung up on the interop forms: especially defprotocol and friends; from a cursory glance, it appears that todays AOT compilation will compile the protocol into a Java class, then compile the namespace that references the protocol with the assumption that the protocol's Java class is available. When we use build rules to only package our namespace's class files into the output JAR, the code fails with a NoClassDefFoundError because the protocol really needs to be recompiled, at runtime compilation, into an in-memory Java class.

Obviously, supporting this correctly will be a challenge; the compiled bytecode for our namespace would ideally:
1) check to see if the Java class already exists and use it if so
2) load the necessary namespaces so as to force the creation of the Java class

I can imagine any number of ways to juggle things to make this work, so I won't suggest a specific implementation.

In the meantime, our workaround is to create a "stub" module as part of our build; it simply requires in the necessary namespaces (for example, org.clojure:core.cache); this forces an AOT compile of the dependencies and we have a special rule to package such dependencies in the stub module's output JAR. This may not be a scalable problem, and it is expensive to identify what 3rd party dependencies require this treatment.

Comment by Stuart Halloway [ 24/May/13 1:25 PM ]

I am marking this incomplete because there does not yet seem to be plurality, much less consensus or unanimity, on approach.

Personally I am in favor of a solution based on a predicate that gets handed the class name and compiled bits, and then can choose whether to write the class. Pretty close to Stuart Sierra's compile-write-classes. Might be possible to flow more information than classname to the predicate.

Comment by Alex Miller [ 21/Oct/13 2:35 PM ]

Removed the 1.6 release from this and added to Release.Next list to make this a priority for the next release.

Comment by Kevin Downey [ 21/Oct/13 3:42 PM ]


I don't exactly understand your write up, I am reading the compiler, the emit-protocol macro, and emit-method-builder to try and understand it.

You might check to see if you have a situation similar to the following:

(ns a.b)

(defprotocol P1
  (pm [a]))

then either

(ns a.c
  (:import (a.b P1))

(defrecord R []
  (pm [x] x))


(ns a.c)

(defrecord R []
  (pm [x] x))

in both examples defrecord is actually getting the class behind the protocol instead of the protocol, the correct thing to do is

(ns a.c
  (:require [a.b :refer [P1]]))

(defrecord R []
  (pm [x] x))

This is an extremely common mistake people make when using protocols, unfortunately the flexibility of using interfaces directly in defrecord forms, and protocols being backed by interfaces means it is very easy to unwittingly make such a mistake. Both of the mistake examples could result in missing classes/namespace problems.

Comment by Michael Blume [ 08/Jan/15 4:26 PM ]

The write-classes patch still applies cleanly, I just tried it out, made a one-line change to lein-ring and got a war-file for my app with two generated classes instead of hundreds. This seems like a good fix to me. What do we need to do to get it into release?

Comment by Alex Miller [ 08/Jan/15 4:30 PM ]

As Stu said above, we do not have consensus on whether this is the right way to go with this. I am planning to look at all outstanding AOT tickets (including this one) early in 1.8.

Comment by Stuart Halloway [ 07/Sep/16 3:15 PM ]

I believe that we should close this ticket. AOT ought to make all the classfiles as it does today to avoid classloader inversion problems (see e.g. http://dev.clojure.org/jira/browse/CLJ-1544), and tools can us e.g. boot's sift to remove classes not part of the current project from delivered jars.

We might want to make some tools that identify jars that erroneously includes compiled classes from other namespaces, i.e. resulting from not using sift.

Comment by Kevin Downey [ 07/Sep/16 3:44 PM ]

something like boot's sift is not a complete solution.

lein his included that functionality for sometime, and back in 2010 it defaulted to removing classes transitively generated. that turned out to break things so it was disabled. the functionality still exists but defaults to off.

the setting if someone wants to try it: https://github.com/technomancy/leiningen/blob/master/sample.project.clj#L330
the issue that lead to it being turned off by default: https://github.com/technomancy/leiningen/issues/141

Generated at Mon Sep 26 19:18:24 CDT 2016 using JIRA 4.4#649-r158309.