Clojure

Significantly improve compile time by reducing calls to Class.forName

Details

  • Type: Enhancement Enhancement
  • Status: Closed Closed
  • Priority: Critical Critical
  • Resolution: Completed
  • Affects Version/s: Release 1.7
  • Fix Version/s: Release 1.7
  • Component/s: None
  • Patch:
    Code
  • Approval:
    Ok

Description

Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

Proposed: Avoid calling Class.forName() on non-namespaced symbols that do not contain "." or start with "[", don't map to a Class in the ns, and are names in the current local env. Also, adjust the ordering of checks in tagToClass to check for hints before checking for class.

[Note that the latest variant of the patch moves the check from the original patch farther down in the logic to avoid changing the semantics. This still yields virtually all of the performance gains. See comments for details.]

Patch: clj-1529-no-cache-2.diff

Screened by: Stu Halloway. Note that for this change the patch ended up being so small it is easier follow the code than the prose description.

[1] https://github.com/ztellman/aleph

  1. class-for-name.diff
    21/Sep/14 2:08 PM
    2 kB
    Zach Tellman
  2. clj-1529-no-cache.diff
    27/Oct/14 4:30 PM
    1 kB
    Alex Miller
  3. clj-1529-no-cache-2.diff
    05/Nov/14 11:37 AM
    1 kB
    Alex Miller
  4. clj-1529-with-cache.diff
    27/Oct/14 4:30 PM
    5 kB
    Alex Miller
  5. maybe-class-cache.patch
    07/Oct/14 4:06 PM
    5 kB
    Alex Miller
  6. maybe-class-cache-2.patch
    24/Oct/14 3:01 PM
    4 kB
    Alex Miller
  1. clj-1529.png
    40 kB
    08/Oct/14 2:13 PM

Activity

Zach Tellman made changes -
Field Original Value New Value
Description Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

This patch improves compilation time from 18 seconds to 7 seconds. The gain is exaggerated by the number of macros I use, but I would expect at least 50% improvements across a wide variety of codebases.

This patch does introduce a slight semantic change by privileging lexical scope over classnames. Consider this code:

(let [String "foo"]
  (. String substring 0 1))

Previously, this would be treated as a static call to 'java.lang.String', but with the patch would be treated as a call to the lexical variable 'String'. Since the new semantic is what I (and I think everyone else) would have expected in the first place, it's probably very likely that no one is shadowing classes with their variable names, since someone would have complained about this. If anyone feels this is at all risky, however, I'm happy to discuss it further.


[1] https://github.com/ztellman/aleph
Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

This patch improves compilation time from 18 seconds to 7 seconds. The gain is exaggerated by the number of macros I use, but I would expect at least 50% improvements across a wide variety of codebases.

This patch does introduce a slight semantic change by privileging lexical scope over classnames. Consider this code:

    (let [String "foo"]
      (. String substring 0 1))

Previously, this would be treated as a static call to 'java.lang.String', but with the patch would be treated as a call to the lexical variable 'String'. Since the new semantic is what I (and I think everyone else) would have expected in the first place, it's probably very likely that no one is shadowing classes with their variable names, since someone would have complained about this. If anyone feels this is at all risky, however, I'm happy to discuss it further.


[1] https://github.com/ztellman/aleph
Hide
Ghadi Shayban added a comment -

One of our larger projects (not macro-laden) just went from 36 seconds to 23 seconds to start with this patch.

Show
Ghadi Shayban added a comment - One of our larger projects (not macro-laden) just went from 36 seconds to 23 seconds to start with this patch.
Alex Miller made changes -
Approval Triaged [ 10120 ]
Alex Miller made changes -
Priority Major [ 3 ] Critical [ 2 ]
Hide
Ramsey Nasser added a comment -

I ported this patch to Clojure-CLR for the Unity integration project and we have seen significant speedups as well. I too agree that this is the behavior I expect as a user.

Show
Ramsey Nasser added a comment - I ported this patch to Clojure-CLR for the Unity integration project and we have seen significant speedups as well. I too agree that this is the behavior I expect as a user.
Hide
Alex Miller added a comment -

I ran this on a variety of open-source projects. I didn't find that it produced any unexpected behavior or test errors. Most projects were about 10% faster to run equivalent of "lein test" with a few as high as 35% faster.

Show
Alex Miller added a comment - I ran this on a variety of open-source projects. I didn't find that it produced any unexpected behavior or test errors. Most projects were about 10% faster to run equivalent of "lein test" with a few as high as 35% faster.
Rich Hickey made changes -
Fix Version/s Release 1.7 [ 10250 ]
Approval Triaged [ 10120 ] Vetted [ 10003 ]
Rich Hickey made changes -
Approval Vetted [ 10003 ] Incomplete [ 10006 ]
Hide
Alex Miller added a comment -

We're interested in comparing this and the class caching in fastload branch to get something in for 1.7. Next step is to extract a patch of the stuff in fastload so we can compare them better.

Show
Alex Miller added a comment - We're interested in comparing this and the class caching in fastload branch to get something in for 1.7. Next step is to extract a patch of the stuff in fastload so we can compare them better.
Hide
Alex Miller added a comment -

Add maybe class cache patch from fastload branch

Show
Alex Miller added a comment - Add maybe class cache patch from fastload branch
Alex Miller made changes -
Attachment maybe-class-cache.patch [ 13400 ]
Hide
Alex Miller added a comment - - edited

Times below to run "time lein test" on a variety of projects with columns:

  • master = current 1.7.0 master
  • maybe-cache = maybe-class-cache.patch extracted from Rich's fastload branch
  • class-for-name = class-for-name.diff from Zach
  • % maybe-cache = % improvement for maybe-cache over master
  • % class-for-name = % improvement for class-for-name patch over master (sorted desc)

project,master,maybe-cache,class-for-name,% maybe-cache,% class-for-name
aleph,25.605,16.572,14.460,35.278,43.527
riemann,40.550,27.656,24.734,31.798,39.004
lamina,37.247,30.072,29.045,19.263,22.021
trapperkeeper,11.979,11.158,10.3,6.854,14.016
plumbing,73.777,68.388,66.922,7.304,9.292
cheshire,5.583,5.089,5.086,8.848,8.902
tools.analyzer,5.411,5.289,5.023,2.255,7.171
core.async,19.161,18.090,17.942,5.589,6.362
tools.reader,4.686,4.435,4.401,5.356,6.082
clara-rules,43.964,42.140,41.542,4.149,5.509
core.typed,158.885,154.954,151.445,2.474,4.683
instaparse,9.286,8.922,8.859,3.920,4.598
schema,45.3,43.914,43.498,3.060,3.978
mandoline,76.295,74.831,74.425,1.919,2.451

The summary is that both patches improve times on all projects. In most cases, the improvement from either is <10% but the first few projects have greater improvements. The class-for-name patch has a bigger improvement in all projects than the maybe-cache patch (but maybe-cache has no change in semantics).

Show
Alex Miller added a comment - - edited Times below to run "time lein test" on a variety of projects with columns:
  • master = current 1.7.0 master
  • maybe-cache = maybe-class-cache.patch extracted from Rich's fastload branch
  • class-for-name = class-for-name.diff from Zach
  • % maybe-cache = % improvement for maybe-cache over master
  • % class-for-name = % improvement for class-for-name patch over master (sorted desc)
project,master,maybe-cache,class-for-name,% maybe-cache,% class-for-name aleph,25.605,16.572,14.460,35.278,43.527 riemann,40.550,27.656,24.734,31.798,39.004 lamina,37.247,30.072,29.045,19.263,22.021 trapperkeeper,11.979,11.158,10.3,6.854,14.016 plumbing,73.777,68.388,66.922,7.304,9.292 cheshire,5.583,5.089,5.086,8.848,8.902 tools.analyzer,5.411,5.289,5.023,2.255,7.171 core.async,19.161,18.090,17.942,5.589,6.362 tools.reader,4.686,4.435,4.401,5.356,6.082 clara-rules,43.964,42.140,41.542,4.149,5.509 core.typed,158.885,154.954,151.445,2.474,4.683 instaparse,9.286,8.922,8.859,3.920,4.598 schema,45.3,43.914,43.498,3.060,3.978 mandoline,76.295,74.831,74.425,1.919,2.451 The summary is that both patches improve times on all projects. In most cases, the improvement from either is <10% but the first few projects have greater improvements. The class-for-name patch has a bigger improvement in all projects than the maybe-cache patch (but maybe-cache has no change in semantics).
Hide
Nicola Mometto added a comment -

Are the two patches mutually exclusive?

Show
Nicola Mometto added a comment - Are the two patches mutually exclusive?
Hide
Alex Miller added a comment -

They are non-over-lapping. I have not considered whether they could both be applied or whether that makes any sense.

Show
Alex Miller added a comment - They are non-over-lapping. I have not considered whether they could both be applied or whether that makes any sense.
Hide
Alex Miller added a comment -

The two patches both essentially cut off the same hot code path, just at different points (class-for-name is earlier), so applying them both effectively should give you about the performance of class-for-name.

Show
Alex Miller added a comment - The two patches both essentially cut off the same hot code path, just at different points (class-for-name is earlier), so applying them both effectively should give you about the performance of class-for-name.
Alex Miller made changes -
Attachment clj-1529.png [ 13403 ]
Hide
Alex Miller added a comment -

Added a picture of the data for easier consumption.

Show
Alex Miller added a comment - Added a picture of the data for easier consumption.
Hide
Deepak Giridharagopal added a comment -

One of our bigger projects saw a reduction of startup time of 16% with class-for-name, 14% with maybe-cache, and a whopping 23% with both patches applied. This was actually starting up the program, as opposed to running "lein test", FWIW.

Maybe it's worth re-running the benchmarks with a "both-patches" variant?

Show
Deepak Giridharagopal added a comment - One of our bigger projects saw a reduction of startup time of 16% with class-for-name, 14% with maybe-cache, and a whopping 23% with both patches applied. This was actually starting up the program, as opposed to running "lein test", FWIW. Maybe it's worth re-running the benchmarks with a "both-patches" variant?
Hide
Alex Miller added a comment -

Hey Deepak, I did actually run some of them with both patches and saw times similar to class-for-name.

Were your times consistent across restarts? The times in the data above are the best of 3 trials for every data point (although they were relatively consistent).

Show
Alex Miller added a comment - Hey Deepak, I did actually run some of them with both patches and saw times similar to class-for-name. Were your times consistent across restarts? The times in the data above are the best of 3 trials for every data point (although they were relatively consistent).
Hide
Deepak Giridharagopal added a comment -

Hi Alex, the tests I ran did 20-iteration loops, and I took the mean (though it was pretty consistent between restarts). I can redo stuff and upload the raw data for you if that will help.

Show
Deepak Giridharagopal added a comment - Hi Alex, the tests I ran did 20-iteration loops, and I took the mean (though it was pretty consistent between restarts). I can redo stuff and upload the raw data for you if that will help.
Hide
Deepak Giridharagopal added a comment -

So repeating the experiment several times does in fact behave as you suspected...apologies for my previous LOLDATA.

Show
Deepak Giridharagopal added a comment - So repeating the experiment several times does in fact behave as you suspected...apologies for my previous LOLDATA.
Hide
Alex Miller added a comment -

maybe-class-cache-2.patch removes some debugging stuff

Show
Alex Miller added a comment - maybe-class-cache-2.patch removes some debugging stuff
Alex Miller made changes -
Attachment maybe-class-cache-2.patch [ 13441 ]
Alex Miller made changes -
Attachment clj-1529-no-cache.diff [ 13449 ]
Attachment clj-1529-with-cache.diff [ 13450 ]
Hide
Alex Miller added a comment -

I've done more testing and made mods to both patches and moved them closer together.

On the maybe-class-cache patch (new version = clj-1529-with-cache.diff):
1) I found that adding a final else branch that returned null was an improvement - this avoids caching things that will never hit in the future (Cons, PersistentList, Symbols with namespaces, etc). That's both a perf improvement (avoids hashing those things) and a memory improvement.
2) The tagToClass improvement from Zach's patch is orthogonal and also valid here so I added it.
3) I added Zach's check, but moved the placement lower so that it doesn't alter semantics. It helps avoid caching locals that aren't classes.

On the class-for-name patch (new version = clj-1529-no-cache.diff):
1) Same change as #3 above - moved check lower to avoid semantics change.

With these changes, both patches have tagToClass and local checks, neither patch changes semantics, and the only difference is whether to keep or remove the cache.

aleph timings (for "lein test"):

  • 1.7 master = 25.415 s
  • 1.7 + clj-1529-with-cache.diff = 14.329 s
  • 1.7 + clj-1529-no-cache.diff = 14.808 s

lamina timings (for "lein test"):

  • 1.7 master = 37.340 s
  • 1.7 + clj-1529-with-cache.diff = 28.680 s
  • 1.7 + clj-1529-no-cache.diff = 28.759 s

The cache helps slightly in both cases, but it does not seem worth adding the new dynamic var and unbounded memory use inherent in the cache.

Show
Alex Miller added a comment - I've done more testing and made mods to both patches and moved them closer together. On the maybe-class-cache patch (new version = clj-1529-with-cache.diff): 1) I found that adding a final else branch that returned null was an improvement - this avoids caching things that will never hit in the future (Cons, PersistentList, Symbols with namespaces, etc). That's both a perf improvement (avoids hashing those things) and a memory improvement. 2) The tagToClass improvement from Zach's patch is orthogonal and also valid here so I added it. 3) I added Zach's check, but moved the placement lower so that it doesn't alter semantics. It helps avoid caching locals that aren't classes. On the class-for-name patch (new version = clj-1529-no-cache.diff): 1) Same change as #3 above - moved check lower to avoid semantics change. With these changes, both patches have tagToClass and local checks, neither patch changes semantics, and the only difference is whether to keep or remove the cache. aleph timings (for "lein test"):
  • 1.7 master = 25.415 s
  • 1.7 + clj-1529-with-cache.diff = 14.329 s
  • 1.7 + clj-1529-no-cache.diff = 14.808 s
lamina timings (for "lein test"):
  • 1.7 master = 37.340 s
  • 1.7 + clj-1529-with-cache.diff = 28.680 s
  • 1.7 + clj-1529-no-cache.diff = 28.759 s
The cache helps slightly in both cases, but it does not seem worth adding the new dynamic var and unbounded memory use inherent in the cache.
Alex Miller made changes -
Approval Incomplete [ 10006 ] Vetted [ 10003 ]
Alex Miller made changes -
Attachment clj-1529-no-cache-2.diff [ 13464 ]
Description Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

This patch improves compilation time from 18 seconds to 7 seconds. The gain is exaggerated by the number of macros I use, but I would expect at least 50% improvements across a wide variety of codebases.

This patch does introduce a slight semantic change by privileging lexical scope over classnames. Consider this code:

    (let [String "foo"]
      (. String substring 0 1))

Previously, this would be treated as a static call to 'java.lang.String', but with the patch would be treated as a call to the lexical variable 'String'. Since the new semantic is what I (and I think everyone else) would have expected in the first place, it's probably very likely that no one is shadowing classes with their variable names, since someone would have complained about this. If anyone feels this is at all risky, however, I'm happy to discuss it further.


[1] https://github.com/ztellman/aleph
Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

*Proposed:* Avoid calling Class.forName() on non-namespaced symbols that do not contain "." or start with "[", don't map to a Class in the ns, and *are* names in the current local env. Also, adjust the ordering of checks in tagToClass to check for hints before checking for class.

*Patch:* clj-1529-no-cache-2.diff

*Screened by:* Alex Miller [Note that the latest variant of the patch moves the check from the original patch farther down in the logic to avoid changing the semantics. This still yields virtually all of the performance gains. See comments for details.]

[1] https://github.com/ztellman/aleph
Approval Vetted [ 10003 ] Screened [ 10004 ]
Alex Miller made changes -
Description Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

*Proposed:* Avoid calling Class.forName() on non-namespaced symbols that do not contain "." or start with "[", don't map to a Class in the ns, and *are* names in the current local env. Also, adjust the ordering of checks in tagToClass to check for hints before checking for class.

*Patch:* clj-1529-no-cache-2.diff

*Screened by:* Alex Miller [Note that the latest variant of the patch moves the check from the original patch farther down in the logic to avoid changing the semantics. This still yields virtually all of the performance gains. See comments for details.]

[1] https://github.com/ztellman/aleph
Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

*Proposed:* Avoid calling Class.forName() on non-namespaced symbols that do not contain "." or start with "[", don't map to a Class in the ns, and *are* names in the current local env. Also, adjust the ordering of checks in tagToClass to check for hints before checking for class.

[Note that the latest variant of the patch moves the check from the original patch farther down in the logic to avoid changing the semantics. This still yields virtually all of the performance gains. See comments for details.]

*Patch:* clj-1529-no-cache-2.diff

*Screened by:*

[1] https://github.com/ztellman/aleph
Approval Screened [ 10004 ] Vetted [ 10003 ]
Hide
Alex Miller added a comment -

Talked to Rich, focusing on no-cache patch. Added new version that fixes tabbing and restores Zach's name to the patch, which seems appropriate.

Show
Alex Miller added a comment - Talked to Rich, focusing on no-cache patch. Added new version that fixes tabbing and restores Zach's name to the patch, which seems appropriate.
Stuart Halloway made changes -
Description Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

*Proposed:* Avoid calling Class.forName() on non-namespaced symbols that do not contain "." or start with "[", don't map to a Class in the ns, and *are* names in the current local env. Also, adjust the ordering of checks in tagToClass to check for hints before checking for class.

[Note that the latest variant of the patch moves the check from the original patch farther down in the logic to avoid changing the semantics. This still yields virtually all of the performance gains. See comments for details.]

*Patch:* clj-1529-no-cache-2.diff

*Screened by:*

[1] https://github.com/ztellman/aleph
Compilation speed has been a real problem for a number of my projects, especially Aleph [1], which in 1.6 takes 18 seconds to load. Recently I realized that Class.forName is being called repeatedly on symbols which are lexically bound. Hits on Class.forName are cached, but misses are allowed to go through each time, which translates into tens of thousands of calls after calling `(use 'aleph.http)`.

*Proposed:* Avoid calling Class.forName() on non-namespaced symbols that do not contain "." or start with "[", don't map to a Class in the ns, and *are* names in the current local env. Also, adjust the ordering of checks in tagToClass to check for hints before checking for class.

[Note that the latest variant of the patch moves the check from the original patch farther down in the logic to avoid changing the semantics. This still yields virtually all of the performance gains. See comments for details.]

*Patch:* clj-1529-no-cache-2.diff

*Screened by:* Stu Halloway. Note that for this change the patch ended up being so small it is easier follow the code than the prose description.

[1] https://github.com/ztellman/aleph
Approval Vetted [ 10003 ] Screened [ 10004 ]
Rich Hickey made changes -
Approval Screened [ 10004 ] Ok [ 10007 ]
Stuart Halloway made changes -
Status Open [ 1 ] Closed [ 6 ]
Resolution Completed [ 1 ]

People

Vote (28)
Watch (9)

Dates

  • Created:
    Updated:
    Resolved: