Improve `refer` performance
Description
Environment
Attachments
Activity
Alexander Yakushev October 12, 2023 at 3:14 PMEdited
If this issue ever becomes relevant again, here’s my modification to the provided patch. Here’s what I have changed and why:
At least on my machine, building an intermediate map even with transients is more expensive than CASing in every individual var with
Namespace.refer
. So, I reverted the accumulation/batch merge logic.reduce1
doesn’t know how to iterate a PersistentMap efficiently, so I rewrote walking the map to an iterator.
The results are the following (MacBook M1 2020):
Test | 1.12.0-alpha4 | 1.12.0-alpha4+patch |
---|---|---|
(clojure.core/refer 'clojure.core) | 228.20 us Alloc: 553 KB | 59.48 us Alloc: 50 KB |
clj -M -e ‘(System/exit 0)’ (without a project) | 840-880 ms | 815-845 ms |
To measure time, I ran the commands in a loop. On one hand, it’s not entirely fair because for refer
the immediate performance is more important than the optimized post-JIT performance. On the other hand, (refer ‘clojure.core)
is invoked for every namespace, so we care about more than just the first invocation.
The overall improvement of running an empty Clojure program end-to-end roughly corresponds to the 3% contributed by refer
that I see when I profile Clojure startup (attached).

Ghadi Shayban August 2, 2018 at 4:30 AM
I suspect that this will not matter much for peak performance. Given the attached picture of clojure.instant loading, this patch may shave a bit off startup time.
(picture made with Bytestacks, profile recorded a JDK11 debug build from today)

Alex Miller September 7, 2017 at 4:52 PM
Patch updated to address first screening comment. I didn't actually find any case of the second one? Give me a pointer.
Details
Assignee
UnassignedUnassignedReporter
Alex MillerAlex MillerLabels
Approval
VettedPatch
CodePriority
MajorAffects versions
Details
Details
Assignee
Reporter

Labels
Approval
Patch
Priority

refer
underliesrequire
,use
, andrefer-clojure
use cases and is not particularly efficient at its primary job of copying symbol/var mapping from one namespace to another.Approach: Some improvements that can be made:
Go directly to the namespace mappings and avoid creating filtered intermediate maps (ns-publics)
Use transients to build map of references to refer
Instead of cas'ing each new reference individually, build map of all changes, then cas
For (:require :only ...) case - instead of walking all referred vars and looking for matches, walk only the included vars and look up each one
There are undoubtedly more dramatic changes (like immutable namespaces) in how all this works that could further improve performance but I tried to make the scope small-ish for this change.
While individual refer timings are greatly reduced (~50% reduction for (refer clojure.core), ~90% reduction for :only use), refer is only a small component of broader require load times so the improvements in practice are modest.
Performance:
expr in a new repl
1.7.0-beta3
1.7.0-beta3+patch
(in-ns 'foo) (clojure.core/refer 'clojure.core)
2.65 ms
0.994 ms
(in-ns 'bar) (clojure.core/refer 'clojure.core :only '[inc dec])
1.04 ms
0.113 ms
(use 'criterium.core)
0.877 ms
0.762 ms
(require '[clojure.core.async :refer (>!! <!! chan close!)])
3408 ms
3302 ms
Patch: clj-1730-2.patch
Screening Notes
Patch appears correct but we should consider:
non-idiomatic use of
if
instead ofwhen
makes branches hard to readnon-idiomatic indentation of
if
(both branches on one line) hard to readI don't think not found should be an
IllegalAccessError
, but the original code did this already, so ...the optimistic concurrency loop around the swap will never give up, is this ok?