Improve `refer` performance

Description

refer underlies require, use, and refer-clojure use cases and is not particularly efficient at its primary job of copying symbol/var mapping from one namespace to another.

Approach: Some improvements that can be made:

  • Go directly to the namespace mappings and avoid creating filtered intermediate maps (ns-publics)

  • Use transients to build map of references to refer

  • Instead of cas'ing each new reference individually, build map of all changes, then cas

  • For (:require :only ...) case - instead of walking all referred vars and looking for matches, walk only the included vars and look up each one

There are undoubtedly more dramatic changes (like immutable namespaces) in how all this works that could further improve performance but I tried to make the scope small-ish for this change.

While individual refer timings are greatly reduced (~50% reduction for (refer clojure.core), ~90% reduction for :only use), refer is only a small component of broader require load times so the improvements in practice are modest.

Performance:

expr in a new repl

1.7.0-beta3

1.7.0-beta3+patch

(in-ns 'foo) (clojure.core/refer 'clojure.core)

2.65 ms

0.994 ms

(in-ns 'bar) (clojure.core/refer 'clojure.core :only '[inc dec])

1.04 ms

0.113 ms

(use 'criterium.core)

0.877 ms

0.762 ms

(require '[clojure.core.async :refer (>!! <!! chan close!)])

3408 ms

3302 ms

Patch: clj-1730-2.patch

Screening Notes

Patch appears correct but we should consider:

  • non-idiomatic use of if instead of when makes branches hard to read

  • non-idiomatic indentation of if (both branches on one line) hard to read

  • I don't think not found should be an IllegalAccessError, but the original code did this already, so ...

  • the optimistic concurrency loop around the swap will never give up, is this ok?

Environment

None

Attachments

5

Activity

Show:

Alexander Yakushev October 12, 2023 at 3:14 PM
Edited

If this issue ever becomes relevant again, here’s my modification to the provided patch. Here’s what I have changed and why:

  • At least on my machine, building an intermediate map even with transients is more expensive than CASing in every individual var with Namespace.refer. So, I reverted the accumulation/batch merge logic.

  • reduce1 doesn’t know how to iterate a PersistentMap efficiently, so I rewrote walking the map to an iterator.

The results are the following (MacBook M1 2020):

Test

1.12.0-alpha4

1.12.0-alpha4+patch

(clojure.core/refer 'clojure.core)

228.20 us

Alloc: 553 KB

59.48 us

Alloc: 50 KB

clj -M -e ‘(System/exit 0)’

(without a project)

840-880 ms

815-845 ms

To measure time, I ran the commands in a loop. On one hand, it’s not entirely fair because for refer the immediate performance is more important than the optimized post-JIT performance. On the other hand, (refer ‘clojure.core) is invoked for every namespace, so we care about more than just the first invocation.

The overall improvement of running an empty Clojure program end-to-end roughly corresponds to the 3% contributed by refer that I see when I profile Clojure startup (attached).

Ghadi Shayban August 2, 2018 at 4:30 AM

I suspect that this will not matter much for peak performance. Given the attached picture of clojure.instant loading, this patch may shave a bit off startup time.

(picture made with Bytestacks, profile recorded a JDK11 debug build from today)

Alex Miller September 7, 2017 at 4:52 PM

Patch updated to address first screening comment. I didn't actually find any case of the second one? Give me a pointer.

Details

Assignee

Reporter

Approval

Vetted

Patch

Code

Priority

Affects versions

Created May 13, 2015 at 5:52 PM
Updated May 21, 2024 at 3:35 PM