Clojure

Use transients with select-keys if possible

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
  • Patch:
    Code

Description

Problem: Currently select-keys uses conj to add entries. If the map is editable, conj! could be used instead to improve select-keys performance.

Additionally keyseq is traversed as a seq but could be traversed via reduce instead, which might be faster.

Approach 1: Use a transient map and conj!, keeping loop/recur
Approach 2: Reimplement select-keys to use reduce instead of loop/recur
Approach 3: Combine approach one and two

selected key size loop/recur transient reduce transient + reduce
1 243 ns 256 ns 161 ns 188 ns
7 1.1 ms
885 ns 454 ns

From these numbers, approach 3 was chosen.

Note: In order to implement select-keys in terms of reduce, select-keys needed to be moved until after the definition of reduce. This forced a (declare select-keys) since it's used before the definition of reduce

Patch: 0001-CLJ-1789-Use-transients-and-reduce-with-select-keys.patch

Activity

Hide
Erik Assum added a comment - - edited

Standard Clojure select-keys:

(bench (clojure.core/select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 246382440 in 60 samples of 4106374 calls.
             Execution time mean : 243.245536 ns
    Execution time std-deviation : 2.714803 ns
   Execution time lower quantile : 238.473675 ns ( 2.5%)
   Execution time upper quantile : 248.544255 ns (97.5%)
                   Overhead used : 1.845047 ns

With transients:

(bench (select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 232727220 in 60 samples of 3878787 calls.
             Execution time mean : 256.937568 ns
    Execution time std-deviation : 10.025123 ns
   Execution time lower quantile : 249.951872 ns ( 2.5%)
   Execution time upper quantile : 276.251590 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 5 outliers in 60 samples (8.3333 %)
	low-severe	 3 (5.0000 %)
	low-mild	 2 (3.3333 %)
 Variance from outliers : 25.4503 % Variance is moderately inflated by outliers

With reduce

(bench (select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 364807860 in 60 samples of 6080131 calls.
             Execution time mean : 161.582833 ns
    Execution time std-deviation : 2.212659 ns
   Execution time lower quantile : 158.027524 ns ( 2.5%)
   Execution time upper quantile : 167.673682 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 3 outliers in 60 samples (5.0000 %)
	low-severe	 3 (5.0000 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

Reduce + transient

(bench (select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 318075720 in 60 samples of 5301262 calls.
             Execution time mean : 188.656164 ns
    Execution time std-deviation : 3.024952 ns
   Execution time lower quantile : 183.867285 ns ( 2.5%)
   Execution time upper quantile : 195.466784 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 4 outliers in 60 samples (6.6667 %)
	low-severe	 4 (6.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

On bigger map/selection

(bench (clojure.core/select-keys {:a "b" :c "d" :b "b" :d "d" :e "e" :f "f" :g "g"} [:a :c :b :d :e :f :g]))
Evaluation count : 56147160 in 60 samples of 935786 calls.
             Execution time mean : 1.104653 µs
    Execution time std-deviation : 36.366516 ns
   Execution time lower quantile : 1.048257 µs ( 2.5%)
   Execution time upper quantile : 1.142031 µs (97.5%)
                   Overhead used : 1.845047 ns

Found 5 outliers in 60 samples (8.3333 %)
	low-severe	 4 (6.6667 %)
	low-mild	 1 (1.6667 %)
 Variance from outliers : 19.0389 % Variance is moderately inflated by outliers

reduce

(bench (select-keys {:a "b" :c "d" :b "b" :d "d" :e "e" :f "f" :g "g"} [:a :c :b :d :e :f :g]))
Evaluation count : 67723500 in 60 samples of 1128725 calls.
             Execution time mean : 885.840664 ns
    Execution time std-deviation : 11.503115 ns
   Execution time lower quantile : 864.403495 ns ( 2.5%)
   Execution time upper quantile : 905.721942 ns (97.5%)
                   Overhead used : 1.845047 ns

Transient + reduce

(bench (select-keys {:a "b" :c "d" :b "b" :d "d" :e "e" :f "f" :g "g"} [:a :c :b :d :e :f :g]))
Evaluation count : 134119380 in 60 samples of 2235323 calls.
             Execution time mean : 454.587795 ns
    Execution time std-deviation : 15.681611 ns
   Execution time lower quantile : 439.822498 ns ( 2.5%)
   Execution time upper quantile : 485.797378 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 3 outliers in 60 samples (5.0000 %)
	low-severe	 3 (5.0000 %)
 Variance from outliers : 20.6393 % Variance is moderately inflated by outliers

The attached patch is using both transients and reduce

Show
Erik Assum added a comment - - edited Standard Clojure select-keys:
(bench (clojure.core/select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 246382440 in 60 samples of 4106374 calls.
             Execution time mean : 243.245536 ns
    Execution time std-deviation : 2.714803 ns
   Execution time lower quantile : 238.473675 ns ( 2.5%)
   Execution time upper quantile : 248.544255 ns (97.5%)
                   Overhead used : 1.845047 ns
With transients:
(bench (select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 232727220 in 60 samples of 3878787 calls.
             Execution time mean : 256.937568 ns
    Execution time std-deviation : 10.025123 ns
   Execution time lower quantile : 249.951872 ns ( 2.5%)
   Execution time upper quantile : 276.251590 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 5 outliers in 60 samples (8.3333 %)
	low-severe	 3 (5.0000 %)
	low-mild	 2 (3.3333 %)
 Variance from outliers : 25.4503 % Variance is moderately inflated by outliers
With reduce
(bench (select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 364807860 in 60 samples of 6080131 calls.
             Execution time mean : 161.582833 ns
    Execution time std-deviation : 2.212659 ns
   Execution time lower quantile : 158.027524 ns ( 2.5%)
   Execution time upper quantile : 167.673682 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 3 outliers in 60 samples (5.0000 %)
	low-severe	 3 (5.0000 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Reduce + transient
(bench (select-keys {:a "b" :c "d"} [:a]))
Evaluation count : 318075720 in 60 samples of 5301262 calls.
             Execution time mean : 188.656164 ns
    Execution time std-deviation : 3.024952 ns
   Execution time lower quantile : 183.867285 ns ( 2.5%)
   Execution time upper quantile : 195.466784 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 4 outliers in 60 samples (6.6667 %)
	low-severe	 4 (6.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
On bigger map/selection
(bench (clojure.core/select-keys {:a "b" :c "d" :b "b" :d "d" :e "e" :f "f" :g "g"} [:a :c :b :d :e :f :g]))
Evaluation count : 56147160 in 60 samples of 935786 calls.
             Execution time mean : 1.104653 µs
    Execution time std-deviation : 36.366516 ns
   Execution time lower quantile : 1.048257 µs ( 2.5%)
   Execution time upper quantile : 1.142031 µs (97.5%)
                   Overhead used : 1.845047 ns

Found 5 outliers in 60 samples (8.3333 %)
	low-severe	 4 (6.6667 %)
	low-mild	 1 (1.6667 %)
 Variance from outliers : 19.0389 % Variance is moderately inflated by outliers
reduce
(bench (select-keys {:a "b" :c "d" :b "b" :d "d" :e "e" :f "f" :g "g"} [:a :c :b :d :e :f :g]))
Evaluation count : 67723500 in 60 samples of 1128725 calls.
             Execution time mean : 885.840664 ns
    Execution time std-deviation : 11.503115 ns
   Execution time lower quantile : 864.403495 ns ( 2.5%)
   Execution time upper quantile : 905.721942 ns (97.5%)
                   Overhead used : 1.845047 ns
Transient + reduce
(bench (select-keys {:a "b" :c "d" :b "b" :d "d" :e "e" :f "f" :g "g"} [:a :c :b :d :e :f :g]))
Evaluation count : 134119380 in 60 samples of 2235323 calls.
             Execution time mean : 454.587795 ns
    Execution time std-deviation : 15.681611 ns
   Execution time lower quantile : 439.822498 ns ( 2.5%)
   Execution time upper quantile : 485.797378 ns (97.5%)
                   Overhead used : 1.845047 ns

Found 3 outliers in 60 samples (5.0000 %)
	low-severe	 3 (5.0000 %)
 Variance from outliers : 20.6393 % Variance is moderately inflated by outliers
The attached patch is using both transients and reduce
Hide
Alex Miller added a comment -

The proposed approach seems good to me. The description needs to reflect what was considered and chosen better.

Show
Alex Miller added a comment - The proposed approach seems good to me. The description needs to reflect what was considered and chosen better.

People

Vote (1)
Watch (2)

Dates

  • Created:
    Updated: