Clojure

Use transient map in zipmap

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: Release 1.5
  • Fix Version/s: None
  • Component/s: None
  • Labels:
  • Patch:
    Code
  • Approval:
    Triaged

Description

The attached patch changes zipmap to use a transient map internally. The definition is also moved so that it resides below that of #'transient. The original definition is commented out (like that of #'into).

Activity

Rich Hickey made changes -
Field Original Value New Value
Approval Vetted [ 10003 ]
Fix Version/s Release 1.5 [ 10150 ]
Hide
Aaron Bedra added a comment -

Why is the old implementation left and commented out? If we are going to move to a new implementation, the old one should be removed.

Show
Aaron Bedra added a comment - Why is the old implementation left and commented out? If we are going to move to a new implementation, the old one should be removed.
Aaron Bedra made changes -
Approval Vetted [ 10003 ] Incomplete [ 10006 ]
Hide
Michał Marczyk added a comment -

As mentioned in the ticket description, the previously attached patch follows the pattern of into whose non-transient-enabled definition is left in core.clj with a #_ in front – I wasn't sure if that's something desirable in all cases.

Here's a new patch with the old impl removed.

Show
Michał Marczyk added a comment - As mentioned in the ticket description, the previously attached patch follows the pattern of into whose non-transient-enabled definition is left in core.clj with a #_ in front – I wasn't sure if that's something desirable in all cases. Here's a new patch with the old impl removed.
Michał Marczyk made changes -
Attachment 0001-Use-transient-map-in-zipmap.patch [ 11428 ]
Hide
Andy Fingerhut added a comment -

Thanks for the updated patch, Michal. Sorry to raise such a minor issue, but would you mind using a different name for the updated patch? I know JIRA can handle multiple attached files with the same name, but my prescreening code isn't quite that talented yet, and it can lead to confusion when discussing patches.

Show
Andy Fingerhut added a comment - Thanks for the updated patch, Michal. Sorry to raise such a minor issue, but would you mind using a different name for the updated patch? I know JIRA can handle multiple attached files with the same name, but my prescreening code isn't quite that talented yet, and it can lead to confusion when discussing patches.
Michał Marczyk made changes -
Attachment 0001-Use-transient-map-in-zipmap.patch [ 11428 ]
Michał Marczyk made changes -
Hide
Michał Marczyk added a comment -

Thanks for the heads-up, Andy! I've reattached the new patch under a new name.

Show
Michał Marczyk added a comment - Thanks for the heads-up, Andy! I've reattached the new patch under a new name.
Hide
Andy Fingerhut added a comment -

Presumptuously changing Approval from Incomplete back to None after the Michal's updated patch was added, addressing the reason the ticket was marked incomplete.

Show
Andy Fingerhut added a comment - Presumptuously changing Approval from Incomplete back to None after the Michal's updated patch was added, addressing the reason the ticket was marked incomplete.
Andy Fingerhut made changes -
Approval Incomplete [ 10006 ]
Stuart Halloway made changes -
Fix Version/s Release 1.5 [ 10150 ]
Fix Version/s Release 1.6 [ 10157 ]
Aaron Bedra made changes -
Comment [ Is there a reason why the function moved inside of core.clj? ]
Aaron Bedra made changes -
Comment [ Never mind. I just took a look and realized that transient wasn't available yet. Sorry. ]
Hide
Aaron Bedra added a comment -

The patch looks good and applies cleanly. Are there additional tests that we should run to verify that this is providing the improvement we think it is. Also, is there a discussion somewhere that started this ticket? There isn't a lot of context here.

Show
Aaron Bedra added a comment - The patch looks good and applies cleanly. Are there additional tests that we should run to verify that this is providing the improvement we think it is. Also, is there a discussion somewhere that started this ticket? There isn't a lot of context here.
Aaron Bedra made changes -
Assignee Michał Marczyk [ michalmarczyk ] Aaron Bedra [ aaron ]
Aaron Bedra made changes -
Affects Version/s Release 1.6 [ 10157 ]
Hide
Michał Marczyk added a comment -

Hi Aaron,

Thanks for looking into this!

From what I've been able to observe, this change hugely improves zipmap times for large maps. For small maps, there is a small improvement. Here are two basic Criterium benchmarks (transient-zipmap defined at the REPL as in the patch):

;;; large map
user=> (def xs (range 16384))
#'user/xs
user=> (last xs)
16383
user=> (c/bench (zipmap xs xs))
Evaluation count : 13920 in 60 samples of 232 calls.
             Execution time mean : 4.329635 ms
    Execution time std-deviation : 77.791989 us
   Execution time lower quantile : 4.215050 ms ( 2.5%)
   Execution time upper quantile : 4.494120 ms (97.5%)
nil
user=> (c/bench (transient-zipmap xs xs))
Evaluation count : 21180 in 60 samples of 353 calls.
             Execution time mean : 2.818339 ms
    Execution time std-deviation : 110.751493 us
   Execution time lower quantile : 2.618971 ms ( 2.5%)
   Execution time upper quantile : 3.025812 ms (97.5%)

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 25.4675 % Variance is moderately inflated by outliers
nil

;;; small map
user=> (def ys (range 16))
#'user/ys
user=> (last ys)
15
user=> (c/bench (zipmap ys ys))
Evaluation count : 16639020 in 60 samples of 277317 calls.
             Execution time mean : 3.803683 us
    Execution time std-deviation : 88.431220 ns
   Execution time lower quantile : 3.638146 us ( 2.5%)
   Execution time upper quantile : 3.935160 us (97.5%)
nil
user=> (c/bench (transient-zipmap ys ys))
Evaluation count : 18536880 in 60 samples of 308948 calls.
             Execution time mean : 3.412992 us
    Execution time std-deviation : 81.338284 ns
   Execution time lower quantile : 3.303888 us ( 2.5%)
   Execution time upper quantile : 3.545549 us (97.5%)
nil

Clearly the semantics are preserved provided transients satisfy their contract.

I think I might not have started a ggroup thread for this, sorry.

Show
Michał Marczyk added a comment - Hi Aaron, Thanks for looking into this! From what I've been able to observe, this change hugely improves zipmap times for large maps. For small maps, there is a small improvement. Here are two basic Criterium benchmarks (transient-zipmap defined at the REPL as in the patch):
;;; large map
user=> (def xs (range 16384))
#'user/xs
user=> (last xs)
16383
user=> (c/bench (zipmap xs xs))
Evaluation count : 13920 in 60 samples of 232 calls.
             Execution time mean : 4.329635 ms
    Execution time std-deviation : 77.791989 us
   Execution time lower quantile : 4.215050 ms ( 2.5%)
   Execution time upper quantile : 4.494120 ms (97.5%)
nil
user=> (c/bench (transient-zipmap xs xs))
Evaluation count : 21180 in 60 samples of 353 calls.
             Execution time mean : 2.818339 ms
    Execution time std-deviation : 110.751493 us
   Execution time lower quantile : 2.618971 ms ( 2.5%)
   Execution time upper quantile : 3.025812 ms (97.5%)

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 25.4675 % Variance is moderately inflated by outliers
nil

;;; small map
user=> (def ys (range 16))
#'user/ys
user=> (last ys)
15
user=> (c/bench (zipmap ys ys))
Evaluation count : 16639020 in 60 samples of 277317 calls.
             Execution time mean : 3.803683 us
    Execution time std-deviation : 88.431220 ns
   Execution time lower quantile : 3.638146 us ( 2.5%)
   Execution time upper quantile : 3.935160 us (97.5%)
nil
user=> (c/bench (transient-zipmap ys ys))
Evaluation count : 18536880 in 60 samples of 308948 calls.
             Execution time mean : 3.412992 us
    Execution time std-deviation : 81.338284 ns
   Execution time lower quantile : 3.303888 us ( 2.5%)
   Execution time upper quantile : 3.545549 us (97.5%)
nil
Clearly the semantics are preserved provided transients satisfy their contract. I think I might not have started a ggroup thread for this, sorry.
Alex Miller made changes -
Approval Triaged [ 10120 ]
Affects Version/s Release 1.6 [ 10157 ]
Affects Version/s Release 1.5 [ 10150 ]
Fix Version/s Release 1.6 [ 10157 ]
Alex Miller made changes -
Labels performance

People

Vote (3)
Watch (4)

Dates

  • Created:
    Updated: