ClojureScript

Re-precating ObjMap

Details

  • Type: Enhancement Enhancement
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Completed
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code

Description

The attached patch (to be applied on top of the CLJS-189 patch – otherwise tests don't pass, and for good reason) removes the deprecation notice on ObjMap and reintroduces the code for emitting ObjMaps in the compiler as per yesterday's discussion on #clojure (see http://clojure-log.n01se.net/date/2012-04-20.html#18:58a).

With this patch applied, ObjMaps will be automagically converted to PersistentHashMaps once a they grow beyond 16 entries. The compiler will also only omit ObjMap creation code for small maps (up to 16 entries).

ObjMap is meant to provide native-JS-object-level access speed in ClojureScript, particularly for literal maps with literal simple keys. The behaviour of the latter kind of maps was not tested at all in the jsPerf test suite on the PHM patch; links to new tests with literals forthcoming.

Activity

Hide
Michał Marczyk added a comment -

New patch against current master.

Here's a jsPerf test:

http://jsperf.com/cljs-literal-maps-4

ObjMap does indeed win on access time, but incredibly the ObjMap created from a literal seems slower than the one created by associng the keys onto cljs.core.ObjMap/EMPTY. I'll think about possible sources of systematic error.

The prep code was generated using https://github.com/michalmarczyk/cljs-persistent-hash-map-test (current commit sha: 0812c5cd76d78060605829e3e78eb3d841fa3238).

Show
Michał Marczyk added a comment - New patch against current master. Here's a jsPerf test: http://jsperf.com/cljs-literal-maps-4 ObjMap does indeed win on access time, but incredibly the ObjMap created from a literal seems slower than the one created by associng the keys onto cljs.core.ObjMap/EMPTY. I'll think about possible sources of systematic error. The prep code was generated using https://github.com/michalmarczyk/cljs-persistent-hash-map-test (current commit sha: 0812c5cd76d78060605829e3e78eb3d841fa3238).
Hide
Michał Marczyk added a comment -

NB. the last patch effectively breaks transient support (and fails to take advantage of it for OM -> PHM conversion), next patch will be more usable. Tests on larger maps as well as conversion time benchmarks forthcoming.

Show
Michał Marczyk added a comment - NB. the last patch effectively breaks transient support (and fails to take advantage of it for OM -> PHM conversion), next patch will be more usable. Tests on larger maps as well as conversion time benchmarks forthcoming.
Hide
David Nolen added a comment -

I think as soon as this ticket is done we should cut a new CLJS release.

Show
David Nolen added a comment - I think as soon as this ticket is done we should cut a new CLJS release.
Hide
Michał Marczyk added a comment -

Applies cleanly on top of the CLJS-213 patch for ease of perf testing; if CLJS-213 turns out to be unnecessary, I'll prepare a version w/o PAM.

Branch @ https://github.com/michalmarczyk/clojurescript/tree/reprecating-objmap-4

Show
Michał Marczyk added a comment - Applies cleanly on top of the CLJS-213 patch for ease of perf testing; if CLJS-213 turns out to be unnecessary, I'll prepare a version w/o PAM. Branch @ https://github.com/michalmarczyk/clojurescript/tree/reprecating-objmap-4
Hide
Michał Marczyk added a comment -

OM vs. PAM vs. PHM on four-entry maps:

http://jsperf.com/cljs-maps-access-4

As usual, prep code built using

https://github.com/michalmarczyk/cljs-persistent-hash-map-test

Show
Michał Marczyk added a comment - OM vs. PAM vs. PHM on four-entry maps: http://jsperf.com/cljs-maps-access-4 As usual, prep code built using https://github.com/michalmarczyk/cljs-persistent-hash-map-test
Hide
Michał Marczyk added a comment -

PAM vs. PHM on four-entry maps with numeric keys:

http://jsperf.com/cljs-complex-maps-access-4

Will post a larger batch of tests next.

Show
Michał Marczyk added a comment - PAM vs. PHM on four-entry maps with numeric keys: http://jsperf.com/cljs-complex-maps-access-4 Will post a larger batch of tests next.
Hide
Michał Marczyk added a comment -

Oh, as a quick summary, PAM is over 5x as fast as PHM on that last test in Chrome. OM is by far the fastest for string keys, but we knew that already. Conversion time tests to be posted.

Show
Michał Marczyk added a comment - Oh, as a quick summary, PAM is over 5x as fast as PHM on that last test in Chrome. OM is by far the fastest for string keys, but we knew that already. Conversion time tests to be posted.
Hide
Michał Marczyk added a comment - - edited

See this conversation with yoklov on #clojure:

http://clojure-log.n01se.net/date/2012-04-28.html#23:14a

Summary: unscientifically, seems like a speed improvement, possibly comparable to records w/ get. (This is re: CLJS-190 + CLJS-213.)

Show
Michał Marczyk added a comment - - edited See this conversation with yoklov on #clojure: http://clojure-log.n01se.net/date/2012-04-28.html#23:14a Summary: unscientifically, seems like a speed improvement, possibly comparable to records w/ get. (This is re: CLJS-190 + CLJS-213.)
Hide
Michał Marczyk added a comment -

Three new jsPerf tests with summaries of initial results in Chrome:

http://jsperf.com/cljs-maps-access-16

^- ObjMap still fastest by far, PAM less than twice as fast as PHM.

http://jsperf.com/cljs-maps-conversion-16

^- Identical time for assoc at conversion threshold. I'm just surprised to see it comes out as exactly equal, I wouldn't expect a particularly significant difference at this size. (I'll note that lein trampoline cljsbuild repl-rhino assures me that (aget phm.samples.strpams 16) is a PAM etc. though.)

http://jsperf.com/cljs-maps-assoc-12

^- Apparently it's impossible to beat PHM for assoc on V8. I still can't quite believe it.

Show
Michał Marczyk added a comment - Three new jsPerf tests with summaries of initial results in Chrome: http://jsperf.com/cljs-maps-access-16 ^- ObjMap still fastest by far, PAM less than twice as fast as PHM. http://jsperf.com/cljs-maps-conversion-16 ^- Identical time for assoc at conversion threshold. I'm just surprised to see it comes out as exactly equal, I wouldn't expect a particularly significant difference at this size. (I'll note that lein trampoline cljsbuild repl-rhino assures me that (aget phm.samples.strpams 16) is a PAM etc. though.) http://jsperf.com/cljs-maps-assoc-12 ^- Apparently it's impossible to beat PHM for assoc on V8. I still can't quite believe it.
Hide
Michał Marczyk added a comment -

Some more jsPerf tests w/ summaries:

http://jsperf.com/cljs-maps-access-64
http://jsperf.com/cljs-maps-access-128
http://jsperf.com/cljs-maps-access-256
http://jsperf.com/cljs-maps-access-512

^- Here all gets are for keys actually present in the map. Contrary to earlier results, ObjMap wins in all cases. I'm not sure what wrong with the earlier test @ http://jsperf.com/cljs-persistent-hash-map-access – or perhaps with tests in this batch – I'll need to look into that... Any ideas?

http://jsperf.com/cljs-maps-access-128-with-default

^- Here half of the get attempts are misses and PHM wins (though by a significantly smaller margin than OM in the "hit" case).

http://jsperf.com/cljs-maps-assoc-8
http://jsperf.com/cljs-maps-assoc-16
http://jsperf.com/cljs-maps-assoc-32

^- Here, on the other hand, the results are as usual – PHMs run circles around PAMs and OMs (with PAMs noticeably faster than OMs at these sizes).

In light of the above, picking PAM -> PHM conversion threshold between 8 and 16 seems reasonable, however I'm less certain about the correct one for OM -> PHM. In fact, I'm wondering if it might make sense to make this a knob in the compiler? (I'll probably make it knob-like in a cleaned-up patch anyway, so this is mostly a question about advertising this to users.) On a related note: should user code be able to request a particular map type (in a way preventing automatic conversion at that particular time? for any maps "descended from" that one?). This could be implemented e.g. with a flag stored on the map instance.

Show
Michał Marczyk added a comment - Some more jsPerf tests w/ summaries: http://jsperf.com/cljs-maps-access-64 http://jsperf.com/cljs-maps-access-128 http://jsperf.com/cljs-maps-access-256 http://jsperf.com/cljs-maps-access-512 ^- Here all gets are for keys actually present in the map. Contrary to earlier results, ObjMap wins in all cases. I'm not sure what wrong with the earlier test @ http://jsperf.com/cljs-persistent-hash-map-access – or perhaps with tests in this batch – I'll need to look into that... Any ideas? http://jsperf.com/cljs-maps-access-128-with-default ^- Here half of the get attempts are misses and PHM wins (though by a significantly smaller margin than OM in the "hit" case). http://jsperf.com/cljs-maps-assoc-8 http://jsperf.com/cljs-maps-assoc-16 http://jsperf.com/cljs-maps-assoc-32 ^- Here, on the other hand, the results are as usual – PHMs run circles around PAMs and OMs (with PAMs noticeably faster than OMs at these sizes). In light of the above, picking PAM -> PHM conversion threshold between 8 and 16 seems reasonable, however I'm less certain about the correct one for OM -> PHM. In fact, I'm wondering if it might make sense to make this a knob in the compiler? (I'll probably make it knob-like in a cleaned-up patch anyway, so this is mostly a question about advertising this to users.) On a related note: should user code be able to request a particular map type (in a way preventing automatic conversion at that particular time? for any maps "descended from" that one?). This could be implemented e.g. with a flag stored on the map instance.
Hide
Michał Marczyk added a comment -

Some huge map lookup tests:

http://jsperf.com/cljs-maps-access-10000
http://jsperf.com/cljs-maps-access-100000
http://jsperf.com/cljs-maps-access-1000000

ObjMap still ahead, though by a much smaller margin in the latter two cases. Still not sure what's the problem with the original lookup test. :-/

Show
Michał Marczyk added a comment - Some huge map lookup tests: http://jsperf.com/cljs-maps-access-10000 http://jsperf.com/cljs-maps-access-100000 http://jsperf.com/cljs-maps-access-1000000 ObjMap still ahead, though by a much smaller margin in the latter two cases. Still not sure what's the problem with the original lookup test. :-/
Hide
David Nolen added a comment -

Can you create a new version of the old PHM tests with these patches applied to double check that we're not seeing a perf regression?

Show
David Nolen added a comment - Can you create a new version of the old PHM tests with these patches applied to double check that we're not seeing a perf regression?
Hide
David Nolen added a comment -

I'm not sure about exposing the knob. It's also not clear to me that PAM is much of a win. Given that ObjMap may very well trump PHM for access times even at relatively large sizes perhaps we should consider ObjMap's type changing based on the number of updates and not size? 32?

Related, perhaps we should have an obj-map constructor fn which creates an ObjMap which will never upgrade to PHM. TransientObjMap + obj-map ctor could be a performance win for certain kinds of programs?

Show
David Nolen added a comment - I'm not sure about exposing the knob. It's also not clear to me that PAM is much of a win. Given that ObjMap may very well trump PHM for access times even at relatively large sizes perhaps we should consider ObjMap's type changing based on the number of updates and not size? 32? Related, perhaps we should have an obj-map constructor fn which creates an ObjMap which will never upgrade to PHM. TransientObjMap + obj-map ctor could be a performance win for certain kinds of programs?
Hide
Michał Marczyk added a comment -

Well, I think I've recreated all the original tests already, except I've left out the COW HashMap. The cljs-maps-assoc-* series indicate PHMs trounce other map types on assoc as soundly as they appeared to in the original tests. Or have I missed something?

The lookup performance issue seems weird to me, since I really don't see any obvious methodological errors in either the old or the new tests. Clearly that's something still in need of attention.

As for ObjMap changing type based on the number of updates – that's an interesting idea, though at large sizes it could result in a sudden pretty significant road bump at conversion time; perhaps too much of a surprise for regular code? Also, this would basically result in switching at some size in range [updates-threshold, compiler-obj-map-emit-threshold + updates-threshold] unless the compiler always emits ObjMaps for the simple keys case with no size thresholding, which seems problematic. Of course the "switch in size range" behaviour might be a good idea – I wonder how to benchmark that though...

Conversion-preventing constructor – that's what I meant by requesting a particular map type. We could use just the one constructor by adding an extra field with a flag (to be checked only when a conversion would occur with the current arrangements) – clearly that's an extra minor hit at every assoc to a implicit-conversion-free map, but those should probably not be updated anyway! (Except as transients, see below.)

Transient support – I've got TransientObjMap already implemented (with no implicit conversions), I'll tidy it up a bit for possible inclusion.

NB. folks who need top lookup perf for strings can already use objects directly or call cljs.core.ObjMap/fromObject (which always constructs an OM – no risk of implicit conversion).

About PAMs – they seem to offer a significant lookup speed boost at small sizes and no particular gains (even the opposite) at larger sizes. That's about the same effect as that on the JVM, so maybe it makes sense to keep them around for very small sizes only (~10)? I've no particularly strong opinion on this one for now, to be honest; I'd love to know if yoklov's speed gains had anything to do with PAMs or was it just the ObjMaps at small sizes (note the report was about the patches with both conversion thresholds set to 16).

Sorry for the chaotic writing, I need to think about this some more...

Show
Michał Marczyk added a comment - Well, I think I've recreated all the original tests already, except I've left out the COW HashMap. The cljs-maps-assoc-* series indicate PHMs trounce other map types on assoc as soundly as they appeared to in the original tests. Or have I missed something? The lookup performance issue seems weird to me, since I really don't see any obvious methodological errors in either the old or the new tests. Clearly that's something still in need of attention. As for ObjMap changing type based on the number of updates – that's an interesting idea, though at large sizes it could result in a sudden pretty significant road bump at conversion time; perhaps too much of a surprise for regular code? Also, this would basically result in switching at some size in range [updates-threshold, compiler-obj-map-emit-threshold + updates-threshold] unless the compiler always emits ObjMaps for the simple keys case with no size thresholding, which seems problematic. Of course the "switch in size range" behaviour might be a good idea – I wonder how to benchmark that though... Conversion-preventing constructor – that's what I meant by requesting a particular map type. We could use just the one constructor by adding an extra field with a flag (to be checked only when a conversion would occur with the current arrangements) – clearly that's an extra minor hit at every assoc to a implicit-conversion-free map, but those should probably not be updated anyway! (Except as transients, see below.) Transient support – I've got TransientObjMap already implemented (with no implicit conversions), I'll tidy it up a bit for possible inclusion. NB. folks who need top lookup perf for strings can already use objects directly or call cljs.core.ObjMap/fromObject (which always constructs an OM – no risk of implicit conversion). About PAMs – they seem to offer a significant lookup speed boost at small sizes and no particular gains (even the opposite) at larger sizes. That's about the same effect as that on the JVM, so maybe it makes sense to keep them around for very small sizes only (~10)? I've no particularly strong opinion on this one for now, to be honest; I'd love to know if yoklov's speed gains had anything to do with PAMs or was it just the ObjMaps at small sizes (note the report was about the patches with both conversion thresholds set to 16). Sorry for the chaotic writing, I need to think about this some more...
Hide
Michał Marczyk added a comment -

Patch against CLJS-213. Uses the update count approach as discussed on IRC, with a threshold of 32.

Show
Michał Marczyk added a comment - Patch against CLJS-213. Uses the update count approach as discussed on IRC, with a threshold of 32.
Hide
Michał Marczyk added a comment -

New patch w/ improved commit message (mentions implicit conversion to PHM in (transient obj-map) calls).

Show
Michał Marczyk added a comment - New patch w/ improved commit message (mentions implicit conversion to PHM in (transient obj-map) calls).

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: