ClojureScript

optimized case

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

With the advent of asm.js many engines will like compile switch statements over integers into jump tables. We should provide a real `case*` ast node that compiles to JS `switch` when possible - i.e. numbers, strings, keywords etc.

Activity

David Nolen made changes -
Field Original Value New Value
Description With the advent of asm.js many engine will like compile switch statements over integer into jump tables. We should provide a real `case*` ast node that compiles to JS `switch` when possible - i.e. numbers, strings, keywords etc. With the advent of asm.js many engines will like compile switch statements over integers into jump tables. We should provide a real `case*` ast node that compiles to JS `switch` when possible - i.e. numbers, strings, keywords etc.
David Nolen made changes -
Priority Major [ 3 ] Minor [ 4 ]
Michał Marczyk made changes -
Assignee Michał Marczyk [ michalmarczyk ]
Michał Marczyk made changes -
Status Open [ 1 ] In Progress [ 3 ]
Michał Marczyk made changes -
Status In Progress [ 3 ] Open [ 1 ]
Hide
Michał Marczyk added a comment -

First cut impl also available here:

https://github.com/michalmarczyk/clojurescript/tree/713-compile-case-to-switch

With this patch applied, case expressions are compiled to switch + some extra bits when all tests are numbers or strings, otherwise old logic is used.

For example, {{(fn [] (let [x 1] (case x 1 :foo (2 3) :bar :quux)))}} gets compiled to

function () {
    var x = 1;
    var G__6469 = x;
    var caseval__6470;
    switch (G__6469) {
      case 1:
        caseval__6470 = new cljs.core.Keyword(null, "foo", "foo", 1014005816);
        break;
      case 2:
      case 3:
        caseval__6470 = new cljs.core.Keyword(null, "bar", "bar", 1014001541);
        break;
      default:
        caseval__6470 = new cljs.core.Keyword(null, "quux", "quux", 1017386809);
    }
    return caseval__6470;
}

The existing test suite passes, but I suppose it wouldn't hurt to add some tests with case in all possible contexts.

Show
Michał Marczyk added a comment - First cut impl also available here: https://github.com/michalmarczyk/clojurescript/tree/713-compile-case-to-switch With this patch applied, case expressions are compiled to switch + some extra bits when all tests are numbers or strings, otherwise old logic is used. For example, {{(fn [] (let [x 1] (case x 1 :foo (2 3) :bar :quux)))}} gets compiled to
function () {
    var x = 1;
    var G__6469 = x;
    var caseval__6470;
    switch (G__6469) {
      case 1:
        caseval__6470 = new cljs.core.Keyword(null, "foo", "foo", 1014005816);
        break;
      case 2:
      case 3:
        caseval__6470 = new cljs.core.Keyword(null, "bar", "bar", 1014001541);
        break;
      default:
        caseval__6470 = new cljs.core.Keyword(null, "quux", "quux", 1017386809);
    }
    return caseval__6470;
}
The existing test suite passes, but I suppose it wouldn't hurt to add some tests with case in all possible contexts.
Michał Marczyk made changes -
Hide
Michał Marczyk added a comment -

As a next step, I was planning to arrange things so that numbers/strings are fished out from among the tests and compiled to switch always, with any leftover tests put in an old-style nested-ifs-based case under default:. Does this sound good?

It seems to me that to deal with symbols and keywords in a similar manner we'd have to do one of two things:

1. check symbol? and keyword? at the top, then compile separate switches (the one for keywords would extract the name from the given keyword and use strings in the switch);

2. use hashes for dispatch.

Which one sounds better? Or is there a third way?

Show
Michał Marczyk added a comment - As a next step, I was planning to arrange things so that numbers/strings are fished out from among the tests and compiled to switch always, with any leftover tests put in an old-style nested-ifs-based case under default:. Does this sound good? It seems to me that to deal with symbols and keywords in a similar manner we'd have to do one of two things: 1. check symbol? and keyword? at the top, then compile separate switches (the one for keywords would extract the name from the given keyword and use strings in the switch); 2. use hashes for dispatch. Which one sounds better? Or is there a third way?
Hide
Michał Marczyk added a comment -

Of course we'd need to compute hashes statically to go with 2. I'd kind of like it if it were impossible (randomized seed / universal hashing), but currently it isn't.

Show
Michał Marczyk added a comment - Of course we'd need to compute hashes statically to go with 2. I'd kind of like it if it were impossible (randomized seed / universal hashing), but currently it isn't.
Hide
Francis Avila added a comment - - edited

At least on v8, there are surprisingly few cases where a switch statement will be optimized to a jump table. Basically the type of the switched-over value must always (across calls) match the type of every case, and there must be fewer than 128 cases, and integer cases must be 31-bit ints (v8's smi type). So mixing string and number cases in the same switch guarantees the statement will never be compiled. In many cases an equivalent if-else will end up being significantly faster on v8 just because the optimizing jit recognizes them better. There's an oldish bug filed against v8 switch performance. Looking at the many jsperfs of switch statements, it doesn't seem that v8 has improved. Relevant jsperf

Firefox is much better at optimizing switch statements (maybe because of their asm.js/emscripten work) but I don't know what conditions trigger (de)optimization.

I suspect the best approach is probably going to be your option one: if-else dispatch on type if any case is not a number, and then a switch statement covering the values for each of the keyword/string/symbol types present (no nested switch statements, and outlining the nested switches might be necessary). Even with a good hash, to guarantee v8 optimizing-compilation you would need to truncate the hashes into an smi (signed-left-shift once?) inside the case*.

Show
Francis Avila added a comment - - edited At least on v8, there are surprisingly few cases where a switch statement will be optimized to a jump table. Basically the type of the switched-over value must always (across calls) match the type of every case, and there must be fewer than 128 cases, and integer cases must be 31-bit ints (v8's smi type). So mixing string and number cases in the same switch guarantees the statement will never be compiled. In many cases an equivalent if-else will end up being significantly faster on v8 just because the optimizing jit recognizes them better. There's an oldish bug filed against v8 switch performance. Looking at the many jsperfs of switch statements, it doesn't seem that v8 has improved. Relevant jsperf Firefox is much better at optimizing switch statements (maybe because of their asm.js/emscripten work) but I don't know what conditions trigger (de)optimization. I suspect the best approach is probably going to be your option one: if-else dispatch on type if any case is not a number, and then a switch statement covering the values for each of the keyword/string/symbol types present (no nested switch statements, and outlining the nested switches might be necessary). Even with a good hash, to guarantee v8 optimizing-compilation you would need to truncate the hashes into an smi (signed-left-shift once?) inside the case*.
Hide
David Nolen added a comment -

There's no need for invention here. We should follow the strategy that Clojure adopts - compile time hash calculation.

Show
David Nolen added a comment - There's no need for invention here. We should follow the strategy that Clojure adopts - compile time hash calculation.
Hide
Francis Avila added a comment -

The problem, as Michal alluded to, is that the hash functions in cljs's runtime environment are not available at compile-time (unlike in Clojure). This might be a good opportunity to clean up that situation or even use identical hash values across Clojure and Clojurescript (i.e. CLJS-754), but that's a much bigger project. Especially considering it will probably not bring much of a speedup over an if-else-if implementation except in very narrow circumstances.

Show
Francis Avila added a comment - The problem, as Michal alluded to, is that the hash functions in cljs's runtime environment are not available at compile-time (unlike in Clojure). This might be a good opportunity to clean up that situation or even use identical hash values across Clojure and Clojurescript (i.e. CLJS-754), but that's a much bigger project. Especially considering it will probably not bring much of a speedup over an if-else-if implementation except in very narrow circumstances.
Hide
David Nolen added a comment - - edited

Francis Avila I would make no such assumptions about performance without benchmarks. One of the critical uses for case is over keywords. Keyword hashes are computed at compile time, so that's one function call and a jump on some JavaScript engines. This is particularly useful for the performance of records where you want to lookup a field via keyword before checking the extension map.

This ticket should probably wait for CLJS-754 before proceeding.

Show
David Nolen added a comment - - edited Francis Avila I would make no such assumptions about performance without benchmarks. One of the critical uses for case is over keywords. Keyword hashes are computed at compile time, so that's one function call and a jump on some JavaScript engines. This is particularly useful for the performance of records where you want to lookup a field via keyword before checking the extension map. This ticket should probably wait for CLJS-754 before proceeding.
Hide
Francis Avila added a comment -

Record field lookup is a good narrow use case to test. I put together a jsperf to compare if-else (current) vs switch with string cases vs switch with int cases (i.e., hash-compares, assuming perfect hashing).

Show
Francis Avila added a comment - Record field lookup is a good narrow use case to test. I put together a jsperf to compare if-else (current) vs switch with string cases vs switch with int cases (i.e., hash-compares, assuming perfect hashing).
David Nolen made changes -
Priority Minor [ 4 ] Major [ 3 ]
Assignee Michał Marczyk [ michalmarczyk ] David Nolen [ dnolen ]
Hide
David Nolen added a comment -

I've merged the case* analyzer and emitter bits by hand into master.

Show
David Nolen added a comment - I've merged the case* analyzer and emitter bits by hand into master.
Hide
David Nolen added a comment -

I'v merged the rest of the patch into master. If any further optimizations are done it will be around dispatching on hash code a la Clojure.

Show
David Nolen added a comment - I'v merged the rest of the patch into master. If any further optimizations are done it will be around dispatching on hash code a la Clojure.
Hide
Francis Avila added a comment - - edited

Your keyword-test optimization has a bug: https://github.com/clojure/clojurescript/commit/9872788b3caa86f639633ff14dc0db49f16d3e2a

Test case:

(let [x "a"] (case x :a 1 "a"))
;=> 1
;;; Should be "a"

Github comment suggests two possible fixes.

Show
Francis Avila added a comment - - edited Your keyword-test optimization has a bug: https://github.com/clojure/clojurescript/commit/9872788b3caa86f639633ff14dc0db49f16d3e2a Test case:
(let [x "a"] (case x :a 1 "a"))
;=> 1
;;; Should be "a"
Github comment suggests two possible fixes.
Hide
David Nolen added a comment -

Thanks fix in master.

Show
David Nolen added a comment - Thanks fix in master.
Hide
Christoffer Sawicki added a comment -

case over "chars" is currently not being optimized to switch (in other words: (case c (\a) :a :other) uses if instead of switch).

Given that ClojureScript chars are just strings of length 1, could this perhaps simply be fixed by tweaking https://github.com/clojure/clojurescript/blob/master/src/clj/cljs/core.clj#L1187 ?

Show
Christoffer Sawicki added a comment - case over "chars" is currently not being optimized to switch (in other words: (case c (\a) :a :other) uses if instead of switch). Given that ClojureScript chars are just strings of length 1, could this perhaps simply be fixed by tweaking https://github.com/clojure/clojurescript/blob/master/src/clj/cljs/core.clj#L1187 ?
Hide
Christoffer Sawicki added a comment - - edited

OK, I couldn't resist trying and it seems to be that easy. Would be great if somebody more knowledgeable could look at it and say if it has any side-effects. (See patch with name 0001-CLJS-713-Allow-test-expressions-for-case-to-be-chars.patch.)

Show
Christoffer Sawicki added a comment - - edited OK, I couldn't resist trying and it seems to be that easy. Would be great if somebody more knowledgeable could look at it and say if it has any side-effects. (See patch with name 0001-CLJS-713-Allow-test-expressions-for-case-to-be-chars.patch.)
Christoffer Sawicki made changes -
Hide
David Nolen added a comment -

The patch looks good I would have applied it if I hadn't already gone and done it master myself just now

Show
David Nolen added a comment - The patch looks good I would have applied it if I hadn't already gone and done it master myself just now
Hide
Christoffer Sawicki added a comment -

Hehe. Thanks! Don't forget to update the "case* tests must be numbers or strings" message on line 496 too.

Show
Christoffer Sawicki added a comment - Hehe. Thanks! Don't forget to update the "case* tests must be numbers or strings" message on line 496 too.
Hide
David Nolen added a comment -

The existing docstring is inaccurate - case supports all compile time literals.

Show
David Nolen added a comment - The existing docstring is inaccurate - case supports all compile time literals.

People

Vote (0)
Watch (3)

Dates

  • Created:
    Updated: