<< Back to previous view

[CLJS-374] satisfies? produces strange code when the protocol is not in the fast-path list Created: 06/Sep/12  Updated: 19/Nov/13

Status: Open
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: David Nolen Assignee: David Nolen
Resolution: Unresolved Votes: 0
Labels: None





[CLJS-713] optimized case Created: 04/Dec/13  Updated: 23/Jun/14

Status: Open
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: David Nolen Assignee: David Nolen
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-CLJS-713-Allow-test-expressions-for-case-to-be-chars.patch     Text File 0001-CLJS-713-first-cut-at-compiling-case-to-switch.patch    

 Description   

With the advent of asm.js many engines will like compile switch statements over integers into jump tables. We should provide a real `case*` ast node that compiles to JS `switch` when possible - i.e. numbers, strings, keywords etc.



 Comments   
Comment by Michał Marczyk [ 18/Feb/14 5:56 PM ]

First cut impl also available here:

https://github.com/michalmarczyk/clojurescript/tree/713-compile-case-to-switch

With this patch applied, case expressions are compiled to switch + some extra bits when all tests are numbers or strings, otherwise old logic is used.

For example, {{(fn [] (let [x 1] (case x 1 :foo (2 3) :bar :quux)))}} gets compiled to

function () {
    var x = 1;
    var G__6469 = x;
    var caseval__6470;
    switch (G__6469) {
      case 1:
        caseval__6470 = new cljs.core.Keyword(null, "foo", "foo", 1014005816);
        break;
      case 2:
      case 3:
        caseval__6470 = new cljs.core.Keyword(null, "bar", "bar", 1014001541);
        break;
      default:
        caseval__6470 = new cljs.core.Keyword(null, "quux", "quux", 1017386809);
    }
    return caseval__6470;
}

The existing test suite passes, but I suppose it wouldn't hurt to add some tests with case in all possible contexts.

Comment by Michał Marczyk [ 18/Feb/14 6:05 PM ]

As a next step, I was planning to arrange things so that numbers/strings are fished out from among the tests and compiled to switch always, with any leftover tests put in an old-style nested-ifs-based case under default:. Does this sound good?

It seems to me that to deal with symbols and keywords in a similar manner we'd have to do one of two things:

1. check symbol? and keyword? at the top, then compile separate switches (the one for keywords would extract the name from the given keyword and use strings in the switch);

2. use hashes for dispatch.

Which one sounds better? Or is there a third way?

Comment by Michał Marczyk [ 18/Feb/14 6:11 PM ]

Of course we'd need to compute hashes statically to go with 2. I'd kind of like it if it were impossible (randomized seed / universal hashing), but currently it isn't.

Comment by Francis Avila [ 19/Feb/14 12:22 AM ]

At least on v8, there are surprisingly few cases where a switch statement will be optimized to a jump table. Basically the type of the switched-over value must always (across calls) match the type of every case, and there must be fewer than 128 cases, and integer cases must be 31-bit ints (v8's smi type). So mixing string and number cases in the same switch guarantees the statement will never be compiled. In many cases an equivalent if-else will end up being significantly faster on v8 just because the optimizing jit recognizes them better. There's an oldish bug filed against v8 switch performance. Looking at the many jsperfs of switch statements, it doesn't seem that v8 has improved. Relevant jsperf

Firefox is much better at optimizing switch statements (maybe because of their asm.js/emscripten work) but I don't know what conditions trigger (de)optimization.

I suspect the best approach is probably going to be your option one: if-else dispatch on type if any case is not a number, and then a switch statement covering the values for each of the keyword/string/symbol types present (no nested switch statements, and outlining the nested switches might be necessary). Even with a good hash, to guarantee v8 optimizing-compilation you would need to truncate the hashes into an smi (signed-left-shift once?) inside the case*.

Comment by David Nolen [ 19/Feb/14 12:50 AM ]

There's no need for invention here. We should follow the strategy that Clojure adopts - compile time hash calculation.

Comment by Francis Avila [ 19/Feb/14 3:09 PM ]

The problem, as Michal alluded to, is that the hash functions in cljs's runtime environment are not available at compile-time (unlike in Clojure). This might be a good opportunity to clean up that situation or even use identical hash values across Clojure and Clojurescript (i.e. CLJS-754), but that's a much bigger project. Especially considering it will probably not bring much of a speedup over an if-else-if implementation except in very narrow circumstances.

Comment by David Nolen [ 19/Feb/14 4:38 PM ]

Francis Avila I would make no such assumptions about performance without benchmarks. One of the critical uses for case is over keywords. Keyword hashes are computed at compile time, so that's one function call and a jump on some JavaScript engines. This is particularly useful for the performance of records where you want to lookup a field via keyword before checking the extension map.

This ticket should probably wait for CLJS-754 before proceeding.

Comment by Francis Avila [ 22/Feb/14 4:44 AM ]

Record field lookup is a good narrow use case to test. I put together a jsperf to compare if-else (current) vs switch with string cases vs switch with int cases (i.e., hash-compares, assuming perfect hashing).

Comment by David Nolen [ 10/May/14 3:59 PM ]

I've merged the case* analyzer and emitter bits by hand into master.

Comment by David Nolen [ 10/May/14 4:42 PM ]

I'v merged the rest of the patch into master. If any further optimizations are done it will be around dispatching on hash code a la Clojure.

Comment by Francis Avila [ 11/May/14 12:53 AM ]

Your keyword-test optimization has a bug: https://github.com/clojure/clojurescript/commit/9872788b3caa86f639633ff14dc0db49f16d3e2a

Test case:

(let [x "a"] (case x :a 1 "a"))
;=> 1
;;; Should be "a"

Github comment suggests two possible fixes.

Comment by David Nolen [ 11/May/14 10:50 AM ]

Thanks fix in master.

Comment by Christoffer Sawicki [ 23/Jun/14 3:41 PM ]

case over "chars" is currently not being optimized to switch (in other words: (case c (\a) :a :other) uses if instead of switch).

Given that ClojureScript chars are just strings of length 1, could this perhaps simply be fixed by tweaking https://github.com/clojure/clojurescript/blob/master/src/clj/cljs/core.clj#L1187 ?

Comment by Christoffer Sawicki [ 23/Jun/14 4:11 PM ]

OK, I couldn't resist trying and it seems to be that easy. Would be great if somebody more knowledgeable could look at it and say if it has any side-effects. (See patch with name 0001-CLJS-713-Allow-test-expressions-for-case-to-be-chars.patch.)

Comment by David Nolen [ 23/Jun/14 4:15 PM ]

The patch looks good I would have applied it if I hadn't already gone and done it master myself just now

Comment by Christoffer Sawicki [ 23/Jun/14 4:22 PM ]

Hehe. Thanks! Don't forget to update the "case* tests must be numbers or strings" message on line 496 too.

Comment by David Nolen [ 23/Jun/14 4:48 PM ]

The existing docstring is inaccurate - case supports all compile time literals.





[CLJS-857] change deftype*/defrecord* special forms to include their inline methods decls Created: 15/Sep/14  Updated: 16/Sep/14

Status: Open
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Nicola Mometto Assignee: David Nolen
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-change-deftype-defrecord-special-forms-to-include-th.patch    
Patch: Code

 Description   

Currently the deftype* and defrecord* special forms don't include their inline method declarations, this means that code like:

(deftype foo [x] Object (toString [_] x))

is macroexpanded into something that looks like:

(deftype* foo [x] pmask) (extend-type foo Object (toString [_] x)) ...

The issue with this approach is that we want to treat "x" in the extend-type expression as if it was inside the lexical scope of the deftype*, to do so the analyzer attaches a metadata flag to the extend-type call with the deftype* local fields.

This is not ideal and complicates needlessly the analyzer, I propose to simply move the extend-type call as a parameter to the deftype* special form, this way no special handling for local fields will be needed:

(deftype foo [x] pmask (extend-type foo Object (toString [_] x)))

since the extend-type code is effectively inside the lexical scope of the deftype.

Everything said above applies to defrecord* aswell, this patch implements what I proposed and refactors a bit the code in the analyzer to remove the code duplication in the defrecord* and deftype* parse implementations.

In addition, the current approach was unfeasable to implement for tools.analyzer.js so taking this patch would mean making cljs.analyzer and tools.analyzer.js special-form compatible.






[CLJS-527] Support dynamic runtime extension of protocols to types Created: 20/Jun/13  Updated: 11/Dec/13

Status: Open
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Chas Emerick Assignee: David Nolen
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File CLJS-527.diff    
Patch: Code and Test

 Description   

Here is a transliteration of a semi-common pattern used with Clojure protocols to dynamically extend protocols to concrete types implementing other protocols (or interfaces, on the JVM):

(defprotocol P (m [this]))

(extend-protocol P
  object
  (m [this]
    (if (seq? this)
      (do
        (extend-type (type this) P
          (m [this] (count this)))
        (m this))
      (throw (ex-info "Cannot extend m to type" {:type (type this)})))))

(I think dnolen was the first to talk about this outside of irc.) Unfortunately, this does not work in ClojureScript; extend-type currently requires that the type be specified as a symbol:

clojure.lang.ExceptionInfo: clojure.lang.PersistentList cannot be cast to clojure.lang.Named at line 4  {:tag :cljs/analysis-error, :file nil, :line 4, :column 5}

I can (hackily?) make this work by simply not attempting to resolve tsym here. However, that leaves lists in as values for :tag metadata (which might be used by the analyzer and/or other tools that depend upon it?), which I presume is not OK.

If someone can provide guidance on a sane path from here, I'll do what I can to produce a plausible patch.



 Comments   
Comment by Chas Emerick [ 21/Jun/13 12:08 PM ]

Looks like jvm.tools.analyzer emits a :tag of nil for some corresponding Clojure code; this can be seen by running this:

(require '[clojure.tools.analyzer :refer (ast)])
#_= nil
(defprotocol P (m [this]))
#_= P
(ast (fn [x]
       (extend-type (type x)
         P
         (m [this] (count this)))))
#_= ...

(The output is verbose enough that I'm not bothering to paste it here.) So, that's easy enough to do, and makes the original example work in ClojureScript.

However, simply suspending the lookup of what is currently assumed to be a symbol naming the type being extended isn't enough. With only that, dynamic usage of extend-type will affect js native prototypes, e.g.:

ClojureScript:cljs.user> (defprotocol P (m [this]))
nil
ClojureScript:cljs.user> (defn naive-dynamic-extend [x]
  (extend-type (type x)
    P
    (m [this] "hi")))
...
ClojureScript:cljs.user> (naive-dynamic-extend true)
...
ClojureScript:cljs.user> js/Boolean.prototype.cljs$user$P$m$arity$1
#<
function (this$) {
    return "hi";
}
>

So the bits in extend-type that handle base types (boolean, string, function, array, etc) need to be brought over to runtime. Looking into this now.

Comment by Chas Emerick [ 24/Jun/13 8:22 AM ]

Patch attached. All previously-allowed usage of extend-type continues to emit exactly the same code. Extensions without a statically-named type include both possible code paths:

1. When the type is a JavaScript native, the extension is made on the prototype's fns using the same base type names as are used for static extensions to e.g. string, object, etc
2. When the type is some other prototype, the extension is made on it directly.

This yields code like:

ClojureScript:cljs.user> (defprotocol P (m [this]))
nil
ClojureScript:cljs.user> #(extend-type (type %) P (m [this] "hi"))
#<
function (p1__4810_SHARP_) {
    var G__4813 = cljs.core.type.call(null, p1__4810_SHARP_);
    var temp__4090__auto__ = (cljs.core.base_type[G__4813]);
    if (cljs.core.truth_(temp__4090__auto__)) {
        var G__4814 = temp__4090__auto__;
        (cljs.user.P[G__4814] = true);
        return (cljs.user.m[G__4814] = ((function (G__4814, temp__4090__auto__, G__4813) {
            return (function (this$) {
                return "hi";
            });
        })(G__4814, temp__4090__auto__, G__4813)));
    } else {
        G__4813.prototype.cljs$user$P$ = true;
        return G__4813.prototype.cljs$user$P$m$arity$1 = ((function (temp__4090__auto__, G__4813) {
            return (function (this$) {
                return "hi";
            });
        })(temp__4090__auto__, G__4813));
    }
}
>

The duplication of the prototype method implementation bodies is unfortunate, a side effect of keeping the extend-type macro and supporting emit-* fns relatively simple. (Note that advanced compilation doesn't lift and merge those fns.) I'm inclined to say that it's a reasonable tradeoff, at least for now, as it only affects the dynamic type extension case; a reasonable TODO later, perhaps.

Comment by Brandon Bloom [ 03/Jul/13 3:44 PM ]

At Chas' request, I took a look at the patch. Tests pass locally & my few small toy projects run fine. I haven't benchmarked.

My only real concern is pretty minor: I'm terrified of JavaScript's semantics around typeof, toString, etc. The existing code paths leverage goog.typeOf, which has some pretty hairy internals. Meanwhile, Chas is just implicitly toString-ing on some type objects with an array set. The code of goog.typeOf also discusses oddities of Object.prototype.toString in firefox, but presumably that won't matter via the implicit conversion present in the array set. So if this works in all the major browsers, the patch LGTM.

Comment by Chas Emerick [ 03/Jul/13 6:29 PM ]

Just a point of documentation w.r.t. the stringifying of js-native prototypes: given the initial example above, if (type x) (or, whatever expression the user is providing that will return a "type" to extend) returns a js-native prototype, we need some way to map that at runtime to the strings that ClojureScript uses for those types when performing protocol dispatch. Using a js object containing as literal a representation of that mapping as possible seemed like a reasonable option. Providing a fn that cond's through the various options would be equivalent AFAICT.

A separate larger issue is, what is a type in ClojureScript? As far as protocols are concerned, the type of types is approximately the union of all non-native js prototypes, and symbols identifying those natives. However, type (and, really, any user of ClojureScript writing expressions provided to extend-type) doesn't know about the latter or the carve-out w.r.t. prototypes, thus some implicit runtime conversion is needed. Alternatively, one could say that any expression provided to extend-type must respect that contract, but then (a) users would need to explicitly handle js native types, and (b) Clojure/ClojureScript portability would be further complicated in this department.

Comment by David Nolen [ 03/Jul/13 8:02 PM ]

Reviewing the patch, thanks all.

Comment by David Nolen [ 03/Jul/13 8:09 PM ]

Ok what is the base-type js-obj for? Why aren't we using goog.typeOf?

Comment by Chas Emerick [ 03/Jul/13 9:06 PM ]

We can't use goog.typeOf because extend-type works with a type (i.e. the return of (type x)), not a value the type of which should be extended to the given protocol(s). (goog.typeOf will always return "function" for prototypes, js-native or not.)

The ClojureScript cljs.core/base-type js-obj is simply a runtime-accessible analogue of the (Clojure) cljs.core/base-type map, except it maps js-native prototypes to the goog.typeOf strings that are used for protocol dispatch.

Comment by David Nolen [ 16/Jul/13 6:40 AM ]

Ok I looked at the patch some more, I don't really like the string coercion aspect around base-type. Let's switch this to an array-map.

Comment by Chas Emerick [ 16/Jul/13 6:48 AM ]

Sure, I can do that. FWIW, that will rope in PAM and whatever other persistent data structure and printing bits it depends upon by default…is that considered acceptable?

Comment by David Nolen [ 16/Jul/13 10:31 AM ]

Hrm, that's actually a good point. Perhaps better to do a array + scan. I thought about this patch some more and it really needs more work. One thing this doesn't handle is objects from foreign contexts. ClojureScript can currently handle this by combining default cases with goog.typeOf.

I think extend-type should probably work with strings and/or symbols that represent the base types so that objects from other contexts can also be handled. I think automating this will be unweildy but at least it gives users the flexibility to handle these cases themselves.

Comment by Chas Emerick [ 16/Jul/13 7:14 PM ]

What do you mean by "foreign contexts"? I did a bit of searching on the term, and didn't turn up anything promising in connection with either ClojureScript or JavaScript. I assume you're not referring to e.g. types loaded via :foreign-libs, but who knows…

Re "strings and/or symbols", are you suggesting that dynamic usage of extend-type should not perform any translation of js-native prototypes to their string names, i.e. an expression being evaluated to determine the type to extend would need to return "string" (or 'string) rather than js/String?

Comment by David Nolen [ 16/Jul/13 9:00 PM ]

JavaScript objects from other JS execution contexts, IFrames are the most common source of these. This is why goog.typeOf implementation is so complex, it handles these cases.

I'm saying that extend-type should do run time extension to JS natives if the user specifies the extension at runtime via a string or symbol for the native cases because an Array from another JS Execution context is not equal to the Array in the current one.

Comment by Brandon Bloom [ 23/Jul/13 2:04 PM ]

It seems silly to argue about all the edge cases here, considering how many edge cases pertaining to "types" are already broken in ClojureScript.

For example, currently (= (type :foo) (type "foo"))

This is because cljs.core/type simply calls accesses the constructor field, and keywords are strings at runtime. Meanwhile, the (type (type x)) is always a function, since there is no Type type.

There are three problems:

1) Type equality

2) Getting an object's type

3) Runtime protocol extension

This patch delegates #2 to cljs.core/type and properly addresses #3.

#1 is a bit trickier, since there are three valid approaches I can think of:

A) Nominal equality - Enhance cljs.core/type to return sensible symbols, by implementing the crux of the goog/typeOf behavior plus some extra behavior for extracting type names out of function string representations.

B) Constructor equality - Simply compare .constructor; This is basically what happens now, but has 2 problems: B1) Doesn't provide for types at compile time B2) might not work correctly with IFrame execution environments

C) Hybrid/Heuristic - (defprotocol IType ...) and implement some Type objects with equality sensible operators; lazily stuff those type objects into a reflection map of some sort.

Personally, I think that B (the current state of the world) is hopelessly broken. Despite my initial reservations regarding the toString coercion, I think this patch does a reasonable job of eschewing B for a stop-gap A (with compile time interop). Given this analysis, I think the string coercion for natives actually does a better job than one could do with a PAM of constructors: ie the coercion covers the remote execution state. Unless this is provably broken for some key scenarios with IFrames, I think the patch is good as is, but we need to think about a follow on patch for fixing up runtime types in general.

Comment by Brandon Bloom [ 23/Jul/13 2:23 PM ]

I should also point out: Unlike JavaScript, Java has a unified nominal type system. Name equality is type equality (ignoring custom class loaders). However, JavaScript with Google Closure has a stratified type system: The dynamic type system utilizes object identity for equality. The GClosure static type system is (mostly) nominal with some fudge factor for the mismatch with the runtime type system (mostly around inheritance/mixins/array-like/etc). I think that ClojureScript should strive for a runtime reification of the Google Closure type system, since that would be most compatible with the Clojure/JVM type system.

Comment by David Nolen [ 23/Jul/13 3:06 PM ]

We are not going to follow goog.typeOf.

Comment by Brandon Bloom [ 23/Jul/13 3:22 PM ]

Follow it where?

Comment by David Nolen [ 23/Jul/13 3:29 PM ]

We're not going to use it nor follow its example for determining types unless we are trying to detect natives.

Comment by Brandon Bloom [ 23/Jul/13 3:34 PM ]

Getting back on topic: Getting some type-like-thing from an object is not this patch.

This patch is about extend-type, which I think it implements reasonably well given our current failings at runtime type reification.

Chas has this working with user defined types as well as with natives. Are there any particular scenarios that are provably broken? Either in general or on a particular browser/runtime?

Comment by David Nolen [ 23/Jul/13 3:38 PM ]

Chas's patch can't catch natives from IFrame contexts, I'd rather this patch move forward with at least the ability for a user to handle that situation themselves which I said above.

Comment by Brandon Bloom [ 23/Jul/13 3:59 PM ]

I think this does handle natives from iframe contexts, since extend-type takes a "type" not an object. Getting the type from an object does not need to happen here. The patch coerces types to a string via toString, which is precisely how goog.typeOf works internally on natives. Search for Object.prototype.toString.call in http://docs.closure-library.googlecode.com/git/closure_goog_base.js.source.html

Are you speculating that the patch doesn't work, or have you tried it?

If the former, Chas: Can you provide a test project that demonstrates extension of the cross product of these two sets:

1) local type
2) request remote object, coerce to type locally
3) request remote type object

A) native objects
B) deftype-ed objects

Comment by Chas Emerick [ 23/Jul/13 7:42 PM ]

Whatever the semantics and dark corners of JavaScript "types" — or, what they should be, at least w.r.t. ClojureScript — extend-type has very little latitude to operate.

The runtime-dynamic variant of the code it generates will be expecting something typeish coming out of whatever expression the user provides to it.
AFAICT, the only sane possibilities for "typeish" in this context are strings naming javascript natives (e.g. "string", or perhaps 'string if we want to be generous), or a constructor fn (cljs.core/PV, or js/String, or anything else returned by type). The current patch only accepts the latter, done to preserve as much as possible the existing patterns of extend-type usage in Clojure, and hopefully avoid foisting conversion of js/String to "string" at runtime onto users. String coercion is used to normalize the former into the latter; since the code determining the typeish value is entirely in the hands of the user (we don't have access to an object that exemplifies the type to which the user is extending, so we can't wedge in anything particularly clever), I believe it (or something similar) is all we can do.

From here, the only other option I see would be to expand the patch to eliminate this coercion, accepting strings or symbols naming js natives ("string", "boolean", and so on), and allow extensions to js natives at runtime without restriction. This may be a feature for some (perhaps if someone wants to extend a protocol to a js native only withing a particular iframe context?); on the other hand, we should probably document heavily that runtime usage of extend-type should take care to perform the sort of coercion the current patch does (and maybe provide some kind of helper function?), insofar as extension to natives directly is considered harmful in general (e.g. http://dev.clojure.org/jira/browse/CLJS-528, which was viewed favorably in irc some weeks ago?).

I'm happy to produce further tests (up to the suite that Brandon suggested above) if that would be helpful.

Comment by Michał Marczyk [ 26/Jul/13 7:04 PM ]

Just wanted to note that I've run into a situation where runtime extension of protocols to types would AFAICT be the next best thing to "extending protocol to protocol". Here's a link to the relevant ticket in fipp's issue tracker: https://github.com/brandonbloom/fipp/issues/6 (relevant part starts in the 8th comment).





Generated at Thu Sep 18 16:49:03 CDT 2014 using JIRA 4.4#649-r158309.