Details
Description
From my email discussion with dnolon before making this patch:
The current analyzer does some trickery to prepare for emitting to JavaScript. Among this trickery is gensyms for locals (including "this"), the new $ prefix on some namespaces, uniqify on parameters, and more. This must be mildly annoying for people writing alternate compiler backends, but for the most part non-blocking because fewer symbol collisions should never be an additional problem for a target language with different symbol resolution rules.
[snip lots more text]
Consider what an ANF transform would look like for the :let form:
(defmethod anf :let
[{:keys [env bindings statements ret form] :as ast}]
(let [bindings (mapcat (fn [{:keys [sym init] :as binding}]
[name (anf init)])
bindings)
body-env (-> bindings last :env)]
(ana/analyze env `(~(first form) [~@(map :form bindings)]
~@(anf-block (assoc ast :env body-env))))))
Simple enough, right? This walks each binding, ANF transforms the init expression, gets the environment in which to analyze the body, and then analyzes a new let (or loop) statement with the modified bindings.
Unfortunately, this doesn't work. When the ana/analyze call happens, body-env contains gensymed locals. The result is that body-env is invalidated by the outer analyze call, which is re-generating the symbols for the local variables. If you take the gensyms out of analyze-let, then analyze becomes pure (modulo def forms) and this code magically becomes correct. I've run into this same problem anywhere gensyms are used in analyze.
Commit message on the patch:
AST Changes
* Anywhere a binding was introduced for a local used to be a symbol,
now it is a map with a :name key and potentially a :shadow key.
* Bindings vectors are no longer alternating symbols, then init maps.
Instead, the are a vector of maps of the shape described for locals
plus an :init key.
* The :gthis key for functions has been replaced with :type, which
is the symbol describing the type name of the enclosing deftype form.
* recur frames now expose :params as binding maps, instead of :names
Benefits:
* Shadowed variables are now visible to downstream AST transforms.
* :tag, :mutable, and other metadata are now uniform across ops
* Eliminates usages of gensym inside the analyzer, which was a source
of state that made the analyzer impossible to use for some
transformations of let, letfn, etc which require re-analyzing forms.
* Removes JavaScript shadowing semantics from the analyze phase.
Attachments
Activity
| Field | Original Value | New Value |
|---|---|---|
| Description |
From my email discussion with dnolon before making this patch:
The current analyzer does some trickery to prepare for emitting to JavaScript. Among this trickery is gensyms for locals (including "this"), the new $ prefix on some namespaces, uniqify on parameters, and more. This must be mildly annoying for people writing alternate compiler backends, but for the most part non-blocking because *fewer* symbol collisions should never be an additional problem for a target language with different symbol resolution rules. [snip lots more text] Consider what an ANF transform would look like for the :let form: {code} (defmethod anf :let [{:keys [env bindings statements ret form] :as ast}] (let [bindings (mapcat (fn [{:keys [sym init] :as binding}] [name (anf init)]) bindings) body-env (-> bindings last :env)] (ana/analyze env `(~(first form) [~@(map :form bindings)] ~@(anf-block (assoc ast :env body-env)))))) {/code} Simple enough, right? This walks each binding, ANF transforms the init expression, gets the environment in which to analyze the body, and then analyzes a new let (or loop) statement with the modified bindings. Unfortunately, this doesn't work. When the ana/analyze call happens, body-env contains gensymed locals. The result is that body-env is invalidated by the outer analyze call, which is re-generating the symbols for the local variables. If you take the gensyms out of analyze-let, then analyze becomes pure (modulo def forms) and this code magically becomes correct. I've run into this same problem anywhere gensyms are used in analyze. Commit message on the patch: {code} AST Changes * Anywhere a binding was introduced for a local used to be a symbol, now it is a map with a :name key and potentially a :shadow key. * Bindings vectors are no longer alternating symbols, then init maps. Instead, the are a vector of maps of the shape described for locals plus an :init key. * The :gthis key for functions has been replaced with :type, which is the symbol describing the type name of the enclosing deftype form. * recur frames now expose :params as binding maps, instead of :names Benefits: * Shadowed variables are now visible to downstream AST transforms. * :tag, :mutable, and other metadata are now uniform across ops * Eliminates usages of gensym inside the analyzer, which was a source of state that made the analyzer impossible to use for some transformations of let, letfn, etc which require re-analyzing forms. * Removes JavaScript shadowing semantics from the analyze phase. {/code} |
From my email discussion with dnolon before making this patch:
The current analyzer does some trickery to prepare for emitting to JavaScript. Among this trickery is gensyms for locals (including "this"), the new $ prefix on some namespaces, uniqify on parameters, and more. This must be mildly annoying for people writing alternate compiler backends, but for the most part non-blocking because *fewer* symbol collisions should never be an additional problem for a target language with different symbol resolution rules. [snip lots more text] Consider what an ANF transform would look like for the :let form: {code} (defmethod anf :let [{:keys [env bindings statements ret form] :as ast}] (let [bindings (mapcat (fn [{:keys [sym init] :as binding}] [name (anf init)]) bindings) body-env (-> bindings last :env)] (ana/analyze env `(~(first form) [~@(map :form bindings)] ~@(anf-block (assoc ast :env body-env)))))) {code} Simple enough, right? This walks each binding, ANF transforms the init expression, gets the environment in which to analyze the body, and then analyzes a new let (or loop) statement with the modified bindings. Unfortunately, this doesn't work. When the ana/analyze call happens, body-env contains gensymed locals. The result is that body-env is invalidated by the outer analyze call, which is re-generating the symbols for the local variables. If you take the gensyms out of analyze-let, then analyze becomes pure (modulo def forms) and this code magically becomes correct. I've run into this same problem anywhere gensyms are used in analyze. Commit message on the patch: {code} AST Changes * Anywhere a binding was introduced for a local used to be a symbol, now it is a map with a :name key and potentially a :shadow key. * Bindings vectors are no longer alternating symbols, then init maps. Instead, the are a vector of maps of the shape described for locals plus an :init key. * The :gthis key for functions has been replaced with :type, which is the symbol describing the type name of the enclosing deftype form. * recur frames now expose :params as binding maps, instead of :names Benefits: * Shadowed variables are now visible to downstream AST transforms. * :tag, :mutable, and other metadata are now uniform across ops * Eliminates usages of gensym inside the analyzer, which was a source of state that made the analyzer impossible to use for some transformations of let, letfn, etc which require re-analyzing forms. * Removes JavaScript shadowing semantics from the analyze phase. {code} |
| Attachment | shadowing.patch [ 11469 ] |
| Attachment | shadowing.patch [ 11468 ] |
| Attachment | CLJS-369-v2.patch [ 11471 ] |
| Summary | analyze shouldn't use gensyms | gensyms break re-analyze; prevents variable shadowing analysis |
| Resolution | Completed [ 1 ] | |
| Status | Open [ 1 ] | Resolved [ 5 ] |
Can we please get the ticket number in the first line of the patch? If you look at the ClojureScript commit history you can see the convention that's been adopted. Thanks!