ClojureScript

gensyms break re-analyze; prevents variable shadowing analysis

Details

  • Type: Defect Defect
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Completed
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
  • Patch:
    Code

Description

From my email discussion with dnolon before making this patch:

The current analyzer does some trickery to prepare for emitting to JavaScript. Among this trickery is gensyms for locals (including "this"), the new $ prefix on some namespaces, uniqify on parameters, and more. This must be mildly annoying for people writing alternate compiler backends, but for the most part non-blocking because fewer symbol collisions should never be an additional problem for a target language with different symbol resolution rules.

[snip lots more text]

Consider what an ANF transform would look like for the :let form:

(defmethod anf :let
  [{:keys [env bindings statements ret form] :as ast}]
  (let [bindings (mapcat (fn [{:keys [sym init] :as binding}]
                           [name (anf init)])
                         bindings)
        body-env (-> bindings last :env)]
    (ana/analyze env `(~(first form) [~@(map :form bindings)]
                          ~@(anf-block (assoc ast :env body-env))))))

Simple enough, right? This walks each binding, ANF transforms the init expression, gets the environment in which to analyze the body, and then analyzes a new let (or loop) statement with the modified bindings.

Unfortunately, this doesn't work. When the ana/analyze call happens, body-env contains gensymed locals. The result is that body-env is invalidated by the outer analyze call, which is re-generating the symbols for the local variables. If you take the gensyms out of analyze-let, then analyze becomes pure (modulo def forms) and this code magically becomes correct. I've run into this same problem anywhere gensyms are used in analyze.

Commit message on the patch:

AST Changes
    
    * Anywhere a binding was introduced for a local used to be a symbol,
      now it is a map with a :name key and potentially a :shadow key.
    
    * Bindings vectors are no longer alternating symbols, then init maps.
      Instead, the are a vector of maps of the shape described for locals
      plus an :init key.
    
    * The :gthis key for functions has been replaced with :type, which
      is the symbol describing the type name of the enclosing deftype form.
    
    * recur frames now expose :params as binding maps, instead of :names
    
    Benefits:
    
    * Shadowed variables are now visible to downstream AST transforms.
    
    * :tag, :mutable, and other metadata are now uniform across ops
    
    * Eliminates usages of gensym inside the analyzer, which was a source
      of state that made the analyzer impossible to use for some
      transformations of let, letfn, etc which require re-analyzing forms.
    
    * Removes JavaScript shadowing semantics from the analyze phase.
  1. CLJS-369-v2.patch
    03/Sep/12 2:02 PM
    19 kB
    Brandon Bloom
  2. shadowing.patch
    01/Sep/12 7:40 PM
    19 kB
    Brandon Bloom

Activity

Brandon Bloom made changes -
Field Original Value New Value
Description From my email discussion with dnolon before making this patch:

The current analyzer does some trickery to prepare for emitting to JavaScript. Among this trickery is gensyms for locals (including "this"), the new $ prefix on some namespaces, uniqify on parameters, and more. This must be mildly annoying for people writing alternate compiler backends, but for the most part non-blocking because *fewer* symbol collisions should never be an additional problem for a target language with different symbol resolution rules.

[snip lots more text]

Consider what an ANF transform would look like for the :let form:

{code}
(defmethod anf :let
  [{:keys [env bindings statements ret form] :as ast}]
  (let [bindings (mapcat (fn [{:keys [sym init] :as binding}]
                           [name (anf init)])
                         bindings)
        body-env (-> bindings last :env)]
    (ana/analyze env `(~(first form) [~@(map :form bindings)]
                          ~@(anf-block (assoc ast :env body-env))))))
{/code}

Simple enough, right? This walks each binding, ANF transforms the init expression, gets the environment in which to analyze the body, and then analyzes a new let (or loop) statement with the modified bindings.

Unfortunately, this doesn't work. When the ana/analyze call happens, body-env contains gensymed locals. The result is that body-env is invalidated by the outer analyze call, which is re-generating the symbols for the local variables. If you take the gensyms out of analyze-let, then analyze becomes pure (modulo def forms) and this code magically becomes correct. I've run into this same problem anywhere gensyms are used in analyze.



Commit message on the patch:

{code}
    AST Changes
    
    * Anywhere a binding was introduced for a local used to be a symbol,
      now it is a map with a :name key and potentially a :shadow key.
    
    * Bindings vectors are no longer alternating symbols, then init maps.
      Instead, the are a vector of maps of the shape described for locals
      plus an :init key.
    
    * The :gthis key for functions has been replaced with :type, which
      is the symbol describing the type name of the enclosing deftype form.
    
    * recur frames now expose :params as binding maps, instead of :names
    
    Benefits:
    
    * Shadowed variables are now visible to downstream AST transforms.
    
    * :tag, :mutable, and other metadata are now uniform across ops
    
    * Eliminates usages of gensym inside the analyzer, which was a source
      of state that made the analyzer impossible to use for some
      transformations of let, letfn, etc which require re-analyzing forms.
    
    * Removes JavaScript shadowing semantics from the analyze phase.
{/code}
From my email discussion with dnolon before making this patch:

The current analyzer does some trickery to prepare for emitting to JavaScript. Among this trickery is gensyms for locals (including "this"), the new $ prefix on some namespaces, uniqify on parameters, and more. This must be mildly annoying for people writing alternate compiler backends, but for the most part non-blocking because *fewer* symbol collisions should never be an additional problem for a target language with different symbol resolution rules.

[snip lots more text]

Consider what an ANF transform would look like for the :let form:

{code}
(defmethod anf :let
  [{:keys [env bindings statements ret form] :as ast}]
  (let [bindings (mapcat (fn [{:keys [sym init] :as binding}]
                           [name (anf init)])
                         bindings)
        body-env (-> bindings last :env)]
    (ana/analyze env `(~(first form) [~@(map :form bindings)]
                          ~@(anf-block (assoc ast :env body-env))))))
{code}

Simple enough, right? This walks each binding, ANF transforms the init expression, gets the environment in which to analyze the body, and then analyzes a new let (or loop) statement with the modified bindings.

Unfortunately, this doesn't work. When the ana/analyze call happens, body-env contains gensymed locals. The result is that body-env is invalidated by the outer analyze call, which is re-generating the symbols for the local variables. If you take the gensyms out of analyze-let, then analyze becomes pure (modulo def forms) and this code magically becomes correct. I've run into this same problem anywhere gensyms are used in analyze.



Commit message on the patch:

{code}
    AST Changes
    
    * Anywhere a binding was introduced for a local used to be a symbol,
      now it is a map with a :name key and potentially a :shadow key.
    
    * Bindings vectors are no longer alternating symbols, then init maps.
      Instead, the are a vector of maps of the shape described for locals
      plus an :init key.
    
    * The :gthis key for functions has been replaced with :type, which
      is the symbol describing the type name of the enclosing deftype form.
    
    * recur frames now expose :params as binding maps, instead of :names
    
    Benefits:
    
    * Shadowed variables are now visible to downstream AST transforms.
    
    * :tag, :mutable, and other metadata are now uniform across ops
    
    * Eliminates usages of gensym inside the analyzer, which was a source
      of state that made the analyzer impossible to use for some
      transformations of let, letfn, etc which require re-analyzing forms.
    
    * Removes JavaScript shadowing semantics from the analyze phase.
{code}
Brandon Bloom made changes -
Attachment shadowing.patch [ 11469 ]
Brandon Bloom made changes -
Attachment shadowing.patch [ 11468 ]
Brandon Bloom made changes -
Attachment CLJS-369-v2.patch [ 11471 ]
Brandon Bloom made changes -
Summary analyze shouldn't use gensyms gensyms break re-analyze; prevents variable shadowing analysis
David Nolen made changes -
Resolution Completed [ 1 ]
Status Open [ 1 ] Resolved [ 5 ]
David Nolen made changes -
Status Resolved [ 5 ] Closed [ 6 ]

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: