ClojureScript

Cull source maps

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

When generating source maps, we emit a lot of extra data that is not needed for mapping stacktraces.

Planck culls a lot of this information out while still being able to successfully map stacktraces, with a big perf gain: Planck loads the smaller source map files instantly rather than seconds. This has been in place for a year now with no reports of issues.

I'm thinking that we may be able to do the same in ClojureScript proper, when writing out source maps. It's worth trying an experiment where we optionally strip out this information using the function that had been developed for Planck, included here for reference:

(defn- strip-source-map
  "Strips a source map down to the minimal representation needed for mapping
  stacktraces. This means we only need the :line and :col fields, we only need
  the last element in each vector of such maps, and we can eliminate
  duplicates, taking the smallest col number for each unique value."
  [sm]
  (into {}
    (map (fn [[row cols]]
           [row (->> cols
                  (map (fn [[col frames]]
                         [col [(select-keys (peek frames) [:line :col])]]))
                  (sort-by first)
                  (distinct-by second)
                  (into {}))]))
    sm))

Activity

Hide
Mike Fikes added a comment -

We are currently emitting source map line/col information for every AST, regardless of op type. While this guarantees that we have comprehensive source mapping information, it may be a much larger set than we need for many common uses.

Some op types can clearly be omitted, like :no-op for example. With a little experimentation, you can see that source mapping for the limited purpose of mapping stack traces is possible with just two or three op types. Dirac DevTools makes more extensive use of source mapping information in order to properly identify locals, binding forms, etc. within the source code.

The attached patch limits source map line/col emission to those tags which are needed for stack trace mapping, and for some simple uses with Dirac (which is a superset of those needed for stack trace mapping). The fundamental problem with this strategy is identifying the right subset of ops for which we emit line/col information.

But, this might be worth it if we can successfully identify a minimal set that meets general needs. With the attached patch, we get a 12% performance boost relative to current master when compiling Coal Mine in non-parallel mode.

With this patch, the size of the source maps written to disk are smaller: For cljs.core it is 432567 bytes instead of 640411 bytes.

Attaching this patch for feedback. If we can find a suitable subset, this might work out. If this proves too difficult, perhaps a compiler option could be introduced to control whether we emit a a tiny subset sufficient for stack trace mapping or slightly larger for debugging (Dirac), or all ops.

Show
Mike Fikes added a comment - We are currently emitting source map line/col information for every AST, regardless of op type. While this guarantees that we have comprehensive source mapping information, it may be a much larger set than we need for many common uses. Some op types can clearly be omitted, like :no-op for example. With a little experimentation, you can see that source mapping for the limited purpose of mapping stack traces is possible with just two or three op types. Dirac DevTools makes more extensive use of source mapping information in order to properly identify locals, binding forms, etc. within the source code. The attached patch limits source map line/col emission to those tags which are needed for stack trace mapping, and for some simple uses with Dirac (which is a superset of those needed for stack trace mapping). The fundamental problem with this strategy is identifying the right subset of ops for which we emit line/col information. But, this might be worth it if we can successfully identify a minimal set that meets general needs. With the attached patch, we get a 12% performance boost relative to current master when compiling Coal Mine in non-parallel mode. With this patch, the size of the source maps written to disk are smaller: For cljs.core it is 432567 bytes instead of 640411 bytes. Attaching this patch for feedback. If we can find a suitable subset, this might work out. If this proves too difficult, perhaps a compiler option could be introduced to control whether we emit a a tiny subset sufficient for stack trace mapping or slightly larger for debugging (Dirac), or all ops.
Hide
Antonin Hildebrand added a comment -

Dirac relies on source maps in two ways:
1) indirectly: standard Chrome DevTools (or V8) code uses source maps in various places (e.g. mapping console message line/col info back to original sources, or showing proper lines in the debugger, mapping names of local variables in the debugger, etc.)
2) directly: Dirac uses source map info from DevTools to provide code completion in its REPL prompt (it relies on "names" list in associated source maps as defined in source map specs[1])

I have pretty good test coverage for #2 so I would spot any regressions with code completion. But it is unclear to me how pruned source maps could affect DevTools/V8 itself.

Also please be aware that there are known source maps issues in DevTools related to ClojureScript (or other transpiled languages). Dirac tried to patch them but there might still be some outstanding bugs.
https://github.com/binaryage/dirac/issues/53

Are you aware of https://github.com/sokra/source-map-visualization? This tool proved to be helpful when I was debugging source map issues with Dirac. I think this could help to determine how to generate minimal source maps with relevant info.

[1] https://sourcemaps.info/spec.html

Show
Antonin Hildebrand added a comment - Dirac relies on source maps in two ways: 1) indirectly: standard Chrome DevTools (or V8) code uses source maps in various places (e.g. mapping console message line/col info back to original sources, or showing proper lines in the debugger, mapping names of local variables in the debugger, etc.) 2) directly: Dirac uses source map info from DevTools to provide code completion in its REPL prompt (it relies on "names" list in associated source maps as defined in source map specs[1]) I have pretty good test coverage for #2 so I would spot any regressions with code completion. But it is unclear to me how pruned source maps could affect DevTools/V8 itself. Also please be aware that there are known source maps issues in DevTools related to ClojureScript (or other transpiled languages). Dirac tried to patch them but there might still be some outstanding bugs. https://github.com/binaryage/dirac/issues/53 Are you aware of https://github.com/sokra/source-map-visualization? This tool proved to be helpful when I was debugging source map issues with Dirac. I think this could help to determine how to generate minimal source maps with relevant info. [1] https://sourcemaps.info/spec.html
Hide
Antonin Hildebrand added a comment -

Just for record. CLJS-2993 is related. Mike, if you were doing your Dirac experiments with ClojureScript master you were likely affected by it.

Show
Antonin Hildebrand added a comment - Just for record. CLJS-2993 is related. Mike, if you were doing your Dirac experiments with ClojureScript master you were likely affected by it.

People

Vote (2)
Watch (2)

Dates

  • Created:
    Updated: