<< Back to previous view

[CLJS-500] Code Size Issue - Constructors Created: 27/Apr/13  Updated: 27/Jul/13  Resolved: 04/May/13

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: David Nolen Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None


 Description   

When compiling something like spectral norm http://github.com/swannodette/cljs-stl/blob/master/src/cljs_stl/spectral/demo.cljs where no ClojureScript data structures are used we still see many constructors in the advanced compiled output. The total KLOC of pretty printed advanced compiled code using master is ~3800.

Quick testing shows that the following ctors are present in spectral norm:

PersistentTreeMap
RedNode
BlackNode
PersistentTreeMapSeq
TransientHashMap
PersistentHashMap
ArrayNodeSeq
NodeSeq
HashCollisionNode
ArrayNode
BitmapIndexedNode
TransientArrayMap
PersistentArrayMap
ObjMap
NeverEquiv
TransientVector
ChunkedSeq
PersistentVector
VectorNode
ChunkedCons
ArrayChunk
ChunkBuffer
LazySeq
Keyword
Cons
EmptyLIst
List
IndexedSeq

We removed the runtime function dependency of cljs.core.PersistentTreeSet/EMPTY, all ctors related to PTSs disappeared.

Removing the runtime constructions of the other EMPTY collections seems to have no further effect on code size, probably because we have avoided dependencies elsewhere.



 Comments   
Comment by David Nolen [ 27/Apr/13 8:41 PM ]

Another issue looks like recursive relationships between constructors, this occurs with hash map upgrading as well as with transient. Getting ridding rid of the IEditableCollection implementations removes all the transient related constructors from appearing. With this change on master we around ~3580 KLOCs.

Coupled with the pr code removed we're down to ~2700 KLOCs. Many constructors magically disappear if we remove the printing code.

After some more investigation, it's only necessary to remove pr-str and that puts us at ~2700 KLOCs.

It appears my speculation about the variadic functions is way off. Recursive relationships between the core types and functions seems to be the real source of the issue.

Comment by David Nolen [ 27/Apr/13 9:26 PM ]

There are many complex dependencies between functions and types currently. For example when writing a type it's tempting to rely on the standard library, but this has repercussions. For example calling map from a deftype implementation means you need to pull in the chunked types!

Another example is how PersistentArrayMap and ObjMap both call reduce, in this case reduce doesn't pull in dependencies but it shows a kind of thinking that should probably be avoided in the collections at the heart of ClojureScript. The reduce cases should be probably be replaced with a loop/recur.

Comment by David Nolen [ 27/Apr/13 9:51 PM ]

Instead of guessing it might be useful to examine the call graph of the AST - there some instructions on how to do so here http://stackoverflow.com/questions/1385335/how-to-generate-function-call-graphs-for-javascript/12220307#12220307

Comment by David Nolen [ 28/Apr/13 11:43 AM ]

One way to control the amount of mutual reference, move that functionality out into a separate function, this way the usage of a specifc function triggers inclusions of data structures instead of it being implicit to using a particular data structure. A specific example is switching data structures based on size i.e. PAM -> PHM.

Comment by David Nolen [ 28/Apr/13 12:54 PM ]

After attempting to limit the inclusion of transients collections in ClojureScript, Closure is indeed very accurate. The real problem is the reliance on convenience fns within deftypes this includes map/reduce/into/etc. for example map will bring in chunked types, into will bring in all transient collections. deftype should really only be written with the language primitives - this is better from a performance perspective anyhow. We should review the deftypes and eliminate the various conveniences where possible.

Comment by David Nolen [ 28/Apr/13 1:32 PM ]

Other offenders are the top level extend-type/protocol on JS natives. extend-type nil calls hashcode, IPrintWithWriter includes all the printing machinery, string & array load primseq arrayseq.

We should probably adopt the Clojure JVM and dispatch on the types where we can in the library fns themselves at not at the protocol level.

Comment by David Nolen [ 04/May/13 12:01 PM ]

This has been addressed in master in a number of commits. (.log js/console "Hello world!") is now ~100 lines of code. There are more enhancements that need further investigation like removing the various transient data structures if unused but this needs more investigation.

Generated at Thu Jul 24 15:06:26 CDT 2014 using JIRA 4.4#649-r158309.