[CLJS-340] improve pr-str performance Created: 24/Jul/12  Updated: 27/Jul/13  Resolved: 04/Sep/12

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: David Nolen Assignee: Unassigned
Resolution: Completed Votes: 2
Labels: None

Attachments: Text File 0001-Optimize-pr-str-by-making-more-use-of-StringBuffer.patch    


Currently pr-str performance is terrible. We should implement printing in the same style as core_print.clj Clojure JVM.

Comment by Evan Mezeske [ 25/Jul/12 10:43 PM ]

I've rewritten the pr-str stuff, passing a StringBuffer down and appending to it instead of doing a massive amount of concats with temporary lists (some of which were lazy). The results are very promising. On V8, serialization of big data structures is 2-3 times faster, and in the IE8 JS engine, it's 10 times faster.

I still need to clean it up and get the tests working, but I should be able to submit a patch pretty soon.

Comment by Evan Mezeske [ 26/Jul/12 12:11 AM ]

I've attached my patch to optimize pr-str. I used the StringBuffer approach, with which I see very substantial performance improvements of 2-3X in V8 and 10-100X in IE8.

I also added a suite of tests for pr-str, based loosely on the reader tests, to make sure that I didn't break anything. All tests pass.

Comment by David Nolen [ 26/Jul/12 8:53 AM ]

The approach looks good but this is a pretty big breaking change. After a year of ClojureScript I think there are quite a few code bases that rely on the contract of -pr-seq. I would prefer a patch that created a new protocol, IPrintMethod (or some better name) with support for this new interface for all current IPrintable types. Yes this means a lot of code duplication for now but this allows us to provide a fast path, it allows people to adopt it, and it gives us time to deprecate IPrintable.

Comment by Evan Mezeske [ 26/Jul/12 12:26 PM ]

I was afraid that you might say that! I was hoping that the leading dash on -pr-seq meant that nobody used it, but c'est la vie.

It shouldn't be hard to make things cohabitate in the way you described. Since I already renamed the -pr-seq function, I can just add the old stuff back in, and as long as the protocol names differ it should be easy.

I'll see if I can get to that this evening, although it might have to wait for the weekend. Either way, I'll get to it very soon.

Comment by David Nolen [ 26/Jul/12 12:30 PM ]

Yeah anybody working with deftype or defrecord will probably have provided their own IPrintable implementation. You should be able to dispatch to the fast path with a satisfies? test.


Comment by Evan Mezeske [ 01/Aug/12 10:36 PM ]

I updated my patch with support for backwards compatibility with old types that implement -pr-seq. If an object does not satisfy the new protocol (IPrintWith), pr-str will fall back on IPrintable recursively. I marked various IPrintable-related features as ^:deprecated, and included comments that explain why.

I also added some new tests, for backwards compatibility and printing records.

Comment by Evan Mezeske [ 02/Aug/12 3:33 PM ]

For reference, I added a simple pr-str benchmark to cljs-bench. The test data structure that is printed is not very large, but I'm thinking that it's nested deeply enough to show the performance gains of my patch once it's applied. If not, I can update the benchmark to dynamically build up a larger data structure to really tax pr-str.


Comment by David Nolen [ 10/Aug/12 11:03 AM ]

Could we get a compiler warning for IPrintable usage? We can then remove it from the next release.

Comment by Evan Mezeske [ 11/Aug/12 10:23 PM ]

I will add the warning, hopefully this weekend sometime. I need to think about how to do it, though. The easy way would be to just generate a warning whenever pr-seq is called. However, for a deeply nested structure, that would produce a whole lot of warnings for just one use of pr-seq. I'll try to come up with a solution that will only generate one warning per invocation of pr(-str).

Comment by David Nolen [ 11/Aug/12 10:45 PM ]

Would be better to warn only implementation rather than usage. Ideally the patch just warns on files that are not core.cljs. That's still bound to create some noise internal to CLJS, but I think that'll encourage folks to fix their own code.

Comment by Evan Mezeske [ 11/Aug/12 11:06 PM ]

Ah, yeah, that would be much better. I forgot we're working on the compiler here, that sort of thing should be easy!

Comment by David Nolen [ 29/Aug/12 8:27 PM ]

I wonder if an IWriter protocol isn't warranted here instead of making printer be a function. While it seems like overkill for clientside JS perhaps for serverside scripts it would be useful. Granted those users could easily construct a function but there something protocol like here that might benefit from being made explicit.

Perhaps worth pondering http://docs.oracle.com/javase/6/docs/api/java/io/Writer.html and http://nodejs.org/api/fs.html just a little bit.

Comment by Evan Mezeske [ 30/Aug/12 2:28 AM ]

I added an IWriter protocol, changed IPrintWith's name to IPrintWriter, and changed its implementation to use an IWriter. I added two types that implement IWriter: StringBufferWriter and StandardOutputWriter, and then implemented the pr-* functions in terms of those.

Comment by Evan Mezeske [ 31/Aug/12 12:52 AM ]

I changed IPrintWriter to IPrintWithWriter, as per a conversation I had with David Nolen on IRC.

Comment by David Nolen [ 31/Aug/12 8:18 AM ]

This patch no longer applies to master. Can we get a new patch? Thanks!

Comment by Evan Mezeske [ 01/Sep/12 4:05 PM ]

Rebased the patch on top of master just now.

Comment by David Nolen [ 03/Sep/12 1:12 PM ]

Great it applies! However tests now fail for SpiderMonkey and JavaScriptCore -

out/core-advanced-test.js:371: ReferenceError: write is not defined"


Exception: ReferenceError: Can't find variable: write
global code@out/core-advanced-test.js:371

Also can we please remove the old patches to prevent future confusion?


Comment by Evan Mezeske [ 03/Sep/12 3:04 PM ]

Sadly, JavaScriptCore does not have any way to print to standard output without a newline, so I had to remove the more efficient StandardOutputWriter and output strings by building them up fully in memory and then dumping them in one big print.

Eventually we could add the option to supply a write-fn, which is just like print-fn except guaranteed not to output a trailing newline. Then we could check whether a write-fn was set and if so avoid the intermediate in-memory copy of the string.

The tests are now passing under all three implementations.

Comment by David Nolen [ 04/Sep/12 6:07 PM ]

I'm confused as to why you needed to remove StandardOutputWriter. Is there any reason to not make the implementation adaptable to the available capabilities? I think that most people that use ClojureScript serverside will be using V8 or Rhino anyhow.

Comment by David Nolen [ 04/Sep/12 6:07 PM ]

I'm confused as to why you needed to remove StandardOutputWriter. Is there any reason to not make the implementation adaptable to the available capabilities? I think that most people that use ClojureScript serverside will be using V8 or Rhino anyhow.

Comment by Evan Mezeske [ 04/Sep/12 6:17 PM ]

I removed StandardOutputWriter because it doesn't work with *print-fn*, which is assumed to append a newline to the text it outputs.

I agree that the implementation should be adaptable to the available capabilities, but I was thinking that such behavior could be added in a future patch.

The approach I had in mind was to add a settable *write-fn*, and have the implementation use a buffering approach if it *write-fn* is nil, otherwise use the more efficient StandardOutputWriter. If you agree with this approach, I'd be happy to add it to this patch.

Comment by David Nolen [ 04/Sep/12 6:25 PM ]

But why do people need to deal with *print-fn* or *write-fn* at all? Can't this logic be hidden inside StandardOutputWriter?

(set! *out* (new StandardOutputWriter.))

Granted code exists that depends on *print-fn* so I'm not talking about removing them. But being able to set! *out* also opens the door for rebinding *out* just like we do in Clojure.

Comment by Evan Mezeske [ 04/Sep/12 6:50 PM ]

Is there an official list of what JS platforms are supported by ClojureScript? It appears that every platform has a unique API for printing to standard output, so StandardOutputWriter would have to inspect the environment to find write() (V8), putstr() (SpiderMonkey), no support (JavaScriptCore, Rhino), process.stdout.write() (node.js), or god knows what other functions in the more obscure JS engines.

I assume that *print-fn* exists to allow for compatibility with any JS platform (even those not explicitly tested against). If StandardOutputWriter is going to have platform-specific knowledge baked into it (e.g. a list of functions to try), I think it should also have a back door to allow people to override things for other JS platforms. I guess *out* could be used for that purpose, though...

Comment by David Nolen [ 04/Sep/12 7:03 PM ]

I agree that this is a bit of scope creep for this ticket, we should hash it out a bit on design page first. Do we need *write-fn* for this patch to work? It would be preferable to do w/o it so we can design something proper w/o yet something else people will inadvertently depend on.

Comment by Evan Mezeske [ 04/Sep/12 7:34 PM ]

The patch, in its current form (e.g. without a StandardOutputWriter), assumes that only *print-fn* is available, and doesn't require a *write-fn*.

Even without the StandardOutputWriter, it's generally much faster (and certainly no worse) than the existing implementation.

Comment by David Nolen [ 04/Sep/12 7:40 PM ]

Great! Will go with this then.

Comment by David Nolen [ 04/Sep/12 7:41 PM ]

Can we rebase the patch on master again, this won't apply cleanly because of another commit thank snuck in involving pr-str. Thanks!

Comment by Evan Mezeske [ 04/Sep/12 8:11 PM ]

Huh, that patch was based on the latest master commit – I'm very confused about why it won't apply cleanly. I must have done something weird when I created it. I tested the application of the new patch I'm attaching, so it should be good to go.

Comment by David Nolen [ 04/Sep/12 9:39 PM ]

fixed, http://github.com/clojure/clojurescript/commit/c8bc05ca8e3c7e30a85148dd67398aa4d0e6b468

Comment by Brandon Bloom [ 04/Sep/12 11:01 PM ]

Sweet! Good stuff, Evan, thanks.

You wouldn't be interested in writing a new patch for CLJS-321 would you?

Comment by David Nolen [ 06/Sep/12 8:03 PM ]

This ticket is great but it did affect printing behavior - it would be nice to resolve http://dev.clojure.org/jira/browse/CLJS-372 before cutting a new release.

Comment by Evan Mezeske [ 07/Sep/12 1:41 AM ]

Could you add some more detail to CLJS-372? I'll be happy to fix any regressions caused by this patch. In my testing, I didn't notice anything broken, though, so I'll need a bit more detail so that I know where to start looking.

Comment by David Nolen [ 07/Sep/12 11:11 AM ]

I added some more detail to CLJS-372.

Generated at Thu Apr 25 18:55:11 CDT 2019 using JIRA 4.4#649-r158309.