ClojureScript compiler with optimizations off outputs utf-8 which is likely to be misrecognised by the browser
Completed
Description fields
Description
By default, the google closure compiler outputs all javascript as us-ascii, and uses embedded \u1234 -style escapes for non-ascii unicode characters. So, when the clojurescript compiler is run with optimizations enabled, the google compilation step ensures that all output is us-ascii.
However, when clojurescript is compiled with optimizations off, clojurescript outputs utf-8 directly.
When the javascript is served from a web server, typically it will be served with a text/javascript MIME type, which by default in HTTP will be assumed to be in the iso-8859-1 encoding. When the javascript is served from a file:// url, I'm not sure what character encoding the browser would assume Javascript to be. Setting the charset on the script tag is not guaranteed to override any encoding set by the server.
Typically browsers will not assume javascript source code to be in UTF-8, so uncompiled programs will be mis-interpretted. As clojurescript uses unicode characters internally to tag keywords and symbols, this causes simple examples to fail to run in a development environment.
See the following threads where this has caused a problem:
It would improve robustness for the unoptimized output to use character encoding consistent with the optimized output, by using us-ascii. us-ascii ensures that the compiled output will work when served from almost any web server, or file:// environment as it is a subset of utf-8 and iso-8859-1 - the two most likely interpretations.
By default, the google closure compiler outputs all javascript as us-ascii, and uses embedded \u1234 -style escapes for non-ascii unicode characters. So, when the clojurescript compiler is run with optimizations enabled, the google compilation step ensures that all output is us-ascii.
However, when clojurescript is compiled with optimizations off, clojurescript outputs utf-8 directly.
When the javascript is served from a web server, typically it will be served with a text/javascript MIME type, which by default in HTTP will be assumed to be in the iso-8859-1 encoding.
When the javascript is served from a file:// url, I'm not sure what character encoding the browser would assume Javascript to be.
Setting the charset on the script tag is not guaranteed to override any encoding set by the server.
Typically browsers will not assume javascript source code to be in UTF-8, so uncompiled programs will be mis-interpretted. As clojurescript uses unicode characters internally to tag keywords and symbols, this causes simple examples to fail to run in a development environment.
See the following threads where this has caused a problem:
http://groups.google.com/group/clojure-dev/browse_thread/thread/e4dc61383455294
http://groups.google.com/group/clojure/browse_thread/thread/6613fcf1a9129c3a#
It would improve robustness for the unoptimized output to use character encoding consistent with the optimized output, by using us-ascii. us-ascii ensures that the compiled output will work when served from almost any web server, or file:// environment as it is a subset of utf-8 and iso-8859-1 - the two most likely interpretations.