[CLJS-133] reader/read-string produces malformed keywords in IE9 Created: 20/Jan/12 Updated: 25/Feb/12 Resolved: 25/Feb/12 |
|
| Status: | Resolved |
| Project: | ClojureScript |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Defect | Priority: | Minor |
| Reporter: | g. christensen | Assignee: | Unassigned |
| Resolution: | Completed | Votes: | 0 |
| Labels: | reader | ||
| Environment: |
Windows 7 x86, MSIE 9, Jetty |
||
| Description |
|
the following call: (reader/read-string "{:status :ok}") produces {"\uFFFD'status" "\uFFFD'ok"} which differs from expected {:status :ok} the problem disappears if unicode special characters are manually replaced with their escaped equivalents ("\uFDD0") in cljs.core.keyword function in the compiled core.js file currently I have no possibility to reproduce the problem on other system, so I'm not certain in all of the aspects |
| Comments |
| Comment by David Nolen [ 24/Jan/12 1:07 PM ] |
|
Keywords in ClojureScript are just JavaScript strings. If you mean that you're seeing this on the client, that is expected, are you saying that you're seeing this in the ClojureScript REPL? |
| Comment by g. christensen [ 26/Jan/12 10:46 AM ] |
|
Yes, I know about the internal keyword representation, the result {"\uFFFD'status" "\uFFFD'ok"} is taken from (pr-str (reader/read-string "{:status :ok}")) put in the `alert' call, in other browsers it returns {:status :ok}, but in IE it returns the string above. Comparison of such keywords with hardcoded keywords returns nil, so most likely they are not interpreted as keywords (as you may notice, the special character code in malformed keyword differs from the character hardcoded in the clojurescript code of the `keyword' function (\uFDD0 vs \uFFFD). |
| Comment by g. christensen [ 27/Jan/12 1:43 AM ] |
|
I just have read some of unicode specifications and found: "U+FFFD � replacement character used to replace an unknown or unprintable character", so it probably necessary to find point where the noncharacter replaced with this character, or may be the raw nonescaped noncharacter is replaced internally by \uFFFD and there is no distinction between keywords and other symbols in IE, obtained through read-string (it may process files correctly but replace noncharacters in constructed strings). |
| Comment by David Nolen [ 03/Feb/12 7:17 PM ] |
|
Having people looking into the IE issues is fantastic - this is similar to another IE9 reader issue, do you have an approach that you think will solve the problem? Thanks. |
| Comment by g. christensen [ 04/Feb/12 10:14 AM ] |
|
The only thing I can think up is to place \uFDD0 and \uFDD1 escaped literals instead of raw characters in compiled JavaScript output or some compiler hack which will place the escaped literals in `keyword' and `symbol' construction functions. |
| Comment by David Nolen [ 05/Feb/12 12:18 PM ] |
|
And you're sure that you're setting the utf-8 meta tag in your HTML document? |
| Comment by David Nolen [ 20/Feb/12 10:50 AM ] |
|
Same as |
| Comment by David Nolen [ 22/Feb/12 8:53 AM ] |
|
This ticket is different from |
| Comment by Thomas Scheiblauer [ 22/Feb/12 11:09 AM ] |
|
applying http://dev.clojure.org/jira/secure/attachment/10939/cljs-133_fix.patch to the current HEAD makes read-string work as expected. This is because David's patch for cljs-139 (http://dev.clojure.org/jira/secure/attachment/10913/139_fix_unicode_emit.patch) does not address the "emit-constant" multimethod for String (only Character, clojure.lang.Keyword and clojure.lang.Symbol). Will will have to do the same replacement for String (each character) as David did for Character (maybe by utilizing clojure.string.replace) to make the 2 functions I patched in core.cljs work in the previous unpatched state (I hope someone can understand my gibberish !!! deleted referenced patch because it is now obsolete !!! |
| Comment by Thomas Scheiblauer [ 23/Feb/12 8:33 AM ] |
|
I have attached a patch to CLJS-139 which fixes this related issue. |
| Comment by Thomas Scheiblauer [ 23/Feb/12 12:52 PM ] |
|
I have just attached a general non-ascii escape patch to CLJS-139 which obsoletes my previous one! |
| Comment by David Nolen [ 25/Feb/12 10:25 AM ] |
|
Fixed, https://github.com/clojure/clojurescript/commit/965dc505229652558adcb526ecb5a9f91ce31ce2 |