<< Back to previous view

[CLJS-819] cljs.reader cannot handle character classes beginning with slashes in regex literals Created: 20/Jun/14  Updated: 01/Jul/14  Resolved: 01/Jul/14

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Critical
Reporter: Ziyang Hu Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: bug, cljs, reader

Attachments: Text File cljs-819.patch    

 Description   

cljs.user> (cljs.reader/read-string "#\"\\s\"")
Compilation error: Error: Unexpected unicode escape \s



 Comments   
Comment by Ziyang Hu [ 20/Jun/14 10:03 AM ]

This in particular means that (cljs.reader/read-string (str [#"\s"])) won't work

Comment by Francis Avila [ 25/Jun/14 11:46 AM ]

Patch and test.

Comment by David Nolen [ 01/Jul/14 9:25 PM ]

fixed https://github.com/clojure/clojurescript/commit/32259c5ff3f86ea086ae3949403df80c2f518c7e





[CLJS-775] cljs.reader parses radix form of int literals (e.g. 2r101) incorrectly Created: 27/Feb/14  Updated: 10/May/14  Resolved: 10/May/14

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Francis Avila Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader

Attachments: Text File cljs-775-initial.patch     Text File cljs-775.patch    
Patch: Code and Test

 Description   
ClojureScript:cljs.user> (cljs.reader/read-string "2r10")
"Error evaluating:" (cljs.reader/read-string "2r10") :as "cljs.reader.read_string.call(null,\"2r10\")"
org.mozilla.javascript.JavaScriptException: Error: Invalid number format [2r10] (file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js#107)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:107 (anonymous)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:112 (anonymous)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:374 (read_number)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:650 (read)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:677 (read_string)
	at <cljs repl>:1 (anonymous)
	at <cljs repl>:1


 Comments   
Comment by Francis Avila [ 28/Feb/14 1:42 AM ]

Turns out other integer literals were broken besides radix due to a re-match problem (CLJS-776). Patch includes fix and tests for all the different integer literal forms.

Floats, ratios, and symbols/keywords might also have parsed incorrectly in certain cases, but I did not produce failing tests to confirm.

Comment by David Nolen [ 08/May/14 6:42 PM ]

Can we get a new patch rebased on master? Thanks!

Comment by Francis Avila [ 10/May/14 11:36 AM ]

Rebased patch.

Comment by David Nolen [ 10/May/14 2:10 PM ]

fixed https://github.com/clojure/clojurescript/commit/56ea020fd9b15df220ba247f73b68873f041d8ef





[CLJS-540] Switch to tools.reader for cljs.analyzer/forms-seq Created: 15/Jul/13  Updated: 29/Jul/13  Resolved: 29/Jul/13

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Sean Grove Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: analyzer, cljs, reader, sourcemap
Environment:

CLJS master


Attachments: Text File tools_reader_with_passing_tests.patch    
Patch: Code

 Description   

Switches cljs.analyzer to use tools.reader, so we can get more accurate column location for symbols, prepping for more fleshed-out source maps.



 Comments   
Comment by David Nolen [ 15/Jul/13 10:43 AM ]

Excellent, we need two more things in this patch, can you update bootstrap.sh and the POM file? Thanks.

Comment by Sean Grove [ 15/Jul/13 11:00 AM ]

Updated with POM information and bootstrap update

Comment by Sean Grove [ 22/Jul/13 2:20 PM ]

CA's been processed and I'm listed on the contributing page, so shouldn't be blocked by that any longer.

Comment by David Nolen [ 27/Jul/13 2:09 PM ]

I tried applying this patch and rerunning the bootstrap script. This works but when I try to run script/test I get an error about the tools.reader not being on the classpath. Even trying to require tools.reader at via the repl doesn't work for me.

Comment by David Nolen [ 29/Jul/13 11:14 AM ]

fixed http://github.com/clojure/clojurescript/commit/9f010ff5d4a122b0f1dc93905647f309cc45c699





[CLJS-454] Instance Reader to Support Micro/Nanoseconds Created: 04/Jan/13  Updated: 03/Aug/13  Resolved: 03/Aug/13

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Anatoly Polinsky Assignee: Unassigned
Resolution: Completed Votes: 4
Labels: reader, timestamp
Environment:

N/A


Attachments: File CLJS-454.diff     File CLJS-454-patch2.diff     Text File cljs-reader-nanosecond-workarround-corrected.patch    

 Description   

Any timestamps with a greater than millisecond precision cannot be handled by the ClouseScript reader:

> cljs.reader.read_date("2012-12-30T23:20:05.066980000-00:00")
> Error: Assert failed: timestamp millisecond field must be in range 0..999 Failed:  0<=66980000<=999 (<= low n high)

Here "2012-12-30T23:20:05.066980000-00:00" is an example of an ordinary timestamp that is returned from Postgres.

ClojureScript reader interprets the nanosecond portion "066980000" as milliseconds and the check here fails:

def parse-and-validate-timestamp
...
   (check 0 ms 999 "timestamp millisecond field must be in range 0..999")


 Comments   
Comment by David Nolen [ 04/Jan/13 4:55 PM ]

Is this behavior supported in Clojure?

Comment by Jozef Wagner [ 05/Jan/13 6:48 AM ]

It seems it is

Comment by Anatoly Polinsky [ 06/Jan/13 2:18 AM ]

@David,

Yep, it is supported. Enhancing Jozef's response:

user=> (use 'clojure.instant)
nil
user=> (read-instant-date "2012-12-30T23:20:05.066980000-00:00")
#inst "2012-12-30T23:20:05.066-00:00"

/Anatoly

Comment by David Nolen [ 09/Jan/13 10:59 PM ]

Ok, do you all have a good idea about what a patch would look like?

Comment by Thomas Heller [ 05/Feb/13 3:28 PM ]

Since JavaScript has no support for nanoseconds in Date, I'd vote for dropping the nanoseconds. Currently the reader just blows up when reading a String with java.sql.Timestamp printed in CLJ, since that is printed in nanosecond precision.

While probably not the best solution, I attached a patch for cljs.reader/parse-and-validate-timestamp fn that just drops the nanoseconds from the millisecond portion. Would be easier to just strip the ms string to 3 digits but that would cause "2012-12-30T23:20:05.066980000012312312123121-00:00" to validate.

Comment by Thomas Heller [ 05/Feb/13 3:35 PM ]

Well, Math. Just realized that this is not really correct, since 1000 is closer to 999 than it is to 100.

Comment by Thomas Heller [ 05/Feb/13 3:37 PM ]

Sorry, for that brainfart.

Comment by David Nolen [ 06/Feb/13 11:39 AM ]

Or we could return a custom ClojureScript Date type that provides the same api as js/Date but adds support for nanoseconds.

Comment by Thomas Heller [ 20/Feb/13 5:20 PM ]

One could extend the js/Date prototype with a setNanos/getNanos method and call them accordingly. I'd offer to implement that but the cljs.reader/parse-and-validate-timestamp function scares me, any objections to rewriting that?

As for the js/Date extension, thats easy enough:

(set! (.. js/Date -prototype -setNanos) (fn [ns] (this-as self (set! (.-nanos self) ns))))
(set! (.. js/Date -prototype -getNanos) (fn [] (this-as self (or (.-nanos self) 0))))

Not sure if its a good idea though, messing with otherwise native code might not be "stable".

Not convinced that keeping the nanos around is "required" since javascript cannot construct Dates with nanos, but its probably better not to lose the nanos in case of round tripping from clj -> cljs -> clj.

Comment by David Nolen [ 21/Feb/13 12:25 AM ]

We definitely don't want to mutate Date's prototype without namespacing. I'm not sure we want to mutate Date's prototype at all. That's why I suggested a ClojureScript type with the same interface as Date just as Google has done in Closure.

Comment by Jonas Enlund [ 31/Jul/13 4:21 AM ]

This patch (CLJS-454.diff) takes the "truncate to millisecs" approach which I think is correct considering that the clojure reader does the same thing (but they truncate to nanoseconds instead).

The patch does some refactoring of parse-and-validate-timestamp and in the process fixes CLJS-564 as well as a subtle bug where the interpretation of the fraction part differed between clojurescript and clojure (see commit msg for details)

I also added tests.

Comment by Jonas Enlund [ 31/Jul/13 4:54 AM ]

There are duplicate tests in core-tests[1] and reader-tests[2]. This wouldn't be a big deal but each of them generates 7000+ assertions which is a bit excessive. Also note that the assertions in reader-tests doesn't do any padding so instant literals of the form #inst "2010-1-1..." are generated (which are not valid but goes unnoticed in the current implementation). This is why the CLJS-454.diff patch fails to pass the tests.

Should I update the patch where I remove some of the tests (either in core-tests or reader-tests)?

[1] https://github.com/clojure/clojurescript/blob/master/test/cljs/cljs/core_test.cljs#L1770
[2] https://github.com/clojure/clojurescript/blob/master/test/cljs/cljs/reader_test.cljs#L57

Comment by David Nolen [ 31/Jul/13 4:59 AM ]

I note that Clojure includes these tests so probably not a good idea to remove them. Also they actually test different things right? I'd rather see the tests fixed if they need to be adjusted for the new behavior.

Comment by Jonas Enlund [ 31/Jul/13 5:58 AM ]

I can't find the tests you're referring to. Link?

Comment by David Nolen [ 31/Jul/13 6:06 AM ]

https://github.com/clojure/clojure/blob/master/test/clojure/test_clojure/reader.clj#L482

Comment by Jonas Enlund [ 31/Jul/13 6:18 AM ]

format take care of the padding

(format "#inst \"2010-%02d-%02dT%02d:14:15.666-06:00\"" month day hour)

so those tests won't generate strings of the form "2010-1-1T1:14:15.666-06:00".

Also, the clojure reader can't read such literals and requires zero-padding:

user=> #inst "2010-1-1T1:14:15.666-06:00"
RuntimeException Unrecognized date/time syntax: 2010-1-1T1:14:15.666-06:00  clojure.instant/fn--6183/fn--6184 (instant.clj:118)
user=> #inst "2010-01-01T01:14:15.666-06:00"
#inst "2010-01-01T07:14:15.666-00:00"

The clojurescript reader returns bogus dates:

ClojureScript:cljs.user> (reader/read-string "#inst \"2010-1-1T1:14:15.666-06:00\"")
#inst "1970-01-01T00:00:00.000-00:00"
ClojureScript:cljs.user> (reader/read-string "#inst \"2010-01-01T01:14:15.666-06:00\"")
#inst "2010-01-01T07:14:15.666-00:00"

which is probably the reason the tests from https://github.com/clojure/clojurescript/blob/master/test/cljs/cljs/reader_test.cljs#L57 still passes without assertion errors.

Comment by David Nolen [ 03/Aug/13 5:05 PM ]

fixed, http://github.com/clojure/clojurescript/commit/a749ab6c04baa6bd4c890e860adf43b3d702932b





[CLJS-356] `read-string` exception message should be the same as in Clojure Created: 16/Aug/12  Updated: 27/Jul/13  Resolved: 17/Aug/12

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Shantanu Kumar Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader
Environment:

ClojureScript (Rhino REPL, browser, PhantomJS)



 Description   

When I run (read-string ""), I get the exception message "EOF while reading" on Clojure (clojure.core) but on ClojureScript (cljs.reader) the message is just "EOF".

ClojureScript's `read-string` should be fixed to match the error message "EOF while reading".

Reference: https://groups.google.com/forum/?fromgroups#!topic/clojure/k-ZX5MaXoiQ%5B1-25%5D



 Comments   
Comment by David Nolen [ 17/Aug/12 12:40 AM ]

Fixed, http://github.com/clojure/clojurescript/commit/df6f316acdc3375ea6de29b7700dac1a5c8e9dbe





[CLJS-133] reader/read-string produces malformed keywords in IE9 Created: 20/Jan/12  Updated: 27/Jul/13  Resolved: 25/Feb/12

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: g. christensen Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader
Environment:

Windows 7 x86, MSIE 9, Jetty



 Description   

the following call: (reader/read-string "{:status :ok}") produces {"\uFFFD'status" "\uFFFD'ok"} which differs from expected {:status :ok}
the server inserts proper content-type (utf-8) header for all javascript files

the problem disappears if unicode special characters are manually replaced with their escaped equivalents ("\uFDD0") in cljs.core.keyword function in the compiled core.js file
it doesn't disappear when call to the str_STAR_ function is replaced to the concatenation operators, which suggest that the function works correctly and adds some mystery to the problem

currently I have no possibility to reproduce the problem on other system, so I'm not certain in all of the aspects



 Comments   
Comment by David Nolen [ 24/Jan/12 1:07 PM ]

Keywords in ClojureScript are just JavaScript strings. If you mean that you're seeing this on the client, that is expected, are you saying that you're seeing this in the ClojureScript REPL?

Comment by g. christensen [ 26/Jan/12 10:46 AM ]

Yes, I know about the internal keyword representation, the result {"\uFFFD'status" "\uFFFD'ok"} is taken from (pr-str (reader/read-string "{:status :ok}")) put in the `alert' call, in other browsers it returns {:status :ok}, but in IE it returns the string above. Comparison of such keywords with hardcoded keywords returns nil, so most likely they are not interpreted as keywords (as you may notice, the special character code in malformed keyword differs from the character hardcoded in the clojurescript code of the `keyword' function (\uFDD0 vs \uFFFD).
It's strange because it works perfectly in other browsers, and it's like that it's a some sort of endianness problem or something, but I don't know so much about IE internals and can't judje on this matter.
I have tried to reproduce the problem on other machine with IE 9.0.8112.16421 64bit update 9.0.3 and it's still there.

Comment by g. christensen [ 27/Jan/12 1:43 AM ]

I just have read some of unicode specifications and found: "U+FFFD � replacement character used to replace an unknown or unprintable character", so it probably necessary to find point where the noncharacter replaced with this character, or may be the raw nonescaped noncharacter is replaced internally by \uFFFD and there is no distinction between keywords and other symbols in IE, obtained through read-string (it may process files correctly but replace noncharacters in constructed strings).

Comment by David Nolen [ 03/Feb/12 7:17 PM ]

Having people looking into the IE issues is fantastic - this is similar to another IE9 reader issue, do you have an approach that you think will solve the problem? Thanks.

Comment by g. christensen [ 04/Feb/12 10:14 AM ]

The only thing I can think up is to place \uFDD0 and \uFDD1 escaped literals instead of raw characters in compiled JavaScript output or some compiler hack which will place the escaped literals in `keyword' and `symbol' construction functions.

Comment by David Nolen [ 05/Feb/12 12:18 PM ]

And you're sure that you're setting the utf-8 meta tag in your HTML document?

Comment by David Nolen [ 20/Feb/12 10:50 AM ]

Same as CLJS-139

Comment by David Nolen [ 22/Feb/12 8:53 AM ]

This ticket is different from CLJS-139, this is only about the reader.

Comment by Thomas Scheiblauer [ 22/Feb/12 11:09 AM ]

applying http://dev.clojure.org/jira/secure/attachment/10939/cljs-133_fix.patch to the current HEAD makes read-string work as expected. This is because David's patch for cljs-139 (http://dev.clojure.org/jira/secure/attachment/10913/139_fix_unicode_emit.patch) does not address the "emit-constant" multimethod for String (only Character, clojure.lang.Keyword and clojure.lang.Symbol). Will will have to do the same replacement for String (each character) as David did for Character (maybe by utilizing clojure.string.replace) to make the 2 functions I patched in core.cljs work in the previous unpatched state (I hope someone can understand my gibberish

!!! deleted referenced patch because it is now obsolete !!!

Comment by Thomas Scheiblauer [ 23/Feb/12 8:33 AM ]

I have attached a patch to CLJS-139 which fixes this related issue.

Comment by Thomas Scheiblauer [ 23/Feb/12 12:52 PM ]

I have just attached a general non-ascii escape patch to CLJS-139 which obsoletes my previous one!

Comment by David Nolen [ 25/Feb/12 10:25 AM ]

Fixed, https://github.com/clojure/clojurescript/commit/965dc505229652558adcb526ecb5a9f91ce31ce2





Generated at Wed Oct 22 23:00:45 CDT 2014 using JIRA 4.4#649-r158309.