<< Back to previous view

[CLJS-819] cljs.reader cannot handle character classes beginning with slashes in regex literals Created: 20/Jun/14  Updated: 01/Jul/14  Resolved: 01/Jul/14

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Critical
Reporter: Ziyang Hu Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: bug, cljs, reader

Attachments: Text File cljs-819.patch    

 Description   

cljs.user> (cljs.reader/read-string "#\"\\s\"")
Compilation error: Error: Unexpected unicode escape \s



 Comments   
Comment by Ziyang Hu [ 20/Jun/14 10:03 AM ]

This in particular means that (cljs.reader/read-string (str [#"\s"])) won't work

Comment by Francis Avila [ 25/Jun/14 11:46 AM ]

Patch and test.

Comment by David Nolen [ 01/Jul/14 9:25 PM ]

fixed https://github.com/clojure/clojurescript/commit/32259c5ff3f86ea086ae3949403df80c2f518c7e





[CLJS-775] cljs.reader parses radix form of int literals (e.g. 2r101) incorrectly Created: 27/Feb/14  Updated: 10/May/14  Resolved: 10/May/14

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Francis Avila Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader

Attachments: Text File cljs-775-initial.patch     Text File cljs-775.patch    
Patch: Code and Test

 Description   
ClojureScript:cljs.user> (cljs.reader/read-string "2r10")
"Error evaluating:" (cljs.reader/read-string "2r10") :as "cljs.reader.read_string.call(null,\"2r10\")"
org.mozilla.javascript.JavaScriptException: Error: Invalid number format [2r10] (file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js#107)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:107 (anonymous)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:112 (anonymous)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:374 (read_number)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:650 (read)
	at file:/Users/favila/wcs/clojurescript/.repl/cljs/reader.js:677 (read_string)
	at <cljs repl>:1 (anonymous)
	at <cljs repl>:1


 Comments   
Comment by Francis Avila [ 28/Feb/14 1:42 AM ]

Turns out other integer literals were broken besides radix due to a re-match problem (CLJS-776). Patch includes fix and tests for all the different integer literal forms.

Floats, ratios, and symbols/keywords might also have parsed incorrectly in certain cases, but I did not produce failing tests to confirm.

Comment by David Nolen [ 08/May/14 6:42 PM ]

Can we get a new patch rebased on master? Thanks!

Comment by Francis Avila [ 10/May/14 11:36 AM ]

Rebased patch.

Comment by David Nolen [ 10/May/14 2:10 PM ]

fixed https://github.com/clojure/clojurescript/commit/56ea020fd9b15df220ba247f73b68873f041d8ef





[CLJS-540] Switch to tools.reader for cljs.analyzer/forms-seq Created: 15/Jul/13  Updated: 29/Jul/13  Resolved: 29/Jul/13

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Sean Grove Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: analyzer, cljs, reader, sourcemap
Environment:

CLJS master


Attachments: Text File tools_reader_with_passing_tests.patch    
Patch: Code

 Description   

Switches cljs.analyzer to use tools.reader, so we can get more accurate column location for symbols, prepping for more fleshed-out source maps.



 Comments   
Comment by David Nolen [ 15/Jul/13 10:43 AM ]

Excellent, we need two more things in this patch, can you update bootstrap.sh and the POM file? Thanks.

Comment by Sean Grove [ 15/Jul/13 11:00 AM ]

Updated with POM information and bootstrap update

Comment by Sean Grove [ 22/Jul/13 2:20 PM ]

CA's been processed and I'm listed on the contributing page, so shouldn't be blocked by that any longer.

Comment by David Nolen [ 27/Jul/13 2:09 PM ]

I tried applying this patch and rerunning the bootstrap script. This works but when I try to run script/test I get an error about the tools.reader not being on the classpath. Even trying to require tools.reader at via the repl doesn't work for me.

Comment by David Nolen [ 29/Jul/13 11:14 AM ]

fixed http://github.com/clojure/clojurescript/commit/9f010ff5d4a122b0f1dc93905647f309cc45c699





[CLJS-454] Instance Reader to Support Micro/Nanoseconds Created: 04/Jan/13  Updated: 03/Aug/13  Resolved: 03/Aug/13

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Anatoly Polinsky Assignee: Unassigned
Resolution: Completed Votes: 4
Labels: reader, timestamp
Environment:

N/A


Attachments: File CLJS-454.diff     File CLJS-454-patch2.diff     Text File cljs-reader-nanosecond-workarround-corrected.patch    

 Description   

Any timestamps with a greater than millisecond precision cannot be handled by the ClouseScript reader:

> cljs.reader.read_date("2012-12-30T23:20:05.066980000-00:00")
> Error: Assert failed: timestamp millisecond field must be in range 0..999 Failed:  0<=66980000<=999 (<= low n high)

Here "2012-12-30T23:20:05.066980000-00:00" is an example of an ordinary timestamp that is returned from Postgres.

ClojureScript reader interprets the nanosecond portion "066980000" as milliseconds and the check here fails:

def parse-and-validate-timestamp
...
   (check 0 ms 999 "timestamp millisecond field must be in range 0..999")


 Comments   
Comment by David Nolen [ 04/Jan/13 4:55 PM ]

Is this behavior supported in Clojure?

Comment by Jozef Wagner [ 05/Jan/13 6:48 AM ]

It seems it is

Comment by Anatoly Polinsky [ 06/Jan/13 2:18 AM ]

@David,

Yep, it is supported. Enhancing Jozef's response:

user=> (use 'clojure.instant)
nil
user=> (read-instant-date "2012-12-30T23:20:05.066980000-00:00")
#inst "2012-12-30T23:20:05.066-00:00"

/Anatoly

Comment by David Nolen [ 09/Jan/13 10:59 PM ]

Ok, do you all have a good idea about what a patch would look like?

Comment by Thomas Heller [ 05/Feb/13 3:28 PM ]

Since JavaScript has no support for nanoseconds in Date, I'd vote for dropping the nanoseconds. Currently the reader just blows up when reading a String with java.sql.Timestamp printed in CLJ, since that is printed in nanosecond precision.

While probably not the best solution, I attached a patch for cljs.reader/parse-and-validate-timestamp fn that just drops the nanoseconds from the millisecond portion. Would be easier to just strip the ms string to 3 digits but that would cause "2012-12-30T23:20:05.066980000012312312123121-00:00" to validate.

Comment by Thomas Heller [ 05/Feb/13 3:35 PM ]

Well, Math. Just realized that this is not really correct, since 1000 is closer to 999 than it is to 100.

Comment by Thomas Heller [ 05/Feb/13 3:37 PM ]

Sorry, for that brainfart.

Comment by David Nolen [ 06/Feb/13 11:39 AM ]

Or we could return a custom ClojureScript Date type that provides the same api as js/Date but adds support for nanoseconds.

Comment by Thomas Heller [ 20/Feb/13 5:20 PM ]

One could extend the js/Date prototype with a setNanos/getNanos method and call them accordingly. I'd offer to implement that but the cljs.reader/parse-and-validate-timestamp function scares me, any objections to rewriting that?

As for the js/Date extension, thats easy enough:

(set! (.. js/Date -prototype -setNanos) (fn [ns] (this-as self (set! (.-nanos self) ns))))
(set! (.. js/Date -prototype -getNanos) (fn [] (this-as self (or (.-nanos self) 0))))

Not sure if its a good idea though, messing with otherwise native code might not be "stable".

Not convinced that keeping the nanos around is "required" since javascript cannot construct Dates with nanos, but its probably better not to lose the nanos in case of round tripping from clj -> cljs -> clj.

Comment by David Nolen [ 21/Feb/13 12:25 AM ]

We definitely don't want to mutate Date's prototype without namespacing. I'm not sure we want to mutate Date's prototype at all. That's why I suggested a ClojureScript type with the same interface as Date just as Google has done in Closure.

Comment by Jonas Enlund [ 31/Jul/13 4:21 AM ]

This patch (CLJS-454.diff) takes the "truncate to millisecs" approach which I think is correct considering that the clojure reader does the same thing (but they truncate to nanoseconds instead).

The patch does some refactoring of parse-and-validate-timestamp and in the process fixes CLJS-564 as well as a subtle bug where the interpretation of the fraction part differed between clojurescript and clojure (see commit msg for details)

I also added tests.

Comment by Jonas Enlund [ 31/Jul/13 4:54 AM ]

There are duplicate tests in core-tests[1] and reader-tests[2]. This wouldn't be a big deal but each of them generates 7000+ assertions which is a bit excessive. Also note that the assertions in reader-tests doesn't do any padding so instant literals of the form #inst "2010-1-1..." are generated (which are not valid but goes unnoticed in the current implementation). This is why the CLJS-454.diff patch fails to pass the tests.

Should I update the patch where I remove some of the tests (either in core-tests or reader-tests)?

[1] https://github.com/clojure/clojurescript/blob/master/test/cljs/cljs/core_test.cljs#L1770
[2] https://github.com/clojure/clojurescript/blob/master/test/cljs/cljs/reader_test.cljs#L57

Comment by David Nolen [ 31/Jul/13 4:59 AM ]

I note that Clojure includes these tests so probably not a good idea to remove them. Also they actually test different things right? I'd rather see the tests fixed if they need to be adjusted for the new behavior.

Comment by Jonas Enlund [ 31/Jul/13 5:58 AM ]

I can't find the tests you're referring to. Link?

Comment by David Nolen [ 31/Jul/13 6:06 AM ]

https://github.com/clojure/clojure/blob/master/test/clojure/test_clojure/reader.clj#L482

Comment by Jonas Enlund [ 31/Jul/13 6:18 AM ]

format take care of the padding

(format "#inst \"2010-%02d-%02dT%02d:14:15.666-06:00\"" month day hour)

so those tests won't generate strings of the form "2010-1-1T1:14:15.666-06:00".

Also, the clojure reader can't read such literals and requires zero-padding:

user=> #inst "2010-1-1T1:14:15.666-06:00"
RuntimeException Unrecognized date/time syntax: 2010-1-1T1:14:15.666-06:00  clojure.instant/fn--6183/fn--6184 (instant.clj:118)
user=> #inst "2010-01-01T01:14:15.666-06:00"
#inst "2010-01-01T07:14:15.666-00:00"

The clojurescript reader returns bogus dates:

ClojureScript:cljs.user> (reader/read-string "#inst \"2010-1-1T1:14:15.666-06:00\"")
#inst "1970-01-01T00:00:00.000-00:00"
ClojureScript:cljs.user> (reader/read-string "#inst \"2010-01-01T01:14:15.666-06:00\"")
#inst "2010-01-01T07:14:15.666-00:00"

which is probably the reason the tests from https://github.com/clojure/clojurescript/blob/master/test/cljs/cljs/reader_test.cljs#L57 still passes without assertion errors.

Comment by David Nolen [ 03/Aug/13 5:05 PM ]

fixed, http://github.com/clojure/clojurescript/commit/a749ab6c04baa6bd4c890e860adf43b3d702932b





[CLJS-356] `read-string` exception message should be the same as in Clojure Created: 16/Aug/12  Updated: 27/Jul/13  Resolved: 17/Aug/12

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Shantanu Kumar Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader
Environment:

ClojureScript (Rhino REPL, browser, PhantomJS)



 Description   

When I run (read-string ""), I get the exception message "EOF while reading" on Clojure (clojure.core) but on ClojureScript (cljs.reader) the message is just "EOF".

ClojureScript's `read-string` should be fixed to match the error message "EOF while reading".

Reference: https://groups.google.com/forum/?fromgroups#!topic/clojure/k-ZX5MaXoiQ%5B1-25%5D



 Comments   
Comment by David Nolen [ 17/Aug/12 12:40 AM ]

Fixed, http://github.com/clojure/clojurescript/commit/df6f316acdc3375ea6de29b7700dac1a5c8e9dbe





[CLJS-133] reader/read-string produces malformed keywords in IE9 Created: 20/Jan/12  Updated: 27/Jul/13  Resolved: 25/Feb/12

Status: Closed
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: g. christensen Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader
Environment:

Windows 7 x86, MSIE 9, Jetty



 Description   

the following call: (reader/read-string "{:status :ok}") produces {"\uFFFD'status" "\uFFFD'ok"} which differs from expected {:status :ok}
the server inserts proper content-type (utf-8) header for all javascript files

the problem disappears if unicode special characters are manually replaced with their escaped equivalents ("\uFDD0") in cljs.core.keyword function in the compiled core.js file
it doesn't disappear when call to the str_STAR_ function is replaced to the concatenation operators, which suggest that the function works correctly and adds some mystery to the problem

currently I have no possibility to reproduce the problem on other system, so I'm not certain in all of the aspects



 Comments   
Comment by David Nolen [ 24/Jan/12 1:07 PM ]

Keywords in ClojureScript are just JavaScript strings. If you mean that you're seeing this on the client, that is expected, are you saying that you're seeing this in the ClojureScript REPL?

Comment by g. christensen [ 26/Jan/12 10:46 AM ]

Yes, I know about the internal keyword representation, the result {"\uFFFD'status" "\uFFFD'ok"} is taken from (pr-str (reader/read-string "{:status :ok}")) put in the `alert' call, in other browsers it returns {:status :ok}, but in IE it returns the string above. Comparison of such keywords with hardcoded keywords returns nil, so most likely they are not interpreted as keywords (as you may notice, the special character code in malformed keyword differs from the character hardcoded in the clojurescript code of the `keyword' function (\uFDD0 vs \uFFFD).
It's strange because it works perfectly in other browsers, and it's like that it's a some sort of endianness problem or something, but I don't know so much about IE internals and can't judje on this matter.
I have tried to reproduce the problem on other machine with IE 9.0.8112.16421 64bit update 9.0.3 and it's still there.

Comment by g. christensen [ 27/Jan/12 1:43 AM ]

I just have read some of unicode specifications and found: "U+FFFD � replacement character used to replace an unknown or unprintable character", so it probably necessary to find point where the noncharacter replaced with this character, or may be the raw nonescaped noncharacter is replaced internally by \uFFFD and there is no distinction between keywords and other symbols in IE, obtained through read-string (it may process files correctly but replace noncharacters in constructed strings).

Comment by David Nolen [ 03/Feb/12 7:17 PM ]

Having people looking into the IE issues is fantastic - this is similar to another IE9 reader issue, do you have an approach that you think will solve the problem? Thanks.

Comment by g. christensen [ 04/Feb/12 10:14 AM ]

The only thing I can think up is to place \uFDD0 and \uFDD1 escaped literals instead of raw characters in compiled JavaScript output or some compiler hack which will place the escaped literals in `keyword' and `symbol' construction functions.

Comment by David Nolen [ 05/Feb/12 12:18 PM ]

And you're sure that you're setting the utf-8 meta tag in your HTML document?

Comment by David Nolen [ 20/Feb/12 10:50 AM ]

Same as CLJS-139

Comment by David Nolen [ 22/Feb/12 8:53 AM ]

This ticket is different from CLJS-139, this is only about the reader.

Comment by Thomas Scheiblauer [ 22/Feb/12 11:09 AM ]

applying http://dev.clojure.org/jira/secure/attachment/10939/cljs-133_fix.patch to the current HEAD makes read-string work as expected. This is because David's patch for cljs-139 (http://dev.clojure.org/jira/secure/attachment/10913/139_fix_unicode_emit.patch) does not address the "emit-constant" multimethod for String (only Character, clojure.lang.Keyword and clojure.lang.Symbol). Will will have to do the same replacement for String (each character) as David did for Character (maybe by utilizing clojure.string.replace) to make the 2 functions I patched in core.cljs work in the previous unpatched state (I hope someone can understand my gibberish

!!! deleted referenced patch because it is now obsolete !!!

Comment by Thomas Scheiblauer [ 23/Feb/12 8:33 AM ]

I have attached a patch to CLJS-139 which fixes this related issue.

Comment by Thomas Scheiblauer [ 23/Feb/12 12:52 PM ]

I have just attached a general non-ascii escape patch to CLJS-139 which obsoletes my previous one!

Comment by David Nolen [ 25/Feb/12 10:25 AM ]

Fixed, https://github.com/clojure/clojurescript/commit/965dc505229652558adcb526ecb5a9f91ce31ce2





[CLJ-1444] Fix unquote splicing for empty seqs Created: 11/Jun/14  Updated: 11/Jun/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader, syntax-quote

Attachments: Text File 0001-Fix-unquote-splicing-for-empty-seqs.patch    
Patch: Code and Test

 Description   

Current behaviour:

user=> `(~@())
nil
user=> `[~@()]
[]

Expected behaviour:

user=> `(~@())
()
user=> `[~@()]
[]





[CLJ-1425] Defer literal map construction of syntax-quoted maps to allow for semantically valid unquote splicing Created: 16/May/14  Updated: 17/May/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Jon Distad Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader
Environment:

Any


Attachments: Text File 0001-Fix-map-unquote-splicing.patch    
Patch: Code and Test

 Description   

At present one cannot unquote-splice into a map literal unless the map contains an even number of literal forms, even if one of them is a null unquote (~@[]).

E.g.: `{~@[1 2]} ;=> RuntimeException Map literal must contain an even number of forms clojure.lang.Util.runtimeException (Util.java:219)

However, within the context of a syntax-quote, it is not essential that the map literal be represented internally as a map since the syntax-quote emits code to build the map and not the map itself. The syntaxQuote method on SyntaxQuoteReader does not even operate the map, but rather a flattened sequence of interleaved keys and values.

With the aid of metadata and a LispReader-global Var, we can track that a collection of elements within a syntax quote will become a map, and emit the proper code forms from the SyntaxQuoteReader. There is a small edge case in metadata literals, but an with additional piece of metadata containing the proto-map we can still generate the appropriate (with-meta ...) form at syntax-quote emission time.

Importantly, none of the hand-waving involved ever escapes the reader, and the eval/compile environment is none the wiser.

This allows the following:

`{~@[1 2]} ;=> after eval: {1 2}
`^{~@[:foo :bar]} sym ;=> metadata of 'sym after eval: {:foo :bar}

But not:
`~{1} ;=> RuntimeException ...

Or:{1} ;=> RuntimeException ...

And `{~@[1]} has the same semantics as the currently required `{~@[1] ~@[]}
;=> IllegalArgumentException No value supplied for key: 1 clojure.lang.PersistentHashMap.create (PersistentHashMap.java:77)

The changes in my patch pass all existing tests and include an additional test for the newly-supported map unquote-splicing form.



 Comments   
Comment by Jon Distad [ 16/May/14 5:57 PM ]

Modified from this morning- more tests, plus bugfix for the new cases caught.

Comment by Jon Distad [ 17/May/14 10:47 AM ]

Updated patch.

Now uses two distinct paths for adding metadata. Old version potentially stacked with-meta calls, which could result in lost keys.





[CLJ-1366] The empty map literal is read as a different map each time Created: 01/Mar/14  Updated: 02/Mar/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5, Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: OHTA Shogo Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: memory, reader

Attachments: Text File 0001-make-the-reader-return-the-same-empty-map-when-it-re.patch     Text File 0002-make-the-reader-return-the-same-empty-map-when-it-re.patch    
Patch: Code

 Description   

As reported here (https://groups.google.com/forum/?hl=en#!topic/clojure-dev/n83hlRFsfHg), the empty map literal is read as a different map each time.

user=> (identical? (read-string "{}") (read-string "{}"))
false

Making the reader return the same empty map when it reads an empty map is expected to improve some memory efficiency, and also lead to consistency with the way other collection literals are read in.

user=> (identical? (read-string "()") (read-string "()"))
true
user=> (identical? (read-string "[]") (read-string "[]"))
true
user=> (identical? (read-string "#{}") (read-string "#{}"))
true

Cause: LispReader calls RT.map() with an empty array when it reads an empty map, and RT.map() in turn makes a new map unless its argument given is null.

Approach: make RT.map() return the same empty map when the argument is an empty array as well, not only when null



 Comments   
Comment by OHTA Shogo [ 01/Mar/14 2:59 AM ]

Sorry, the patch 0001-make-the-reader-return-the-same-empty-map-when-it-re.patch didn't work.

The updated patch 0002-make-the-reader-return-the-same-empty-map-when-it-re.patch works, but I'm afraid it'd be beyond the scope of this ticket since it modifies RT.map() behavior a bit.





[CLJ-1324] Allow leading slashes in unqualified symbol names Created: 15/Jan/14  Updated: 10/Feb/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Mike Anderson Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader
Environment:

All


Attachments: Text File clj-1324-1.patch    
Patch: Code and Test

 Description   

The proposal is to allow the reader to accept symbol names with leading slashes.

Problem: Leading slashes are frequently useful, e.g. in mathematical operators like "/" or "/="

Currently, only "/" is allowed as a special case, and is used for the division operator in clojure.core

This could be extended to allow all symbols to have names starting with a leading slash.

There should be no ambiguity with namespace-qualified symbols:
1) In the case of a leading slash, a symbol should be interpreted as an unqualified symbol e.g. "/="
2) In the case of a slash anywhere except in leading position, it should considered as namespace qualified, e.g. "clojure.core/+"
3) In the case of multiple non-leading slashes, the first slash is the namespace separator, e.g. "clojure.core.matrix.operators//="

Optionally, it also would be possible to allow multiple slashes after the leading slash in a name. This would allow symbols such as "/src/main/clojure" to become valid.



 Comments   
Comment by Paavo Parkkinen [ 10/Feb/14 7:32 AM ]

Attached patch to allow leading slashes in symbol names.

The patch changes the regexp pattern used to match symbols to accept characters after a slash in symbol names.

Tests pass, and the patch also adds a couple of new special cases to the symbol tests.





[CLJ-1286] Fix reader spec and regex to match code for keywords starting with digits Created: 31/Oct/13  Updated: 02/Jul/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: reader


 Description   

The reader page at http://clojure.org/reader states that symbols (and keywords) cannot start with a number and the regex used in LispReader (and EdnReader) also has this intention. CLJ-1252 addressed this by fixing the broken reader regex to match the spec. However, that broke some existing code so we rolled back the change. There is still a disconnect here and this ticket serves to decide what to do instead.

I presume that we are effectively deciding that keywords like :5 are ok to read. If so, we should alter the regex to more accurately capture that intent - right now it allows these purely by accident due to backtracking. A secondary question is whether the Clojure and EDN reader spec should also explicitly allow these as valid. My preference would be to have the reader and the spec match, so I would lobby to loosen the reader spec.



 Comments   
Comment by Nicola Mometto [ 12/Nov/13 4:50 PM ]

what about keywords like :1/1 or :1/a? Clojure currently accepts the latter but not the former.

Comment by Francis Avila [ 02/Jul/14 3:13 PM ]

There's more discussion of this problem (and symbol/keyword parsing in general) in the context of cljs.reader at CLJS-677.

Comment by Andy Fingerhut [ 02/Jul/14 3:27 PM ]

Francis, can you double-check that ticket number? The one you mention (CLJS-667) doesn't seem to have any discussion of this problem.

Comment by Francis Avila [ 02/Jul/14 3:39 PM ]

Sincere apologies, it's CLJS-677. (Original post corrected too.)





[CLJ-1282] The quote special form should throw an exception if passed more than one form to quote Created: 23/Oct/13  Updated: 16/Feb/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Howard Lewis Ship Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: errormsgs, reader

Attachments: Text File CLJ-1282-p1.patch     Text File CLJ-1282-p2.patch    
Patch: Code and Test
Approval: Triaged

 Description   

Quote currently ignores all but the first argument. In the case of being called accidentally with multiple values, it should throw an exception specifying the error.

user> (quote 1 2 3)
1

------- Original: --------

Every once in a while, you can just go down the rabbit hole.

I had an errant expression in my code:

(-> message get-message-values 'DESTINATION_MERCHANT_ID)

One would think this would work; it certainly would if the key was a keyword and not a symbol.

One would expect this to expand to:

('DESTINATION_MERCHANT_ID (get-message-values message))

however, the reader is involved, so it is as if the source were:

(-> message get-message-values (quote DESTINATION_MERCHANT_ID))

which expands to:

(quote (-> message get-message-values) DESTINATION_MERCHANT_ID))

... hilarity ensues! Because quote currently ignores extra parameters, my code gets the quoted value '(clojure.core/-> message get-message-values) rather than the expected string from the map; this shifts us from the "there's a bug in my code" to "the nature of reality is broken".

The correct expression is:

(-> message get-message-values (get 'DESTINATION_MERCHANT_ID))

This took quite a while to track down; if the

special form checked that it was passed exactly one form to quote and threw an exception otherwise, I think I would have caught this much earlier. It could even identify the expression it is quoting, which would provide a lot better understanding of where I went wrong.



 Comments   
Comment by Howard Lewis Ship [ 23/Oct/13 2:02 PM ]

Sorry, can't edit the description now to correct the formatting errors.

Comment by Howard Lewis Ship [ 24/Oct/13 6:09 PM ]

I just wanted to point out that your description is shorter, but makes it appear that such a use is unlikely and therefore unimportant; the detail of my description is to point out a reasonable situation where something explicable, but completely counterintuitive and confusing, does occur.

Comment by Alex Miller [ 24/Oct/13 8:28 PM ]

That's why I left the original in there too.

Comment by Gary Fredericks [ 09/Dec/13 7:07 AM ]

(quote) currently returns nil. Do we have an opinion about that?

Comment by Gary Fredericks [ 09/Dec/13 9:32 AM ]

Attached p1, which throws an IllegalArgumentException (wrapped in a CompilerException of course) for anything but 1 arg, and includes the number of args that were passed.

I can't think of any reason why (quote) would be useful, so I decided to throw on that too. Very easy to change of course.

Also added a test that (eval '(quote 1 2 3)) throws.

Comment by Stuart Halloway [ 31/Jan/14 12:46 PM ]

I recommend the following changes:

  • throw an ex-info that includes the offending form in its map {:form ...}
  • check only for the map data, not exception type or message, in the tests
Comment by Andy Fingerhut [ 31/Jan/14 6:31 PM ]

Patch CLJ-1282-p1.patch no longer applies cleanly after commits made to Clojure master on Jan 31 2014, probably due to the patch committed for CLJ-1318, and probably only because some lines of context changed in the test file. That would be trivial to update, but Stu's comments above suggest that more significant changes need to be made.

Comment by Gary Fredericks [ 01/Feb/14 9:19 AM ]

Throwing an ex-info is easy enough. I don't know how to avoid at least incidentally checking for the exception type, since the ExceptionInfo is wrapped in a CompilerException. I'll make a patch that keeps the class name in the test but doesn't do any checks on the cause aside from the ex-data. Let me know if I should do anything different.

Comment by Gary Fredericks [ 01/Feb/14 9:58 AM ]

Attached CLJ-1282-p2.patch which is off of the current master and addresses Stu's points.

Comment by Alex Miller [ 04/Feb/14 11:23 PM ]

Moving back to Triaged for more looks.

Comment by Nicola Mometto [ 16/Feb/14 12:12 PM ]

Currently (quote) returns nil, is it intended that this patch makes that an error or was this by accident?

Comment by Gary Fredericks [ 16/Feb/14 12:23 PM ]

I consciously chose to make (quote) an error – I made a comment about that earlier and didn't get any feedback, so I unilaterally decided to make it an error due to the fact that I couldn't think of any possible use for (quote).

It's an easy switch if somebody thinks differently.

Comment by Nicola Mometto [ 16/Feb/14 1:13 PM ]

I'm sorry I did not notice your previois comment.
I'm asking because I need to know whether I should throw on (quote) for tools.analyzer, currently it is allowed but I too think that (quote) should be an error.





[CLJ-1252] Clojure reader (incorrectly) accepts keywords starting with a number Created: 04/Sep/13  Updated: 31/Oct/13  Resolved: 31/Oct/13

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.6

Type: Defect Priority: Minor
Reporter: Alex Miller Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: reader

Attachments: Text File numkeyword.patch    
Patch: Code and Test
Approval: Screened

 Description   

The reader page at http://clojure.org/reader states that symbols (and keywords) cannot start with a number and the regex used in LispReader (and EdnReader) also has this intention. But:

user> :5
:5
user> (class *1)
clojure.lang.Keyword

Cause: The pattern for keywords is: "[:]?([\D&&[^/]].*/)?(/|[\D&&[^/]][^/]*)". The intention of the [\D&&[^/]] is to accept all non-digits except /. However, if the : matches and 5 does not, the regex will backtrack and unmatch the :, instead matching it in the non-digit charset. The whole match is sent on in the code and the : is stripped later.

Solution: Prevent back-tracking for the first : match by using the possessive quantifier ?+ instead of ?. Once the first : is matched, it will not backtrack to it. (This is also faster.) The patch makes this change in LispReader and EdnReader, updates a couple tests that were using number keywords, and adds a new negative test to check that :5 isn't read.

Patch: numkeyword.patch

Screened by: Alex Miller



 Comments   
Comment by Andy Fingerhut [ 05/Sep/13 5:25 PM ]

Given how far this one is along now, perhaps CLJ-1003 should be marked as a duplicate of this one?

Comment by Alex Miller [ 31/Oct/13 9:39 PM ]

Because this broke existing code (several lib projects like java.jdbc), we have rolled this back. Will follow up with an alternate ticket to address this in some other way.





[CLJ-1238] Allow EdnReader to read foo// (CLJ-873 for EdnReader) Created: 28/Jul/13  Updated: 22/Nov/13  Resolved: 22/Nov/13

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.6

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: edn, reader

Attachments: Text File 0001-Fix-CLJ-873-for-EdnReader-too.patch     File clj-1238-2.diff    
Patch: Code
Approval: Ok

 Description   

The patch that has been applied in 1.6 for CLJ-873 predated the introduction of EdnReader, as such it only patched LispReader.

This patch makes the same change to allow foo// in EdnReader, and removes two constants in clojure.lang.RT that are not needed anymore after this patch.

To reproduce:

user=> (require 'clojure.edn)
user=> (defn / [& x] 42)
user=> (clojure.edn/read-string "(user// 4 2)")
RuntimeException Invalid token: user//  clojure.lang.Util.runtimeException (Util.java:219)

Approach: copy changes from LispReader in CLJ-873. After:

user=> (require 'clojure.edn)
nil
user=> (defn / [& x] 42)
WARNING: / already refers to: #'clojure.core// in namespace: user, being replaced by: #'user//
#'user//
user=> (clojure.edn/read-string "(user// 4 2)")
(user// 4 2)

Patch: 0001-Fix-CLJ-873-for-EdnReader-too.patch

Screened by: Alex Miller



 Comments   
Comment by Nicola Mometto [ 29/Jul/13 9:37 AM ]

Alex, this title might be misleading – the main purpose of this patch is to allow "foo//" to be read by clojure.edn/read

Comment by Alex Miller [ 30/Jul/13 3:12 PM ]

@Nicola - Please fix!

Comment by Nicola Mometto [ 30/Jul/13 6:28 PM ]

Done - I also changed the description to better describe what the patch actually does

Comment by Andy Fingerhut [ 25/Oct/13 12:27 PM ]

I've not looked into details yet, but screened 0001-Fix-CLJ-873-for-EdnReader-too.patch fails to apply cleanly as of Oct 25, 2013 latest Clojure master.

Comment by Alex Miller [ 25/Oct/13 2:05 PM ]

Updated patch to apply cleanly to master as of Oct 25, 2013.

Comment by Andy Fingerhut [ 31/Oct/13 4:37 PM ]

With the undoing of the change for CLJ-1252 earlier today, the patch 0001-Fix-CLJ-873-for-EdnReader-too.patch applies cleanly again, and clj-1238-2.diff no longer does.

Comment by Alex Miller [ 31/Oct/13 4:48 PM ]

Andy you are uncanny - I was just on my way to update this.

Comment by Alex Miller [ 31/Oct/13 4:51 PM ]

Switching back to prior patch.





[CLJ-1234] Accept whitespace in Record and Type reader forms Created: 24/Jul/13  Updated: 25/Oct/13  Resolved: 25/Oct/13

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.6

Type: Defect Priority: Minor
Reporter: Jozef Wagner Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader

Attachments: File clj-1234-v1.diff     Text File clj-1234-v1.txt    
Patch: Code and Test
Approval: Ok

 Description   

Whitespace should be allowed in the record reader form as with tagged literals. Following example shows the problem:

user=> *clojure-version*
{:major 1, :minor 5, :incremental 1, :qualifier nil}
user=> (defrecord B [x])
user.B
user=> (B. 2)
#user.B{:x 2}
user=> #user.B{:x 3}
#user.B{:x 3}
user=> #user.B {:x 3}
RuntimeException Unreadable constructor form starting with "#user.B "  clojure.lang.Util.runtimeException (Util.java:219)
user=> #inst "2013"
#inst "2013-01-01T00:00:00.000-00:00"
user=> #inst"2013"
#inst "2013-01-01T00:00:00.000-00:00"

Patch: clj-1234-v1.diff

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 25/Jul/13 2:40 PM ]

Stack trace in case it's useful:

user=> (pst *e)
ReaderException java.lang.RuntimeException: Unreadable constructor form starting with "#user.B "
clojure.lang.LispReader.read (LispReader.java:220)
clojure.core/read (core.clj:3407)
clojure.core/read (core.clj:3405)
clojure.tools.nrepl.middleware.interruptible-eval/evaluate/fn-589/fn-592 (interruptible_eval.clj:52)
clojure.main/repl/read-eval-print-6588/fn-6589 (main.clj:257)
clojure.main/repl/read-eval-print--6588 (main.clj:257)
clojure.main/repl/fn--6597 (main.clj:277)
clojure.main/repl (main.clj:277)
clojure.tools.nrepl.middleware.interruptible-eval/evaluate/fn--589 (interruptible_eval.clj:56)
clojure.core/apply (core.clj:617)
clojure.core/with-bindings* (core.clj:1788)
clojure.tools.nrepl.middleware.interruptible-eval/evaluate (interruptible_eval.clj:41)
Caused by:
RuntimeException Unreadable constructor form starting with "#user.B "
clojure.lang.Util.runtimeException (Util.java:219)
clojure.lang.LispReader$CtorReader.readRecord (LispReader.java:1224)
clojure.lang.LispReader$CtorReader.invoke (LispReader.java:1174)
clojure.lang.LispReader$DispatchReader.invoke (LispReader.java:619)
clojure.lang.LispReader.read (LispReader.java:185)
clojure.core/read (core.clj:3407)

Comment by Andy Fingerhut [ 08/Sep/13 3:40 AM ]

Patch clj-1234-v1.txt adds a test that fails without the patch, and passes with it, verifying that white space is allowed after #record.name but before {.

Very straightforward patch, especially since fogus was nice enough to write 2 commented-out lines that did the trick simply by uncommenting them.

Comment by Alex Miller [ 08/Sep/13 3:14 PM ]

Any LispReader fix likely also affects EdnReader.

Comment by Nicola Mometto [ 08/Sep/13 4:57 PM ]

EdnReader doesn't read record/type literals so in this case it's not affected

Comment by Alex Miller [ 09/Sep/13 1:26 AM ]

Thanks for checking!





[CLJ-1146] Symbol name starting with digits to defn throws "Unmatched delimiter )" Created: 13/Jan/13  Updated: 18/Apr/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Trivial
Reporter: Linus Ericsson Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: errormsgs, reader
Environment:

$java -jar clojure-1.5.0-RC2.jar

$java -version
java version "1.6.0_37"
Java(TM) SE Runtime Environment (build 1.6.0_37-b06-434-10M3909)
Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01-434, mixed mode)
Mac OS X:
System Version: Mac OS X 10.6.8 (10K549)
Kernel Version: Darwin 10.8.0



 Description   

When trying to use an invalid symbol name when defining a function, the error message thrown is a confusing and wrong one. The error message is "RuntimeException Unmatched delimiter: ) clojure.lang.Util.runtimeException (Util.java:219)", which unfortunately is the only message seen in nrepled emacs.

$ java -jar clojure-1.5.0-RC2.jar
Clojure 1.5.0-RC2
user=> (defn 45fn [] nil)
NumberFormatException Invalid number: 45fn clojure.lang.LispReader.readNumber (LispReader.java:255)
[]
nil
RuntimeException Unmatched delimiter: ) clojure.lang.Util.runtimeException (Util.java:219)

Expected:
When trying to (defn or (def a thing with a non valid symbol name, the last thrown error message should be one stating that the given symbol name is not a valid one.



 Comments   
Comment by Kevin Downey [ 18/Apr/14 2:27 AM ]

this is an artifact of how streams and repls work.

when you type (defn 45fn [] nil) and hit enter, the inputstream flushes and "(defn 45fn [] nil)" is made available to the reader, the reader reads up to 45fn, throws an error back to the main repl loop, which prints out the error, then calls read, which still has the unread parts available to it "[] nil)"

changing this behavior would require significant changes to clojure's repl.

checkout https://github.com/trptcolin/reply instead





[CLJ-1138] data-reader returning nil causes exception Created: 22/Dec/12  Updated: 14/Feb/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Steve Miner Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader
Environment:

clojure 1.5 beta2, Mac OS X 10.8.2, java version "1.6.0_37"



 Description   

If a data-reader returns nil, the reader throws java.lang.RuntimeException: No dispatch macro... The error message implies that there is no dispatch macro for whatever the first character of the tag happens to be.

Here's a simple example:

(binding [*data-readers* {'f/ignore (constantly nil)}] (read-string "#f/ignore 42 10"))

RuntimeException No dispatch macro for: f clojure.lang.Util.runtimeException (Util.java:219)



 Comments   
Comment by Steve Miner [ 22/Dec/12 9:43 AM ]

clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch allows a data-reader to return nil instead of throwing. Does sanity check that possible tag or record isJavaIdentifierStart(). Gives better error message for special characters that might actually be dispatch macros (rather than assuming it's a tagged literal).

Comment by Steve Miner [ 22/Dec/12 10:06 AM ]

clj-1138-data-reader-return-nil-for-no-op.patch allows a data-reader returning nil to be treated as a no-op by the reader (like #_). nil is not normally a useful value (actually it causes an exception in Clojure 1.4 through 1.5 beta2) for a data-reader to return. With this patch, one could get something like a conditional feature reader using data-readers.

Comment by Steve Miner [ 22/Dec/12 10:26 AM ]

clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch is the first patch to consider. It merely allows nil as a value from a data-reader and returns nil as the final value. I think it does what was originally intended for dispatch macros, and gives a better error message in many cases (mostly typos).

The second patch, clj-1138-data-reader-return-nil-for-no-op.patch, depends on the other being applied first. It takes an extra step to treat a nil value returned from a data-reader as a no-op for the reader (like #_).

Comment by Steve Miner [ 23/Dec/12 11:52 AM ]

It turns out that you can work around the original problem by having your data-reader return '(quote nil) instead of plain nil. That expression conveniently evaluates to nil so you can get a nil if necessary. This also works after applying the patches so there's still a way to return nil if you really want it.

(binding [*data-readers* {'x/nil (constantly '(quote nil))}] (read-string "#x/nil 42"))
;=> (quote nil)

Comment by Andy Fingerhut [ 07/Feb/13 9:20 AM ]

Patch clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch dated Dec 22 2012 still applies cleanly to latest master if you use the following command:

% git am --keep-cr -s --ignore-whitespace < clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch

Without the --ignore-whitespace option, the patch fails only because some whitespace was changed in Clojure master recently.

Comment by Andy Fingerhut [ 13/Feb/13 11:24 AM ]

OK, now with latest master (1.5.0-RC15 at this time), patch clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch no longer applies cleanly, not even using --ignore-whitespace in the 'git am' command given above. Steve, if you could see what needs to be updated, that would be great. Using the patch command as suggested in the "Updating stale patches" section of http://dev.clojure.org/display/design/JIRA+workflow wasn't enough, so it should probably be carefully examined by hand to see what needs updating.

Comment by Steve Miner [ 14/Feb/13 12:21 PM ]

I removed my patches. Things have changes recently with the LispReader and new EdnReader.





[CLJ-1100] Reader literals cannot contain periods Created: 02/Nov/12  Updated: 13/May/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Kevin Lynagh Assignee: Steve Miner
Resolution: Unresolved Votes: 1
Labels: reader

Attachments: Text File CLJ-1100-reader-tags-with-periods.patch     Text File clj-1100-v2.patch    
Patch: Code and Test
Approval: Screened

 Description   

The reader tries to read a record instead of a literal if the tag contains periods.

user> (binding [*data-readers* {'foo/bar #'identity}] (read-string "#foo/bar 1"))
1
user> (binding [*data-readers* {'foo/bar.x #'identity}] (read-string "#foo/bar.x 1"))
ClassNotFoundException foo/bar.x  java.lang.Class.forName0 (Class.java:-2)

Summary of reader forms:

Kind Example Constraint Status
Record #user.Foo[1] record class name OK
Class #java.lang.String["abc"] class name OK
Clojure reader tag #uuid "c48d7d6e-f3bb-425a-abc5-44bd014a511d" not a class name, no "/" OK
Library reader tag #my/card "5H" not a class name, has "/" OK
  #my.ns/card "5H" not a class name, has "/" OK
  #my/playing.card "5H" not a class name, has "/" BROKEN - read as record

Note: reader tags should not be allowed to override the record reader.

Cause: In LispReader, CtorReader.invoke() decides between record and tagged literal based on whether the tag has a ".".

Proposed: Change the discriminator in CtorReader by doing more string inspection:

  • If name has a "/" -> readTagged (not a legal class name)
  • If name has no "/" or "." -> readTagged (records must have qualified names)
  • Else -> readRecord (also covers Java classes)

Tradeoffs: Clojure-defined data reader tags must not contain periods. Not possible to read a Java class with no package. Avoids unnecessary class loading/construction for all tags.

Patch: CLJ-1100-v2.patch

Screened by: Alex Miller

Alternatives considered:

Using class checks:

  • Attempt readRecord (also covers Java classes)
  • If failed, attempt readTagged

Tradeoffs: Clojure tags could not override Java/record constructors - not sure that's something we'd ever want to do, but this would cut that off. This alternative may attempt classloading when it would not have before.



 Comments   
Comment by Steve Miner [ 06/Nov/12 9:41 AM ]

The suggested patch (clj-1100-reader-literal-periods.patch) will break reading records when *default-data-reader-fn* is set. Try adding a test like this:

(deftest tags-containing-periods-with-default
      ;; we need a predefined record for this test so we (mis)use clojure.reflect.Field for convenience
      (let [v "#clojure.reflect.Field{:name \"fake\" :type :fake :declaring-class \"Fake\" :flags nil}"]
        (binding [*default-data-reader-fn* nil]
          (is (= (read-string v) #clojure.reflect.Field{:name "fake" :type :fake :declaring-class "Fake" :flags nil})))
        (binding [*default-data-reader-fn* (fn [tag val] (assoc val :meaning 42))]
          (is (= (read-string v) #clojure.reflect.Field{:name "fake" :type :fake :declaring-class "Fake" :flags nil})))))
Comment by Rich Hickey [ 29/Nov/12 9:36 AM ]

The problem assessment is ok, but the resolution approach may not be. What happens should be based not upon what is in data-readers but whether or not the name names a class.

Is the intent here to allow readers to circumvent records? I'm not in favor of that.

Comment by Steve Miner [ 29/Nov/12 4:01 PM ]

New patch following Rich's comments. The decision to read a record is now based on the symbol containing periods and not having a namespace. Otherwise, it is considered a data reader tag. User
defined tags are required to be qualified but they may now have periods in the name. Tests added to show that
data readers cannot override record classes. Note: Clojure-defined data reader tags may be unqualified, but they should not contain periods in order to avoid confusion with record classes.

Comment by Steve Miner [ 29/Nov/12 4:17 PM ]

I deleted my old patch and some comments referring to it to avoid confusion.

In Clojure 1.5 beta 1, # followed by a qualified symbol with a period in the name is considered a record and causes an exception for the missing record class. With the patch, only non-qualified symbols containing periods are considered records. That allows user-defined qualified symbols with periods in their names to be used as data reader tags.

Comment by Andy Fingerhut [ 07/Feb/13 9:05 AM ]

clj-1100-periods-in-data-reader-tags-patch-v2.txt dated Feb 7 2013 is identical to CLJ-1100-periods-in-data-reader-tags.patch dated Nov 29 2012, except it applies cleanly to latest master. The only change appears to be in some white space in the context lines.

Comment by Andy Fingerhut [ 07/Feb/13 12:53 PM ]

I've removed clj-1100-periods-in-data-reader-tags-patch-v2.txt mentioned in the previous comment, because I learned that CLJ-1100-periods-in-data-reader-tags.patch dated Nov 29 2012 applies cleanly to latest master and passes all tests if you use this command to apply it.

% git am --keep-cr -s --ignore-whitespace < CLJ-1100-periods-in-data-reader-tags.patch

I've already updated the JIRA workflow and screening patches wiki pages to mention this --ignore-whitespace option.

Comment by Andy Fingerhut [ 13/Feb/13 11:31 AM ]

Both of the current patches, CLJ-1100-periods-in-data-reader-tags.patch dated Nov 29 2012, and clj-1100-reader-literal-periods.patch dated Nov 6 2012, fail to apply cleanly to latest master (1.5.0-RC15) as of today, although they did last week. Given all of the changes around read / read-string and edn recently, they should probably be evaluated by their authors to see how they should be updated.

Comment by Steve Miner [ 14/Feb/13 12:23 PM ]

I deleted my patch: CLJ-1100-periods-in-data-reader-tags.patch. clj-1100-reader-literal-periods.patch is clearly wrong, but the original author or an administrator has to delete that.

Comment by Kevin Lynagh [ 14/Feb/13 1:28 PM ]

I cannot figure out how to remove my attachment (clj-1100-reader-literal-periods.patch) in JIRA.

Comment by Steve Miner [ 14/Feb/13 1:43 PM ]

Downarrow (popup) menu to the right of the "Attachments" section. Choose "manager attachments".

Comment by Kevin Lynagh [ 14/Feb/13 2:02 PM ]

Great, thanks Steve. Are you going to take another pass at this issue, or should I give it a go?

Comment by Steve Miner [ 14/Feb/13 3:04 PM ]

Kevin, I'm not planning to work on this right now as 1.5 is pretty much done. It might be worthwhile discussing the issue a bit on the dev mailing list before working on a patch, but that's up to you. I think my approach was correct, although now changes would have to be applied to both LispReader and EdnReader.

Comment by Alex Miller [ 09/Apr/14 10:29 AM ]

Updated description based on my understanding.

Comment by Steve Miner [ 22/Apr/14 3:30 PM ]

I will resurrect my old patch and update it for the changes since 1.5.

Comment by Steve Miner [ 28/Apr/14 8:21 AM ]

Added patch to allow reader tags to have periods, but only with a namespace. Added tests to confirm that it works, but does not allow overriding a record name with a data-reader.

Comment by Steve Miner [ 28/Apr/14 8:51 AM ]

The patch implements Alex's alternative 1. It's purely lexical. A tag symbol without a namespace and containing periods is handled as a record (Java class). Otherwise, it's a data-reader tag. Of course, unqualified symbols without periods are still data-reader tags.

IMHO, a Java class without a package is a pathological case which Clojure doesn't need to worry about. This patch follows the convention that Java classes are named by unqualified symbols containing dots.

I did try alternative 2, testing for an actual class, but the implementation was more complicated. Also, it would open the possibility of breaking working code by adding a record or Java class that accidentally collided with an unqualified dotted tag that had previously worked fine. It's better to follow a simple rule that unqualified dotted symbols always refer to classes. Maybe the class doesn't actually exist, but that doesn't mean the symbol might be a data-literal tag.

Comment by Alex Miller [ 13/May/14 4:49 PM ]

Added clj-1100-v2.patch - identical, just removes whitespace to simplify change.





[CLJ-1079] Don't squash explicit :line and :column metadata in the MetaReader Created: 29/Sep/12  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Chas Emerick Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: reader

Attachments: File CLJ-1079.diff    
Patch: Code and Test

 Description   

I have been experimenting with using cljx to produce Clojure and ClojureScript source from a single file. This has gone well so far, with the exception that, due to the way the source transformation works, all of the linebreaks and other formatting is gone from the output. There is an option to include the original :line metadata in the output though, like so:

;;This file autogenerated from 
;;
;;  src/cljx/com/foo/hosty.cljx
;;
^{:line 1} (ns com.foo.hosty)
^{:line 3} (defn ^{:clj true} system-hash [x] ^{:line 5} (System/identityHashCode x))

(Hopefully, such hackery won't be necessary in the future with sjacket or something like it...)

Unfortunately, when read in using a LineNumberingPushbackReader, code like this has its :line metadata squashed by the line numbers coming from that. A REPL-friendly example would be:

=> (meta (read (clojure.lang.LineNumberingPushbackReader.
                 (java.io.StringReader. "^{:line 66} ()"))))
{:line 1}
=> (meta (read (java.io.PushbackReader.
                 (java.io.StringReader. "^{:line 66} ()"))))
{:line 66}

The latter seems more correct to me (and is equivalent to read-string).



 Comments   
Comment by Chas Emerick [ 29/Sep/12 7:07 PM ]

{{CLJ-1097.diff}} contains a fix for this issue, as well as a separate commit that eliminates a series of casts in order to improve readability in the area.

Comment by Andy Fingerhut [ 05/Oct/12 8:23 AM ]

Chas, your patch doesn't apply cleanly to latest Clojure master as of Oct 5 2012. I'm not sure, but I think some recent commits to Clojure on Oct 4 2012 caused that. I also take it as evidence of your awesomeness that you can write patches for tickets that haven't been filed yet

Comment by Chas Emerick [ 05/Oct/12 9:24 AM ]

"patches for tickets that haven't been filed yet?"

Anyway, tweaking up this patch is a small price to pay for having column meta. New {{CLJ-1097.diff}} patch attached, applies clean on master as of now. Otherwise same contents as in the original patch, except:

  • the same dynamic is also applied to :column metadata, now that it's available
  • the changes have been rebased into a single commit (including the elimination of the casts in MetaReader, which were becoming so numerous that the code was less readable than most
Comment by Nicola Mometto [ 05/Oct/12 9:39 AM ]

"patches for tickets that haven't been filed yet?"

He was referring to the fact that you uploaded "CLJ-1097.diff" while the ticket is #1079

Comment by Chas Emerick [ 05/Oct/12 9:42 AM ]

Oh, hah! Twice now, even! One more data point recommending my having slight dyslexia or somesuch. :-P

I've replaced the attached patch with one that is named properly to avoid any later confusion.

Comment by Chas Emerick [ 07/Oct/12 3:57 PM ]

Refreshed patch to apply cleanly to master after the recent off by one patch for :column metadata.

Comment by Stuart Halloway [ 19/Oct/12 3:12 PM ]

This feels backwards to me. If a special purpose tool wants to convey information via metadata, why does it use names that collide with those used by LispReader?

Comment by Chas Emerick [ 19/Oct/12 7:36 PM ]

The information being conveyed is the same :line and :column metadata conveyed by LispReader — in fact, that's where it comes from in the first place.

Kibit (and cljx) is essentially an out-of-band source transformation tool. Given an input like this:

(ns com.foo.hosty)

(defn ^:clj system-hash
  [x]
 (System/identityHashCode x))

(defn ^:cljs system-hash
  [x]
  (goog/getUid x))

…it produces two files, a .clj for Clojure, and a .cljs for ClojureScript. (The first code listing in the ticket description is the former.)

However, because there's no way to transform Clojure code/data without losing formatting, anything that depends on line/column numbers (stack traces, stepping debuggers) is significantly degraded. If LispReader were to defer to :line and :column metadata already available on the loaded forms (there when the two generated files are spit out with *print-meta* on), this would not be the case.

Comment by Andy Fingerhut [ 07/Feb/13 8:47 AM ]

clj-1079-patch-v2.txt dated Feb 7 2013 is identical to Chas's CLJ-1079.diff dated Oct 7 2012, except it applies cleanly to latest master. I believe the only difference is that some white space in the context lines is updated.

Comment by Andy Fingerhut [ 07/Feb/13 12:35 PM ]

Sorry for the noise. I've removed clj-1079-patch-v2.txt mentioned in the previous comment, because I learned that CLJ-1079.diff dated Oct 7 2012 applies cleanly to latest master and passes all tests if you use this command to apply it.

% git am --keep-cr -s --ignore-whitespace < CLJ-1079.diff

I will update the JIRA workflow page instructions for applying patches to mention this, too, because there are multiple patches that fail without --ignore-whitespace, but apply cleanly with that option. That will eliminate the need to update patches merely for whitespace changes.





[CLJ-1074] Read/print round-trip for +/-Infinity and NaN Created: 21/Sep/12  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Colin Jones Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: print, reader

Attachments: Text File 0001-Read-Infinity-and-NaN.patch     Text File clj-1074-read-infinity-and-nan-patch-v2-plus-edn-reader.patch    
Patch: Code and Test

 Description   

A few float-related forms (namely, Double/POSITIVE_INFINITY, Double/NEGATIVE_INFINITY, Double/NaN) are not eval-able after a round-trip via

(read-string (binding [*print-dup* true] (pr-str f))

The two options I see are to provide print-method implementations for these and their Float cousins, or to make Infinity, -Infinity, +Infinity, and NaN readable values. Since it sounds like edn may want to provide a spec for these values (see https://groups.google.com/d/topic/clojure-dev/LeJpOhHxESs/discussion and https://github.com/edn-format/edn/issues/2), I think making these values directly readable as already printed is preferable. Something like Double/POSITIVE_INFINITY seems too low-level from edn's perspective, as it would refer to a Java class and constant.

I'm attaching a patch implementing reader support for Infinity, -Infinity, +Infinity, and NaN.



 Comments   
Comment by Timothy Baldridge [ 03/Dec/12 11:34 AM ]

Please bring this up on clojure-dev. We'll be able to vet this ticket after that.

Comment by Colin Jones [ 03/Dec/12 1:18 PM ]

Should I respond to my original clojure-dev post about this (linked in the issue description above), or start a new one?

Comment by Andy Fingerhut [ 24/May/13 1:11 PM ]

Patch clj-1074-read-infinity-and-nan-patch-v2.txt dated May 24 2013 is identical to 0001-Read-Infinity-and-NaN.patch dated Sep 21 2012, except it applies cleanly to latest master. The older patch conflicts with a recent commit made for CLJ-873.

Comment by Nicola Mometto [ 25/May/13 11:55 AM ]

clj-1074-read-infinity-and-nan-patch-v2-plus-edn-reader.patch is the same as clj-1074-read-infinity-and-nan-patch-v2.txt except it patches EdnReader too, but it must be applied after #CLJ-873 0001-Fix-CLJ-873-for-EdnReader-too.patch get merged





[CLJ-1033] pr-str and read-string don't handle @ symbols inside keywords properly Created: 26/Jul/12  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Steven Ruppert Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: print, reader
Environment:

Ubuntu 12.04 LTS; Java 1.7.0_05 Java HotSpot(TM) Client VM



 Description   
user=> (read-string (pr-str {(keyword "key@other") :stuff}))
RuntimeException Map literal must contain an even number of forms  clojure.lang.Util.runtimeException (Util.java:170)

pr-str emits "{:key@other :stuff}", which read-string fails to interpret correctly. Either pr-str needs to escape the @ symbol, or read-string needs to handle the symbol inside a keyword.

Background: I'm passing a map with email addresses as keys through Storm bolts, which require a thrift-serializable form. Using the pr-str/read-string combo fails on these keys, so I've fallen back to JSON serialization.



 Comments   
Comment by Stuart Halloway [ 10/Aug/12 12:51 PM ]

The '@' character is not a legal character for keywords or symbols (see http://clojure.org/reader). Recategorized as enhancement request.

Comment by Steven Ruppert [ 10/Aug/12 1:04 PM ]

Then why doesn't (keyword "keywith@") throw an exception? It seems inconsistent with your statement.

Comment by Andy Fingerhut [ 13/Sep/12 2:23 PM ]

It is a long standing property of Clojure that it does not throw exceptions for everything that is illegal.

Comment by Steven Ruppert [ 17/Sep/12 2:16 PM ]

Yeah, but read-string clearly does. Is there a good reason that the "keyword" function can't throw an exception? With the other special rules on namespaces within symbol names, the "keyword" function really should be doing validation.

Another solution would be to allow a ruby-like :"symbol with disallowed characters" literal, but that would also be confusing with how the namespace is handled.

https://groups.google.com/forum/?fromgroups=#!topic/clojure/Ct5v9w0yNAE has some older discussion on this topic.

Comment by Andy Fingerhut [ 17/Sep/12 7:43 PM ]

Disclaimer: I'm not a Clojure/core member, just an interested contributor who doesn't know all the design decisions that were made here.

Steven, I think perhaps a couple of concerns are: (1) doing such checks would be slower than not doing them, and (2) implementing such checks means having to update them if/when the rules for legal symbols, keywords, namespace names, etc. change.

Would you be interested in writing strict versions of functions like symbol and keyword for addition to a contrib library? And test suites that try to hit a significant number of corner cases in the rules for what is legal and what is not? I mean those as serious questions, not rhetorical ones. This would permit people that want to use the strict versions of these functions to do so, and at the same time make it easy to measure performance differences between the strict and loose versions.

Comment by Steven Ruppert [ 13/Jan/13 10:58 PM ]

Looking back at this, the root cause of the problem is that the {pr} function, although it by default "print[s] in a way that objects can be read by the reader" [0], doesn't always do so. Thus, the easiest "fix" is to change its docstring to warn that not all keywords can be read back.

The deeper problem is that symbol don't have a reader form that can represent all actually possible keywords (in this case, those with "@" in them). Restricting the actually-possible keywords to match the reader form, i.e. writing a strict "keyword" function actually seems like a worse solution overall to me. The better solution would be to somehow extend the keyword reader form to allow it to express all possible keywords, possibly ruby's :"keyword" syntax. Plus, that solution would avoid having to keep hypothetical strict keyword/symbol functions in sync with reader operation, and write test cases for that, and so on.

Thus, the resolution of this bug comes down to how far we're willing to go. Changing the docstring would be the easiest, but extending the keyword form would be the "best" resolution, IMO.

[0]: http://clojuredocs.org/clojure_core/clojure.core/pr

Comment by Andy Fingerhut [ 19/Aug/13 10:54 AM ]

I happened across ticket CLJ-17 yesterday. Its discussion thread shows that the topic of validating the contents of constructed keywords and symbols has arisen before. At the time, a patch was written that modified the functions "symbol" and "keyword" so that the symbol/keyword was constructed as it does now, but then it was double-checked for readability using the clojure.lang.RT/readString method on the string arg given. It threw an exception if the intern and readString methods returned non-equal symbols (or if readString threw an exception).

Rich was concerned that this run-time overhead would be too high, and asked if anyone knew a faster way of doing it. Chas Emerick proposed making all symbols readable using a syntax like Common Lisp's #|symbol with whitespace|, with some checks for the common case where the quoting would be unnecessary. Rich was open to the idea of quoting arbitrary symbols, but that it would be a different ticket than that one.

I am not aware of anyone creating a ticket to introduce quoting of arbitrary symbols since then, but I could have missed it. This ticket could become that ticket, but its description would need significant editing, and code changes would be needed in a variety of places in Clojure.





[CLJ-1003] :100 is a valid Clojure keyword. Is that intentional? Created: 27/May/12  Updated: 06/Sep/13  Resolved: 06/Sep/13

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Brian Taylor Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: reader


 Description   

In the latest clojure (eccde24c7),

These parse as keywords:

:100
:100/a

This is an error:

:a/100

Is this behavior intentional?

In LispReader.java, the pattern for a symbol (and keyword) is "[:]?([\\D&&[^/]].*/)?([
D&&[^/]][^/]*)". If I'm mentally parsing that correctly, it seems like none of my examples should be valid keywords. I don't know why the first two cases parse.

http://clojure.org/reader says that none of my examples should parse.

Clojurescript does not accept any of my examples, see CLJS-142



 Comments   
Comment by Andy Fingerhut [ 13/Sep/12 6:34 PM ]

The reason that :100 parses as a keyword is that [
D&[^/]] matches a colon character, too. So the first : of :100 is not matching the [:]? at the beginning of the regular expression, but by a later part of it. Regexps can be tricky.

Whether this is intentional or not, my guess is probably not. Given that the docs at http://clojure.org/reader also say "A symbol can contain one or more non-repeating ':'s", writing a regexp that matches only what is documented there looks fairly complex.

Clojure has a history of allowing things, without throwing exceptions or giving any other errors, even if they are documented as not legal. This may be one of those cases. Do not consider this last paragraph as any kind of authoritative answer. It is only my observation.

Comment by Andy Fingerhut [ 05/Sep/13 5:26 PM ]

Perhaps this should be marked as duplicate of CLJ-1252

Comment by Alex Miller [ 06/Sep/13 7:50 AM ]

Dupe of CLJ-1252





[CLJ-976] Implement reader literal and print support for PersistentQueue data structure Created: 27/Apr/12  Updated: 22/Aug/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Fogus Assignee: Fogus
Resolution: Unresolved Votes: 4
Labels: data-structures, queue, reader, tagged-literals

Attachments: File CLJ-976-queue-literal-eval-and-synquote.diff     Text File clj-976-queue-literal-eval-and-synquote-patch-v3.txt     File CLJ-976-queue-literal-eval.diff    
Patch: Code and Test
Approval: Triaged

 Description   

Clojure's PersistentQueue structure has been in the language for quite some time now and has found its way into a fair share of codebases. However, the creation of queues is a two step operation often of the form:

(conj clojure.lang.PersistentQueue/EMPTY :a :b :c)

;=> #<PersistentQueue clojure.lang.PersistentQueue@78d5f6bc>

A better experience might be the following:

#queue [:a :b :c]

;=> #queue [:a :b :c]

(pop #queue [:a :b :c])

;=> #queue [:b :c]

This syntax is proposed and discussed in the Clojure-dev group at https://groups.google.com/forum/?fromgroups#!topic/clojure-dev/GQqus5Wycno

Open question: Should the queue literal's arguments eval? The implications of this are illustrated below:

;; non-eval case
#queue [1 2 (+ 1 2)]

;=> #queue [1 2 (+ 1 2)]


;; eval case
#queue [1 2 (+ 1 2)]

;=> #queue [1 2 3]

The answer to this open question will determine the implementation.



 Comments   
Comment by Steve Miner [ 27/Apr/12 10:18 AM ]

I think the non-eval behavior would be consistent with the other reader literals in Clojure 1.4. It's definitely better for interop where some other language implementation could be expected to handle a few literal representations, but not the evaluation of Clojure expressions. Use a regular function if the args need evaluation.

Comment by Chas Emerick [ 27/Apr/12 10:19 AM ]

The precedent of records seems relevant:

=> (defrecord A [b])
user.A
=> #user.A[(+ 4 5)]
#user.A{:b (+ 4 5)}
=> #user.A{:b (+ 4 5)}
#user.A{:b (+ 4 5)}

This continues to make sense, as otherwise queues would need to print with an extra (quote …) form around lists — which records neatly avoid:

=> (A. '(+ 4 5))
#user.A{:b (+ 4 5)}

Does this mean that a queue fn (analogous to vector, maybe) will also make an appearance? It'd be handy for HOF usage.

Comment by Fogus [ 27/Apr/12 11:00 AM ]

Added a patch for the tagged literal support ONLY. This is only one part of the total solution. This provides the read-string and printing capability. I'd like more discussion around the eval side before I get dive into the compiler.

Comment by Paul Michael Bauer [ 27/Apr/12 6:45 PM ]

In addition to Chas' observations on consistency with record literals, would eval in queue literals open up the same security hole as #=, needing to respect *read-eval*?
When needing eval inside a queue literal, embedding a #= seems more apropos.

Comment by Fogus [ 04/May/12 1:14 PM ]

Evalable queue literal support.

Comment by Andy Fingerhut [ 10/May/12 5:54 PM ]

Neither of the patches CLJ-976-queue-literal-tagged-parse-support-only.diff dated Apr 27, 2012 nor CLJ-976-queue-literal-eval.diff dated May 4, 2012 apply cleanly to latest master as of May 10, 2012.

Comment by Fogus [ 11/May/12 10:15 AM ]

Updated patch file to merge with latest master.

Comment by Fogus [ 20/Jul/12 1:14 PM ]

New patch with support fixed for syntax-quote.

Comment by Stuart Sierra [ 17/Aug/12 12:41 PM ]

Patch does not apply as of commit f5f4faf95051f794c9bfa0315e4457b600c84cef

Comment by Fogus [ 17/Aug/12 3:06 PM ]

Weird. I was able to download the CLJ-976-queue-literal-eval-and-synquote.diff patch and apply it to HEAD as of just now (f5f4faf95051f794c9bfa0315e4457b600c84cef). There were whitespace warnings, but the patch applied, compiles and passes all tests.

Comment by Andy Fingerhut [ 17/Aug/12 7:29 PM ]

With latest head I was able to successfully apply patch CLJ-976-queue-literal-eval-and-synquote.diff with this command:

git am --keep-cr -s < CLJ-976-queue-literal-eval-and-synquote.diff

with some warnings, but successfully applied. If I try it without the --keep-cr option, the patch fails to apply. I believe this is often a sign that either one of the files being patched, or the patch itself, contains CR/LF line endings, which some of the Clojure source files definitely do.

The command above (with --keep-cr) is currently the one recommended for applying patches on page http://dev.clojure.org/display/design/JIRA+workflow in section "Screening Tickets". I added the suggested --keep-cr option after running across another patch that applied with the option, but not without it.

Comment by Andy Fingerhut [ 28/Aug/12 5:45 PM ]

Presumptuously changing Approval from Incomplete back to Test, since the latest patch does apply cleanly if --keep-cr option is used.

Comment by Rich Hickey [ 08/Sep/12 6:48 AM ]

this needs more time

Comment by Fogus [ 18/Sep/12 8:15 AM ]

Rich,

Do you mind providing a little more detail? I would be happy to make any changes if needed. However, if it's just a matter of its relationship to EDN and/or waiting until the next release then I am happy to wait. In either case, I'd like to complete this or push it to the back of my mind. Thanks.

Comment by Andy Fingerhut [ 05/Oct/12 7:49 AM ]

clj-976-queue-literal-eval-and-synquote-patch-v2.txt dated Oct 5 2012 is identical to Fogus's patch CLJ-976-queue-literal-eval-and-synquote.diff dated Jul 20 2012. It simply removes one line addition to clojure.iml that Rich has since added in a different commit, so that this patch now applies cleanly to latest master.

Comment by Andy Fingerhut [ 16/Oct/12 12:20 PM ]

clj-976-queue-literal-eval-and-synquote-patch-v3.txt dated oct 16 2012 is identical to Fogus's patch CLJ-976-queue-literal-eval-and-synquote.diff dated Jul 20 2012. It simply removes one line addition to clojure.iml that Rich has since added in a different commit, so that this patch now applies cleanly to latest master.

Comment by Andy Fingerhut [ 20/Oct/12 12:26 PM ]

Fogus, with the recent commit of a patch for CLJ-1070, my touched-up patch clj-976-queue-literal-eval-and-synquote-patch-v3.txt dated Oct 16 2012 doesn't apply cleanly. In this case it isn't simply a few lines of context that have changed, it is the interfaces that PersistentQueue implements have been changed. It might be best if you take a look at the latest code and the patch and consider how it should be updated.

Comment by Steve Miner [ 06/Apr/13 8:07 AM ]

Related to CLJ-1078.

Comment by Alex Miller [ 22/Aug/13 10:38 PM ]

Moving back to Triaged as Rich has not vetted.





[CLJ-955] java object reader constructor doesn't work Created: 18/Mar/12  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Brent Millare Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader


 Description   

Here is a transcript:

;user=> clojure-version
{:major 1, :minor 4, :incremental 0, :qualifier "beta5"}
;user=> (.getProtocol #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
java.lang.RuntimeException: Can't embed object in code, maybe print-dup not defined: file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs, compiling:(null:2)
;user=> (.getProtocol (java.net.URL. "file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"))
"file"

Another transcript from google groups https://groups.google.com/forum/?fromgroups&hl=en#!topic/clojure/vlsFgVaKcSQ

user=> (def x #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
#'user/x
user=> x
#<URL file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs>
user=> (.getProtocol x)
"file"

user=> (.getProtocol #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
CompilerException java.lang.RuntimeException: Can't embed object in code, maybe print-dup not defined: file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs, compiling:(NO_SOURCE_PATH:5)
user=> (defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w (str o)))
#<MultiFn clojure.lang.MultiFn@2e694f12>

user=> (.getProtocol #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
ClassCastException clojure.lang.Symbol cannot be cast to java.net.URL user/eval11 (NO_SOURCE_FILE:7)
user=>

user=> (def x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
#'user/x
user=> (printf "(class x)=%s x='%s'\n" (class x) x)
(class x)=class java.net.URL x='file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs'
nil
user=> (let [x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"]]
(printf "(class x)=%s x='%s'\n" (class x) x))
CompilerException java.lang.RuntimeException: Can't embed object in code, maybe print-dup not defined: file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs, compiling:(NO_SOURCE_PATH:4)

user=> (defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w (str o)))
#<MultiFn clojure.lang.MultiFn@3362a63>
user=> (let [x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"]]
(printf "(class x)=%s x='%s'\n" (class x) x))
(class x)=class clojure.lang.Symbol x='file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs'
nil



 Comments   
Comment by Andy Fingerhut [ 19/Mar/12 2:58 AM ]

I've confirmed this behavior with 1.4.0 beta5. I've only tracked it down as far as finding the "Can't embed object in code" message, which is easy to find in Compiler.java in method emitValue. That exception is thrown because printString in RT.java throws an exception.

It isn't clear to me what should be done instead, though.

Comment by Fogus [ 19/Mar/12 9:03 AM ]

What would it mean to construct an arbitrary Java object for the purpose of embedding it in code in a generic way? You could say that it's just a matter of calling its constructor with the right args but very often in Java that is not enough to make an object considered "initialize". Sometimes there are init methods or putters or whatever that are required for object construction. So right there I hope it's clear that even though #some.Klass["foo"] provides a way to call an arbitrary constructor, it's in no way a guarantee that an instance is properly constructed. The reason that (for example) defrecords are embeddable anywhere is because we know for certain that the constructor creates a fully initialized instance. If you need to embed specific instances in your own code then you have two options:

  • Implement a print-dup for the class that guarantees a fully initialized object is built.
  • Use Clojure 1.4's tagged literal feature to do the same.

Quick point of note:

Your code

(defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w (str o)))

Doesn't do what you think it does. It spits out exactly file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs that Clojure reads in as a symbol. Something like the following might be more appropriate:

(defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w "#java.net.URL") (.write w (str [ (str o) ])))

(let [x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"]]                             
  (printf "(class x)=%s x='%s'\n" (class x) x) x)                                                                          

; (class x)=class java.net.URL x='file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs'

;=> #<URL file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs>

(.getProtocol *1)                        
;=> "file"
Comment by Brent Millare [ 19/Mar/12 10:09 AM ]

Fogus, can you please elaborate on using clojure 1.4's tagged literal features. While I understand that you can define data-reader functions, for example

(binding [*data-readers* {'user/f (fn [x] (java.io.File. (first x)))}] (read-string "#user/f [\"hello\"]")) ;=> #<File hello>

however, I feel this is only half a fix, compared with the first mentioned solution (involving print-dup), since clojure's tagged literals only are important for reading, not for printing. Does using tagged literals, so that (read-string (pr-str (read-string ...) works, imply that you must also define print methods per class? If this is true, it seems problematic since if different code wants to define different print methods, this will conflict since defining print-dup methods is global. Is there a good solution for printing objects depending on the context? As an alternative solution, I propose making the default print-method of all objects that didn't already have a printed representation to be a tagged literal, this way, users can customize what it means to read it. (See https://groups.google.com/forum/?fromgroups&hl=en#!topic/clojure/GdT5cO6JoSQ )

Comment by Fogus [ 19/Mar/12 10:18 AM ]

"Fogus, can you please elaborate on using clojure 1.4's tagged literal features."

I can, but it falls outside of the scope of this particular ticket. It might be better to take the broader conversation to the design page at http://dev.clojure.org/display/design/Tagged+Literals and the clojure-dev list.

Has the original motivation for this ticket been addressed?

Comment by Brent Millare [ 19/Mar/12 10:26 AM ]

I think you've explained the underlying problem, so yes. One possible caveat might be that it may be a concern that the current error message is not telling that the underlying problem lies in an improperly initialized object. (or maybe it is, and that's the error message I should expect).





[CLJ-950] Function literals behavior differ from that of fns Created: 08/Mar/12  Updated: 01/Mar/13  Resolved: 08/Mar/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Víctor M. Valenzuela Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: reader


 Description   

((fn [] true)) ; true
(#(true)) ; classcast exception

((fn [])) ; nil
(#()) ; ()

(some (fn [_]_) [nil false 0 1]) ; 0
(some #(%) [nil false 0 1]) ; NPE



 Comments   
Comment by Tassilo Horn [ 08/Mar/12 12:27 PM ]

This is no defect. Function literals must have a function (or macro or special form) as first symbol.
So your examples should be written like so:

user> (#(do true))
true
user> (#(do))
nil
user> (some #(do %) [nil false 0 1])
0
Comment by Víctor M. Valenzuela [ 08/Mar/12 2:10 PM ]

It makes sense. However (and correct me if I'm wrong) there should be little problem in making them fully equivalent to fns, resulting in a more concise and consistent API.

Please consider re-opening the issue as a feature request.

Regards - Víctor.

Comment by Tassilo Horn [ 09/Mar/12 1:26 AM ]

The reader docs at http://www.clojure.org/reader say that #() is not a replacement for (fn [] ...). You can't make it more equivalent to fn without making it much harder to understand. Let me explain that with an example.

(defn get-time []
  (System/currentTimeMillis))

(#(get-time)) ;; What's the result?

What's the result of the funcall above? Clearly, right now, it is the current system time.

So if we decided to allow to write #(true) as an alternative to (constantly true) [which is a varargs fn] or #(do true) [which is a fn of zero args], then valid values of #(get-time) where both the current system time but also the function object for get-time. Functions are values, too.

Ok, one could say that in the case of a function, #(function) is always a call, but it would make it harder to reason about what the code does for not much benefit.

Comment by Víctor M. Valenzuela [ 09/Mar/12 7:56 AM ]

You're right - satisfying my request would require to change the average use of this feature to #((some_fn %1 %2)) rather than just #(some_fn %1 %2), if we wanted #(true) to be valid. Which indeed would be barely handy.

Thank you.





[CLJ-946] eval reader fail to recognize function object Created: 06/Mar/12  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Naitong Xiao Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader


 Description   

(defmacro stubbing [stub-forms & body]
(let [stub-pairs (partition 2 stub-forms)
returns (map last stub-pairs)
stub-fns (map constantly returns) ;;(map #(list 'constantly %) returns)
real-fns (map first stub-pairs)]
`(binding [~@(interleave real-fns stub-fns)]
~@body)))

This macro is from Clojure In Action , whith the commented line rewrited (I know that original is better)
I assumed that this macro would work as supposed , if the stub forms use compile time literal, for e.g

(def ^:dynamic $ (fn [x] (* x 10))

(stubbing [$ 1]
($ 1 1))
;; =>
IllegalArgumentException No matching ctor found for class clojure.core$constantly$fn__3656
clojure.lang.Reflector.invokeConstructor (Reflector.java:166)
clojure.lang.LispReader$EvalReader.invoke (LispReader.java:1031)
clojure.lang.LispReader$DispatchReader.invoke (LispReader.java:618)

;;macro can expanded
(macroexpand-all '(stubbing [$ 1]
($ 1 1)))
;; =>
(let* [] (clojure.core/push-thread-bindings (clojure.core/hash-map (var $) #<core$constantly$fn_3656 clojure.core$constantly$fn_3656@161f888>)) (try ($ 1 1) (finally (clojure.core/pop-thread-bindings))))

I thought there is something wrong with eval reader, but i can't figure it out

btw the above code can run on clojure-clr



 Comments   
Comment by Michael Klishin [ 06/Mar/12 11:24 PM ]

As far as I know, Clojure in Action examples are written for Clojure 1.2. What version are you using?

Comment by Naitong Xiao [ 07/Mar/12 1:24 AM ]

I use clojure 1.3

The example in Clojure In Action is Ok on clojure 1.3
I just found this peculiar thing when trying to answer a question from another one in a mail list.





[CLJ-928] instant literal for Date and Timestamp should print in UTC Created: 07/Feb/12  Updated: 01/Mar/13  Resolved: 24/Feb/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Steve Miner Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader
Environment:

1.4-beta1, Mac OS X 10.7.3, java version "1.6.0_30"


Attachments: Text File CLJ-928-instant-literals-for-Date-and-Timestamp-fogus.patch     Text File CLJ-928-instant-literals-for-Date-and-Timestamp.patch    
Patch: Code and Test
Approval: Ok

 Description   

The default #inst returns a java.util.Date. Date is always in UTC, and doesn't know about time zones, but the implementation of the print method always renders it in the default time zone. For example,

user=> #inst "2012Z"
#inst "2011-12-31T19:00:00.000-05:00"

RFC3339 says:

4.3. Unknown Local Offset Convention

If the time in UTC is known, but the offset to local time is unknown,
this can be represented with an offset of "-00:00". This differs
semantically from an offset of "Z" or "+00:00", which imply that UTC
is the preferred reference point for the specified time.

java.sql.Timestamp should also print in UTC since that class doesn't keep timezone information. The print-method for Timestamp seems broken.

user=> (binding [*data-readers* {'inst #'clojure.instant/read-instant-timestamp}] (read-string "#inst \"2012Z\""))
#inst "2011-12-31T19:000000000000000-05:00"

user=> (binding [data-readers {'inst #'clojure.instant/read-instant-timestamp}]
(read-string "#inst \"2012-01-01T01:23:45.678+00:00\""))
#inst "2011-12-31T20:267800000000000-05:00"

user=> (java.sql.Timestamp. 0)
#inst "1969-12-31T19:000000000000000-05:00"

The implementations of the print-methods for Data, GregorianCalendar and Timestamp do too much string manipulation. I suggest doing some refactoring. (Patch coming soon.)

Also, the documentation needs some updating. clojure.instant/read-instant-date, etc. mention instant-reader but I think that mechanism was generalized as data-readers.



 Comments   
Comment by Steve Miner [ 07/Feb/12 1:28 PM ]

#inst literals for Date and Timestamp should print in UTC. Fixed round-tripping problem with Timestamp nanos (and re-enabled test). Refactored print-methods to avoid some string manipulations, particularly regarding Calendar timezone offsets. Added tests including those from CLJ-926, which this patch also fixes. Fixed doc on read-instant-date, etc.

Comment by Fogus [ 07/Feb/12 3:10 PM ]

Thank you. I will run this patch through the paces ASAP.

Comment by Cosmin Stejerean [ 07/Feb/12 7:09 PM ]

"Date is always in UTC, and doesn't know about time zones"

Given that pretty much all of the methods on Date return results in the local timezone, I wouldn't quite say that Date is always in UTC. Granted, most of those methods are deprecated; toString() however isn't and it also displays the date in the local timezone.

We can certainly force printing of dates in UTC, by overriding the timezone setting on SimpleDateFormat, and there might be other valid reasons for always forcing UTC representation. Dates not being timezone aware however doesn't seem like one to me.

Comment by Steve Miner [ 08/Feb/12 8:54 AM ]

It's only a slight exaggeration to say that a java.util.Date is intrinsically a long (milliseconds since "the epoch"). For Clojure's purposes, an "instant" is a point in time in a universal sense (not situated in a timezone) so UTC makes sense as the canonical form. RFC 3339 section 4.3 says to use "-00:00" for the offset when timezone information is unknown. java.sql.Timestamp is like a Date with an extra field for nanos so it also should use UTC for the canonical form.

My argument for UTC as the canonical form is that it best matches the desired semantics of a Clojure instant. I think it's the most conservative way to go and offers the best approach to interoperability by following RFC 3339.

java.util.Calendar is by design cognizant of timezones so it arguably makes sense to preserve the timezone information. Note that comparisons of Calendars take into account the timezone so two Calendar objects might be the same instant in a sense but not be equal because of the timezones. That might be a bit tricky for Clojure users who want a clean abstraction of "instant" with multiple implementations. Presumably they know what they are doing if they decide to use read-instant-calendar. It might be a good idea to be more explicit about this in the documentation.

Comment by Fogus [ 20/Feb/12 3:22 PM ]

Modified priting to conform to the following:

  • Instant instances are stored in UTC format without offset information.
  • Instant literals with offset information will be parsed and stored with offset applied.
  • Instant literals without offset information will be parsed as UTC format without offset information.
  • Instants will print as time having local offset applied and without offset printed.
  • Instants without offset info will be print-dup'd as UTC format with RFC-3339 default offset information (-00:00)

This provides a canonical storage and read format of UTC and a print format relative to the local offset.

Comment by Stuart Sierra [ 24/Feb/12 2:11 PM ]

With Fogus' patch on Feb. 20, instant literals do not round-trip unless *print-dup* is bound true:

user=>  (let [i (read-string "#inst \"2012-02-24T14:41-02:00\"")
              j (read-string (pr-str i))]
          (prn i)
          (prn j)
          (prn (= i j)))
#inst "2012-02-24T11:41:00.000"
#inst "2012-02-24T06:41:00.000"
false
nil
user=>(binding [*print-dup* true]
        (let [i (read-string "#inst \"2012-02-24T14:41-02:00\"")
              j (read-string (pr-str i))]
          (prn i)
          (prn j)
          (prn (= i j))))
#inst "2012-02-24T16:41:00.000-00:00"
#inst "2012-02-24T16:41:00.000-00:00"
true
nil

The docstring of *print-dup* says nothing about print/read values being equal (by Clojure's definition of =) only that print/read values will be of the same type. "Type" is not a particularly meaningful concept when applied to data literals.

We (Stuart & Luke) believe that print/read values should always print in such a way that they can be read back as the same instant. Printing an instant should either:

  • print in local time, with local offset
  • always print in UTC, with zero offset

Instants should never print without any time zone offset, as this admits too much confusion.

Comment by Stuart Sierra [ 24/Feb/12 2:46 PM ]

Just spoke with Rich Hickey, who made the following assertions:

  • Instants should always represent an exact point in time, unambiguously. This representation may include time zone information.
  • Our specification for instant literals allows any part on the right to be omitted. Omitting the time zone offset means UTC is assumed. It is not important whether or not the default printer includes the UTC "00:00" offset or omits it.
  • If we are printing an instant literal based on a type that does not contain time zone information (e.g., java.util.Date) then we should not add time zone information but simply print in UTC.
  • If we are printing an instant literal based on a type that can contain time zone information (e.g., java.util.Calendar) then we may print with the time zone offset included. This is a nice-to-have feature, not a requirement. It is always permissible to print instants in UTC.
  • It is possible that round-trip print/read of an instant may lose time zone information, depending on the types used, but it this should not change the meaning of the instant in UTC.
Comment by Steve Miner [ 24/Feb/12 3:13 PM ]

I believe my original patch satisfies the requirements stated above. I'm happy to work on this if you want something changed.

Comment by Stuart Sierra [ 24/Feb/12 3:22 PM ]

Rejected Fogus' patch of Feb. 20.

Screened Steve Miner's patch of Feb. 7. Ready for Rich.

Comment by Stuart Sierra [ 24/Feb/12 3:29 PM ]

Patch applied.





[CLJ-927] default tagged literal reader Created: 06/Feb/12  Updated: 01/Mar/13  Resolved: 09/Nov/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.5

Type: Enhancement Priority: Major
Reporter: Fogus Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: clojure, reader

Attachments: Text File CLJ-927-default-data-reader-fn-for-unknown-tags.patch    
Patch: Code and Test
Approval: Ok

 Description   

With data reader support, it's impossible to write a program to read an arbitrary stream of Clojure forms. For example, the following code will fail with the current 1.4.0-beta1 tagged literal support:

#point [0 2]

It might be enough to require that the read side define a reader for point, but what if we do not wish to require that both sides of the conversation know about the #point tag beforehand? Using the identity function as a default allows the reader to handle the literal form as is even when it doesn't recognize a tag (or even care to translate it).

The change from the current behavior is that missing tagged literal parsers are no longer error conditions.



 Comments   
Comment by Steve Miner [ 12/Feb/12 10:30 AM ]

I'd like to preserve the unknown literal tag as well as the value. That would allow a program to recreate the printed representation. A map would work as the value. You'd get something like: {:unknown-literal point :value [0 2]}. If you needed to pass that on to some other process, you could easily write it in the original literal form. Perhaps the key for literal tag should be namespace qualified to avoid possible confusion with user data. Another benefit of returning a map for an unknown literal tag is that equality tests still seem reasonable: (not= #foo "abc" #bar "abc").

Comment by Steve Miner [ 18/Sep/12 11:01 AM ]

It would be convenient if I could handle unknown tags using some sort of catch-all key in *data-readers* (say 'default). The associated function should take two arguments: the tag and the literal value. If there is no 'default key in *data-readers*, then an error would be thrown (same as Clojure 1.4).

I think it's a simple way to allow the programmer to take control without having to add new API or data types. It's just one distinguished key ('default, :default something like that) and one additional line of doc.

I can provide a patch.

Comment by Rich Hickey [ 21/Sep/12 9:17 AM ]

This needs to be addressed. We should follow the advice given in the edn docs:

If a reader encounters a tag for which no handler is registered, the implementation can either report an error, call a designated 'unknown element' handler, or create a well-known generic representation that contains both the tag and the tagged element, as it sees fit. Note that the non-error strategies allow for readers which are capable of reading any and all edn, in spite of being unaware of the details of any extensions present.

We can get both of the latter by having a preinstalled default unknown element handler that returns the generic representation. "identity" is out since it loses the tag. Using a map with namespaced keys is somewhat subtle. An TaggedElement record type is also possible.

Issues are - what happens when more than one library tries to supply a default? If the system supplies one, perhaps it's best to only allow dynamic rebinding and not static installation. Or error on conflicting default handlers, or first/last one wins (but first/'last' might be difficult to predict).

Comment by Steve Miner [ 17/Oct/12 12:49 PM ]

Everyone agrees that identity is not an appropriate default so I changed the Summary field.

Comment by Steve Miner [ 18/Oct/12 8:54 PM ]

Minimal patch that adds support for a default data reader in *data-readers*. If an unknown tag is read, we look up the 'default reader in *data-readers and call it with two arguments: the tag and the value. By default, there is no default reader and you get an exception as in Clojure 1.4.

Comment by Steve Miner [ 18/Oct/12 8:57 PM ]

An alternative patch that establishes a default data reader to handle the case of an unknown tag. The default reader returns a map with metadata to define the :unknown-tag. This preserves the information for the unknown data literal, but keeps a simple and convenient format.

Comment by Stuart Halloway [ 19/Oct/12 5:27 AM ]

Steve,

I am guessing that you consider these two patches alternatives, not cumulative. I am marking as screened the "default-in-data-readers" patch, which allows users to specify a 'default handler.

The other patch "unknown-data-reader", which specifies a built-in Clojure handler for unknown tags, is not screened. It specifies a default handler that returns a metadata-annotated map. If there is to be a default handler, I think it would need to return something disjoint from data, e.g. a tagged interface of some kind (or at least a namespaced name in the map.)

It would be a great help to have a little discussion along with the patches.

Comment by Steve Miner [ 19/Oct/12 8:01 AM ]

Yes, the two patches are separate alternatives. The "default-in-data-readers" patch just adds the special treatment for the 'default key in *data-readers* without providing a system default. This allows the application programmer to provide a catch-all handler for unknown data literal tags. Libraries should be discouraged from setting a 'default handler. Conflicts with the 'default key in data_readers.clj are handled just like other keys so it would be a bad thing for multiple libraries to try to take over the 'default data reader. Libraries can instead provide implementations and let the application programmer do the actual setting of the 'default key.

The second patch ("unknown-data-reader") implements 'default similarly, but also provides for a 'default reader in default-data-readers. My unknown-data-reader returns a map. I found it safer and simpler to deal with keywords instead of symbols for unknown tag – Clojure doesn't like symbols for unknown packages and you never know what you might get in data literal tags. After experimenting with with my original idea of having a map with two keys (essentially :tag and :value), I decided that I preferred the single entry map with the key derived from the tag and the value preserved as read. Adding the metadata to define the :unknown-tag provides enough information for the application programmer to deal with the maps unambiguously. I think the single entry maps are easier to read.

The alternative that I did not pursue would be to use a new record type as the default data type. My second patch could be used as a basis for that approach. We just need to replace my unknown-data-reader function with one that creates a record (TaggedElement).

Comment by Rich Hickey [ 19/Oct/12 5:43 PM ]

I don't like the 'default entry, especially if it is usually wrong for a library to set it.

Having no bindable var default makes editing data_readers.clj an outstanding chore for everyone.

It is unlikely a single special override in data_readers.clj is suitable for every reading situation, even in the same app.

If there is a known TaggedElement type, then people need only be able to opt out of that.
So if there were a default-tagged-reader function of (tag read-element) that built a TaggedElement (or, alternatively, threw, if people prefer), people could just rebind that in a particular reading context.

There's no perfect default, but in most situations getting unknown data is an error. I personally would default to that, and provide a read-tagged-element fn people could bind in when trying to implement read-anything.

Comment by Steve Miner [ 20/Oct/12 10:03 AM ]

old patches deleted. This revised patch introduces a var *default-data-reader-fn* which can be used to handle unknown tags. By default it is nil and an exception is thrown for the unknown tag as in Clojure 1.4. If it's non-nil, the function is called with the tag and value. I chose the name so that it contained 'data-reader', which makes it search friendly. I wanted to commit this separately from any attempt to provide a built-in catch-all reader and new record type as that might be more contentious.

Comment by Steve Miner [ 02/Nov/12 9:42 AM ]

The patch for *default-data-reader-fn* was committed for Clojure 1.5 beta1. The programmer can now specify a catch-all reader, which solves the main issue. However, there is no built-in default reader so unknown tags still cause exceptions to be thrown as in Clojure 1.4.

I think this bug can be closed. Default reader implementations could be provided by a contrib library. Or someone can open a new bug if you want the language to provide a built-in default reader.

Comment by Nicola Mometto [ 02/Nov/12 1:24 PM ]

what about adding default-tagged-reader-fn to clojure.main/with-bindings to make it set!able?

Comment by Steve Miner [ 03/Nov/12 9:29 AM ]

Good point about making it set!-able. I filed that issue as a separate bug: CLJ-1101.

Comment by Stuart Sierra [ 09/Nov/12 8:26 AM ]

Resolved.





[CLJ-926] Instant literals do not round trip correctly Created: 05/Feb/12  Updated: 17/Feb/12  Resolved: 17/Feb/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Cosmin Stejerean Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: instant, reader, tagged-literals

Attachments: Text File CLJ-926-round-trip-date-instants-with-tests.patch    
Patch: Code and Test
Approval: Screened
Waiting On: Stuart Sierra

 Description   

When using java.util.Date for instant literals (which is the default) instants do not round-trip properly during daylight savings. Here is an example:

(read-string "#inst \"2010-11-12T13:14:15.666-06:00\"")
#inst "2010-11-13T06:14:15.666+10:00"

I'm currently in Melbourne, which is normally GMT+10. However, on November 12th daylight savings is in effect, so the proper GMT offset is +11. The date above is actually using the correct local time (6:14:15) but with the wrong offset. The problem is more obvious when you attempt to round-trip the instant that was read.

user> #inst "2010-11-13T06:14:15.666+10:00"
#inst "2010-11-13T07:14:15.666+10:00"

Notice that we read 6:14am but the output was 7:14 with the same offset. Upon digging deeper the real culprit seems to be the use of String.format (through clojure.core/format) when outputting java.util.Date. Notice the following inside caldate->rfc3339

(format "#inst \"%1$tFT%1$tT.%1$tL%1$tz\"" d))

Let's compare the timezone offset in the underlying date with what is printed by %1$tz

user> (def d #inst "2010-11-13T06:14:15.666+10:00")
#'clojure.instant/d                                                                                                                                                                                         
user> (.getHours d)
7                                                                                                                                                                                                           
user> (.getTimezoneOffset d)
-660

For reference, the definition of getTimezoneOffset is

-(Calendar.get(Calendar.ZONE_OFFSET) + Calendar.get(Calendar.DST_OFFSET)) / (60 * 1000)

So far it looks good. 6am in GMT+10 becomes 7am in GMT+11. Let's see how String.format handles it though.

                                                                                               
clojure.instant> (format "%1$tz" d)
"+1000"                                                                                                                                                                                                     
clojure.instant> (format "%1$tT" d)
"07:14:15"

String.format prints the correct hour, but with the wrong offset. The recommended way for formatting dates is to use a DateFormatter.

The String.format approach appears to work properly for Calendar, but not for Date. Therefore the attached patch keeps the current
implementation for java.util.Calendar but uses SimpleDateFormat to handle java.util.Date correctly. This fixes the roundtrip problem.



 Comments   
Comment by Cosmin Stejerean [ 05/Feb/12 9:29 PM ]

Not quite sure how to create a repeatable test for this since the issue depends on the local timezone.

Comment by Steve Miner [ 06/Feb/12 8:33 AM ]

java.util.TimeZone/setDefault could be used for testing in various timezones.

Comment by Steve Miner [ 06/Feb/12 8:37 AM ]

Regarding the patch: SimpleDateFormat is a relatively heavy-weight object, so it seems bad to allocate one every time you print a Date. Unfortunately, SimpleDateFormat is not thread-safe, so you can't just share one. The Java world seems to agree that you should use JodaTime instead, but if you're stuck with the JDK, you need to have a ThreadLocal SimpleDateFormat. I'm working on my own patch along those lines.

Comment by Fogus [ 06/Feb/12 7:38 PM ]

Excellent analysis. I'll keep track of this case and vet any patches that come along. Please do attach a regression test if possible.

Comment by Cosmin Stejerean [ 06/Feb/12 8:39 PM ]

I attached a new patch using a SimpleDateFormat per thread using a thread local. I'll try to add some tests next.

Comment by Cosmin Stejerean [ 06/Feb/12 10:42 PM ]

A combined patch that uses a thread local SimpleDateFormat, tests round-tripping at a known daylight savings point (by overriding the default timezone) and checks round tripping at multiple points throughout the year (every hour of the first 4 weeks of every month of the year).

Steve, thanks for the suggestions on using a thread local and TimeZone/setDefault.

Comment by Steve Miner [ 07/Feb/12 1:32 PM ]

I filed CLJ-928 with my patch for fixing the printing of Date and Timestamp using UTC. I copied the tests from the CLJ-926 patch to make sure we're compatible.

Comment by Fogus [ 07/Feb/12 3:11 PM ]

Thanks for the regression test also. I'll vett this patch ASAP.

Comment by Fogus [ 17/Feb/12 1:24 PM ]

Seems reasonable to me.

Comment by Stuart Sierra [ 17/Feb/12 1:46 PM ]

Vetted. Will apply later.

Comment by Stuart Sierra [ 17/Feb/12 1:56 PM ]

Not Vetted. Test. That thing.

Comment by Stuart Sierra [ 17/Feb/12 2:10 PM ]

No, it's "Screened," not "Test." Somebody save me.

Comment by Stuart Sierra [ 17/Feb/12 3:55 PM ]

Superseded by CLJ-928.





[CLJ-899] Accept and ignore colon between key and value in map literals Created: 18/Dec/11  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Unassigned
Resolution: Unresolved Votes: 4
Labels: reader


 Description   

Original title was 'treat colons as whitespace' which isn't a problem description but a (flawed) implementation approach

For JSON compatibility
known problems when no spaces - x:true and y:false



 Comments   
Comment by Tassilo Horn [ 23/Dec/11 3:22 AM ]

Discussed here: https://groups.google.com/d/msg/clojure/XvJUzaY1jec/l8xEwlFl8EUJ

Comment by Kevin Downey [ 11/Jan/12 2:23 PM ]

please no

Comment by Tavis Rudd [ 16/Jan/12 12:17 PM ]

Alan Malloy raises a good point in the google group discussion (https://groups.google.com/d/msg/clojure/XvJUzaY1jec/aVpWBicwGhsJ) about accidental confusion between trailing (or floating) and leading colons:
"It isn't even as simple as "letting them
be whitespace", because presumably you want (read-string "{a: b}") to
result in (hash-map 'a 'b), but (read-string "{a :b}") to result in
(hash-map 'a :b)."

This issue could be avoided by only treating a colon as whitespace when followed by a comma. As easy cut-paste of json seems the be the key motivation here, the commas are going to be there anyway: valid {"v":, 1234} vs syntax error {a-key: should-be-a-keyword}.

Comment by Alex Baranosky [ 16/Jan/12 5:23 PM ]

This would be visually confusing imo.

Comment by Laurent Petit [ 17/Jan/12 5:01 PM ]

Please, oh please, no.

Comment by Tavis Rudd [ 18/Jan/12 2:40 PM ]

Er, brain fart. I was typing faster than I was thinking and put the comma in the wrong place. In my head I meant the form following the colon would have to have a comma after it. Thus, {"a-json-key": 1234, ...} would be valid while {"a-json-key": was-supposed-to-be-a-keyword "another-json-key" foo} would complain about the colon being an Invalid Token. I don't see the need for it, however.

Comment by Joseph Smith [ 27/Feb/12 10:55 AM ]

Clojure already has reader syntax for a map. If we support JSON, do we also support ruby map literals? Seems like this addition would only add confusion, imo, given colons are used in keywords and keywords are frequently used in maps - e.g., when de-serializing from XML, or even JSON.

Comment by David Nolen [ 27/Feb/12 11:19 AM ]

Clojure is no longer a language hosted only on the JVM. Clojure is also hosted on the CLR, and JavaScript. In particular ClojureScript can't currently easily deal with JSON literals - an extremely common (though problematic) data format. By allowing colon whitespace in map literals - Clojure data structures can effectively become an extensible JSON superset - giving the succinctness of JSON and the expressiveness of XML.

+1 from me.

Comment by Tim McCormack [ 13/Nov/12 7:27 PM ]

Clojure is only hosted on the JVM; ClojureScript is hosted on JS VMs. If this is useful for CLJS, it should just be a CLJS feature.

Comment by Mike Anderson [ 10/Dec/12 11:51 PM ]

-1 for this whole idea: that way madness lies....

If we keep adding syntactical oddities like this then the language will become unmaintainably complex. It's the exact opposite of simple to have lots of special cases and ambiguities that you have to remember.

If people want to use JSON that is fine, but then the best approach use a specific JSON parser/writer, not just paste it into Clojure source and expect it to work.

Comment by Laszlo Török [ 11/Dec/12 4:54 AM ]

-1 for reasons mentioned by Allan Malloy and Mike Anderson





[CLJ-889] Specifically allow '.' inside keywords Created: 01/Dec/11  Updated: 27/Jul/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Howard Lewis Ship Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: keywords, reader


 Description   

The documentation for keywords (on page http://clojure.org/reader) specifically states that '.' is not allowed as part of a keyword name; however '.' is specifically useful. For example, several web frameworks for Clojure use keywords to represent HTML elements, using CSS selector syntax (i.e., :div.important is equivalent to <div class='important'>).

In any case, the use of '.' is not checked by the reader and it is generally useful.

I would like to see '.' officially allowed (in the documentation). Further, I'd like to see additional details about which punctuation characters are allowed (my own web framework uses '&', '?' and '>' inside keywords for various purposes ... again, current reader implementation does not forbid this, but if a future reader will reject it, I'd like to know now).



 Comments   
Comment by Howard Lewis Ship [ 08/Dec/11 3:37 PM ]

To clarify, Hiccup and Cascade both use keywords containing '#' and '.' Cascade goes further, using '&' (to represent HTML entities), '>', and (possibly in the future) '?'.

Comment by Devin Walters [ 20/Oct/12 6:46 PM ]

I think the EDN spec mitigates some of the concern, but as of yet the official clojure.org reader documentation does not reflect the language used in the description of EDN. Where does EDN stand right now? Can the description being used on the github page be pulled over to clojure.org?

References:

Comment by Howard Lewis Ship [ 15/Apr/13 5:56 AM ]

Unfortunately, the EDN specification does not mention '>'.





[CLJ-878] DispatchReader always calls CtorReader when no dispatch macro found Created: 15/Nov/11  Updated: 03/Sep/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Greg Chapman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader
Environment:

Windows 7, Java 1.7



 Description   

At the REPL, I accidentally typed:

user=> # "\w+"
RuntimeException Unsupported escape character: \w clojure.lang.Util.runtimeExce
ption (Util.java:156)
#<core$PLUS clojure.core$PLUS@6b7dc78>

You can see the confusing result (the REPL is also left in an unclosed string). After looking at LispReader.java, it seems to me DispatchReader ought to at least check for whitespace before calling CtorReader (perhaps better would be to check for a valid symbol character). Another example:

user=> (defrecord X [x])
user.X
user=> # "user.X"[5]
#user.X{:x 5}

I'm assuming the fact that the above successfully creates an X is accidental.






[CLJ-420] Undefined symbols raise exceptions with line/column number of enclosing expression Created: 08/Aug/10  Updated: 28/Oct/13  Resolved: 25/Oct/13

Status: Resolved
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.6

Type: Defect Priority: Minor
Reporter: Alexander Redington Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: errormsgs, reader

Attachments: Text File CLJ-420-2.patch     Text File CLJ-420-3.patch     Text File CLJ-420-4.patch     Text File CLJ-420.patch    
Patch: Code
Approval: Screened

 Description   

Certain kinds of errors in loaded source files are coming back tagged with the correct source file, but an incorrect line:column number. This seems to happen when unknown symbols occur by themselves, not called as a function.

The general pattern appears to be that an undefined symbol is reported with a line number of the beginning of its nearest enclosing expression. If the undefined symbol appears at the top level of a file, it is reported with line:column number 0:0, or line:column number of REPL input, if loaded from a REPL. The behavior is different in a Leiningen REPL. If the undefined symbol appears at the top level of a file, it is reported with line:column number 1:1.

$ cat test1.clj 

bla
$ cat test2.clj 

(bla)
$ java -cp ../../opt/clojure/clojure-1.5.1.jar:. clojure.main
Clojure 1.5.1
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:1:1) 
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:2:1) 
user=> (require 'test2)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test2.clj:2:1) 
user=> (require 'test2)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test2.clj:2:1) 
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:5:1)

Patch: CLJ-420-3.patch

Approach: Capture line and column metadata for symbols in the LispReader. A few tests were adjusted to ignore line and col metadata for protocol symbols which now have them.

Screened by: Alex Miller

Background: Clojure Google group thread when this issue was originally reported in 2010 against Clojure 1.2: http://groups.google.com/group/clojure/browse_frm/thread/beb36e7228eabd69/a7ef16dcc45834bc?hl=en#a7ef16dcc45834bc



 Comments   
Comment by Assembla Importer [ 28/Sep/10 9:59 PM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/420

Comment by Assembla Importer [ 28/Sep/10 9:59 PM ]

stu said: Updating tickets (#427, #426, #421, #420, #397)

Comment by Assembla Importer [ 28/Sep/10 9:59 PM ]

stu said: Updating tickets (#429, #437, #397, #420)

Comment by Alex Miller [ 10/Aug/13 12:20 AM ]

Based on the updated information (which is really a totally different issue), I have reduced priority from Major to Minor, removed the fix version and sent this back down to Triaged for Rich to take another look.

Comment by Paavo Parkkinen [ 28/Sep/13 6:58 AM ]

Just noticed that when I reproduce this with current code from Github, I get the same behaviour that was in the original report.

$ cat test1.clj 

bla
$ cat test2.clj 

(bla)
$ java -cp target/clojure-1.6.0-master-SNAPSHOT.jar:. clojure.main
Clojure 1.6.0-master-SNAPSHOT
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:1:1) 
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:2:1) 
user=> (require 'test2)    
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test2.clj:2:1) 
user=> (require 'test2)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test2.clj:2:1) 
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:5:1) 
user=> 
Comment by Paavo Parkkinen [ 28/Sep/13 7:23 AM ]

I also get the original behaviour with 1.5.1.

$ java -cp ../../opt/clojure/clojure-1.5.1.jar:. clojure.main
Clojure 1.5.1
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:1:1) 
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:2:1) 
user=> (require 'test2)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test2.clj:2:1) 
user=> (require 'test2)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test2.clj:2:1) 
user=> (require 'test1)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: bla in this context, compiling:(test1.clj:5:1) 
user=> 

Could be lein is mixing it up. I also get the behaviour in the description if I do it with lein.

Comment by Andy Fingerhut [ 28/Sep/13 11:22 PM ]

Paavo, I think you are correct that I was getting different behavior than the original description because I was using Leiningen, rather than straight Clojure, and that the original description's behavior is still true with the latest Clojure master when Leiningen is not used. My bad for changing the description unnecessarily. Would you be willing to correct it?

Comment by Paavo Parkkinen [ 29/Sep/13 7:21 AM ]

Corrected description.

Comment by Paavo Parkkinen [ 06/Oct/13 5:20 AM ]

I have a fix for This bug. I will attach it to the ticket, but it is not supposed to be considered for approval, as it is clearly not finished (i.e. cleaned up) yet. If there is a better way to solicit comments for a proposed fix, please let me know.

For the fix, I simply copied the line:column number tracking code from ListReader.invoke (also used in others) into LispReader.read for Symbols. Only issues I see is that some test cases get confused because some meta data contains line:column info where it didn't use to.

I also tried to fix the line:column numbering for files like test3 and test4 below, and the fix is in the patch file, but commented out, because the fix for that causes some compile problems. I have not yet tracked down what the cause of the problems is. I suppose the correct procedure is to leave it out for now, and possibly open a new ticket for it?

$ cat test3.clj 

(
bla)
$ cat test4.clj 

(print
bla)

If anyone has any comments or suggestions, I would be very happy to hear them. In the mean time I'll start checking out the failing test cases, and seeing if I can correct them to work with the fix. I'm not sure how I should go about testing the fix itself.

Also, if the comments here is not the correct place for this discussion, please let me know and I will move it elsewhere (clojure-dev).

P.s. I suppose for test2, the column number should be 2, not 1. Not sure if that's a big deal or not.

Comment by Alex Miller [ 17/Oct/13 11:04 PM ]

A few things here:

1) I think you should drop the commented Compiler changes and just focus on the reader changes.
2) This patch has a number of test failures. In particular, protocol symbols now have line and column metadata. That's potentially quite useful but I haven't fully contemplated the ramifications.

Moving to Incomplete for now.

Comment by Paavo Parkkinen [ 19/Oct/13 9:40 PM ]

I attached a new version of patch, which gets rid of the Compiler changes, and fixes the test cases by simply ignoring the additional line/column metadata. I'm not aware of any way to reliably know what the values are going to be so they could be checked.

Or course there might be code that relies on the metadata not including line/column info, and that code will break with the patch. I can't really think of any reason why anyone would do that, and can't estimate how many people might be affected. After the patch people will probably start using the new metadata, and after that it will be difficult to get rid of, if needed in the future.

There is, I think, also potential performance effects. I believe the metadata is generated at compile time, but will be available at runtime, which will mean at least a minor increase in footprint, and possibly CPU as well. Again, I'm not that familiar with how the runtime works that I could really estimate that.

Of course there may also be other effects that I can't imagine right now.

Comment by Alex Miller [ 20/Oct/13 5:05 PM ]

I updated with a new patch that fixes some whitespace errors and removes the CLJ-420 comments which don't seem very helpful to me. Marking screened.

Comment by Rich Hickey [ 25/Oct/13 7:51 AM ]

this breaks tests, and will also break a ton of code not expecting this new metadata

Comment by Paavo Parkkinen [ 28/Oct/13 7:44 AM ]

Attached a patch (CLJ-420-4.patch) which fixes this without introducing the new metadata into symbols. Instead of using metadata to pass the line/column info from the reader to the compiler, I just store it in fields in the Symbol class.

Comment by Andy Fingerhut [ 28/Oct/13 1:31 PM ]

Paavo, I am not a Clojure screener, and cannot say any of this with authority to back it up, but Rich chose to close this ticket as declined, meaning that it would take some convincing for him to consider a ticket for it again. I am not saying you can't attach all of the patches you want to this ticket, but they might not go anywhere.

Comment by Alex Miller [ 28/Oct/13 2:33 PM ]

+1 to Andy's comment.

Further, I don't think the replacement patch respects the immutability and immutability concerns of interned Symbols - setting the line and column is a big race condition in the patch as is. IF this path were going to be acceptable, line and column would need to be final fields set in the Symbol constructor. However, it seems to me that you could easily have two symbols sharing the same interned Symbol that perhaps were created in different locations.

An additional item to consider is the additional memory overhead of carrying two additional ints on every Symbol. In general, it does not seem to me that this path provides enough value per overhead/effort so I would think I would recommend letting it go OR filing a new ticket.

Comment by Paavo Parkkinen [ 28/Oct/13 7:10 PM ]

I didn't realize the ticket was closed. Thanks for pointing that out.

And thanks for taking the time to still take a look at and comment on my patch, Alex. It's good advice.





[ASYNC-70] documentation of thread macro should include behavior of nil (closes the channel) Created: 15/May/14  Updated: 23/Jun/14  Resolved: 23/Jun/14

Status: Closed
Project: core.async
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Howard Lewis Ship Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: reader
Environment:

0.1.278.0



 Description   

I needed the ability to invoke some code in a thread, and have the channel close if nil was returned.

Digging though the code, I discovered it already does this, but it is no documented in the docstring.

I'll supply a patch shortly.



 Comments   
Comment by Alex Miller [ 23/Jun/14 11:15 AM ]

The thread docstring says: "Returns a channel which will receive the result of the body when completed." The special case of a null return value is actually handled by ignoring it (because you are not allowed to explicitly put nils on a channel). The channel is closed on completion regardless. I'm not sure I get what needs to be added here and no patch, so closing. Reopen if there is a concrete suggestion to evaluate.





Generated at Tue Jul 22 10:37:04 CDT 2014 using JIRA 4.4#649-r158309.