<< Back to previous view

[CLJ-1033] pr-str and read-string don't handle @ symbols inside keywords properly Created: 26/Jul/12  Updated: 13/Jan/13

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Steven Ruppert Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader
Environment:

Ubuntu 12.04 LTS; Java 1.7.0_05 Java HotSpot(TM) Client VM


Approval: Vetted

 Description   
user=> (read-string (pr-str {(keyword "key@other") :stuff}))
RuntimeException Map literal must contain an even number of forms  clojure.lang.Util.runtimeException (Util.java:170)

pr-str emits "{:key@other :stuff}", which read-string fails to interpret correctly. Either pr-str needs to escape the @ symbol, or read-string needs to handle the symbol inside a keyword.

Background: I'm passing a map with email addresses as keys through Storm bolts, which require a thrift-serializable form. Using the pr-str/read-string combo fails on these keys, so I've fallen back to JSON serialization.



 Comments   
Comment by Stuart Halloway [ 10/Aug/12 12:51 PM ]

The '@' character is not a legal character for keywords or symbols (see http://clojure.org/reader). Recategorized as enhancement request.

Comment by Steven Ruppert [ 10/Aug/12 1:04 PM ]

Then why doesn't (keyword "keywith@") throw an exception? It seems inconsistent with your statement.

Comment by Andy Fingerhut [ 13/Sep/12 2:23 PM ]

It is a long standing property of Clojure that it does not throw exceptions for everything that is illegal.

Comment by Steven Ruppert [ 17/Sep/12 2:16 PM ]

Yeah, but read-string clearly does. Is there a good reason that the "keyword" function can't throw an exception? With the other special rules on namespaces within symbol names, the "keyword" function really should be doing validation.

Another solution would be to allow a ruby-like :"symbol with disallowed characters" literal, but that would also be confusing with how the namespace is handled.

https://groups.google.com/forum/?fromgroups=#!topic/clojure/Ct5v9w0yNAE has some older discussion on this topic.

Comment by Andy Fingerhut [ 17/Sep/12 7:43 PM ]

Disclaimer: I'm not a Clojure/core member, just an interested contributor who doesn't know all the design decisions that were made here.

Steven, I think perhaps a couple of concerns are: (1) doing such checks would be slower than not doing them, and (2) implementing such checks means having to update them if/when the rules for legal symbols, keywords, namespace names, etc. change.

Would you be interested in writing strict versions of functions like symbol and keyword for addition to a contrib library? And test suites that try to hit a significant number of corner cases in the rules for what is legal and what is not? I mean those as serious questions, not rhetorical ones. This would permit people that want to use the strict versions of these functions to do so, and at the same time make it easy to measure performance differences between the strict and loose versions.

Comment by Steven Ruppert [ 13/Jan/13 10:58 PM ]

Looking back at this, the root cause of the problem is that the {pr} function, although it by default "print[s] in a way that objects can be read by the reader" [0], doesn't always do so. Thus, the easiest "fix" is to change its docstring to warn that not all keywords can be read back.

The deeper problem is that symbol don't have a reader form that can represent all actually possible keywords (in this case, those with "@" in them). Restricting the actually-possible keywords to match the reader form, i.e. writing a strict "keyword" function actually seems like a worse solution overall to me. The better solution would be to somehow extend the keyword reader form to allow it to express all possible keywords, possibly ruby's :"keyword" syntax. Plus, that solution would avoid having to keep hypothetical strict keyword/symbol functions in sync with reader operation, and write test cases for that, and so on.

Thus, the resolution of this bug comes down to how far we're willing to go. Changing the docstring would be the easiest, but extending the keyword form would be the "best" resolution, IMO.

[0]: http://clojuredocs.org/clojure_core/clojure.core/pr

Generated at Wed May 22 07:43:23 CDT 2013 using JIRA 4.4#649-r158309.