<< Back to previous view

[CLJS-677] cljs.reader doesn't support keywords starting with a digit Created: 12/Nov/13  Updated: 10/Sep/15

Status: Open
Project: ClojureScript
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Chas Emerick Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: None

ClojureScript:foo> (r/read-string ":0")
"Error evaluating:" (r/read-string ":0") :as "cljs.reader.read_string.call(null,\":0\")"
org.mozilla.javascript.EcmaError: TypeError: Cannot read property "0.0" from null (file:/home/chas/dev/clojure/cljs/.repl/cljs/reader.js#451)

The topic of leading digits in keywords came up separately, as they've been supported in Clojure for some time, but can now be considered part of the spec, as it were. See CLJ-1286.

BTW, this is another simple-check win...

Comment by Chas Emerick [ 21/Nov/13 9:38 AM ]

This is not a simple regex change, as I had hoped given the recent flurry in Clojure. The symbol pattern in cljs.reader is faithful to Clojure HEAD, but the processing of matches isn't. I think it may be a wash as to whether it'd be easier to fix what's there vs. porting clojure.tools.reader.impl.commons/parse-symbol (which incidentally doesn't use a regex)…either way, leaving it for another day (or someone else, if they're up for it).

Comment by Francis Avila [ 02/Jul/14 12:35 AM ]

I think I fixed the match processing issue you're talking about (CLJS-775 CLJS-776)? However I'm still confused by this and CLJ-1286. The clojure reader docs and edn spec still say they should reject `:0`, but 1.6.0 doesn't. What's the expected behavior? Is the spec going to be fixed, or clojure reader fixed once downstream packages are fixed?

Comment by Jozef Wagner [ 02/Jul/14 1:50 AM ]

AFAIK EDN specs do not reject :0 (no rule that the second character cannot be a digit). See https://github.com/wagjo/serialization-formats for my interpretation of existing specs.

Comment by Francis Avila [ 02/Jul/14 1:35 PM ]

Ah, I think I see the source of the confusion. Both EDN and the clojure reader spec both say something like "keywords are like symbols, except beginning with a colon." The confusion lies in whether we interpret that as meaning

  1. First character is a colon, then the second character and after are matched against the symbol definition.
  2. The first character is a colon, and the whole form is matched against the symbol definition.

CLJ-1003 CLJ-1252 and CLJ-1286 and myself all seem to understand the first meaning. This might be because when we say "the first character of a keyword" we typically mean the first character after the colon, as if the colon is "special" and not part of the keyword (e.g. like a reader macro character).

However clojure 1.6 seems to be following the second meaning (and explains why `:0/a` is ok but not `:0/0`), and I'm not sure from the cited tickets and google group discussions whether this is because of downstream breakage or if this is the intended interpretation and the patch from CLJ-1252 was accepted by Alex Miller erroneously.

Note if we accept the second interpretation, then the restriction "A symbol can contain one or more non-repeating ':'s." from the clojure reader docs is incorrect for keywords. (EDN doesn't allow namespace-expanded keywords, it seems, so it's not an issue there.)

Also EDN allows contiguous colons in symbols, whereas clojure 1.6 and the reader spec do not.

Comment by Francis Avila [ 02/Jul/14 2:11 PM ]

Also clojure 1.6 allows a/:a and :a/:a (where name part violates first-character rule for symbols), even though the specs do not. (This is something your table doesn't mention. Very thorough work BTW! I wish the reader spec was more formalized and unambiguous...)

Comment by Francis Avila [ 02/Jul/14 3:08 PM ]

I think this pattern follows the specs:

(?!///) # edge case: / only allowed in name part.
# name or namespace part of symbol or keyword
 #division symbol
 # normal symbol
 # symbol starting with [-+.]
 # keyword
# name part when namespace is present
(?:/(/ # division symbol
# groups:
# 1: symbol name or namespace 2: keyword colon(s) 3: keyword name or namespace
# 4: keyword or symbol name (and groups 1 and 3 are namespaces)"


  1. Does not enforce no-repeating-colon rule (but it is easy to validate after matching).
  2. Rejects violations of first-character-rule in symbols which clojure accepts.
  3. Accepts a trailing colon on namespace (unlike clojure).
  4. Accepts foo// or :foo//, which are not clearly addressed by the specs. (Jozef's table has more background). These are both allowed in Clojure 1.6, but not 1.5 or (arguably) edn.
Comment by Francis Avila [ 02/Jul/14 6:28 PM ]

Another problem: Accepts :::a/b, which I think is ok per the specs but is not read by 1.6. Crazy example:

user=> (require ['clojure.core :as (symbol ":a")])
user=> :::a/map

RuntimeException Invalid token: :::a/map  clojure.lang.Util.runtimeException (Util.java:221)
user=> (resolve (symbol ":a" "map"))

Theoretically I might expect :::a/map to be read as :clojure.core/map?

Comment by Nicolás Berger [ 10/Sep/15 6:44 AM ]

Bumping this up, as I just scratched my head for an hour to find out this was the culprit

Comment by David Nolen [ 10/Sep/15 7:15 AM ]

Nicolás, the premise of the ticket is that this should be supported when clearly the Clojure documentation about valid keywords states that it isn't. The Clojure implementation just happens to allow it. In anycase, this needs to be sorted out in Clojure first.

Comment by Francis Avila [ 10/Sep/15 9:35 AM ]

I think CLJ-1527 is currently the ticket where this problem is pursued.

Generated at Sat Feb 06 05:53:41 CST 2016 using JIRA 4.4#649-r158309.