<< Back to previous view

[CLJ-1138] data-reader returning nil causes exception Created: 22/Dec/12  Updated: 15/Feb/17

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5, Release 1.6, Release 1.7, Release 1.8
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Steve Miner Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: reader

clojure 1.5 beta2, Mac OS X 10.8.2, java version "1.6.0_37"

Attachments: Text File 0001-CLJ-1139-allow-nil-in-data-reader.patch     Text File clj-1139-2.patch    
Patch: Code and Test
Approval: Triaged


If a data-reader returns nil, the reader throws java.lang.RuntimeException: No dispatch macro... The error message implies that there is no dispatch macro for whatever the first character of the tag happens to be.

Here's a simple example:

user=> (binding [*data-readers* {'f/ignore (constantly nil)}] 
         (read-string "#f/ignore 42 10"))
RuntimeException No dispatch macro for: f  clojure.lang.Util.runtimeException (Util.java:219)

The original reader code did not distinguish between the absence of a data-reader and a returned value of nil from the appropriate data-reader. It therefore got confused and tried to find a dispatch macro, sending it further down the incorrect code path, ultimately yielding a misleading error message.

The original documentation did not distinguish nil as an illegal value. Clearly this bug was an oversight in the original data-reader code, not an intentional feature.

The patch uses a sentinel value to distinguish the missing data-reader case from the nil returned value case.

Patch: clj-1139-2.patch

Comment by Steve Miner [ 22/Dec/12 9:43 AM ]

clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch allows a data-reader to return nil instead of throwing. Does sanity check that possible tag or record isJavaIdentifierStart(). Gives better error message for special characters that might actually be dispatch macros (rather than assuming it's a tagged literal).

Comment by Steve Miner [ 22/Dec/12 10:06 AM ]

clj-1138-data-reader-return-nil-for-no-op.patch allows a data-reader returning nil to be treated as a no-op by the reader (like #_). nil is not normally a useful value (actually it causes an exception in Clojure 1.4 through 1.5 beta2) for a data-reader to return. With this patch, one could get something like a conditional feature reader using data-readers.

Comment by Steve Miner [ 22/Dec/12 10:26 AM ]

clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch is the first patch to consider. It merely allows nil as a value from a data-reader and returns nil as the final value. I think it does what was originally intended for dispatch macros, and gives a better error message in many cases (mostly typos).

The second patch, clj-1138-data-reader-return-nil-for-no-op.patch, depends on the other being applied first. It takes an extra step to treat a nil value returned from a data-reader as a no-op for the reader (like #_).

Comment by Steve Miner [ 23/Dec/12 11:52 AM ]

It turns out that you can work around the original problem by having your data-reader return '(quote nil) instead of plain nil. That expression conveniently evaluates to nil so you can get a nil if necessary. This also works after applying the patches so there's still a way to return nil if you really want it.

(binding [*data-readers* {'x/nil (constantly '(quote nil))}] (read-string "#x/nil 42"))
;=> (quote nil)

Comment by Andy Fingerhut [ 07/Feb/13 9:20 AM ]

Patch clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch dated Dec 22 2012 still applies cleanly to latest master if you use the following command:

% git am --keep-cr -s --ignore-whitespace < clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch

Without the --ignore-whitespace option, the patch fails only because some whitespace was changed in Clojure master recently.

Comment by Andy Fingerhut [ 13/Feb/13 11:24 AM ]

OK, now with latest master (1.5.0-RC15 at this time), patch clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch no longer applies cleanly, not even using --ignore-whitespace in the 'git am' command given above. Steve, if you could see what needs to be updated, that would be great. Using the patch command as suggested in the "Updating stale patches" section of http://dev.clojure.org/display/design/JIRA+workflow wasn't enough, so it should probably be carefully examined by hand to see what needs updating.

Comment by Steve Miner [ 14/Feb/13 12:21 PM ]

I removed my patches. Things have changes recently with the LispReader and new EdnReader.

Comment by Alex Miller [ 15/Feb/17 9:18 AM ]

Fixed whitespace warning and updated patch so it applies, no semantic changes, attribution retained in clj-1139-2.patch.

Comment by Alex Miller [ 15/Feb/17 9:27 AM ]

Ticket needs better description of problem and approach taken in the patch.

Comment by Steve Miner [ 15/Feb/17 10:09 AM ]

If the problem isn't clear, I would ask why would a nil return value be treated specially for a data-reader? And if it is considered illegal by design, does this error message enlighten the user?

I could not find any documented restriction at the time the bug was filed and I still can't find any today. So it seems like a simple bug to me. The data-reader should be allowed to return nil, and the Clojure reader should process the nil as usual. My work-around was to return (quote nil) which gave the intended behavior without triggering the bug.

Comment by Alex Miller [ 15/Feb/17 10:55 AM ]

Would appreciate more updates to the description. My question would be whether invoking a data reader function should ever return nil. Is there a good use case to need this? It seems you are reading the description of a non-nil tagged value with the reader and thus getting back nil is confusing. That's not possibly round-trippable and thus seems asymmetric.

Comment by Steve Miner [ 15/Feb/17 1:34 PM ]

Nulla poena sine lege or basically unless you say it's illegal, I should be able to do it. My trivial example is #C NULL which seems like an obvious nil to me.

Looking back on this issue, I can see that most people think of tagged literals as a way of encoding foreign values in Clojure literals. If you only care about an extensible data notation, who needs another way of writing nil? That's a fair question.

I wanted to use data-readers as somewhat circumscribed reader macros (as used in Common Lisp). I discovered this bug while I was doing something platform specific (long before reader conditionals were implemented). In my situation, it was convenient to return nil on "other" platforms.

Many usages of data-readers are not bijective. For example, #infix (3 + 4) interpreted as constant 7 is likewise not round-trippable. Unless you're Dan Friedman or Wil Byrd, round-tripping is a tough requirement.

I will try to update my description with a bit more context, but I don't want to distract anyone from the obvious bug (and bad error message) with my unorthodox usage.

Comment by Steve Miner [ 15/Feb/17 1:45 PM ]

By the way, this bug is CLJ-1138, but the proposed patch says "1139" which might confuse some busy reviewers.

Comment by Steve Miner [ 15/Feb/17 2:13 PM ]

I tested the patch and it worked well for me with the current master. I would suggest adding another test to confirm that the edn/read-string works correctly as well. Here's what I used. This also tests that overriding the default readers works. Please feel free to take the test if you want it.

(deftest clj-1138-uuid-override
  (is (nil? (binding [*data-readers* {'uuid (constantly nil)}]
              (read-string "#uuid \"550e8400-e29b-41d4-a716-446655440000\""))))
  (is (nil? (edn/read-string {:readers {'uuid (constantly nil)}}
                             "#uuid \"550e8400-e29b-41d4-a716-446655440000\""))))
Generated at Thu Oct 19 18:52:40 CDT 2017 using JIRA 4.4#649-r158309.