<< Back to previous view

[TRDR-44] The exceptions throw when parsing fails could be much more specific and helpful. Created: 12/Apr/17  Updated: 13/Apr/17

Status: Open
Project: tools.reader
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Russ Olsen Assignee: Nicola Mometto
Resolution: Unresolved Votes: 1
Labels: error-reporting

Both cljs and clojure.
OS independent.

Attachments: Text File edn.patch    
Patch: Code and Test


The exceptions thrown when edn/clojure code parsing fails are less helpful than they might be:

  • The same message is used in slightly different circumstance. For example "EOF while reading" is used in three slightly different situations in clojure/tools/reader.clj, while "EOF while reading character" is used four times in the same file.
  • The exception messages frequently do not include much of the available context. Which token is bad? Which map contains an odd number of items?
  • The line number where the error occurred is not always returned.

Comment by Nicola Mometto [ 13/Apr/17 10:03 AM ]

Hi Russ,
thanks for the patch, this is a significant change so it might take me a while to review it and I can't say whether or not I'll accept it yet.

In the meantime, can you give me a quick overview of the impl approach and a description of which error messages you changed?

Comment by Russ Olsen [ 13/Apr/17 11:28 AM ]


Sure, thanks for writing.

Let me start by saying that I'm not married to the approach that I tool with this patch. It seemed to me a good way to make things better, but certainly not the only way.
The tact that I took was:

  • Factor out generating the common exceptions and put that code in the errors namespace. I did this because – as I'm sure you are aware – there are four separate implementations of the basic reading code, one each for tools.reader and tools.reader.edn and times two for clojure and clojurescript. There is more or less one
    function in errors for each exception.
  • In a couple of places I pushed down the specific type of value we are reading into the generic function. So both read-delimited and read-token have an additional 'kind' parameter, so that when something goes wrong they can say "Opps, I was reading a vector (or symbol or keyword) and stuff happened."
  • Also added an inspect multimethod whose idea is to provide a safe way to turns values into strings to provide context. I say 'safe' in that inspect will happily turn a keyword or a symbol into a string, but will give you <seq> for lazy (and possibly never ending) sequences.
  • I also changed read-string to use an indexing reader.

As for the specifics of the messages, well there are a lot of them. Can I suggest you take a looks at errors.clj - I think the changes are apparent there. If you still need more context then I'll be happy to write up something more formal.

[TRDR-29] Simple benchmarking bash script to test before & after applying a batch Created: 17/Jul/15  Updated: 17/Jul/15

Status: Open
Project: tools.reader
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: David Nolen Assignee: David Nolen
Resolution: Unresolved Votes: 0
Labels: None

Generated at Thu May 25 16:52:08 CDT 2017 using JIRA 4.4#649-r158309.