<< Back to previous view

[TRDR-25] Source information when reading EDN Created: 06/Apr/15  Updated: 05/May/15

Status: Open
Project: tools.reader
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Andrew Childs Assignee: Nicola Mometto
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File tools.reader-edn-source-info.diff     File toos.reader-edn-source-info-2.diff    
Patch: Code

 Description   

I want to parse a file as EDN with full source information (line/column metadata), but tools.reader does not provide any metadata when using clojure.tools.reader.edn. I also want the source information for numeric literals and keywords, which cannot hold metadata. One approach is to parameterise the construction of the data, and allow the builder to add or discard metadata, including a place to hold metadata for keywords and numeric literals. I've attached an implementation as a patch.

Would this be suitable for inclusion into tools.reader? If not, is there a different approach that would be?



 Comments   
Comment by Nicola Mometto [ 07/Apr/15 5:06 AM ]

Hi Andrew, I'm definitely interested in adding such a feature to tools.reader, however before I can consider your patch/approach can address the following:

  • don't add unnecessary whitespace/indentation changes, it makes it harder to see what's actually getting changed
  • the change from :use to :require+:refer is a no-go, there's a reason why t.reader is not using :requrie+:refer: clojure 1.3 compatibility. Please revert that change and make sure tools.reader is still back compatible by running `lein test-all`
  • what are the performance implications of this change? can you attach a couple of benchmarks showing pre-patch vs post-patch timing?
  • why does this only targets the edn reader?

Thanks

Comment by Andrew Childs [ 07/Apr/15 9:11 PM ]

Hi Nicola,

  • I've attached an updated patch that addresses the first two points: removed whitespace churn and lein test-all passes.
  • For generating timing information, is there a set of benchmarks or a collection of example documents you would recommend I use?
  • This only targets the edn reader since that is what is required for my application. The same approach could be applied to the regular reader.

Thank you for considering these changes,
Andrew

Comment by Nicola Mometto [ 13/Apr/15 12:45 PM ]

Hi Andrew,
if this is going to make into tools.reader, it has to be a change in both the edn reader and the regular clojure reader.
It's a tools.reader policy that the regular clojure reader is additive to the edn reader meaning that if a feature is available for t.r.edn then it is avaliable for t.reader too (the opposite is not guaranteed).

For timing information, I suggest you generate a bunch of random edn data using something like data.generators or test.check and benchmark it pre and post patch using criterium (https://github.com/hugoduncan/criterium)

Comment by Andrew Childs [ 29/Apr/15 7:00 PM ]

Hi Nicola,

I've run some benchmarks, the details of which I have pushed to github. The output seems to indicate there's some performance penalty for small inputs, but for larger inputs it's less obvious or even faster. Since it's adding a layer of abstraction, I don't yet understand why it is sometimes faster for larger inputs.

I'm happy to do the work to integrate a similar abstraction for the regular clojure reader, but I'd like to know if you agree with the approach before I put in the time do so. I'd really like full line numbers in the EDN reader, but I'm not attached to any particular implementation.

Thanks,
Andrew

Comment by Nicola Mometto [ 05/May/15 9:34 AM ]

This is a significant change and I don't think think I will have time to review it anytime soon, in the meantime thanks for your work, I'll let you know as soon as I get to look at it more throughly.





[TRDR-29] Simple benchmarking bash script to test before & after applying a batch Created: 17/Jul/15  Updated: 17/Jul/15

Status: Open
Project: tools.reader
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: David Nolen Assignee: David Nolen
Resolution: Unresolved Votes: 0
Labels: None





[TRDR-13] Better documentation wrt interaction between reader-types and java readers Created: 11/Apr/14  Updated: 11/Apr/14

Status: Open
Project: tools.reader
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Nicola Mometto Assignee: Nicola Mometto
Resolution: Unresolved Votes: 0
Labels: None





Generated at Tue Sep 01 17:27:25 CDT 2015 using JIRA 4.4#649-r158309.