<< Back to previous view

[CLJ-1424] Reader conditionals Created: 15/May/14  Updated: 27/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Ghadi Shayban Assignee: Alex Miller
Resolution: Unresolved Votes: 1
Labels: reader

Attachments: File clj-1424-10.diff     File CLJ-1424-2.diff     File clj-1424-3.diff     File clj-1424-4.diff     File clj-1424-5.diff     File clj-1424-6.diff     File clj-1424-7.diff     File clj-1424-8.diff     File clj-1424-9.diff     File clojure-feature-expressions.diff    
Patch: Code and Test
Approval: Vetted

 Description   

See http://dev.clojure.org/display/design/Reader+Conditionals for design.

The implementation details contain several major parts:

1) LispReader changes to implement new conditional read syntax and features.

In RT:

  • suppress-read - new dynamic var
  • readString() - new arity that takes opts

In LispReader:

  • read(...) entry point - added new variants that take opts. The opts-only form will decode :eof into the correct values for eofIsError and eofValue. Internal forms take both opts and pendingForms (list).
  • All reader invoke dispatch methods now take both opts and pendingForms which get drug around the call stack.
  • Defined all option keys and well-known allowed values - OPT_EOF, OPT_FEATURES, OPT_READ_COND, EOFTHROW, PLATFORM_KEY, PLATFORM_FEATURES, COND_ALLOW, COND_PRESERVE.
  • read(...) entry point will ensure installation of {:features #{:clj}}, appropriately merging others or none. There is no path that will pass :features inside Clojure but users can call LispReader with it directly.
  • readDelimitedList() was refactored so it can share logic with readCondDelimited().
  • Luke can provide more detail on the guts of reader conditional reading if needed.

New ReaderConditional and TaggedLiteral types.

In clojure.core:

  • tagged-literal?, tagged-literal, reader-conditional?, reader-conditional - new functions to support the preserved data form of these
  • new print-method impls for TaggedLiteral and ReaderConditional
  • read - added new arity that takes opts and explain opts in docstring
  • read-string - added new arity that takes opts (placement matches edn/read-string and is designed for partial application)

2) cljc file changes.
In Compiler:

  • OPTS_COND_ALLOWED - keep around a map {:read-cond :allow} to invoke reader when reading .cljc files.
  • readerOpts() - new private function to determine reader options to use based on file name (.clj vs .cljc)
  • load() - use readerOpts() and invoke LispReader with them
  • compile() - use readerOpts() and invoke LispReader with them

In RT:

  • load(...) - check first for .clj file, then for .cljc file when looking by ns

In clojure.main - update code to allow for .clj and .cljc files
In clojure.repl - update code to allow for .clj and .cljc files

3) Tests + infrastructure updates.

  • Move test/clojure/test_clojure/reader.clj to test/clojure/test_clojure/reader.cljc so that we can test reader conditionals.
  • Bump version of test.generative to get newer version that depends on newer tools.namespace which is able to find .cljc files as source files. The test script at src/script/run_test.clj uses this to find Clojure test files and without the bump, it will not find the new reader.cljc test ns. Note: if you use the ant build, you need to re-run antsetup.sh for this to take effect.
  • src/script/run_test.clj - as part of the version bump, move from deprecated namespace to new namespace
  • Added lots of tests in reader.cljc

Patch: clj-1424-10.diff

Screened by:

Related: CLJS-27, TRDR-14



 Comments   
Comment by Jozef Wagner [ 16/May/14 2:19 AM ]

Has there been a decision that CL syntax is going to be used? Related discussion can be found at design page, google groups discussion and another discussion.

Comment by Alex Miller [ 16/May/14 8:34 AM ]

No, no decisions on anything yet.

Comment by Ghadi Shayban [ 19/May/14 7:25 PM ]

Just to echo a comment from TRDR-14:

This is WIP and just one approach for feature expressions. There seem to be at least two couple diverging approaches emerging from the various discussion (Brandon Bloom's idea of read-time splicing being the other.)

In any case having all Clojure platforms be ready for the change is probably essential. Also backwards compatibility of feature expr code to Clojure 1.6 and below is also not trivial.

Comment by Kevin Downey [ 04/Aug/14 1:39 PM ]

if you have ever tried to do tooling for a language where the "parser" tossed out information or did some partial evaluation, it is a pain. this is basically what the #+cljs style feature expressions and bbloom's read time splicing both do with clojure's reader. I think resolving this at read time instead of having the compiler do it before macro expansion is a huge mistake and makes the reader much less useful for reading code.

Comment by Ghadi Shayban [ 04/Aug/14 2:00 PM ]

Kevin, what kind of tooling use case are you alluding to?

Comment by Kevin Downey [ 04/Aug/14 3:24 PM ]

any use case that involves reading code and not immediately handing it off to the compiler. if I wanted to write a little snippet to read in a function, add an unused argument to every arity then pprint it back, reader resolved feature expressions would not round trip.

if I want to write snippet of code to generate all the methods for a deftype (not a macro, just at the repl write a `for` expression) I can generate a clojure data structure, call pprint on it, then paste it in as code, reader feature expressions don't have a representation as data so I cannot do that, I would have to generate strings directly.

Comment by Alex Miller [ 22/Aug/14 9:10 AM ]

Changing Patch setting so this is not in Screenable yet (as it's still a wip).

Comment by Alex Miller [ 07/Nov/14 4:39 PM ]

Latest patch brings up to par with related patches in CLJS-27 and TRDR-14 and importantly adds support for loading .cljc files as Clojure files.

Comment by Andy Fingerhut [ 07/Nov/14 5:55 PM ]

Maybe undesirable behavior demonstrated below with latest Clojure master plus patch clj-1424-3.diff, due to the #+cljs skipping the comment, but not the (dec a). I thought it could be fixed simply by moving RT.suppressRead() check after (ret == r) check in read(), but that isn't correct.

user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs (dec a))")
(defn foo [a] (inc a))
user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs ; foo\n (dec a))")
(defn foo [a] (inc a) (dec a))
Comment by Alex Miller [ 21/Jan/15 4:28 PM ]

Added new clj-1424-4.diff which makes a couple of modifications:

  • removed support for and/or/not (#+ and #- remain)
  • *features* has been removed
  • if you wish to have a custom feature set while reading, there is a new option map that can be passed to read (this all parallels similar changes previously made to the edn reader)

Example of adding a "custom" feature to the feature set (which will always contain "clj" feature):

(read 
  {:features #{:custom}} 
  (java.io.PushbackReader. (java.io.StringReader. "[#+custom :x]")))
Comment by Andy Fingerhut [ 21/Jan/15 5:01 PM ]

Latest patch clj-1424-4.diff also exhibits maybe-undesirable behavior in which #+cljs can suppress an immediately following comment, rather than the form following it. See 07/Nov/14 comment with example above.

Comment by Alex Miller [ 21/Jan/15 6:16 PM ]

Thanks Andy, I'm aware. Haven't looked at it yet.

Comment by Luke VanderHart [ 25/Jan/15 9:26 PM ]

Patch clj-1424-5.diff modifies the code to use "read-conditionals", as outlined by Rich at: https://groups.google.com/d/msg/clojure-dev/LW0ocQ1RcYI/IBPPyfCpM3kJ

Comment by Alex Miller [ 26/Jan/15 12:33 PM ]

Some feedback:

1) Because pendingForms is an internal thing, I would make the read() that takes it non-public.
2) In readDelimitedList, I don't see the point of constructing a new LinkedList then checking if it's empty there. Should just make the add conditional on whether it's null or not.
3) You could treat pendingForms as a Deque (which LinkedList implements) and then use pop() instead of remove(0). The addAll(0, ...) is more painful to replicate though if you're sticking to Deque. I think I'd be tempted to just commit explicitly to LinkedList for pendingForms since we fully control the construction and use of it within the reader.
4) Might be nice to update the commented-out readers to support pendingForms as I did with opts. Or remove the updates for opts. Should either do all the mods or none on the commented-out code.
5) s/read-cond-splicing/read-cond-splice/ ? Seems like where it's used it should be a verb.
6) Should just use :default and make :else and :none throw exceptions. I think Rich mentioned :except or :exception too? or maybe I misheard that.
7) Should have some more tests to tweak the error cases - bad feature, uneven forms, default out of allowed position, bad contents for splice, etc.

Comment by Alex Miller [ 26/Jan/15 2:01 PM ]

From Chouser on the mailing list: "is it intentional that reading (clojure.core/read-cond ...) does not behave the same as (#? ...)? That is, (#? ...) can be read as c.c/read-cond depending on read options, but having been read, if it is printed again it doesn't round-trip back to #?. This is different, for example, from how #(...) is read as (fn* [] (...)), which then retains its meaning."

In shouldReadConditionally(), it looks like the == check vs READ_COND will not work. Instead of:
return (first == READ_COND || first == READ_COND_SPLICING);
do
return (READ_COND.equals(first) || READ_COND_SPLICING.equals(first));

For example, this test doesn't seem to give the right answer:

user=> (read-str-opts {:preserve-read-cond false} "(clojure.core/read-cond :clj :x :default :y)")
(clojure.core/read-cond :clj :x :default :y)    ;; should be :x
Comment by Michael Blume [ 26/Jan/15 3:27 PM ]

With this patch applied to master, lein check fails on instaparse:

Compiling namespace instaparse.abnf
Exception in thread "main" clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader, compiling:(abnf.clj:186:28)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3605)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3599)
	at clojure.lang.Compiler$DefExpr.eval(Compiler.java:436)
	at clojure.lang.Compiler.eval(Compiler.java:6772)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.RT.loadResourceScript(RT.java:384)
	at clojure.lang.RT.loadResourceScript(RT.java:375)
	at clojure.lang.RT.load(RT.java:459)
	at clojure.lang.RT.load(RT.java:425)
	at clojure.core$load$fn__5424.invoke(core.clj:5850)
	at clojure.core$load.doInvoke(core.clj:5849)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at user$eval52$fn__63.invoke(form-init5310597017138984927.clj:1)
	at user$eval52.invoke(form-init5310597017138984927.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6767)
	at clojure.lang.Compiler.eval(Compiler.java:6757)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.Compiler.loadFile(Compiler.java:7150)
	at clojure.main$load_script.invoke(main.clj:275)
	at clojure.main$init_opt.invoke(main.clj:280)
	at clojure.main$initialize.invoke(main.clj:308)
	at clojure.main$null_opt.invoke(main.clj:343)
	at clojure.main$main.doInvoke(main.clj:421)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader
	at clojure.lang.AFn.throwArity(AFn.java:429)
	at clojure.lang.AFn.invoke(AFn.java:36)
	at instaparse.cfg$eval800$safe_read_string__801.invoke(cfg.clj:163)
	at instaparse.cfg$process_string.invoke(cfg.clj:180)
	at instaparse.cfg$build_rule.invoke(cfg.clj:217)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:214)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:207)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core.protocols$seq_reduce.invoke(protocols.clj:30)
	at clojure.core.protocols$fn__6436.invoke(protocols.clj:59)
	at clojure.core.protocols$fn__6389$G__6384__6402.invoke(protocols.clj:13)
	at clojure.core$reduce.invoke(core.clj:6501)
	at clojure.core$into.invoke(core.clj:6582)
	at instaparse.cfg$ebnf.invoke(cfg.clj:277)
	at clojure.lang.AFn.applyToHelper(AFn.java:154)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3600)
	... 27 more
Failed.
Comment by Michael Blume [ 26/Jan/15 3:29 PM ]

Aha, of course, Instaparse is calling into the LispReader$StringReader directly.

Is it worth providing versions of these methods with the old arities? Or should instaparse just not be using Clojure internals this way?

Comment by Michael Blume [ 26/Jan/15 3:33 PM ]

https://github.com/Engelberg/instaparse/blob/v1.3.5/src/instaparse/cfg.clj#L159

Comment by Alex Miller [ 26/Jan/15 3:33 PM ]

Instaparse is reaching pretty deep inside implementation details here, so I'd say this should expect to break. We could back-fill the old arities here but I'd really prefer not to if possible.

Comment by Luke VanderHart [ 27/Jan/15 11:23 AM ]

clj-1424-6.diff addresses all the issues mentioned above. Per a comment from Rich, it also adds tests to ensure that nested splices work properly (they do).

There were two things from your list I didn't do, Alex:

3) I kept pendingForms as a List. Because we aren't confining ourselves to a Deque interface, I don't see the benefit of calling pop() over remove(0) (with identical semantics) as justification for over-specifying the concrete type.

5) I kept "read-cond-splicing" since it parallels the form of "unquote-splicing". Seems that those should be consistent.

Comment by Luke VanderHart [ 30/Jan/15 9:03 PM ]

clj-1424-7.diff contains Rich's "reader-conditionals" proposal.

Comment by Alex Miller [ 06/Feb/15 3:45 PM ]

ReaderConditional / TaggedLiteral
1) when patch applied I see some whitespace errors in here, also line endings seem different, might want to check it
2) a common pattern in other Java classes is private constructor and public static create() method
3) could use Util.hash() to clean up the "null->0" logic in hashCode()

LispReader
4) adds unused import: java.util.Iterator
5) it looks like returnOn flag could just be collapsed into checking if returnOnChar is non-null?
6) in readCondDelimited, EOF and FINISHED are never used and can be removed presumably

Comment by Luke VanderHart [ 07/Feb/15 1:49 PM ]

I have attached clj-1424-8.diff, which addresses your most recent comments, Alex. I formatted it using `git format patch` instead of `git diff` so it should have the email address added correctly.

Your comments are all addressed, with the exception of returnOn. I don't think that can be collapsed. You really need two values: one to say what character should cause a return, and one to say what value should be returned in that scenario. You could use a convention on the return value, I suppose, (e.g, null means a completed read) but there's already precedent for passing in the value to be returned (namely, eofValue).

Comment by Alex Miller [ 09/Feb/15 9:49 AM ]

Looks good. I think I actually mis-read what returnOn was doing, so np on that. I still see the whitespace issues and the CR/LF in those two files. Were you going to change those?

Comment by Alex Miller [ 09/Feb/15 4:24 PM ]

Added new -9 patch that squashes the last patch but is otherwise identical. The older patches in that diff were the source of still seeing whitespace errors on apply.

Comment by Rich Hickey [ 20/Feb/15 8:09 AM ]

I think in the first iteration we should allow reader conditionals only in .cljc files, and support only standard features :clj, :cljs and :clr.

Comment by Andy Fingerhut [ 27/Feb/15 10:35 AM ]

Comment on -10 patch:

The doc string additions for clojure.core/read are a bit cryptic for those not already steeped in the details. It would be good to at least mention 'reader conditionals' so people have something to Google for. Current additions:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow, or :preserve to keep all branches as data
+    :features - persistent set of feature keywords
+    :eof - on eof, return value unless :eofthrow, then throw

Alternate suggestion:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow to process reader conditionals, or :preserve to keep all branches as data.  Throw if reader conditional found and this key not present.
+    :features - persistent set of feature keywords for processing reader conditionals.
+    :eof - on eof, return value unless :eofthrow, then throw
Comment by Alex Miller [ 27/Feb/15 10:39 AM ]

Thanks Andy, will consider. Indeed, this will have lengthier docs on http://clojure.org/reader so I was trying to just hit the surface but that's a reasonable suggestion.





[CLJ-1602] vals and keys return values should implement IReduceInit or Iterable Created: 25/Nov/14  Updated: 15/Jan/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File clj-1602-2.diff     File clj-1602-3.diff     File clj-1602-4.diff     File clj-1602.diff    
Patch: Code and Test
Approval: Vetted

 Description   
  • with generative tests
  • with perf demos

Background: clojure.core/keys calls RT.keys(Object) calls APersistentMap.KeySeq.create(ISeq). RT.keys() creates a sequence (of Map.Entry objects) and KeySeq just wraps it, calling .keyValue(). There is an equivalent vals -> RT.vals() -> ValSeq path. Both of these seq impls extend ASeq and provide Iterable implementations via SeqIterator (iterator wrapped over the seq).

Approach: The important thing here is to avoid creating the sequence and instead directly iterate/reduce over the map. Noting that CLJ-1499 provides support for making PHM directly Iterable and that KeySeq/ValSeq already implement Iterable, I chose to focus on making the instances returned from keys and vals support Iterable directly on the underlying map instead of via the seq.

RT.keys()/vals() created the seq and passed it to KeySeq/ValSeq which made it too late to directly cover the original map iterator. There are a few places that rely on passing a seq of Map.Entry to keys/vals (not just a map instance), so I check for IPersistentMap and in that case pass it directly to a new KeySeq factory method that remembers both the Iterable and the ISeq. There is also support in here for using a direct key/value iterator via IMapIterable as introduced in CLJ-1499.

Questions/notes:

  • Could potentially check for Map or Iterable instead of IPersistentMap in RT.keys()/vals(). Not sure how common it is to pass normal Java maps to keys/vals.
  • The direct Iterable support vanishes once you move off the head of the keys or vals seq. So (rest (keys map)) does not have Iterable support. This is not really possible unless you hold an Iterator and advance it along with the seq, but that seemed to introduce all sorts of possibilities for badness. Since maps are unordered, it seems weird to rely on any ordering or processing only parts of any map, so I suspect doing this would be quite rare.
  • This patch depends on CLJ-1499 for IMapIterable.

Performance: I tested perf using criterium to benchmark as follows:

(use 'criterium.core)
(def m (zipmap (range 1000) (range 1000)))
(bench (reduce + (keys m)))
(bench (reduce + (vals m)))
(bench expr) 1.6.0 1.7.0-alpha5 alpha5 + clj-1499 + patch
(reduce + (keys m)) 69 µs 73 µs 44 µs
(reduce + (vals m)) 75 µs 77 µs 50 µs

Tests:

  • I added some basic tests for subseq and rsubseq as those both rely on the somewhat special behavior of keys accepting a seqable of Map.Entry objects (not just a map itself). There were no other tests for subseq or rsubseq already present.

Patch: clj-1602-4.diff - requires CLJ-1499 patch first (for IMapIterable)

Screened by:

  • the changes in CollReduce are unlikely to stand but are currently essential


 Comments   
Comment by Alex Miller [ 25/Nov/14 11:53 AM ]

Could leverage CLJ-1499 for the bulk of this, may pull that back from 1.8 into 1.7. Waiting on further work till that's answered.

Comment by Alex Miller [ 03/Dec/14 11:24 PM ]

I also have a patch that extends the CLJ-1499 iterators to support providing both key and val iterators that do not require creating and unpacking a Map.Entry. Unfortunately I only saw times that were ~48 µs on the perf benchmark in the description, so it's not a huge benefit (short-lived object allocation is cheap).

Comment by Alex Miller [ 09/Jan/15 9:15 AM ]

waiting on 1499 mods

Comment by Alex Miller [ 14/Jan/15 5:42 PM ]

Updated patch based on latest CLJ-1499-v10 patch.

Comment by Michael Blume [ 15/Jan/15 1:48 AM ]

I find it concerning that the v9 patch passed generative tests, shouldn't we have seen that?

Comment by Alex Miller [ 15/Jan/15 7:21 AM ]

Actually I did see it in the generative tests for this ticket so I moved those tests into the CLJ-1499 patch.

Comment by Michael Blume [ 15/Jan/15 10:10 AM ]

Ah, cool =)





Generated at Fri Feb 27 23:22:21 CST 2015 using JIRA 4.4#649-r158309.