<< Back to previous view

[CLJ-1706] top level conditional splicing ignores all but first element Created: 15/Apr/15  Updated: 21/May/15  Resolved: 21/May/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 2
Labels: reader

Attachments: Text File 0001-CLJ-1706-Make-top-level-reader-conditional-splicing-.patch     Text File clj-1706-2.patch     Text File clj-1706-3.patch     Text File clj-1706-make-error.patch    
Patch: Code and Test
Approval: Ok

 Description   
user=> (def a (java.io.PushbackReader. (java.io.StringReader. "#?@(:clj [1 2])")))
#'user/a
user=> (read {:read-cond :allow} a)
1
user=> (read {:read-cond :allow} a)
RuntimeException EOF while reading  clojure.lang.Util.runtimeException (Util.java:221)

Currently the reader is stateless (read is a static call) but utilizes a stateful reader (and has a few hooks into compiler/runtime state for autoresolving keywords, etc). If the call into the reader at the top level calls a splicing reader conditional, then only the first one will be returned. The remaining forms are stranded in the pendingForms list and will be lost for subsequent reads.

Approach: Make top level reader conditional splicing an error:

user=> (read-string {:read-cond :allow} "#?@(:clj [1 2])")
IllegalStateException Reader conditional splicing not allowed at the top level.  clojure.lang.LispReader.read (LispReader.java:200)

Patch: clj-1706-2.patch

Alternatives:

1. Make top-level reader conditional splicing an error and throw an exception. (SELECTED)

2. Allow the caller to pass in a stateful collection to catch or return the pendingForms. This changes the effective calling API for the reader. You would only need to do this in the cases where reader conditionals were allowed/preserved.

3. Add a static (or threadlocal?) pendingForms attribute to the reader to capture the pendingForms across calls. A static field would have concurrency issues - anyone using the reader across threads would get cross-talk in this field. The pendingForms could be threadlocal which would probably achieve separation in the majority of cases, but also creates a number of lifecycle questions about those forms. When do they get cleared or reset? What happens if reading the same reader happens across threads? Another option would be an identity map keyed by reader instance - would need to be careful about lifecycle management and clean up, as it's basically a cache.

4. Add more state into the reader itself to capture the pendingForms. The reader interfaces and hierarchy would be affected. This would allow the reader to stop passing the pendingForms around inside but modifies the interface in other ways. Again, this would only be needed for the specific case where reader conditionals are allowed so other uses could continue to work as is?

5. If read is going to exit with pendingForms on the stack, they could be printed and pushed back onto the reader. This adds new read/print roundtrip requirements on things at the top level of reader conditionals that didn't exist before.

6. Wrap spliced forms at the top level in a `do`. This seems to violate the intention of splicing reader conditional to read as spliced since it is not the same as if those forms were placed separately in the input stream.



 Comments   
Comment by Alex Miller [ 15/Apr/15 2:05 PM ]

pulling into 1.7 so we can discuss

Comment by Alex Miller [ 24/Apr/15 11:18 AM ]

Compiler.load() makes calls into LispReader.read() (static call). If the reader reads a top-level splicing conditional read form, that will read the entire form, then return the first spliced element. when LispReader.read() returns, the list carrying the other pending forms is lost.

I see two options:

1) Allow the compiler to call the LispReader with a mutable pendingForms list, basically maintaining that state across the static calls from outside the reader. makes the calling model more complicated and exposes the internal details of the pendingform stuff, but is probably the smaller change.

2) Enhance the LineNumberingPushbackReader in some way to remember the pending forms. This would probably allow us to remove the pending form stuff carried around throughout the LispReader and retain the existing (sensible) api. Much bigger change but probably better direction.

Comment by Nicola Mometto [ 24/Apr/15 11:24 AM ]

What about simply disallowing cond-splicing when top level?
Both proposed options are breaking changes since read currently only requires a java.io.PushbackReader

Comment by Alex Miller [ 24/Apr/15 11:42 AM ]

We want to allow top-level cond-splicing.

Comment by Nicola Mometto [ 24/Apr/15 11:52 AM ]

Would automatically wrapping a top-level cond-splicing in a (do ..) form be out of the question?

I'm personally opposed to supporting this feature as it would change the contract of c.c/read, complicate the implementation of LineNumberingPushbackReader or LispReader and complicate significantly the implementaion of tools.reader's reader types, for no significant benefit.
Is it really that important to be able to write

#~@(:clj [1 2])

rather than

(do #~@(:clj [1 2]))

?

Comment by Rich Hickey [ 18/May/15 10:10 AM ]

Please "Make top-level reader conditional splicing an error and throw an exception" for now.

Comment by Nicola Mometto [ 19/May/15 8:50 AM ]

Might be too late since Rich already gave the OK but the proposed patch doesn't prevent single-element top level conditional splicing forms.
e.g

;; this fails
#~@(:clj [1 2])
;; this works
#~@(:clj [1])

Is this intended?

Comment by Alex Miller [ 19/May/15 11:21 AM ]

New -2 patch catches reader conditional splice of 0 or 1 element.

Comment by Nicola Mometto [ 19/May/15 11:59 AM ]

Attached alternative patch that is less intrusive than clj-1706-2.patch

Comment by Alex Miller [ 19/May/15 2:54 PM ]

clj-1706-3.patch is identical to 0001-CLJ-1706Make-top-level-reader-conditional-splicing.patch but with one whitespace change reverted. Marking latest as screened.

Comment by Alex Miller [ 20/May/15 8:38 AM ]

Rich didn't like the dynvar in -3, so switching back to -2.





[CLJ-1669] Move LazyTransformer to an iterator strategy, extend eduction capabilities Created: 04/Mar/15  Updated: 19/May/15  Resolved: 19/May/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: transducers

Attachments: Text File clj-1669-2.patch     Text File clj-1669-3.patch     Text File clj-1669-4.patch     Text File clj-1669-5.patch     Text File clj-1669-6.patch     Text File clj-1669.patch    
Patch: Code and Test
Approval: Ok

 Description   
  • LazyTransformer does a lot of work to be a seq. Instead, switch to creating a transforming iterator.
  • Change sequence to wrap iterator-seq around the transforming iterator.
  • Change the iterator-seq implementation to be chunked. IteratorSeq will no longer be used but is left in case of regressions for now.
  • Change Eduction to provide iteration directly via the transforming iterator.
  • Extend eduction to support multiple xforms.

Performance:

Compared:

  • alpha5 == before any change
  • alpha6 == after clj-1669-6.patch was applied
  • beta3 == latest, includes range enhancement, expanding mapcat enhancement, etc

;; using java 1.8
(use 'criterium.core)
(def s (range 1000))
(def v (vec s))
(def s50 (range 50))
(def v50 (vec s50))

expr alpha5 s alpha5 v alpha6 s alpha6 v beta3 s beta3 v
non-chunking transform            
(into [] (->> s (interpose 5) (partition-all 2))) 432 us 437 us 413 us 411 us 353 us 414 us
(into [] (->> s (eduction (interpose 5) (partition-all 2)))) * 117 us 118 us 117 us 113 us 116 us 113 us
1 chunking transform            
(into [] (map inc s)) 43 us 45 us 35 us 31 us 32 us 36 us
(into [] (map inc) s) 19 us 19 us 18 us 18 us 18 us 16 us
(into [] (sequence (map inc) s)) 100 us 54 us 97 us 65 us 66 us 64 us
(into [] (eduction (map inc) s)) 24 us 19 us 24 us 20 us 20 us 19 us
(doall (map inc (eduction (map inc) s))) 219 us 203 us 98 us 78 us 93 us 89 us
2 chunking transforms        
(into [] (map inc (map inc s))) 53 us 56 us 53 us 54 us 61 us 58 us
(into [] (comp (map inc) (map inc)) s) 13 us 26 us 30 us 26 us 31 us 31 us
(into [] (sequence (comp (map inc) (map inc)) s)) 111 us 64 us 98 us 73 us 83 us 80 us
(into [] (eduction (map inc) (map inc) s)) * 58 us 31 us 58 us 31 us 30 us 31 us
(doall (map inc (eduction (map inc) (map inc) s))) * 240 us 212 us 114 us 93 us 105 us 102 us
expand transform            
(into [] (mapcat range (map inc s50))) 74 us 73 us 67 us 68 us 37 us 39 us
(into [] (sequence (comp (map inc) (mapcat range)) s50)) 111 us 102 us 166 us 159 us 99 us 98 us
(into [] (eduction (map inc) (mapcat range) s50)) * 65 us 64 us 57 us 56 us 27 us 27 us
materialized eduction            
(sort (eduction (map inc) s)) ERR ERR 99 us 77 us 77 us 77 us
(->> s (filter odd?) (map str) (sort-by last)) 1.10 ms 1.25 ms 1.15 ms 1.19 ms 1.14 ms 1.13 ms
(->> s (eduction (filter odd?) (map str)) (sort-by last)) ERR ERR 1.18 ms 1.15 ms 1.13 ms 1.13 ms
  • used comp to combine xforms as eduction only took one in the before case

Patch: clj-1669-6.patch

Screened by:



 Comments   
Comment by Michael Blume [ 05/Mar/15 3:52 PM ]

Nice, I like the direction on this.

CLJ-1515 currently breaks this patch (LongRange cannot be converted to Iterable), but I imagine that'll get better when it absorbs the changes from CLJ-1603

Comment by Alex Miller [ 06/Mar/15 8:11 AM ]

Yeah. colls should be mapped through RT.iter() to catch more cases.

Comment by Alex Miller [ 06/Mar/15 9:52 AM ]

To do:

  • remove Seqable from Eduction
  • support Iterable in RT.toArray()
  • more eduction pipeline tests that require realization at end
Comment by Alex Miller [ 06/Mar/15 1:00 PM ]

Perf numbers show pretty worse results from sequence, will dig in further.

Comment by Alex Miller [ 13/Mar/15 7:41 AM ]

For the s timings, we've actually introduced more steps into the stack:

OLD reduce with s:

LazyTransformer
   seq (range) - every transformation is another layer here

NEW reduce with s:

IteratorSeq 
  TransformingIterator (handles N transformations in 1 step)
    SeqIterator
      seq (range)
Comment by Alex Miller [ 20/Mar/15 10:08 AM ]

Look at perf for:

  • ->> eduction transformation
  • transformation comparison that doesn't support chunking
  • more into vector iteration case
Comment by Alex Miller [ 21/Mar/15 8:45 AM ]

The -5 patch is same -3 except all uses of IteratorSeq have been replaced with a ChunkedCons that is effectively a chunked version of the old IteratorSeq. While no one calls it, I left IteratorSeq in the code base in case of regression.

Generally, the chunked iterator seq reduces the cost in a number of the worst cases and also is a clear benefit in making seqs over a result of eduction or sequence faster to traverse (as they are now chunked).

I think the one potential issue is that seqs over iterators are now chunked when they were not before which could change programs that expect their stateful iterator to be traversed one at a time. This change could be isolated to just to sequence and seq-iterator and mitigated by not changing RT.seqFrom() and seq-iterator to use the new chunking behavior only in sequence and/or with a new chunked-iterator-seq to make it more explicit. The sequence over xf is new so no possible regression there, everything else would just be opt-in.

Comment by Rich Hickey [ 27/Mar/15 9:49 AM ]

push as is but leave unresolved, for perf tweaks

Comment by Alex Miller [ 27/Mar/15 10:15 AM ]

clj-1669-6 is identical to clj-1669-5 but removes two commented out debugging lines that were inadvertently included.

Comment by Alex Miller [ 18/May/15 9:47 AM ]

updated for beta3 numbers





[CLJ-1717] Compiler casts System properties to String without prior type check Created: 29/Apr/15  Updated: 18/May/15  Resolved: 18/May/15

Status: Resolved
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Critical
Reporter: Laurent Petit Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: compiler, regression
Environment:

occurs for any JVM / Operating System. Encountered when using Clojure inside the Equinox OSGi framework (reference OSGi implementation used e.g. by Eclipse IDE)


Attachments: Text File clj-1717-1.patch    
Patch: Code
Approval: Triaged

 Description   

The Clojure Compiler loops over all System Properties through their EntrySet.
This interface allows for non-String properties to be found. It is leveraged by the Equinox OSGi framework (reference OSGi implementation used e.g. by the Eclipse IDE).

This means that since this new code has been introduced in the Clojure compiler, the official Clojure Jar cannot be used inside Eclipse.

The problem is that Counterclockwise, the Eclipse-based Clojure IDE, at least, is affected, since it is developed with Clojure code.

The attached patch solves the issue by skipping System Properties key/value pairs whose values aren't Strings.



 Comments   
Comment by Alex Miller [ 29/Apr/15 2:40 PM ]

Probably a regression related to CLJ-1274.

Comment by Laurent Petit [ 29/Apr/15 2:45 PM ]

Yes, CLJ-1274 moved the code from the Compile.java class to the Compiler.java class. The code already had the cast problem, but it was probably not an issue at runtime for CCW when only present in Compile.java

Comment by Alex Miller [ 29/Apr/15 3:00 PM ]

It looks like maybe this is considered a bug in Eclipse land (which it probably should be)?
https://bugs.eclipse.org/bugs/show_bug.cgi?id=445122

Comment by Laurent Petit [ 29/Apr/15 3:21 PM ]

I agree that it is an abuse on the side of Eclipse-and of a System properties hole that allows to put non-String values and then browse them through methods not using Generics.

From the last comments, it appears that it has only be released very recently, and for only the last stable version of Eclipse. So I expect CCW users with disparate Eclipse versions installed to have the problem if I do not apply the patch to my custom version of Clojure for some more months.

I can live with that if you reject the issue.

Comment by Laurent Petit [ 01/May/15 7:28 AM ]

What is the status for this issue? I understand that being considered an abuse from the OSGi implementation of the System properties contract, you might not want to add code for dealing with this in Compiler.
On the other hand, it's a small 3-lines check with an explanation of why it's done that way, so...

Anyway, feel free to get rid of it and close it so it doesn't get in the way if you think it's not worth the trouble.

Comment by Alex Miller [ 01/May/15 8:44 AM ]

We haven't had a chance to discuss it further yet. There is reluctance to change it if it's not necessary.





[CLJ-1731] Transient maps can't be assoc!'d to contain more than 8 elements. Created: 17/May/15  Updated: 17/May/15  Resolved: 17/May/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Task Priority: Major
Reporter: Peter Herniman Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: assoc!, transient
Environment:

Linux, Fedora 20
Clojure 1.6.0
OpenJDK 64-Bit Server VM 1.7.0_79-mockbuild_2015_04_15_06_33-b00



 Description   

(let [x (transient {})] (dotimes [i 30] (assoc! x i 0)) (persistent! x))
Will result in

{0 0, 1 0, 2 0, 3 0, 4 0, 5 0, 6 0, 7 0}

instead of the expected 30 element map.

I'm not sure if this is fixed in the most recent version (development) but it doesn't work in 1.6.0.



 Comments   
Comment by Alex Miller [ 17/May/15 6:54 AM ]

This is not correct usage of transient maps. You must use the return value of assoc! for further updates, not bash the same instance. In other words, the update model is the same as with normal maps. There is an outstanding ticket to tweak the docstring slightly to make this clearer.

See a similar example with transient vectors at http://clojure.org/transients.

Comment by Peter Herniman [ 17/May/15 6:23 PM ]

Ah ok, looking at the example on clojure.org clears things up a lot.
I'm not too sure if the docstring does need updating, the tutorial on transients does state that you need to capture the return value explicitly.
Thanks!





Generated at Sat May 23 03:51:51 CDT 2015 using JIRA 4.4#649-r158309.