<< Back to previous view

[CLJ-1436] Deref throws an unhelpful error message when used on something not dereferencable Created: 03/Jun/14  Updated: 26/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Phillip Lord Assignee: Unassigned
Resolution: Duplicate Votes: 3
Labels: errormsgs, newbie

Attachments: Text File deref.patch     File tests-patch.diff    
Patch: Code and Test

 Description   

Consider the following code:

(def x 1)
(def y (ref 2))

(+ @x y)

Clojure throws a ClassCastException on cast to Future. This is a very unhelpful error message; why a Future, why not Ref, Atom etc. It would be nice if this failed more gracefully.



 Comments   
Comment by Tobias Kortkamp [ 15/Sep/14 2:08 PM ]

Attached a patch with better error messages for deref. The above example now throws:

IllegalArgumentException class java.lang.Long is not derefable  clojure.core/deref (core.clj:2211)

and e.g.

(deref (delay 1) 500 :foo)

throws

IllegalArgumentException class clojure.lang.Delay is not derefable with a timeout  clojure.core/deref (core.clj:2222)
Comment by Mark Nutter [ 16/Sep/14 6:57 PM ]

Patch file clj-1436-patch-2014-09-16.diff updates the deref function so that it checks whether its arg is a future before sending it to deref-future. It also updates the deref function to provide clearer error messages. If the arg is not a future, and does not implement IDeref, the patched version of deref throws an IllegalArgumentException with a message that the arg cannot be dereferenced because it is not a ref/future/etc.

Comment by Mark Nutter [ 16/Sep/14 7:00 PM ]

Oops, I had this page open from yesterday and didn't see the patch submitted by Tobias. His has everything mine does, so I'll withdraw mine.

Comment by Mark Nutter [ 17/Sep/14 5:13 AM ]

One suggestion: the error message might sound better as "IllegalArgumentException cannot dereference clojure.lang.Delay; not a future or reference type".

Comment by Mark Nutter [ 17/Sep/14 5:44 AM ]

Tobias' patch does not contain the tests I had in mine, so I'm re-submitting just the tests as tests-patch.diff. If you install the tests patch without installing the deref patch, the tests will fail with the error message "Wrong exception type when passing non-IDeref/non-future to deref/@". Applying the deref patch as well will allow the tests to pass.

Comment by Alex Miller [ 28/Dec/14 11:06 AM ]

This patch does not seem to consider performance implications of adding this check and its impact on inlining.

Comment by Nicola Mometto [ 25/Mar/15 5:45 PM ]

Dupe of CLJ-1162

Comment by Andy Fingerhut [ 26/Mar/15 10:13 AM ]

Maybe it shouldn't be the deciding factor when selecting which duplicate ticket to close, but CLJ-1162 has 1 vote, this one has 3. Closing this one as the duplicate is discarding those votes, unless (a) they are copied over to the older ticket, or (b) consider closing the older ticket as a duplicate of this one instead.

Comment by Alex Miller [ 26/Mar/15 10:59 AM ]

I pondered that myself - I usually look at votes, current status, and patch. In this case, I liked the patch on CLJ-1162 better (although they were pretty similar).





[CLJ-1671] Clojure stream socket repl Created: 09/Mar/15  Updated: 26/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: repl

Attachments: Text File clj-1671-2.patch     Text File clj-1671-3.patch     Text File clj-1671-4.patch    
Patch: Code
Approval: Incomplete

 Description   

Programs often want to provide REPLs to users in contexts when a) network communication is desired, b) capturing stdio is difficult, or c) when more than one REPL session is desired. In addition, tools that want to support REPLs and simultaneous conversations with the host are difficult with a single stdio REPL as currently provided by Clojure.

The goal is to provide a simple streaming socket repl as part of Clojure. The socket repl should support both program-to-program communication (by exchanging data) and human-readable printing (which mirrors current behavior). The socket repl server will be started only when supplying a port to clojure.main or by explicitly starting it by calling the provided function.

Each socket connection will be given its own repl context and unique starting user namespace (user, user1, user2, etc) with separate repl stack and proper bindings. On session termination, the namespace will be removed. *in* and *out* will be bound to incoming and outgoing streams. Tools can communicate with the runtime while also providing a user repl by opening two connections to the same server.

There are two known cases where the repl interprets non-readable objects and prints them for human consumption in a non-readable way: objects with no specific print-method and when handling Throwables. Both cases (Object and Throwable) now have a print-method implementation to return a tagged-literal representation. By default the socket repl will print exceptions with the print-method data form. Optionally, the user can set a custom repl exception printer. A function that provides the current human readable exception printing will be provided.

Problems:

  1. Socket server to accept connections and serve a repl to a client
    • Runtime configuration via data
  2. Client repl sessions should be independent
    • Separate user namespace
    • Separate bindings
    • Namespace removed on client shutdown
    • Communicate solely via data for program-to-program communication
  3. Stdio has both out and err streams but socket has only single out
  4. Repls should be nestable
    • Repl within a repl binds streams appropriately
    • Means of control (exit)

Features:

  1. Printing as data
    • Of object without print-method: as #object
    • Of Throwable: as #error
  2. Start socket server from command line
    • Configure: host, port, whether to prompt, error printing
  3. Start socket server programmatically
    • Configure: host, port, whether to prompt, error printing, whether to bind err to out
    • Control: close returned socket server to stop listening
  4. Start stdio repl from command line
    • Configure: whether to prompt, error printing
  5. On socket client accept
    • Create new user namespace
    • Bind error printer according to server config
  6. In socket client repl
    • Bind new error printer function
  7. On socket client disconnect
    • Remove user namespace
    • Close socket

Impl notes: This adds a new -s option to clojure.main that will start a socket server listening on a given host:port. Each client is given a new userN namespace (starting from user1). It binds *in*, *out*, and *err*. Each client connection consumes a daemon thread named "Socket REPL Client N" (matching the user namespace). On client disconnect, the user namespace is removed and thread will die.

There are two system properties that can be used to control whether the prompt and the default error printer:

  • clojure.repl.socket.prompt (default=false) - whether to print prompts
  • clojure.repl.socket.err-printer (default=clojure.main/err->map) - function to format exceptions

The existing stdio repl behaves the same as before, but it's behavior can be influenced by two new similar system properties:

  • clojure.repl.stdio.prompt (default=true)
  • clojure.repl.socket.err-printer (default=clojure.main/err-print)

You can also start a socket repl server programmatically (shown here with all kwarg options - pick the ones you need):

(def ss 
  (clojure.main/socket-repl-server 
    :host "localhost"                   ;; default=<loopback>, accepts InetAddress or String
    :port 5555                          ;; default=0 (ephemeral)
    :use-prompt true                    ;; default=false
    :bind-err true                      ;; default=true, to bind \*err* to \*out*
    :err-printer clojure.main/err->map  ;; default=clojure.main/err->map, print Throwable to \*out*
    ))

The server socket is returned. Closing it will stop listening on the port (existing client connections will still be alive).

If you want to test with the server port, telnet makes a great client:

;; term 1:
$ java -cp target/classes -Dclojure.repl.socket.prompt=true clojure.main -s 127.0.0.1:5555

;; term 2:
$ telnet 127.0.0.1 5555
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
user1=> (+ 1 1)
2
user1=> (/ 1 0)
#error {:cause "Divide by zero",
 :via
 [{:type java.lang.ArithmeticException,
   :message "Divide by zero",
   :at [clojure.lang.Numbers divide "Numbers.java" 158]}],
 :trace
 [[clojure.lang.Numbers divide "Numbers.java" 158]
  [clojure.lang.Numbers divide "Numbers.java" 3808]
  [user1$eval1 invoke "NO_SOURCE_FILE" 1]
  [clojure.lang.Compiler eval "Compiler.java" 6784]
  [clojure.lang.Compiler eval "Compiler.java" 6747]
  [clojure.core$eval invoke "core.clj" 3078]
  [clojure.main$repl$read_eval_print__8287$fn__8290 invoke "main.clj" 265]
  [clojure.main$repl$read_eval_print__8287 invoke "main.clj" 265]
  [clojure.main$repl$fn__8296 invoke "main.clj" 283]
  [clojure.main$repl doInvoke "main.clj" 283]
  [clojure.lang.RestFn invoke "RestFn.java" 619]
  [clojure.main$socket_repl_server$fn__8342$fn__8344 invoke "main.clj" 450]
  [clojure.lang.AFn run "AFn.java" 22]
  [java.lang.Thread run "Thread.java" 724]]}
user1=> (println "hello")
hello
nil

A dynamic var clojure.main/*err-printer* is provided to customize printing of exceptions. It's bound by :err-printer if invoked programmatically or the system property if started from the command line, but it can be dynamically rebound during the session if desired:

user1=> (set! clojure.main/*err-printer* clojure.main/err-print)
#object[clojure.main$err_print 0x317bee01 "clojure.main$err_print@317bee01"]
user1=> (/ 1 0)
ArithmeticException Divide by zero  clojure.lang.Numbers.divide (Numbers.java:158)

Patch: clj-1671-4.patch



 Comments   
Comment by Timothy Baldridge [ 09/Mar/15 5:50 PM ]

Could we perhaps keep this as a contrib library? This ticket simply states "The goal is to provide a simple streaming socket repl as part of Clojure." What is the rationale for the "part of Clojure" bit?

Comment by Alex Miller [ 09/Mar/15 7:33 PM ]

We want this to be available as a Clojure.main option. It's all additive - why wouldn't you want it in the box?

Comment by Timothy Baldridge [ 09/Mar/15 10:19 PM ]

It never has really been too clear to me why some features are included in core, while others are kept in contrib. I understand that some are simply for historical reasons, but aside from that there doesn't seem to be too much of a philosophy behind it.

However it should be noted that since patches to clojure are much more guarded it's sometimes nice to have certain features in contrib, that way they can evolve with more rapidity than the one release a year that clojure has been going through.

But aside from those issues, I've found that breaking functionality into modules forces the core of a system to become more configurable. Perhaps I would like to use this repl socket feature, but pipe the data over a different communication protocol, or through a different serializer. If this feature were to be coded as a contrib library it would expose extension points that others could use to add additional functionality.

So I guess, all that to say, I'd prefer a tool I can compose rather than a pre-built solution.

Comment by Rich Hickey [ 10/Mar/15 6:25 AM ]

Please move discussions on the merits of the idea to the dev list. Comments should be about the work of resolving the ticket, approach taken by the patch, quality/perf issues etc.

Comment by Colin Jones [ 11/Mar/15 1:33 PM ]

I see that context (a) of the rationale is that network communication is desired, which sounds to me like users of this feature may want to communicate across hosts (whether in VMs or otherwise). Is that the case?

If so, it seems like specifying the address to bind to (e.g. "0.0.0.0", "::", "127.0.0.1", etc.) may become important as well as the existing port option. This way, someone who wants to communicate across hosts (or conversely, lock down access to local-only) can make that decision.

Comment by Alex Miller [ 11/Mar/15 2:07 PM ]

Colin - agreed. There are many ways to potentially customize what's in there so we need to figure out what's worth doing, both in the function and via the command line.

I think address is clearly worth having via the function and possibly in the command line too.

Comment by Kevin Downey [ 11/Mar/15 5:49 PM ]

I find the exception printing behavior really odd. for a machine you want an exception as data, but you also want some indication of if the data is an error or not, for a human you wanted a pretty printed stacktrace. making the socket repl default to printing errors this way seems to optimize for neither.

Comment by Rich Hickey [ 12/Mar/15 12:29 PM ]

Did you miss the #error tag? That indicates the data is an error. It is likely we will pprint the error data, making it not bad for both purposes

Comment by Alex Miller [ 13/Mar/15 11:29 AM ]

New -4 patch changes:

  • clojure.core/throwable-as-map now public and named clojure.core/Throwable->map
  • catch and ignore SocketException without printing in socket server repl (for client disconnect)
  • functions to print as message and as data are now: clojure.main/err-print and clojure.main/err->map. All defaults and docs updated.
Comment by David Nolen [ 18/Mar/15 12:44 PM ]

Is there any reason to not allow supplying :eval in addition to :use-prompt? In the case of projects like ClojureCLR + Unity eval generally must happen on the main thread. With :eval as something which can be configured, REPL sessions can queue forms to be eval'ed with the needed context (current ns etc.) to the main thread.

Comment by Kevin Downey [ 20/Mar/15 2:12 PM ]

I did see the #error tag, but throwables print with that tag regardless of if they are actually thrown or if they are just the value returned from a function. Admittedly returning exceptions as values is not something generally done, but the jvm does distinguish between a return value and a thrown exception. Having a repl that doesn't distinguish between the two strikes me as an odd design. The repl you get from clojure.main currently prints the message from a thrown uncaught throwable, and on master prints with #error if you have a throwable value, so it distinguishes between an uncaught thrown throwable and a throwable value. That obviously isn't great for tooling because you don't get a good data representation in the uncaught case.

It looks like the most recent patch does pretty print uncaught throwables, which is helpful for humans to distinguish between a returned value and an uncaught throwable.

Comment by Kevin Downey [ 25/Mar/15 1:10 PM ]

alex: saying this is all additive, when it has driven changes to how things are printed, using the global print-method, rings false to me

Comment by Sam Ritchie [ 25/Mar/15 1:15 PM ]

This seems like a pretty big last minute addition for 1.7. What's the rationale for adding it here vs deferring to 1.8, or trying it out as a contrib first?

Comment by Alex Miller [ 25/Mar/15 2:13 PM ]

Kevin: changing the fallthrough printing for things that are unreadable to be readable should be useful regardless of the socket repl. It shouldn't be a change for existing programs (unless they're relying on the toString of objects without print formats).

Sam: Rich wants it in the box as a substrate for tools.

Comment by Alex Miller [ 26/Mar/15 10:03 AM ]

Marking incomplete, pending at least the repl exit question.





[CLJ-878] DispatchReader always calls CtorReader when no dispatch macro found Created: 15/Nov/11  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Greg Chapman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader
Environment:

Windows 7, Java 1.7



 Description   

At the REPL, I accidentally typed:

user=> # "\w+"
RuntimeException Unsupported escape character: \w clojure.lang.Util.runtimeExce
ption (Util.java:156)
#<core$PLUS clojure.core$PLUS@6b7dc78>

You can see the confusing result (the REPL is also left in an unclosed string). After looking at LispReader.java, it seems to me DispatchReader ought to at least check for whitespace before calling CtorReader (perhaps better would be to check for a valid symbol character). Another example:

user=> (defrecord X [x])
user.X
user=> # "user.X"[5]
#user.X{:x 5}

I'm assuming the fact that the above successfully creates an X is accidental.



 Comments   
Comment by Nicola Mometto [ 25/Mar/15 5:33 PM ]

No longer reproducible

Comment by Alex Miller [ 25/Mar/15 9:33 PM ]

Seems at least partially reproducible to me. I see same behavior on first example. Second example works with a symbol {{# user.X[5]}} but fails with a reasonable error on the given case.

Comment by Nicola Mometto [ 25/Mar/15 9:56 PM ]

Ah, right.
The only way to fix the first error is to make `# "\w+"` a valid regex literal, that is, allowing whitespaces between # and the next dispatch char.
Is this something we want?

Comment by Alex Miller [ 25/Mar/15 10:06 PM ]

dunno





[CLJ-1515] range should return something that implements IReduce Created: 29/Aug/14  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Timothy Baldridge Assignee: Alex Miller
Resolution: Unresolved Votes: 2
Labels: None

Attachments: Text File clj-1515-10.patch     Text File clj-1515-11.patch     Text File clj-1515-12.patch     Text File clj-1515-2.patch     Text File clj-1515-3.patch     Text File clj-1515-4.patch     Text File clj-1515-5.patch     Text File clj-1515-6.patch     Text File clj-1515-7.patch     Text File clj-1515-8.patch     Text File clj-1515-9.patch     Text File CLJ-1515-deftype2.patch     Text File CLJ-1515-deftype3.patch     Text File CLJ-1515-deftype-nostructural-dup.patch     Text File CLJ-1515-deftype.patch     Text File clj-1515.patch     File patch.diff     File range-patch3.diff     File reified-range4.diff    
Patch: Code and Test
Approval: Vetted

 Description   

range should implement IReduce for fast reduction.

Patch: clj-1515-12.patch - Java range with long optimizations

Approach: The clj-1515-12 patch revives the unused clojure.lang.Range class. This class is similar in implementation to ChunkedCons. It differs by lazily constructing the first chunk (this is done in the existing impl by using a LazySeq wrapper) and by caching the next reference like other seq impls. The latter is done because range is frequently used in both chunked and unchunked traversal.

Additionally, 1515-12 contains an optimized version of range (LongRange) for the extremely common case where start, end, and step are all longs. This version uses primitive longs and primitive math and a customized long[] version of ArrayChunk for greater performance.

The special case of (range) is just handled with (iterate inc' 0) (which is further optimized for reduce in CLJ-1603).

Alternatives:

  • CLJ-1515-deftype3.patch took the approach of using deftype to create a seqable, reducible entity that was not actually a seq. Based on work for CLJ-1603, it was established that due to historical requirements, this is not viable and the entity returned for range should implement the ISeq and Collection interfaces directly.
  • clj-1515-11.patch uses the approach of a "split" implementation - a LazySeq that has a fast reduce path. This is a minimal patch that provides no perf difference for sequence usage but a moderate improvement for reduce paths. One important difference vs clj-1515-12 is that the fast reduce path is only obtained on the initial range. Once you walk off the head, the performance will be the same as prior (seq perf).

Performance

criterium quick-bench with java 1.8.0-b132

code 1.7.0-master Java clj-1515-12 (w/CLJ-1603) split clj-1515-11
(count (range (* 1024 1024))) 62 ms 0 ms 58 ms
(reduce + (map inc (range (* 1024 1024)))) 49 ms 38 ms 50 ms
(reduce + (map inc (map inc (range (* 1024 1024))))) 66 ms 60 ms 67 ms
(count (keep odd? (range (* 1024 1024)))) 76 ms 49 ms 67 ms
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 46 ms 29 ms 34 ms
(reduce + 0 (range (* 2048 1024))) 64 ms 26 ms 40 ms
(reduce + 0 (rest (range (* 2048 1024)))) 60 ms 17 ms 57 ms
(doall (range 0 31)) 1.20 µs 0.72 µs 1.15 µs
(doall (range 0 32)) 1.22 µs 0.82 µs 1.18 µs
(doall (range 0 4096)) 147 µs 105 µs 145 µs
(into [] (map inc (range 31))) 1.80 µs 1.32 µs 1.88 µs
(into [] (map inc) (range 31)) 1.55 µs 0.72 µs 1.02 µs
(into [] (range 128)) 4.90 µs 2.14 µs 3.27 µs
(doall (range 1/2 1000 1/3)) 1.29 ms 1.24 ms 1.28 µs
(doall (range 0.5 1000 0.33)) 136 µs 151 µs 151 µs
(into [] (range 1/2 1000 1/3)) 1.31 ms 1.18 ms 1.26 ms
(into [] (range 0.5 1000 0.33)) 151 µs 91 µs 130 µs
(count (filter odd? (take (* 1024 1024) (range)))) 194 ms 163 ms 176 ms
(transduce (take (* 1024 1024)) + (range)) 67 ms 34 ms 68 ms

Performance notes:

  • 1515-12 (Java) - significant improvements in virtually all use cases, both seq and reduce. The final two cases with (range) leverage optimizations from CLJ-1603, which was included for those timings.
  • 1515-11 (split) - a much smaller patch that gives no seq benefits and smaller reduction benefits. Only the head of the range will receive the perf benefits (see (reduce + 0 (rest (range (* 2048 1024))))).

Questions
(range) and supports auto-promotion towards infinity in this patch, which seems to be implied by the doc string but was not actually implemented or tested correctly afaict.



 Comments   
Comment by Alex Miller [ 29/Aug/14 3:19 PM ]

1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

Comment by Timothy Baldridge [ 29/Aug/14 3:40 PM ]

1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Comment by Alex Miller [ 29/Aug/14 3:49 PM ]

6) might be useful if you're walking through it with nth, which hits count everytime, but doubt that's common
7) yep, reasonable

Comment by Andy Fingerhut [ 18/Sep/14 6:52 AM ]

I have already pointed out to Edipo in personal email the guidelines on what labels to use for Clojure JIRA tickets here: http://dev.clojure.org/display/community/Creating+Tickets

Comment by Timothy Baldridge [ 19/Sep/14 10:02 AM ]

New patch with IReduce directly on Range instead of relying on iterators

Comment by Alex Miller [ 01/Oct/14 2:00 PM ]

The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is.

Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.

Comment by Timothy Baldridge [ 03/Oct/14 10:00 AM ]

Looks like chunked seqs do make lazy seq code about 5x faster in these tests.

Comment by Ghadi Shayban [ 03/Oct/14 10:22 AM ]

I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.

Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

Comment by Timothy Baldridge [ 03/Oct/14 10:38 AM ]

I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

Comment by Alex Miller [ 03/Oct/14 10:46 AM ]

I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.

On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Comment by Ghadi Shayban [ 03/Oct/14 11:00 AM ]

Tim, I'm not suggesting unboxed math, but the singular fast-path of all-Longs that you and Alex describe. I mistakenly lower-cased Long/Float.

Comment by Timothy Baldridge [ 31/Oct/14 11:30 AM ]

Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.

Comment by Nicola Mometto [ 14/Nov/14 12:51 PM ]

As discussed with Tim in #clojure, the current patch should not change ArrayChunk's reduce impl, that's an error.

Comment by Alex Miller [ 09/Dec/14 2:40 AM ]

Still a work in progress...

Comment by Nicola Mometto [ 09/Dec/14 8:44 AM ]

Alex, while this is still a work in progress, I see that the change on ArrayChunk#reduce from previous WIP patches not only has not been reverted but has been extended. I don't think the current approach makes sense as ArrayChunk#reduce is not part of the IReduce/IReduceInit contract but of the IChunk contract and changing the behaviour to be IReduce-like in its handling of reduced introduces the burden of having to use preserve-reduced on the reducing function to no apparent benefit.

Given that the preserve-reduced is done on the clojure side, it seems to me like directly invoking .reduce rather than routing through internal-reduce should be broken but I haven't tested it.

Comment by Alex Miller [ 09/Dec/14 9:49 AM ]

That's the work in progress part - I haven't looked at yet. I have not extended or done any work re ArrayChunk, just carried through what was on the prior patch. I'll be working on it again tomorrow.

Comment by Ghadi Shayban [ 10/Dec/14 11:14 PM ]

I am impressed and have learned a ton through this exercise.

quick review of clj-1515-2
1) withMeta gives the newly formed object the wrong meta.
2) LongRange/create() is the new 0-arity constructor for range, which sets the 'end' to Double/POSITIVE_INFINITY cast as a long. Current core uses Double/POSITIVE_INFINITY directly. Not sure how many programs rely upon iterating that far, or how they would break.
3) Relatedly, depending on the previous point: Because only all-long arguments receive chunking, the very common case of (range) with no args would be unchunked. Doesn't seem like too much of a stretch to add chunking to the other impl.
4) Though the commented invariants say that Range is never empty, the implementation uses a magic value of _count == 0 to mean not cached, which is surprising to me. hashcodes have the magic value of -1
5) s/instanceof Reduced/RT.isReduced
6) is the overflow behavior of "int count()" correct?

Comment by Alex Miller [ 11/Dec/14 12:06 AM ]

1) agreed!
2) Good point. I am definitely changing behavior on this (max of 9223372036854775807). I will look at whether this can be handled without affecting perf. Really, handling an infinite end point is not compatible with several things in LongRange.
3) I actually did implement chunking for the general Range and found it was slower (the original Clojure chunking is faster). LongRange is making up for that difference with improved primitive numerics.
4) Since empty is invalid, 0 and -1 are equally invalid. But I agree -1 conveys the intent better.
5) agreed
6) probably not. ties into 2/3.

Thanks for this, will address.

Comment by Alex Miller [ 11/Dec/14 12:11 AM ]

Added -4 patch that addresses 1,4,5 but not the (range) stuff.

Comment by Alex Miller [ 11/Dec/14 12:51 PM ]

Latest -7 patch addresses all feedback and perf #s updated.

Comment by Ghadi Shayban [ 22/Dec/14 10:59 PM ]

See CLJ-1572, I believe CollReduce needs to be extended directly to clojure.lang.{LongRange,Range} inside protocols.clj

Comment by Ghadi Shayban [ 29/Dec/14 12:29 PM ]

Seems like a missing benchmark is (range 0 31) without the doall. Current patch (-9) allocates the chunk buffer at range's construction time. Maybe this can be delayed?

Comment by Alex Miller [ 02/Jan/15 11:24 AM ]

Ghadi - you are right. I reworked the patch (new -10 version) so that the chunk is only created on demand. Basically, the chunking is only used when traversing via chunked seqs and in normal seq iteration or reduce, no explicit chunks are created. That improved several timings. I also added the bench for (range 0 31) which was greatly improved by lazily creating the chunk, so good point on that.

Comment by Michael Blume [ 03/Jan/15 1:38 PM ]

I'm looking at the implementation of equiv() and wondering if it's worth checking whether the other object is also a reified range and comparing the private parameters rather than iterating through the sequences.

Comment by Ghadi Shayban [ 18/Jan/15 7:26 PM ]

Attaching a simplified implementation as a deftype. I benchmarked a billion variants and dumped bytecode. I'm attaching the best performing combination, and it beats out the Java one in most cases.

Approach:
Two deftypes, one specialized to primitive longs, the other generic math.
Implement: Seqable/Counted,IReduce,IReduceInit,Iterable
Also, marker interfaces Sequential and Serialized.

The implementations are very straightforward, but Seqable deserves mention.
'seq' delegates its implementation to the existing lazy-seq based range, which has been stripped of the strange corner cases and now only handles strictly ascending/descending ranges. It has been renamed range*, rather than range1 because all the arguments to range are already numbers. core references to range have been accordingly renamed.

The new range constructor is loaded after reduce & protocols load. It checks types, and delegates to either long-range or generic-range, which handle the wacky argument cases that are not strictly ascending or descending. I'm annoyed with the structural duplication of those conditionals, not sure how to solve it.

Inside the LongRange implementation of IReduce/IReduceInit, boxing is carefully controlled, and was verified through bytecode dumping.

I elaborated the benchmarks for comparison, and also included benchmarking without type specialization.

You discover strange things working on this stuff. Turns out having the comparator close over the 'end' field is beneficial: #(< % end).

Alex, I figured out why the Java versions had a nearly exactly 2x regression on (doall (range 0 31)), and I attached a change to 'dorun'. Instead of calling (seq coll) then (next coll), it effectively calls (next (seq coll)). I think the implementation assumes that calling 'seq' is evaluating the thunk inside LazySeq. This should also help other Seqable impls like Cycle/Iterate/Repeat CLJ-1603.

Results (criterium full runs):

code 1.7.0-alpha5 clj-1515-10.patch (Java) deftype specialized (attached) deftype unspecialized
(count (filter odd? (take (* 1024 1024) (range)))) 187.30 ms 194.92 ms 182.53 ms 184.13 ms
(transduce (take (* 1024 1024)) + (range)) 58.27 ms 89.37 ms 84.69 ms 84.72 ms
(count (range (* 1024 1024))) 62.34 ms 27.11 ns not run not run
(reduce + (map inc (range (* 1024 1024)))) 57.63 ms 46.14 ms 46.17 ms 52.94 ms
(reduce + (map inc (map inc (range (* 1024 1024))))) 82.83 ms 68.66 ms 64.36 ms 71.76 ms
(count (keep odd? (range (* 1024 1024)))) 76.09 ms 57.59 ms not run not run
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 52.17 ms 39.71 ms 28.83 ms 42.23 ms
(reduce + 0 (range (* 2048 1024))) 85.93 ms 38.03 ms 26.49 ms 42.43 ms
(doall (range 0 31)) 1.33 µs 2.89 µs 1.03 µs 1.08 µs
(doall (range 0 32)) 1.35 µs 2.97 µs 1.07 µs 1.10 µs
(doall (range 0 128)) 5.27 µs 11.93 µs 4.19 µs 4.29 µs
(doall (range 0 512)) 21.66 µs 47.33 µs 16.95 µs 17.66 µs
(doall (range 0 4096)) 171.30 µs 378.52 µs 135.45 µs 140.27 µs
(into [] (map inc (range 31))) 1.97 µs 1.57 µs 1.87 µs 1.95 µs
(into [] (map inc) (range 31)) 1.66 µs 824.67 ns 891.90 ns 1.11 µs
(into [] (range 128)) 5.11 µs 2.21 µs 2.43 µs 3.36 µs
(doall (range 1/2 1000 1/3)) 1.53 ms 1.67 ms 1.59 ms 1.52 ms
(doall (range 0.5 1000 0.33)) 164.83 µs 382.38 µs 149.38 µs 141.21 µs
(into [] (range 1/2 1000 1/3)) 1.53 ms 1.40 ms 1.43 ms 1.50 ms
(into [] (range 0.5 1000 0.33)) 157.44 µs 108.27 µs 104.18 µs 127.20 µs

Open Questions for screeners of the deftype patch:

1) What to do about the autopromotion at the Long/MAX_VALUE boundary? I've preserved the current behavior of Clojure 1.6.
2) Alex, I did not pull forward the filter chunk tweak you discovered
3) Is the structural duplication of the conditionals in generic-range long-range awful?
4) Are there any other missing interfaces? IMeta comes to mind. Not sure about IHashEq either.

Comment by Ghadi Shayban [ 18/Jan/15 7:30 PM ]

note with the deftype patch, the transduce over infinite (range) case is the only one where 'master' currently performs best. This is because the implementation was changed to (iterate inc' 0). When CLJ-1603 is applied that situation should improve better.

Comment by Ghadi Shayban [ 18/Jan/15 8:27 PM ]

fixup patch CLJ-1515-deftype

Comment by Ghadi Shayban [ 18/Jan/15 11:05 PM ]

new patch CLJ-1515-deftype-nostructural-dup.patch with the silly conditional duplication removed. Updated benchmarks including 'count' impls. These benchmarks run on a different machine with the same hardness. Blue ribbon.

code 1.7.0-alpha5 1.7.0-rangejava 1.7.0-rangespecial rangespecial / alpah5
(count (filter odd? (take (* 1024 1024) (range)))) 297.14 ms 333.93 ms 328.62 ms 1.10
(transduce (take (* 1024 1024)) + (range)) 105.55 ms 145.44 ms 164.70 ms 1.56
(count (range (* 1024 1024))) 108.92 ms 61.09 ns 26.61 ns 0
(reduce + (map inc (range (* 1024 1024)))) 97.67 ms 95.41 ms 84.62 ms 0.86
(reduce + (map inc (map inc (range (* 1024 1024))))) 140.21 ms 135.59 ms 116.38 ms 0.83
(count (keep odd? (range (* 1024 1024)))) 121.18 ms 104.63 ms 111.46 ms 0.92
(transduce (comp (map inc) (map inc)) + (range (* 1024 1024))) 100.40 ms 86.28 ms 67.17 ms 0.67
(reduce + 0 (range (* 2048 1024))) 131.77 ms 80.43 ms 63.24 ms 0.48
(doall (range 0 31)) 2.53 µs 4.36 µs 2.24 µs 0.89
(doall (range 0 32)) 2.37 µs 4.00 µs 1.99 µs 0.84
(doall (range 0 128)) 9.20 µs 14.98 µs 8.01 µs 0.87
(doall (range 0 512)) 37.28 µs 59.13 µs 35.16 µs 0.94
(doall (range 0 4096)) 331.28 µs 471.57 µs 291.76 µs 0.88
(into [] (map inc (range 31))) 2.83 µs 2.79 µs 2.67 µs 0.94
(into [] (map inc) (range 31)) 2.21 µs 1.39 µs 1.26 µs 0.57
(into [] (range 128)) 6.72 µs 3.25 µs 3.09 µs 0.46
(doall (range 1/2 1000 1/3)) 3.41 ms 4.04 ms 3.14 ms 0.92
(doall (range 0.5 1000 0.33)) 281.04 µs 530.92 µs 244.14 µs 0.87
(into [] (range 1/2 1000 1/3)) 3.32 ms 3.71 ms 2.99 ms 0.90
(into [] (range 0.5 1000 0.33)) 215.53 µs 165.93 µs 138.86 µs 0.64
Comment by Alex Miller [ 19/Jan/15 8:32 AM ]

This is looking very good and I think we should move forward with it as the preferred approach. Feel free to update the description appropriately. I'll file a separate ticket with the filter tweak. Some comments on the patch:

1) GenericRange/count - this math is broken if you start to mix infinity in there. I think just (count (seq this)) is safer.
2) GenericRange/iterable - I think if we are IReduceInit and Seqable, we can omit this. I had it in the Java one because I inherited Iterable via ASeq but that's not an issue in this impl.
3) LongRange/reduce - why Long/valueOf? Isn't start a long field?
4) LongRange/iterable - ditto #2
5) print-method - anytime I see print-method, there should probably be print-dup too. For example, this is broken: (binding [*print-dup* true] (println (range 0 10)))
6) Is serialization broken by this patch? Can you justify the test changes?

Comment by Ghadi Shayban [ 19/Jan/15 12:51 PM ]

regarding Serialization tests currently:

(defn roundtrip
  [v]
  (let [rt (-> v serialize deserialize)
        rt-seq (-> v seq serialize deserialize)]        ;; this
    (and (= v rt)            ;; this fails because the test ^ calls seq first
      (= (seq v) (seq rt))   ;; this passes
      (= (seq v) rt-seq))))  ;; this passes

These new types are merely seqable, so (not= (LongRange. 0 10 1) (seq (LongRange. 0 10 1)))

Not sure how to handle this 100%. Nothing precludes the LongRange itself from roundtripping, but just that calling seq on it returns a separate object.

Comment by Alex Miller [ 23/Jan/15 9:57 AM ]

More comments...

1) Instead of extending IReduce and IReduceInit, just extend IReduce and implement both arities (IReduce extends IReduceInit).
2) I'm slightly troubled by the .invokePrim now. Did you look at a macro that does prim type-hinting maybe?
3) On the serialization thing, is the problem really with serialization or with equality? The test that's failing is (= v rt). Because these are not IPersistentCollections, pcequiv won't be used and it's just calling LongRange.equals(), which is not implemented and falls back to identity, right? Probably need to implement the hashCode and equals stuff. Might need IHashEq too.

user=> (= (range 5) (range 5))
false
Comment by Ghadi Shayban [ 23/Jan/15 12:13 PM ]

Took care of 1) and 3). Punting on hiding the invokePrim behind a macro. It may be shameful but it works really fast.

Another case found:
(assoc {'(0 1 2 3 4) :foo} (range 5) :bar) needs to properly overwrite keys in the map.

Cause: Might be irrelevant in the face of CLJ-1375. Util/equiv for maps doesn't use IPC/equiv unless the collection is also java.util.Collection or Map. If it is then it checks for IPersistentCollection. I added a check for IPersistentCollection first.

https://github.com/ghadishayban/clojure/commits/for-screening

To get hasheq working, I added back the iterators for use by Murmur3/hashOrdered.

Comment by Ghadi Shayban [ 23/Jan/15 3:20 PM ]

Handle hash and equality not through IPersistentCollection. w/ test-cases too.

Comment by Alex Miller [ 20/Feb/15 11:39 AM ]

Ghadi, we need to have range implement IObj too so with-meta works.

Comment by Ghadi Shayban [ 20/Feb/15 12:09 PM ]

Will add pronto but maybe someone can clarify something: Adding a simple clojure.lang.IObj/withMeta to the deftype results in an AbstractMethodError when trying to print a range. This is because the impl of vary-meta used in the default printer assumes that if all IObjs are also IMetas. Seems like a problem with either vary-meta's impl or the interface separation.

I can fix by:
1) adding a _meta field to Ranges and handling IMeta as well as IObj.

;; vary-meta:
(with-meta obj (apply f (meta obj) args)))

;; default printer
(defmethod print-method :default [o, ^Writer w]
  (if (instance? clojure.lang.IObj o)
    (print-method (vary-meta o #(dissoc % :type)) w)
    (print-simple o w)))
Comment by Ghadi Shayban [ 20/Feb/15 12:25 PM ]

Ugh never mind that last bit I fell out of the hammock prematurely. IMeta << IObj.

Alex, CLJ-1603 needs IMeta/meta too.

Comment by Alex Miller [ 24/Feb/15 11:30 AM ]

current direction is pending results of where CLJ-1603 goes





[CLJ-1670] Odd error when evaling a quoted object within a macro with a field name containing a '-' Created: 09/Mar/15  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Tim Engler Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: dash, macro, quote
Environment:

linux gentoo



 Description   

;;Somewhat self explanatory:

(deftype x [ this-is-a-bad-field ])

(defmacro xxx [] `(quote ~(->x 1)))
(xxx)
;;creates error: CompilerException java.lang.IllegalArgumentException: No matching field found: this-is-a-bad-field for class user.x, compiling:(NO_SOURCE_PATH:1:1)

;;But this works:
(deftype y [ ThisIsAGoodField ])

(defmacro yyy [] `(quote ~(->y 1)))
(yyy)

;;returns: #<y user.y@79fd87c8>



 Comments   
Comment by Alex Miller [ 09/Mar/15 12:41 PM ]

possibly a dupe of CLJ-1399

Comment by Tim Engler [ 09/Mar/15 8:21 PM ]

I applied the patch in one of the attachments of CLJ-1399, and this fixed it. So I agree, a dupe.

Comment by Nicola Mometto [ 25/Mar/15 5:22 PM ]

should be closed as dupe of CLJ-1399





[CLJ-1683] test-reduce doesn't catch off-by-one error in IReduce Created: 25/Mar/15  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None

Attachments: Text File clj-1683-v1.patch    
Patch: Code and Test
Approval: Screened

 Description   

To implement the no-init arity of reduce, you have to use the first element of the sequence as an init value, and then you have to skip the first element when you iterate through the sequence. The tests in test-reduce, which focus on using reduce to sum up a bunch of numbers in a range, don't actually pin down this behavior, because the range used starts with zero, and an extra zero doesn't affect the sum.

Screened by: Alex Miller - this is a good tweak to get better tests for the hard this easy to mess up reduce behavior.






[CLJ-809] fn's created using defn should not lexical shadow the var that holds them Created: 11/Jun/11  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Kevin Downey Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None


 Description   

currently (defn foo [x] (foo x)) expands to something like (def foo
(fn foo [x] (foo x))) so the fn is bound to foo lexically in the scope
of the fn body.

because of this lexical shadowing self calls to fns defined with defn
do not incur the overhead of var dereferencing, I assume this is the
reason it was added.

the lexical shadowing also breaks memoization if you create the
memoized functions via (alter-var-root #'foo memoize) and several
macros here and there emit code in this style, since reusing defn is
the easiest way to get all the good juicy metadata bits like arglists,
etc.

1.3 has changes to eliminate some of the cost of going through the var.

since going through the var is much cheaper now it would be nice to
eliminate the lexical shadowing.

http://groups.google.com/group/clojure-dev/browse_thread/thread/33b52b24616967f?hl=en



 Comments   
Comment by Kevin Downey [ 11/Jun/11 2:40 PM ]

tests indicate that accessing the fn through the var vs. accessing through the lexical binding have very similar access times

https://gist.github.com/1013008

results in

$ java -jar clojure.jar ~/src/lexicaltest.clj
lexical binding run 0
"Elapsed time: 86.739 msecs"
lexical binding run 1
"Elapsed time: 8.062 msecs"
lexical binding run 2
"Elapsed time: 16.182 msecs"
var binding run 0
"Elapsed time: 61.136 msecs"
var binding run 1
"Elapsed time: 40.317 msecs"
var binding run 2
"Elapsed time: 11.641 msecs"
$

Comment by Jason Wolfe [ 13/Jun/11 5:05 PM ]

Here's another test that makes sure no JIT funny business (e.g. dead code elimination) is going on, and also looks at primitive hinted versions.

https://gist.github.com/1023816

var-obj result, time: 14930352 ,"Elapsed time: 965.577 msecs"
lex-obj result, time: 14930352 ,"Elapsed time: 957.741 msecs"
var-prim result, time: 14930352 ,"Elapsed time: 770.411 msecs"
lex-prim result, time: 14930352 ,"Elapsed time: 113.412 msecs"
nil

I'm not sure how to properly hint the var in the var-prim version to get truly comparable results. In any case, this use case should definitely be kept in mind.

Comment by Kevin Downey [ 13/Jun/11 5:35 PM ]

I think the point is we want to make sure the jvm can optimize both just as well, so making the code purposefully unoptimizable is kind of silly, yes?

Comment by Jason Wolfe [ 21/Jun/11 4:41 PM ]

@Kevin In your gist, the return value of foo is not used, and foo has no side effects. A sufficiently smart compiler could optimize the calls out entirely, which would not tell us much about the runtime of foo.

Comment by Aaron Bedra [ 28/Jun/11 6:49 PM ]

Rich: Given the performance testing (in the email thread) what is the next step here?

  • more thorough testing
  • patch?

Are there things that this might break that we need to consider?

Comment by Rich Hickey [ 29/Jul/11 7:39 AM ]

Someone needs to make a patch, and then test perf with and without the patch. These simulated tests aren't necessarily indicative. Naive fib is an ok test once you have a patch. And yes, more things might break - needs careful assessment.

Comment by Nicola Mometto [ 11/Apr/14 5:29 PM ]

If I understand this ticket correctly, it looks like this has been done at some point in time.

Clojure 1.7.0-master-SNAPSHOT
user=> (macroexpand-1 '(defn x []))
(def x (clojure.core/fn ([])))
Comment by Nicola Mometto [ 25/Mar/15 5:40 PM ]

This has already been fixed, should be closed

Comment by Alex Miller [ 25/Mar/15 9:22 PM ]

Already done probably long ago:

user=> (pprint (macroexpand '(defn foo [x] (foo x))))
(def foo (clojure.core/fn ([x] (foo x))))




[CLJ-1043] Unordered literals does not preserve left-to-right evaluation of arguments Created: 16/Aug/12  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Brandon Bloom Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None


 Description   

Given: (defn f [x] (println x) x)

#{(f 2) (f 1)}

Prints:

1
2

But expected would be:

2
1

This issue is related to CLJS-288



 Comments   
Comment by Andy Fingerhut [ 23/Sep/12 6:01 PM ]

I have the same question as David Nolen for CLJS-288: Is this a bug, or just behavior you didn't expect?

It seems that vectors preserve the order of evaluation, so if you really want to control evaluation order you could use something like (set [(f 2) (f 1)]) or (set (map f [2 1])).

Comment by Brandon Bloom [ 23/Sep/12 7:38 PM ]

I'd consider the expected default behavior of any syntax or macro to evaluate each sub-form once each, from left to right. Conditional, repeated, or out-of-order evaluation should be documented as deviations from that norm. If you buy that, then this is either a code or a documentation bug. My vote is for code bug.

Comment by Nicola Mometto [ 25/Mar/15 5:05 PM ]

I don't think it's possible to preserve left-to-right evaluation of args on unordered literals, the original order will be lost during read-time, the compiler has no way to know about it.

Comment by Alex Miller [ 25/Mar/15 5:40 PM ]

Sets are unordered.

Comment by Brandon Bloom [ 25/Mar/15 8:22 PM ]

Sometime over the past two years, I've come around to the perspective that this is expected behavior. However, I still maintain that this should be documented somewhere!

On this page: http://clojure.org/evaluation

This paragraph:

Vectors, Sets and Maps yield vectors and (hash) sets and maps whose contents are the evaluated values of the objects they contain. The same is true of metadata maps. If the vector or map has metadata, the evaluated metadata map will become the metadata of the resulting value.

I'd propose inserting a sentence like:

Contents are evaluated sequentially in vectors, but in an unspecified order for sets and maps.

Comment by Alex Miller [ 25/Mar/15 9:11 PM ]

I added a comment to that page.

Comment by Brandon Bloom [ 25/Mar/15 9:19 PM ]

Awesome, thanks!

Small typo: "Vectors elements" should be "Vector elements".





[CLJ-1596] Using keywords in place of symbols for defrecord fields causes a compiler exception with incorrect line number Created: 20/Nov/14  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Kyle Kingsbury Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: compiler, defrecord

Approval: Triaged

 Description   

Possibly related to http://dev.clojure.org/jira/browse/CLJ-1261: a defrecord like

(defn foo [x])

(defrecord Bar [:b])

Throws an exception, like you'd expect:

java.lang.ClassCastException: clojure.lang.Keyword cannot be cast to clojure.lang.IObj, compiling:(tesser/quantiles_test.clj:45:15)

However, this exception's line and character indicates the error is in the previous form: the defn, not the defrecord. This can be really tricky to figure out when the expressions are more complicated.



 Comments   
Comment by Alex Miller [ 20/Nov/14 4:17 PM ]

Related: CLJ-1261

Comment by Alex Miller [ 20/Nov/14 4:18 PM ]

Possibly fixed by CLJ-1561, not sure.

Comment by Nicola Mometto [ 25/Mar/15 5:55 PM ]

No longer reproducible, fixed by CLJ-1561





[CLJ-1322] doseq with several bindings causes "ClassFormatError: Invalid Method Code length" Created: 10/Jan/14  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Miikka Koskinen Assignee: Unassigned
Resolution: Unresolved Votes: 5
Labels: None
Environment:

Clojure 1.5.1, java 1.7.0_25, OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1ubuntu0.12.04.2)


Attachments: Text File doseq-bench.txt     Text File doseq.patch     File script.clj    
Patch: Code
Approval: Incomplete

 Description   

Important Perf Note the new impl is faster for collections that are custom-reducible but not chunked, and is also faster for large numbers of bindings. The original implementation is hand tuned for chunked collections, and wins for larger chunked coll/smaller binding count scenarios, presumably due to the fn call/return tracking overhead of reduce. Details are in the comments.
Screened By
Patch doseq.patch

user=> (def a1 (range 10))
#'user/a1
user=> (doseq [x1 a1 x2 a1 x3 a1 x4 a1 x5 a1 x6 a1 x7 a1 x8 a1] (do))
CompilerException java.lang.ClassFormatError: Invalid method Code length 69883 in class file user$eval1032, compiling:(NO_SOURCE_PATH:2:1)

While this example is silly, it's a problem we've hit a couple of times. It's pretty surprising when you have just a couple of lines of code and suddenly you get the code length error.



 Comments   
Comment by Kevin Downey [ 18/Apr/14 12:20 AM ]

reproduces with jdk 1.8.0 and clojure 1.6

Comment by Nicola Mometto [ 22/Apr/14 5:35 PM ]

A potential fix for this is to make doseq generate intermediate fns like `for` does instead of expanding all the code directly.

Comment by Ghadi Shayban [ 25/Jun/14 8:39 PM ]

Existing doseq handles chunked-traversal internally, deciding the
mechanics of traversal for a seq. In addition to possibly conflating
concerns, this is causing a code explosion blowup when more bindings are
added, approx 240 bytes of bytecode per binding (without modifiers).

This approach redefs doseq later in core.clj, after protocol-based
reduce (and other modern conveniences like destructuring.)

It supports the existing :let, :while, and :when modifiers.

New is a stronger assertion that modifiers cannot come before binding
expressions. (Same semantics as let, i.e. left to right)

valid: [x coll :when (foo x)]
invalid: [:when (foo x) x coll]

This implementation does not suffer from the code explosion problem.
About 25 bytes of bytecode + 1 fn per binding.

Implementing this without destructuring was not a party, luckily reduce
is defined later in core.

Comment by Andy Fingerhut [ 26/Jun/14 12:25 AM ]

For anyone reviewing this patch, note that there are already many tests for correct functionality of doseq in file test/clojure/test_clojure/for.clj. It may not be immediately obvious, but every test for 'for' defined with deftest-both is a test for 'for' and also for 'doseq'.

Regarding the current implementation of doseq: it in't simply that it is too many bytes per binding, it is that the code size doubles with each additional binding. See these results, which measures size of the macroexpanded form rather than byte code size, but those two things should be fairly linearly related to each other here:

(defn formsize [form]
  (count (with-out-str (print (macroexpand form)))))

user=> (formsize '(doseq [x (range 10)] (print x)))
652
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
1960
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
4584
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
9947
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))
20997

Here are results for the same expressions after Ghadi's patch doseq.patch dated June 25 2014:

user=> (formsize '(doseq [x (range 10)] (print x)))
93
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
170
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
247
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
324
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))
401

It would be good to see some performance results with and without this patch, too.

Comment by Stuart Halloway [ 28/Jun/14 2:21 PM ]

In the tests below, the new impl is called "doseq2", vs. the original impl "doseq"

(def hund (into [] (range 100)))
(def ten (into [] (range 10)))
(def arr (int-array 100))
(def s "superduper")

;; big seq, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a (range 100000000)])))
;; 1.2 sec

(dotimes [_ 5]
  (time (doseq2 [a (range 100000000)])))
;; 1.8 sec

;; small unchunked reducible, few bindings: doseq2 wins
(dotimes [_ 5]
  (time (doseq [a s b s c s])))
;; 0.5 sec

(dotimes [_ 5]
  (time (doseq2 [a s b s c s])))
;; 0.2 sec

(dotimes [_ 5]
  (time (doseq [a arr b arr c arr])))
;; 40 msec

(dotimes [_ 5]
  (time (doseq2 [a arr b arr c arr])))
;; 8 msec

;; small chunked reducible, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a hund b hund c hund])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a hund b hund c hund])))
;; 8 msec

;; more bindings: doseq2 wins bigger and bigger
(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten ])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten ])))
;; 0.4 msec

(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten e ten])))
;; 18 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten e ten])))
;; 1 msec
Comment by Ghadi Shayban [ 28/Jun/14 6:23 PM ]

Hmm, I cannot reproduce your results.

I'm not sure whether you are testing with lein, on what platform, what jvm opts.

Can we test using this little harness instead directly against clojure.jar? I've attached a the harness and two runs of results (one w/ default heap, the other 3GB w/ G1GC)

I added a medium and small (range) too.

Anecdotally, I see doseq2 outperform in all cases except the small range. Using criterium shows a wider performance gap favoring doseq2.

I pasted the results side by side for easier viewing.

core/doseq                          doseq2
"Elapsed time: 1610.865146 msecs"   "Elapsed time: 2315.427573 msecs"
"Elapsed time: 2561.079069 msecs"   "Elapsed time: 2232.479584 msecs"
"Elapsed time: 2446.674237 msecs"   "Elapsed time: 2234.556301 msecs"
"Elapsed time: 2443.129809 msecs"   "Elapsed time: 2224.302855 msecs"
"Elapsed time: 2456.406103 msecs"   "Elapsed time: 2210.383112 msecs"

;; med range, few bindings:
core/doseq                          doseq2
"Elapsed time: 28.383197 msecs"     "Elapsed time: 31.676448 msecs"
"Elapsed time: 13.908323 msecs"     "Elapsed time: 11.136818 msecs"
"Elapsed time: 18.956345 msecs"     "Elapsed time: 11.137122 msecs"
"Elapsed time: 12.367901 msecs"     "Elapsed time: 11.049121 msecs"
"Elapsed time: 13.449006 msecs"     "Elapsed time: 11.141385 msecs"

;; small range, few bindings:
core/doseq                          doseq2
"Elapsed time: 0.386334 msecs"      "Elapsed time: 0.372388 msecs"
"Elapsed time: 0.10521 msecs"       "Elapsed time: 0.203328 msecs"
"Elapsed time: 0.083378 msecs"      "Elapsed time: 0.179116 msecs"
"Elapsed time: 0.097281 msecs"      "Elapsed time: 0.150563 msecs"
"Elapsed time: 0.095649 msecs"      "Elapsed time: 0.167609 msecs"

;; small unchunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 2.351466 msecs"      "Elapsed time: 2.749858 msecs"
"Elapsed time: 0.755616 msecs"      "Elapsed time: 0.80578 msecs"
"Elapsed time: 0.664072 msecs"      "Elapsed time: 0.661074 msecs"
"Elapsed time: 0.549186 msecs"      "Elapsed time: 0.712239 msecs"
"Elapsed time: 0.551442 msecs"      "Elapsed time: 0.518207 msecs"

core/doseq                          doseq2
"Elapsed time: 95.237101 msecs"     "Elapsed time: 55.3067 msecs"
"Elapsed time: 41.030972 msecs"     "Elapsed time: 30.817747 msecs"
"Elapsed time: 42.107288 msecs"     "Elapsed time: 19.535747 msecs"
"Elapsed time: 41.088291 msecs"     "Elapsed time: 4.099174 msecs"
"Elapsed time: 41.03616 msecs"      "Elapsed time: 4.084832 msecs"

;; small chunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 31.793603 msecs"     "Elapsed time: 40.082492 msecs"
"Elapsed time: 17.302798 msecs"     "Elapsed time: 28.286991 msecs"
"Elapsed time: 17.212189 msecs"     "Elapsed time: 14.897374 msecs"
"Elapsed time: 17.266534 msecs"     "Elapsed time: 10.248547 msecs"
"Elapsed time: 17.227381 msecs"     "Elapsed time: 10.022326 msecs"

;; more bindings:
core/doseq                          doseq2
"Elapsed time: 4.418727 msecs"      "Elapsed time: 2.685198 msecs"
"Elapsed time: 2.421063 msecs"      "Elapsed time: 2.384134 msecs"
"Elapsed time: 2.210393 msecs"      "Elapsed time: 2.341696 msecs"
"Elapsed time: 2.450744 msecs"      "Elapsed time: 2.339638 msecs"
"Elapsed time: 2.223919 msecs"      "Elapsed time: 2.372942 msecs"

core/doseq                          doseq2
"Elapsed time: 28.869393 msecs"     "Elapsed time: 2.997713 msecs"
"Elapsed time: 22.414038 msecs"     "Elapsed time: 1.807955 msecs"
"Elapsed time: 21.913959 msecs"     "Elapsed time: 1.870567 msecs"
"Elapsed time: 22.357315 msecs"     "Elapsed time: 1.904163 msecs"
"Elapsed time: 21.138915 msecs"     "Elapsed time: 1.694175 msecs"
Comment by Ghadi Shayban [ 28/Jun/14 6:47 PM ]

It's good that the benchmarks contain empty doseq bodies in order to isolate the overhead of traversal. However, that represents 0% of actual real-world code.

At least for the first benchmark (large chunked seq), adding in some tiny amount of work did not change results signifantly. Neither for (map str [a])

(range 10000000) =>  (map str [a])
core/doseq
"Elapsed time: 586.822389 msecs"
"Elapsed time: 563.640203 msecs"
"Elapsed time: 369.922975 msecs"
"Elapsed time: 366.164601 msecs"
"Elapsed time: 373.27327 msecs"
doseq2
"Elapsed time: 419.704021 msecs"
"Elapsed time: 371.065783 msecs"
"Elapsed time: 358.779231 msecs"
"Elapsed time: 363.874448 msecs"
"Elapsed time: 368.059586 msecs"

nor for intrisics like (inc a)

(range 10000000)
core/doseq
"Elapsed time: 317.091849 msecs"
"Elapsed time: 272.360988 msecs"
"Elapsed time: 215.501737 msecs"
"Elapsed time: 206.639181 msecs"
"Elapsed time: 206.883343 msecs"
doseq2
"Elapsed time: 241.475974 msecs"
"Elapsed time: 193.154832 msecs"
"Elapsed time: 198.757873 msecs"
"Elapsed time: 197.803042 msecs"
"Elapsed time: 200.603786 msecs"

I still see reduce-based doseq ahead of the original, except for small seqs

Comment by Ghadi Shayban [ 04/Aug/14 2:55 PM ]

A form like the following will not work with this patch:

(go (doseq [c chs] (>! c :foo)))

as the go macro doesn't traverse fn boundaries. The only such code I know is core.async/mapcat*, a private fn supporting a fn that is marked deprecated.

Comment by Ghadi Shayban [ 07/Aug/14 2:09 PM ]

I see #'clojure.core/run! was just added, which has a similar limitation

Comment by Rich Hickey [ 29/Aug/14 8:19 AM ]

Please consider Ghadi's feedback, esp re: closures.

Comment by Ghadi Shayban [ 22/Sep/14 4:36 PM ]

The current expansion of a doseq [1] under a go form is less than ideal due to the amount of control flow. 14 states in the state machine vs. 7 with loop/recur

[1] Comparison of macroexpansion of (go ... doseq) vs (go ... loop/recur)
https://gist.github.com/ghadishayban/639009900ce1933256a1

Comment by Nicola Mometto [ 25/Mar/15 6:07 PM ]

Related: CLJ-77





[CLJ-1646] Small filter performance enhancement Created: 19/Jan/15  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Critical
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: performance

Attachments: Text File clj-1646.patch    
Patch: Code
Approval: Triaged

 Description   

I found when working on other tickets for 1.7 that filter repeats each call to .nth on the chunk.
Reusing the result is a small perf boost for all filter uses.

Timing with criterium quick-bench:

What stock patch applied comment
(into [] (filter odd? (range 1024))) 54.3 µs 50.2 µs 7.5% reduction

Approach: storing the nth value as to avoid double lookup.

Patch: clj-1646.patch
Screened by:



 Comments   
Comment by Michael Blume [ 25/Mar/15 5:09 PM ]

This seems have been rolled into the patch for CLJ-1515 – should it be closed?

Comment by Alex Miller [ 25/Mar/15 5:13 PM ]

I'm not sure yet which patch will be applied for 1515. I will close this if it's included.





[CLJ-1057] Var's .setDynamic does not set :dynamic in metadata Created: 02/Sep/12  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Trivial
Reporter: Brandon Bloom Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None


 Description   
((juxt (comp :dynamic meta) #(.isDynamic %)) #'*agent*)


 Comments   
Comment by Timothy Baldridge [ 03/Dec/12 9:10 AM ]

This is actually an enhancement as no where in the clojure code is provision made for syncing var's metadata and dynamic state. .isDynamic is the authoritative source, and the calling of .setDynamic is configured by the compiler. If you'd like to see this change, please, feel free to bring it up on clojure-dev for a discussion.

Comment by Nicola Mometto [ 25/Mar/15 4:52 PM ]

Dupe of CLJ-859

Comment by Alex Miller [ 25/Mar/15 5:02 PM ]

dupe of CLJ-859





[CLJ-270] defn-created fns inherit old metadata from the Var they are assigned to Created: 13/Feb/10  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Anonymous Assignee: Rich Hickey
Resolution: Not Reproducible Votes: 0
Labels: None


 Description   
  • What (small set of) steps will reproduce the problem?

user> (def #^{:foo "bar"} x 5)
#'user/x
user> (meta #'x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_FILE", :line 1, :foo "bar"}
user> (defn x [] 5)
#'user/x
user> (meta #'x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_FILE", :line 1, :arglists ([])}
user> (meta x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_FILE", :line 1, :foo "bar"}

  • What is the expected output? What do you see instead?

I expect (meta #'x) to evaluate to the value of the final (meta x) in the above.

  • What version are you using?

Current master (commit 61202d2ff6925002400a9843e8fbd080f3bef3a5).

  • Was this discussed on the group? If so, please provide a link to the discussion.

http://groups.google.com/group/clojure/browse_thread/thread/4c7151aa9c4d919c/d1b033ef5a13dd89?lnk=gst&q=off-by-one#d1b033ef5a13dd89

Prompted by discussion at

http://groups.google.com/group/clojure/browse_thread/thread/6553d48c981019eb/3c55b0bd43a5d8e9?lnk=gst&q=off-by-one#3c55b0bd43a5d8e9

  • Initial attempt at a diagnosis:

I think this is due to DefExpr's eval method binding the Var to init.eval() first and attaching the supplied metadata to the Var later – see Compiler.java lines 341-352. (Note the Var is always already in place when init.eval() is called, regardless of whether it existed prior to the evaluation of the def / defn.) Thus the init expression supplied by defn sees the old (and wrong) metadata on the Var.



 Comments   
Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/270
Attachments:
dont-copy-val-metadata-onto-new-var-value-in-defn.patch - https://www.assembla.com/spaces/clojure/documents/dwK4yssayr37y_eJe5d-aX/download/dwK4yssayr37y_eJe5d-aX
0001-set-meta-on-vars-before-evaluating-their-init-see-27.patch - https://www.assembla.com/spaces/clojure/documents/arrhbiAI4r35lQeJe5cbLr/download/arrhbiAI4r35lQeJe5cbLr

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

stu said: [file:dwK4yssayr37y_eJe5d-aX]

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

stu said: The problem happens with defn, but not with def+fn, so I think the original diagnosis is incorrect.

Another stab at diagnosis: The defn macro copies metadata from a var onto its new value, so if you defn a var that already exists, the old var metadata becomes metadata on the new value. I don't know why this would be the right thing to do, and if you eliminate this behavior (see patch) all the Clojure and Contrib tests still pass.

If this is correct, please assign back to me and I will write tests. If this is wrong, please tell me what's going on here so I know how to write the tests.

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: Child association with ticket #363 was added

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: I think the original diagnosis is correct. Note that the behavior differ when def iscompiled:

user=> (def #^{:foo "bar"} x 5)
#'user/x
user=> (let [] (defn x [] 5))
#'user/x
user=> (meta x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_PATH", :line 83, :arglists ([])}
user=> (meta #'x)
{:ns #<Namespace user>, :name x, :file "NO_SOURCE_PATH", :line 83, :arglists ([])}

See also http://groups.google.com/group/clojure/browse_thread/thread/6afd81896ca368b2#

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: [file:arrhbiAI4r35lQeJe5cbLr]

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: My patch aligns DefExpr.eval with DefExpr.emit (first set meta then eval the init value).

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: Forget it, my current patch is broken.

Comment by Assembla Importer [ 24/Aug/10 6:23 AM ]

cgrand said: My current patch is broken because of #352, what is the rational for #352?

Comment by Nicola Mometto [ 25/Mar/15 4:51 PM ]

not reproducible, metadata is no longer attached to the fns, could be closed

Comment by Alex Miller [ 25/Mar/15 5:01 PM ]

no longer relevant or reproducible





[CLJ-1681] reflection warning throws NPE for literal nil arg Created: 24/Mar/15  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Michael Blume Assignee: Michael Blume
Resolution: Unresolved Votes: 0
Labels: regression, typehints

Attachments: Text File clj-1681-v1.patch     Text File clj-1681-v2.patch     Text File clj-1681-v3.patch    
Patch: Code and Test
Approval: Screened

 Description   

Seen on another project, but reproducible with this:

user=> (set! *warn-on-reflection* true)
true
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
user=> (pst *e)
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:21:13)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6740)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyze (Compiler.java:6485)
	clojure.lang.Compiler$BodyExpr$Parser.parse (Compiler.java:5861)
	clojure.lang.Compiler$FnMethod.parse (Compiler.java:5296)
	clojure.lang.Compiler$FnExpr.parse (Compiler.java:3925)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6731)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6721)
	clojure.lang.Compiler.analyze (Compiler.java:6524)
Caused by:
NullPointerException
	clojure.lang.Compiler.getTypeStringForArgs (Compiler.java:2440)
	clojure.lang.Compiler$InstanceMethodExpr.<init> (Compiler.java:1490)
	clojure.lang.Compiler$HostExpr$Parser.parse (Compiler.java:1000)
	clojure.lang.Compiler.analyzeSeq (Compiler.java:6733)
	clojure.lang.Compiler.analyze (Compiler.java:6524)

Cause: Regression in Clojure 1.6+ from patch for CLJ-1248. In printing the reflection warning message, if the argument's java class is null, an NPE results.

Approach: Check for null before getting class name.

Patch: clj-1681-v3.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 24/Mar/15 3:54 PM ]

Patch welcome.

Comment by Alex Miller [ 24/Mar/15 4:31 PM ]

Need test in the patch and having a simple way to repro in the ticket would be great.

Comment by Nicola Mometto [ 25/Mar/15 3:46 PM ]

here's a repro:

Clojure 1.7.0-master-SNAPSHOT
user=> (set! *warn-on-reflection* true)
true
user=> (defn f [a] (.divide 1M a nil))
CompilerException java.lang.NullPointerException, compiling:(NO_SOURCE_PATH:2:13)
Comment by Michael Blume [ 25/Mar/15 3:49 PM ]

Thanks! I was having some trouble with that.

Comment by Nicola Mometto [ 25/Mar/15 3:54 PM ]

Michael, me and Alex were discussing this ticket in IRC and Alex was proposing replacing the if expression with an implementation like:

(arg.hasJavaClass() && arg.getJavaClass() != null) ? arg.getJavaClass().getName() : "unknown"
Comment by Michael Blume [ 25/Mar/15 3:58 PM ]

Hm, that makes sense, I was thinking print "nil" instead of "unknown" but I guess the fact that the user is passing nil doesn't actually tell us the expected class of the argument (apart from that it's not a primitive)





[CLJ-451] fn literals lack name/arglists/namespace metadata Created: 05/Oct/10  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Anonymous Assignee: Unassigned
Resolution: Declined Votes: 1
Labels: None


 Description   

I would expect (meta (fn not-so-anonymous [a b c])) to include {:name not-so-anonymous :arglists ([a b c])} alongside line number information and possibly namespace/file as well, but currently it only includes :line.



 Comments   
Comment by Assembla Importer [ 05/Oct/10 12:29 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/451

Comment by Nicola Mometto [ 25/Mar/15 4:48 PM ]

Since metadata is no longer attached by the compiler directly to fns but to vars, I guess this ticket is no longer relevant

Comment by Alex Miller [ 25/Mar/15 4:51 PM ]

see comments





[CLJ-1226] set! of a deftype field using field-access syntax causes ClassCastException Created: 26/Jun/13  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: compiler, deftype, interop

Attachments: Text File 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r.patch     Text File 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r-v2.patch    
Patch: Code and Test

 Description   

Clojure 1.6.0-master-SNAPSHOT

user=> (defprotocol p (f [_]))
p
user=> (deftype t [^:unsynchronized-mutable x] p (f [_] (set! (.x _) 1)))
user.t
user=> (f (t. 1))
ClassCastException user.t cannot be cast to compile__stub.user.t user.t (NO_SOURCE_FILE:1

After patch:
Clojure 1.6.0-master-SNAPSHOT
user=> (defprotocol p (f [_]))
p
user=> (deftype t [^:unsynchronized-mutable x] p (f [_] (set! (.x _) 1)))
user.t
user=> (f (t. 1))
1

Patch: 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r-v2.patch



 Comments   
Comment by Nicola Mometto [ 27/Jun/13 5:30 AM ]

This patch offers a better workaround for CLJ-1075, making it possible to write
(deftype foo [^:unsynchronized-mutable x] MutableX (set-x [this v] (try (set! (.x this) v)) v))

Comment by Nicola Mometto [ 25/Mar/15 4:39 PM ]

Updated patch to apply to current master





[CLJ-1603] cycle, iterate, repeat return vals should IReduceInit Created: 25/Nov/14  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File clj-1603-10.patch     Text File clj-1603-11.patch     Text File clj-1603-12-2.patch     Text File clj-1603-12.patch     Text File clj-1603-13.patch     Text File clj-1603-14.patch     Text File clj-1603-15.patch     Text File clj-1603-2.patch     Text File clj-1603-3-2.patch     Text File clj-1603-3-3.patch     Text File clj-1603-3-4.patch     Text File clj-1603-3.patch     Text File clj-1603-4.patch     Text File clj-1603-5.patch     Text File clj-1603-6.patch     Text File clj-1603-7.patch     Text File clj-1603-8.patch     Text File clj-1603-9.patch     Text File clj-1603.patch    
Patch: Code and Test
Approval: Vetted

 Description   

Screened

clj-1603-15.patch (Java approach)

Alternatives:

There were a number of possible approaches for these enhancements:
1) Straight Java impl. The benefit of this approach is improving the performance of both the seq and reduce paths at the expense of writing a bunch of Java. See clj-1603-15.patch for the best of these impls.

2) Clojure deftype. This required moving cycle and iterate and providing a repeat1 implementation until deftype is defined (similar to the approach with reduce1). The deftype version returns a Seqable, IReduce object and has effectively the same former implementation for seq with a new fast implementation for IReduce. This makes reduce paths fast, but leaves seq paths about the same, with the benefit of no new Java code. The downside was that the new seqs did not implement the full range of interfaces from prior impls, which could potentially break code. See clj-1603-12-2.patch for a patch that covers this.

3) Add Iterable or IReduceInit directly to LazySeq. Conceptually, this does not make sense for general lazy seqs. Seqs materialize and cache each value once and doing this along with the ability to iterate/reduce introduces issues with caching (might as well use seqs for that) and synchronization. However, doing this for just these specific lazy seqs is possible - see latest patch.

Approach: Latest patch makes LazySeq implement IReduce and internal macro reducible-lazy-seq lets callers supply a function to implement the reduce path.

A few things to note:

  • Added some example-based tests for iterate, cycle, and repeat where I thought they were needed. Did not add generative tests - not clear to me what these would be that would actually be valuable. All of these functions are pretty simple and the examples cover the special cases.

Performance:

Some example timing, all in µs:

Expression 1.6.0 1.7.0-alpha5 master + clj-1603-15 (Java) master + clj-1603-12-2 (deftype) master + clj-1603-14 (split)
(doall (take 1000 (repeat 1))) 87 93 63 89 92
(into [] (take 1000) (repeat 1)) n/a 67 25 27 33
(doall (repeat 1000 1)) 87 94 16 94 89
(into [] (repeat 1000 1)) 99 110 13 12 12
(reduce + 0 (repeat 1000 1)) 99 126 20 22 25
(into [] (take 1000) (repeat 1)) n/a 67 28 33 27
(doall (take 1000 (cycle [1 2 3]))) 101 106 85 108 103
(into [] (take 1000) (cycle [1 2 3])) n/a 73 38 45 44
(doall (take 1000 (iterate inc 0))) 93 98 75 123 116
(into [] (take 1000) (iterate inc 0)) n/a 85 38 40 39

Notes on timings above:

  • All reduce timings (with into) comparable across the impls and significantly better than the current behavior over seqs.
  • The Java impl is faster across the board with doall. doall repeatedly calls seq() and next() to walk the sequence. The Java class versions of Repeat, Cycle, and Iterate extend ASeq and seq() just return this. next() constructs and returns a new instance of the class, which is immutable. In the lazy seq versions, LazySeq is mutable and requires synchronization and handling the caching safely. So, simple immutable instance ftw here.
  • The Java finite repeat has an extra benefit from using a primitive long for the counter.
  • One performance difference that's not visible in the timings is that the Java implementations have the benefit of being both seqs and reducibles as they are traversed, so you can always get a fast reduce. The deftype and split impls are only reducible at the initial instance, walking off that initial head reverts to lazy seqs that are not quickly reducible.

Patch: The two patches in leading contention are:

  • clj-1603-15.patch - for the Java version
  • clj-1603-14.patch - for the split impl

Alex opinion: I have swung back and forth on this but my current recommendation is for the Java implementation (clj-1603-15.patch). It's faster for both seqs and reduce, both in the timings above and importantly in maintaining reducibility as they are traversed. There is more Java, but I've made my peace with that - the code maximally leverages existing ASeq infrastructure and the implementation is easy to understand.



 Comments   
Comment by Ghadi Shayban [ 25/Nov/14 11:01 AM ]

Stu, do you intend these to be in Java or Clojure? It could be trickier to implement in Clojure directly, as loading would have to be deferred until core_deftype loads. It's certainly tractable without breaking any backwards compatibility, and I've explored this while experimenting with Range as a deftype https://github.com/ghadishayban/clojure/commit/906cd6ed4d7c624dc4e553373dabfd57550eeff2

A macro to help with Seq&List participation could be certainly useful, as efficiently being both a Seq/List and IReduceInit isn't a party.

May be useful to list requirements for protocol/iface participation.

It seems like 'repeatedly' is another missing link in the IReduceInit story.

Rich mentioned the future integration of reduce-kv at the conj, it would also be useful to know how that could fit in.

---- Other concerns and ops that may belong better on the mailing list ----

In experimenting with more reducible sources, I put out a tiny repo (github.com/ghadishayban/reducers) a couple weeks ago that includes some sources and operations. The sources were CollReduce and not ISeq.

Relatedly, caching the hashcode as a Java `transient` field is not supported when implementing a collection using deftype (patch w/ test in CLJ-1573).

Sources:
Iterate was one of them https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51
Repeatedly https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L43-L51

Reduce/transduce-based Operations that accept transducers:
some, any, yield-first https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L52-L80
(any could use a better name, equiv to (first (filter...)))
some and any have a symmetry like filter/remove.

Novelty maybe for 1.8:
A transducible context for Iterables similar to LazyTransformer:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L157-L161

The unless-reduced macro was very useful in implementing the collections:
https://github.com/ghadishayban/reducers/blob/master/src/ghadi/reducers.clj#L7-L15
It is different than the ensure-reduced and unreduced functions in core.

Comment by Alex Miller [ 25/Nov/14 12:01 PM ]

When we discussed this in the past, it was in the vein of reusing some of the range work (in Java) to implement cycle and iterate (per CLJ-1515).

Comment by Ghadi Shayban [ 25/Nov/14 9:20 PM ]

Never mind about 'repeatedly'. Being both ISeq and IReduceInit for repeatedly doesn't make sense for something that relies on side-effects. Current users of repeatedly can reduce over it many times and only realize the elements once.

Comment by Alex Miller [ 05/Dec/14 11:17 PM ]

attached wip Java impl and posted some example timings

Comment by Ghadi Shayban [ 11/Dec/14 4:35 PM ]

NB iterate in this patch does not cache the realized ISeq, but recalcs it at every call to realize the tail. This is not a change in the promised behavior (docstring says "f must be side-effect free") but an implementation change, as worth noting in the changelog.

Comment by Stuart Halloway [ 02/Jan/15 1:32 PM ]

It looks like all the reduce with no inital value paths are still seq-y, and slower, as shown by e.g.

(dotimes [_ 10]
  (time
   (reduce
    +
    (repeat 10000000 1))))

(dotimes [_ 20]
  (time
   (reduce
    +
    0
    (repeat 10000000 1))))
Comment by Alex Miller [ 02/Jan/15 2:01 PM ]

On that example in master before / after patch I see:

before:

  • no init = 844 ms
  • with init = 920 ms

after:

  • no init = 124 ms
  • with init = 90 ms

Is that similar to what you see or not?

Comment by Alex Miller [ 02/Jan/15 4:21 PM ]

The clj-1603-3.patch has been updated to use effectively the same algorithm for both versions of reduce. With the -3 patch, I got ~96 ms on both examples in the prior comment. I re-ran the tests in the description and updated those as well (about the same as expected).

Comment by Stuart Halloway [ 16/Jan/15 1:18 PM ]

The tests do not seem to hit the unseeded reduce branches – do we even want these branches? The original ticket was for IReduceInit.

Comment by Michael Blume [ 18/Jan/15 1:48 PM ]

Probably worth noting – Git will happily apply the latest patch for CLJ-1603 on top of the latest patch for CLJ-1515, but the result does not compile because 1515 uses iterate and 1603 moves the definition of iterate lower in clojure.core. Not sure if this is worth fixing now or just noting for when they're actually applied.

Comment by Michael Blume [ 18/Jan/15 1:52 PM ]

Actually, here, this just moves the declare statement further up the file.

Comment by Michael Blume [ 18/Jan/15 2:19 PM ]

OK, no, the two patches are still incompatible even with the declaration order fixed:

[java] ERROR in (test-range) (LongRange.java:95)
     [java] expected: (= (take 3 (range 3 9 0)) (quote (3 3 3)))
     [java]   actual: java.lang.ClassCastException: clojure.core.InfiniteRepeat cannot be cast to clojure.lang.ISeq
     [java]  at clojure.lang.LongRange.create (LongRange.java:95)
Comment by Alex Miller [ 18/Jan/15 2:31 PM ]

The 1515 patch is actually being reworked right now - we will patch things up at application time if needed.

Comment by Alex Miller [ 19/Jan/15 10:12 AM ]

Removing screened marking so can be re-screened. Added new -7 patch that handles print-method, print-dup, and unmapping the deftype constructors so they're not visible. Thanks to Ghadi in CLJ-1515 for the idea on those.

Comment by Ghadi Shayban [ 20/Jan/15 8:08 AM ]

Review of -7 patch:

Seqable/seq implementations that return a separate ISeq like these do should forward a call to seq on the result, like eduction does. [1] (This is not necessary in these particular impls, as the LazySeqs returned are themselves ISeqs. Also because Cycle's deftype is only constructed for non-empty cycles, the fact that there is a guaranteed seq is implicit. Probably a best practice to add an innocuous seq call if users look to these impls as a recipe.)

The performance regression in (doall (repeat 1000 1)) should go away completely with the dorun tweak in CLJ-1515. This is because dorun is effectively calling seq twice (it calls seq, throws away the result, then calls next.)

minor nits
1) repeat1 seems to be identical to repeat-seq and has both arities necessary for the deftypes
2) inside FiniteRepeat s/(> i 0)/(pos? i) also inside the repeat constructor
3) some things are defn- , some are ^:private
4) Cycle/reduce the recur binding can be (recur rr (or (next s) coll)) rather than nil? check

[1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7324

Comment by Alex Miller [ 22/Jan/15 10:34 PM ]

Ghadi - good comments! Fixed 1,2,4. #3 - ^:private is because defn- is not yet defined. New -8 patch.

Comment by Alex Miller [ 23/Jan/15 10:03 AM ]

Bah.

user=> (= (repeat 5 :a) (repeat 5 :a))
false
Comment by Alex Miller [ 23/Jan/15 3:04 PM ]

Updated to -9 patch that handles hash and equality for finite repeat case.

Comment by Ghadi Shayban [ 26/Jan/15 2:24 PM ]

metadata in the wrong place on #'repeat1

Comment by Alex Miller [ 26/Jan/15 3:27 PM ]

Thanks, fixed in -10.

Comment by Stuart Halloway [ 20/Feb/15 10:19 AM ]

The collections returned by these APIs promise several things the new deftypes do not deliver. For example, 1.6's repeat currently has the following ancestors that are lost in the propopsed deftype:

  • clojure.lang.ISeq
  • java.io.Serializable
  • clojure.lang.IPending
  • java.util.Collection
  • java.util.List
  • java.lang.Iterable
  • clojure.lang.Obj
  • clojure.lang.IPersistentCollection
  • clojure.lang.IMeta
  • clojure.lang.IObj

Losing metadata and serializability are certainly regressions, other stuff maybe as well. I suspect similar problems in all the other API returns.

We could improve testing by taking advantage of the property-checking fns already in test-clojure/data-structures, e.g. is-same-collection

Comment by Alex Miller [ 20/Feb/15 10:42 AM ]

I think we should consider carefully what is contractually required by the return of repeat. My opinion is that repeat must return something seqable, not literally a seq (or lazy seq). With that perspective, ISeq, IPending (there re lazy seq laziness), Collection, List, Iterable, and IPersistentCollection are non-essential. Obj is a concrete class and we shouldn't commit to that.

  • IObj is a gap, this should work but doesn't: (with-meta (repeat 1 :a) {:a 1})
  • IMeta doesn't need to be added, will never have meta right now and meta handles this correctly
  • Serializable - doesn't make sense for the infinite seqs but should probably fix for finite range
Comment by Alex Miller [ 20/Feb/15 11:03 AM ]

Added -11 patch that adds IObj for all the new deftypes and Serializable for FiniteRepeat. Still need to add more tests.

Comment by Stuart Halloway [ 20/Feb/15 12:27 PM ]

I think all the java. interfaces are mandatory for interop. Leaving out any one of the clojure. interfaces creates observable change in behavior composing with other core fns (admittedly the IPending case would be bizarre, who uses that?), so those seem mandatory too.

Agreed Obj should be irrelevant, and if it isn't the bug is elsewhere.

Comment by Alex Miller [ 27/Feb/15 9:41 AM ]

pending further discussion w/ stu

Comment by Alex Miller [ 13/Mar/15 7:02 AM ]

for a clean run and more explanation of the perf numbers

Comment by Alex Miller [ 17/Mar/15 4:23 PM ]

refreshed numbers in table

Comment by Alex Miller [ 18/Mar/15 9:24 AM ]

Refreshed perf numbers and older patch variants, added some more explanation of the perf numbers for comparison.

Comment by Stuart Halloway [ 22/Mar/15 7:50 AM ]

Screened clj-1603-3-3.patch

Comment by Nicola Mometto [ 23/Mar/15 2:00 PM ]

Shouldn't iterate cache its next slot? the current implementation recalculates it every time:

user=> (def coll (iterate #(do (println %) (inc %)) 0))
#'user/coll
user=> (nth coll 3)
0
1
2
3
user=> (nth coll 3)
0
1
2
3

I realize the docstring for iterate explicitely warns that f must be side-effect free but this is just for demonstration purposes, imagine that f is a computationally-heavy function and you'll understand why I don't think the new proposed behaviour is desiderable.

Comment by Nicola Mometto [ 23/Mar/15 2:26 PM ]

I attached clj-1603-3-4.patch which is the same as clj-1603-3-3.patch except it caches the next slot of iterate.

Here is the diff between -3-3 and -3-4 for ease of review:

diff --git a/src/jvm/clojure/lang/Iterate.java b/src/jvm/clojure/lang/Iterate.java
index f23ddca..aeef998 100644
--- a/src/jvm/clojure/lang/Iterate.java
+++ b/src/jvm/clojure/lang/Iterate.java
@@ -16,6 +16,7 @@ public class Iterate extends ASeq implements IReduce {
 
 private final IFn f;      // never null
 private final Object seed;
+private volatile ISeq _next;
 
 private Iterate(IFn f, Object seed){
     this.f = f;
@@ -37,7 +38,14 @@ public Object first(){
 }
 
 public ISeq next(){
-    return new Iterate(f, f.invoke(seed));
+    ISeq ret = _next;
+    if (ret == null)
+        synchronized(this) {
+            ret = _next;
+            if (ret == null)
+                _next = ret = new Iterate(f, f.invoke(seed));
+        }
+    return ret;
 }
 
 public Iterate withMeta(IPersistentMap meta){
Comment by Alex Miller [ 23/Mar/15 3:32 PM ]

It's a good question. That impl kills all of the performance improvement for iterate on the seq use case (prob the current common case). I'll take a look at it though.

Comment by Nicola Mometto [ 23/Mar/15 3:57 PM ]

My opinion is that the performance improvement for iterate that the benchmark shows won't reflect a real performance improvement in "real" usage cases of iterate where `f` actually does some computation

Comment by Nicola Mometto [ 23/Mar/15 4:04 PM ]

This is also true for its reduce impl btw.
I'm sorry if I'm pointing this out late in the life of this ticket but it never occurred to me that, contrary to cycle, repeat or range (CLJ-1515), this change for iterate will possibly make accessing large sequences slower since we lose the caching aspect of lazy-seqs.

Comment by Alex Miller [ 23/Mar/15 4:58 PM ]

Added new -15 patch, which is the same as -3-3 but caches _next for all three types. This has no impact on the reduce performance and was 2-8 us slower in the timings for the seqs above. That seems like a small hit for the reduction in perf and memory on all cases where the seq is traversed multiple times. That seems like a reasonably common thing to do, particularly for finite repeat.

Comment by Alex Miller [ 23/Mar/15 5:02 PM ]

Stu - the only change in -15 vs -3-3 is in the next() methods of each new type. Each has a new "private volatile ISeq _next;"

Cycle:

public ISeq next(){
    if(_next == null) {
        ISeq next = current.next();
        if (next != null)
            _next = new Cycle(all, next);
        else
            _next = new Cycle(all, all);
    }
    return _next;
}

Iterate:

public ISeq next(){
    if(_next == null) {
        _next = new Iterate(f, f.invoke(seed));
    }
    return _next;
}

Repeat:

public ISeq next() {
    if(_next == null) {
        if(count > 1)
            _next = new Repeat(count - 1, val);
        else if(count == INFINITE)
            _next = this;
    }
    return _next;
}
Comment by Michael Blume [ 23/Mar/15 6:47 PM ]

For Cycle, would it be worth holding the head and linking back to it when we loop, so that the underlying sequence only has to be traversed once? Something like this

https://github.com/MichaelBlume/clojure/commit/cache-cycle-head

On my machine this saves about 10 microseconds on

(doall (take 1000 (cycle [1 2 3])))

Comment by Michael Blume [ 23/Mar/15 7:07 PM ]

Also, for the Cycle reduce implementations, it might make sense to assume the underlying sequence is going to be traversed many times, copy it into an array, and loop over it with an index.

Comment by Michael Blume [ 23/Mar/15 7:17 PM ]

like so: https://github.com/MichaelBlume/clojure/commit/array-loop

Comment by Alex Miller [ 23/Mar/15 8:28 PM ]

The problem with holding a pointer back to the head is that the intervening cycle can be arbitrarily large. You are then literally holding the head on an arbitrary size seq. A silly example that fails with the patch but works without it (depending on your jvm size):

(dorun (take 10000000 (cycle (range 10000000))))

I played with the cycle reduce too but it has the same effective potential issue of creating and holding an arbitrarily sized array. Actually this one is worse in that it realizes an arbitrarily large array before knowing how many inputs it will need - it could be a take 1. Also, things that formerly worked could blow up with an oome. What if someone relies on passing an infinite sequence to cycle in a fallthrough case? For example, this works with the current version but would blow up with the array.

(into [] (take 1000) (cycle (iterate inc 0)))

Neither of these seem worth the risk to me.

Comment by Michael Blume [ 23/Mar/15 10:53 PM ]

Yeah, fair enough.

Comment by Alex Miller [ 25/Mar/15 4:14 PM ]

updated version of the -15 patch that gives proper credit to Ghadi for part.





[CLJ-1115] multi arity into Created: 25/Nov/12  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Trivial
Reporter: Yongqian Li Assignee: Unassigned
Resolution: Declined Votes: 1
Labels: None

Attachments: File multi-arity-into.diff    
Patch: Code and Test

 Description   

Any reason why into isn't multi arity?

(into to & froms) => (reduce into to froms)

(into #{} [3 3 4] [2 1] ["a"]) looks better than (reduce into #{} [[3 3 4] [2 1] ["a"]])



 Comments   
Comment by Timothy Baldridge [ 27/Nov/12 11:25 AM ]

Seems to be a valid enhancement. I can't see any issues we'd have with it. Vetted.

Comment by Timothy Baldridge [ 29/Nov/12 2:06 PM ]

Added patch & test. This patch retains the old performance characteristics of into in the case that there is only one collection argument. For example: (into [] [1 2 3]) .

Since the multi-arity version will be slightly slower, I opted to provide it as a second body instead of unifying both into a single body. If someone has a problem with this, I can rewrite the patch. At least this way, into won't get slower.

Comment by Rich Hickey [ 09/Dec/12 7:39 AM ]

This is a good example of an idea for an enhancement I haven't approved, and thus is not yet vetted.

Comment by Yongqian Li [ 05/Feb/14 1:16 AM ]

What's the plan for this bug?

Comment by Alex Miller [ 05/Feb/14 11:27 AM ]

@Yongqian This ticket is one of several hundred open enhancements right now. It is hard to predict when it will be considered. Voting (or encouraging others to vote) on an issue will raise it's priority. The list at http://dev.clojure.org/display/community/Screening+Tickets indicates how we prioritize triage.

Comment by Andy Fingerhut [ 06/Aug/14 2:12 PM ]

Patch multi-arity-into.diff dated Nov 29 2012 no longer applied cleanly to latest Clojure master due to commits made earlier today. Those changes add another arity to the function clojure.core/into, so the multi-arity generalization of into seems unlikely to be compatible with those changes.

Comment by Nicola Mometto [ 25/Mar/15 3:33 PM ]

This is probably out of the question now that into has an additional arity for transducers

Comment by Alex Miller [ 25/Mar/15 3:41 PM ]

That is true. Also, you can use the cat transducer to do something like this:

(into #{} cat [[3 3 4] [2 1] ["a"]])




[CLJ-273] def with a function value returns meta {:macro false}, but def itself doesn't have meta Created: 23/Feb/10  Updated: 25/Mar/15  Resolved: 25/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Anonymous Assignee: Rich Hickey
Resolution: Not Reproducible Votes: 0
Labels: None


 Description   

On the master (1.2) branch, if you create a def with an initial function value, {:macro false} is added to the metadata of the return value for def. However, if you look again at the metadata on the var itself, the {:macro false} is not present! This breaks the use of contrib's defalias when aliasing macros, because the new alias is marked as {:macro false}.

The code below demonstrates the issue, which was introduced in http://github.com/richhickey/clojure/commit/430dd4fa711d0008137d7a82d4b4cd27b6e2d6d1, "metadata for fns."

;; all running on 1.2, DIFF noted in comments

(defmacro foo [])
-> #'user/foo

(meta (def bar (.getRoot #'foo)))
-> {:macro false, :ns #<Namespace user>, :name bar, :file "NO_SOURCE_PATH", :line 83}
;; DIFF: where did that :macro false come from??

(def bar (.getRoot #'foo))
-> #'user/bar

(meta #'bar)
-> {:ns #<Namespace user>, :name bar, :file "NO_SOURCE_PATH", :line 84}
;; LIKE 1.1, but really weird: now the :macro false is gone again!


 Comments   
Comment by Assembla Importer [ 24/Aug/10 9:32 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/273

Comment by Kevin Downey [ 17/Apr/14 10:27 PM ]

this belongs in the isssues for one of the new contrib projects (if defalias ever got moved)

Comment by Alex Miller [ 17/Apr/14 10:35 PM ]

actually, it sounds like defalias is just a place where the issue is observed but the issue is in core. The old contrib/def.clj moved to core.incubator, but defalias did not go with it.

Comment by Nicola Mometto [ 18/Apr/14 9:12 AM ]

https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Var.java#L270 this is probably relevant

Comment by Nicola Mometto [ 25/Mar/15 11:03 AM ]

as of 1.7.0-master-SNAPSHOT this is not reproducible (even replacing .getRoot with .getRawRoot)

Comment by Alex Miller [ 25/Mar/15 12:49 PM ]

see comment





[CLJ-1680] quot and rem handle doubles badly Created: 24/Mar/15  Updated: 25/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Francis Avila Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: math

Attachments: Text File clj-1680_no_div0.patch    
Patch: Code and Test
Approval: Triaged

 Description   

quot and rem in the doubles case (where any one of the arguments is a floating point) gives strange results for non-finite arguments:

(quot Double/POSITIVE_INFINITY 2) ; Java: Infinity
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(quot 0 Double/NaN) ; Java: NaN
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(quot Double/POSITIVE_INFINITY Double/POSITIVE_INFINITY) ; Java: NaN
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(rem Double/POSITIVE_INFINITY 2) ; Java: NaN
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(rem 0 Double/NaN) ; Java: NaN
NumberFormatException Infinite or NaN  java.math.BigDecimal.<init> (BigDecimal.java:808)
(rem 1 Double/POSITIVE_INFINITY) ; The strangest one. Java: 1.0
=> NaN

quot and rem also do divide-by-zero checks for doubles, which is inconsistent with how doubles act for division:

(/ 1.0 0)
=> NaN
(quot 1.0 0) ; Java: NaN
ArithmeticException Divide by zero  clojure.lang.Numbers.quotient (Numbers.java:176)
(rem 1.0 0); Java: NaN
ArithmeticException Divide by zero  clojure.lang.Numbers.remainder (Numbers.java:191)

Attached patch does not address this because I'm not sure if this is intended behavior. There were no tests asserting any of the behavior mentioned above.

Fundamentally the reason for this behavior is that the implementation for quot and rem sometimes (when result if division larger than a long) take a double, coerce it to BigDecimal, then BigInteger, then back to a double. The coersion means it can't handle nonfinite intermediate values. All of this is completely unnecessary, and I think is just leftover detritus from when these methods used to return a boxed integer type (long or BigInteger). That changed at this commit to return primitive doubles but the method body was not refactored extensively enough.

The method bodies should instead be simply:

static public double quotient(double n, double d){
    if(d == 0)
        throw new ArithmeticException("Divide by zero");
    double q = n / d;
    return (q >= 0) ? Math.floor(q) : Math.ceil(q);
}

static public double remainder(double n, double d){
    if(d == 0)
        throw new ArithmeticException("Divide by zero");
    return n % d;
}

Which is what the attached patch does. (And I'm not even sure the d==0 check is appropriate.)

Even if exploding on non-finite results is a desirable property of quot and rem, there is no need for the BigDecimal+BigInteger coersion. I can prepare a patch that preserves existing behavior but is more efficient.

More discussion at Clojure dev.



 Comments   
Comment by Francis Avila [ 24/Mar/15 12:55 PM ]

More testing revealed that n % d does not preserve the relation (= n (+ (* d (quot n d)) (rem n d))) as well as (n - d * (quot n d)), which doesn't make sense to me since that is the very relation the spec says % preserves. % is apparently not simply Math.IEEEremainder() with a different quotient rounding.

Test case: (rem 40.0 0.1) == 0.0; 40.0 % 0.1 == 0.0999... (Smaller numerators will still not land at 0 precisely, but land closer than % does.)

Updated patch which rolls back some parts of the simplification to remainder and adds this test case.





[CLJ-1682] clojure.set/intersection occasionally allows non-set arguments. Created: 24/Mar/15  Updated: 24/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Valerie Houseman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs

Approval: Triaged

 Description   

clojure.set/intersection, by intent and documentation, is meant to be operations between two sets. However, it sometimes allows (and returns correct opreations upon) non-set arguments. This confuses the intention that non-set arguments are not to be used.

Here's an example with Set vs. KeySeq:
If there happens to be an intersection, you'll get a result. This may lead someone coding this to think that's okay, or to not notice they've used an incompatible data type. As soon as the intersection is empty, however, an appropriate type error ensues, albeit by accident because the first argument to clojure.core/disj should be a set.

user=> (require '[clojure.set :refer [intersection]])
nil
user=> (intersection #{:key_1 :key_2} (keys {:key_1 "na"}))   ;This works, but shouldn't
(:key_1)
user=> (intersection #{:key_1 :key_2} (keys {:key_3 "na"}))   ;This fails, because intersection assumes the second argument was a Set
ClassCastException clojure.lang.APersistentMap$KeySeq cannot be cast to clojure.lang.IPersistentSet  clojure.core/disj (core.clj:1449)

(disj (keys {:key_1 "na"}) #{:key_1 :key_2})   ;The assumption that intersection made
ClassCastException clojure.lang.APersistentMap$KeySeq cannot be cast to clojure.lang.IPersistentSet  clojure.core/disj (core.clj:1449)

Enforcing type security on a library that's clearly meant for a particular type seems like the responsible thing to do. It prevents buggy code from being unknowingly accepted as correct, until the right data comes along to step on the bear trap.



 Comments   
Comment by Valerie Houseman [ 24/Mar/15 6:11 PM ]

Please reroute to the CLJ project, as I lack access to do so - my bad.

Comment by Andy Fingerhut [ 24/Mar/15 7:19 PM ]

CLJ-810 was similar, except it was for function clojure.set/difference. That one was declined with the comment "set/difference's behavior is not documented if you don't pass in a set." I do not know what core team will judge ought to be done with this ticket, but wanted to provide some history.

Dynalint [1] and I think perhaps Dire [2] can be used to add dynamic argument checking to core functions.

[1] https://github.com/frenchy64/dynalint
[2] https://github.com/MichaelDrogalis/dire

Comment by Alex Miller [ 24/Mar/15 9:00 PM ]

Now that `set` is faster for sets, I think we could actually add checking for sets in some places where we might not have before. So, it's worth looking at with fresh eyes.





[CLJ-1237] reduce gives a SO for pathological seqs Created: 27/Jul/13  Updated: 24/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Gary Fredericks Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: None
Environment:

1.5.1


Attachments: Text File CLJ-1237c.patch    
Patch: Code and Test

 Description   

reduce gives a StackOverflowError on long sequences that contain many transitions between chunked and unchunked:

(->> (repeat 50000 (cons :x [:y]))
     (apply concat)
     (reduce (constantly nil)))
;; throws StackOverflowError

Such a sequence is well behaved under most other sequence operations, and its underlying structure can even be masked such that reduce succeeds:

(->> (repeat 50000 (cons :x [:y]))
     (apply concat)
     (take 10000000)
     (reduce (constantly nil)))
;; => nil

I don't think Clojure developers normally worry about mixing chunked and unchunked seqs, so the existence of such a sequence is not at all unreasonable (and indeed this happened to me at work and was very difficult to debug).

It seems obvious what causes this given the implementation of reduce – it bounces back and forth between the chunked impl and the unchunked impl, consuming more and more stack as it goes. Without proper tail call optimization, it's not obvious to me what a good fix would be.

Presumed bad solutions

Degrade to naive impl after first chunk

In the IChunkedSeq implementation, instead of calling internal-reduce when the
sequence stops being chunked, it could have an (inlined?) unoptimized implementation,
ensuring that no further stack space is taken up. This retains the behavior that a
generic seq with a chunked tail will still run in an optimized fashion, but a seq with
two chunked portions would only be optimized the first time.

Use clojure.core/trampoline

This would presumably work, but requires wrapping the normal return values from all
implementations of internal-reduce.

Proposed Solution

(attached as CLJ-1237c.patch)

Similar to using trampoline, but create a special type (Unreduced) that signals
an implementation change. The two implementation-change points in internal-reduce
(in the IChunkedSeq impl and the Object impl) are converted to return an instance
of Unreduced instead of a direct call to internal-reduce.

Then seq-reduce is converted to check for instances of Unreduced before returning,
and recurs if it finds one.

Pros

  • Only requires one additional check in most cases
  • Reduces stack usage for existing heterogeneous reductions that weren't extreme enough to crash
  • Should be compatible with 3rd-party implementations of internal-reduce, which can still use the old style (direct recursive calls to internal-reduce) or the optimized style if desired.

Cons

  • internal-reduce is slightly more complicated
  • There's an extra check at the end of seq-reduce – is that a performance worry?


 Comments   
Comment by Gary Fredericks [ 25/Aug/13 4:13 PM ]

Added patch.

Comment by Gary Fredericks [ 24/Mar/15 11:33 AM ]

My coworker just ran into this, independently.





[CLJ-1602] vals and keys return values should implement IReduceInit or Iterable Created: 25/Nov/14  Updated: 23/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Stuart Halloway Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File clj-1602-2.diff     File clj-1602-3.diff     File clj-1602-4.diff     File clj-1602-5.diff     File clj-1602-6.diff     File clj-1602.diff    
Patch: Code and Test
Approval: Screened

 Description   

Background: clojure.core/keys calls RT.keys(Object) calls APersistentMap.KeySeq.create(ISeq). RT.keys() creates a sequence (of Map.Entry objects) and KeySeq just wraps it, calling .keyValue(). There is an equivalent vals -> RT.vals() -> ValSeq path. Both of these seq impls extend ASeq and provide Iterable implementations via SeqIterator (iterator wrapped over the seq).

Approach: The important thing here is to avoid creating the sequence and instead directly iterate/reduce over the map. Noting that CLJ-1499 provides support for making PHM directly Iterable and that KeySeq/ValSeq already implement Iterable, I chose to focus on making the instances returned from keys and vals support Iterable directly on the underlying map instead of via the seq.

RT.keys()/vals() created the seq and passed it to KeySeq/ValSeq which made it too late to directly cover the original map iterator. There are a few places that rely on passing a seq of Map.Entry to keys/vals (not just a map instance), so I check for IPersistentMap and in that case pass it directly to a new KeySeq factory method that remembers both the Iterable and the ISeq. There is also support in here for using a direct key/value iterator via IMapIterable as introduced in CLJ-1499.

Questions/notes:

  • Could potentially check for Map or Iterable instead of IPersistentMap in RT.keys()/vals(). Not sure how common it is to pass normal Java maps to keys/vals.
  • The direct Iterable support vanishes once you move off the head of the keys or vals seq. So (rest (keys map)) does not have Iterable support. This is not really possible unless you hold an Iterator and advance it along with the seq, but that seemed to introduce all sorts of possibilities for badness. Since maps are unordered, it seems weird to rely on any ordering or processing only parts of any map, so I suspect doing this would be quite rare.
  • This patch depends on CLJ-1499 for IMapIterable.

Performance: I tested perf using criterium to benchmark as follows:

(use 'criterium.core)
(def m (zipmap (range 1000) (range 1000)))
(bench (reduce + (keys m)))
(bench (reduce + (vals m)))
(bench expr) 1.6.0 1.7.0-alpha5 alpha5 + clj-1499 + patch
(reduce + (keys m)) 69 µs 73 µs 44 µs
(reduce + (vals m)) 75 µs 77 µs 50 µs

Tests:

  • I added some basic tests for subseq and rsubseq as those both rely on the somewhat special behavior of keys accepting a seqable of Map.Entry objects (not just a map itself). There were no other tests for subseq or rsubseq already present.
  • Some coverage from CLJ-1499 tests d760db
  • Show that metadata works on keys/vals
  • Show that iterator works correctly after you step off the head of the seq, invalidating the iterable
  • Show that the fallthrough branch of iterator works (from PersistentTreeMap)

Patch: clj-1602-6.diff

Screened by:

  • the changes in CollReduce are unlikely to stand but are currently essential


 Comments   
Comment by Alex Miller [ 25/Nov/14 11:53 AM ]

Could leverage CLJ-1499 for the bulk of this, may pull that back from 1.8 into 1.7. Waiting on further work till that's answered.

Comment by Alex Miller [ 03/Dec/14 11:24 PM ]

I also have a patch that extends the CLJ-1499 iterators to support providing both key and val iterators that do not require creating and unpacking a Map.Entry. Unfortunately I only saw times that were ~48 µs on the perf benchmark in the description, so it's not a huge benefit (short-lived object allocation is cheap).

Comment by Alex Miller [ 09/Jan/15 9:15 AM ]

waiting on 1499 mods

Comment by Alex Miller [ 14/Jan/15 5:42 PM ]

Updated patch based on latest CLJ-1499-v10 patch.

Comment by Michael Blume [ 15/Jan/15 1:48 AM ]

I find it concerning that the v9 patch passed generative tests, shouldn't we have seen that?

Comment by Alex Miller [ 15/Jan/15 7:21 AM ]

Actually I did see it in the generative tests for this ticket so I moved those tests into the CLJ-1499 patch.

Comment by Michael Blume [ 15/Jan/15 10:10 AM ]

Ah, cool =)

Comment by Stuart Halloway [ 22/Mar/15 8:16 AM ]

clj-1602-5.diff merely updates -4 to apply on current master

Comment by Stuart Halloway [ 22/Mar/15 8:37 AM ]

requested some tests in the description

Comment by Alex Miller [ 23/Mar/15 1:51 PM ]

Updated to -6 patch that adds the requested tests.





[CLJ-1679] Add fast path in seq comparison for structurally sharing seqs Created: 21/Mar/15  Updated: 23/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: collections, seq

Attachments: Text File 0001-CLJ-1679-do-pointer-checks-in-seq-equality.patch     Text File CLJ-1679-v2.patch     Text File CLJ-1679-v3.patch    
Patch: Code and Test

 Description   

Currently comparing two non identical seqs requires iterating through both seqs comparing value by value, ignoring the possibility of seq `a` and `b` having the same (pointer-equal) rest.

The proposed patch adds a pointer equality check on the seq tails that can make the equality short-circuit if the test returns true, which is helpful when comparing large (or possibly infinite) seqs that share a common subseq.

After this patch, comparisons like

(let [x (range)] (= x (cons 0 (rest x))))
which currently don't terminate, return true in constant time.

Patch: CLJ-1679-v3.patch



 Comments   
Comment by Michael Blume [ 23/Mar/15 12:52 PM ]

When this test fails (it fails on my master, but I've got a bunch of other development patches, I'm still figuring out where the conflict is), it fails by hanging forever. Maybe it'd be better to check equality in a future and time out the future?

Comment by Michael Blume [ 23/Mar/15 1:01 PM ]

like so =)

Comment by Nicola Mometto [ 23/Mar/15 1:11 PM ]

Makes sense, thanks for the updated patch

Comment by Michael Blume [ 23/Mar/15 1:24 PM ]

Hm, previous patch had a problem where the reporting logic still tries to force the sequence and OOMs, this patch prevents that.

Comment by Michael Blume [ 23/Mar/15 1:41 PM ]

ok, looks like CLJ-1515, CLJ-1603, and this patch, all combine to fail together, though any two of them work fine.

Comment by Michael Blume [ 23/Mar/15 1:43 PM ]

(And really there's nothing wrong with the source of this patch, it still works nicely to short-circuit = where there's structural sharing, it's just that the other two patches break structural sharing for ranges, so the test fails)

Comment by Nicola Mometto [ 23/Mar/15 2:02 PM ]

I see, I guess we'll have to change the test if the patches for those tickets get applied.





[CLJ-1667] Socket test can fail if hard-coded port is unavailable Created: 26/Feb/15  Updated: 22/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: io, test

Attachments: Text File socket-test.patch    
Patch: Code
Approval: Screened

 Description   

I was unable to run the Clojure tests due to this problem. There is a test that hardcodes a port and something else on my machine happened to be using that port.

The patch avoids binding a hard-coded port in the test.

Patch: socket-test.patch

Screened by:



 Comments   
Comment by Andy Fingerhut [ 26/Feb/15 11:31 AM ]

I used to try running the prescreen tests in parallel for two different JDKs on the same machine, and I probably stopped doing that because of this. My use case is a very unusual one, and not a good reason to change this by itself, but my use case certainly made this conflict happen regularly.

Comment by Alex Miller [ 26/Feb/15 11:57 AM ]

No good reason not to fix it! Silly test.





[CLJ-1669] Move LazyTransformer to an iterator strategy, extend eduction capabilities Created: 04/Mar/15  Updated: 21/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: transducers

Attachments: Text File clj-1669-2.patch     Text File clj-1669-3.patch     Text File clj-1669-4.patch     Text File clj-1669-5.patch     Text File clj-1669.patch    
Patch: Code and Test
Approval: Vetted

 Description   

LazyTransformer does a lot of work to be a seq. Instead, switch to creating a transforming iterator and leverage the existing IteratorSeq to convert it to a seq. Change sequence to use this approach.
Change Eduction to provide iteration directly via the transforming iterator.
Extend eduction to support multiple xforms.

Performance:

(use 'criterium.core)
(def s (range 1000))
(def v (vec s))
(def s50 (range 50))
(def v50 (vec s50))

expr master s master v 1669-3 s 1669-3 v 1669-5 s 1669-5 v
non-chunking transform            
(into [] (->> s (interpose 5) (partition-all 2))) 466 us 459 us 525 us 508 us 476 us 501 us
(into [] (->> s (eduction (interpose 5) (partition-all 2)))) * 113 us 112 us 117 us 122 us 108 us 108 us
1 chunking transform            
(into [] (map inc s)) 28 us 31 us 30 us 31 us 30 us 31 us
(into [] (map inc) s) 17 us 19 us 19 us 20 us 19 us 17 us
(into [] (sequence (map inc) s)) 58 us 46 us 142 us 148 us 94 us 67 us
(into [] (eduction (map inc) s)) 21 us 20 us 25 us 21 us 23 us 21 us
(doall (map inc (eduction (map inc) s))) 219 us 208 us 204 us 191 us 117 us 97 us
2 chunking transforms        
(into [] (map inc (map inc s))) 49 us 50 us 50 us 50 us 54 us 55 us
(into [] (comp (map inc) (map inc)) s) 23 us 23 us 28 us 23 us 23 us 23 us
(into [] (sequence (comp (map inc) (map inc)) s)) 73 us 58 us 144 us 135 us 104 us 82 us
(into [] (eduction (map inc) (map inc) s)) * 54 us 51 us 54 us 29 us 55 us 30 us
(doall (map inc (eduction (map inc) (map inc) s))) * 230 us 213 us 213 us 196 us 124 us 104 us
expand transform            
(into [] (mapcat range (map inc s50))) 83 us 81 us 80 us 84 us 71 us 72 us
(into [] (sequence (comp (map inc) (mapcat range)) s50)) 122 us 117 us 256 us 254 us 161 us 156 us
(into [] (eduction (map inc) (mapcat range) s50)) * 78 us 79 us 80 us 82 us 60 us 61 us
materialized eduction            
(sort (eduction (map inc) s)) ERR ERR 120 us 84 us 106 us 89 us
(->> s (filter odd?) (map str) (sort-by last)) 1.13 ms 1.21 ms 1.19 ms 1.20 ms 1.19 ms 1.20 ms
(->> s (eduction (filter odd?) (map str)) (sort-by last)) ERR ERR 1.18 ms 1.17 ms 1.22 ms 1.23 ms
  • used comp to combine xforms as eduction only took one in the before case

Patch: clj-1669-5.patch

Screened by:



 Comments   
Comment by Michael Blume [ 05/Mar/15 3:52 PM ]

Nice, I like the direction on this.

CLJ-1515 currently breaks this patch (LongRange cannot be converted to Iterable), but I imagine that'll get better when it absorbs the changes from CLJ-1603

Comment by Alex Miller [ 06/Mar/15 8:11 AM ]

Yeah. colls should be mapped through RT.iter() to catch more cases.

Comment by Alex Miller [ 06/Mar/15 9:52 AM ]

To do:

  • remove Seqable from Eduction
  • support Iterable in RT.toArray()
  • more eduction pipeline tests that require realization at end
Comment by Alex Miller [ 06/Mar/15 1:00 PM ]

Perf numbers show pretty worse results from sequence, will dig in further.

Comment by Alex Miller [ 13/Mar/15 7:41 AM ]

For the s timings, we've actually introduced more steps into the stack:

OLD reduce with s:

LazyTransformer
   seq (range) - every transformation is another layer here

NEW reduce with s:

IteratorSeq 
  TransformingIterator (handles N transformations in 1 step)
    SeqIterator
      seq (range)
Comment by Alex Miller [ 20/Mar/15 10:08 AM ]

Look at perf for:

  • ->> eduction transformation
  • transformation comparison that doesn't support chunking
  • more into vector iteration case
Comment by Alex Miller [ 21/Mar/15 8:45 AM ]

The -5 patch is same -3 except all uses of IteratorSeq have been replaced with a ChunkedCons that is effectively a chunked version of the old IteratorSeq. While no one calls it, I left IteratorSeq in the code base in case of regression.

Generally, the chunked iterator seq reduces the cost in a number of the worst cases and also is a clear benefit in making seqs over a result of eduction or sequence faster to traverse (as they are now chunked).

I think the one potential issue is that seqs over iterators are now chunked when they were not before which could change programs that expect their stateful iterator to be traversed one at a time. This change could be isolated to just to sequence and seq-iterator and mitigated by not changing RT.seqFrom() and seq-iterator to use the new chunking behavior only in sequence and/or with a new chunked-iterator-seq to make it more explicit. The sequence over xf is new so no possible regression there, everything else would just be opt-in.





[CLJ-1424] Reader conditionals Created: 15/May/14  Updated: 20/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Critical
Reporter: Ghadi Shayban Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: reader

Attachments: File clj-1424-10.diff     Text File clj-1424-11.patch     File CLJ-1424-2.diff     File clj-1424-3.diff     File clj-1424-4.diff     File clj-1424-5.diff     File clj-1424-6.diff     File clj-1424-7.diff     File clj-1424-8.diff     File clj-1424-9.diff     File clojure-feature-expressions.diff    
Patch: Code and Test
Approval: Screened

 Description   

See http://dev.clojure.org/display/design/Reader+Conditionals for design.

The implementation details contain several major parts:

1) LispReader changes to implement new conditional read syntax and features.

In RT:

  • suppress-read - new dynamic var
  • readString() - new arity that takes opts

In LispReader:

  • read(...) entry point - added new variants that take opts. The opts-only form will decode :eof into the correct values for eofIsError and eofValue. Internal forms take both opts and pendingForms (list).
  • All reader invoke dispatch methods now take both opts and pendingForms which get drug around the call stack.
  • Defined all option keys and well-known allowed values - OPT_EOF, OPT_FEATURES, OPT_READ_COND, EOFTHROW, PLATFORM_KEY, PLATFORM_FEATURES, COND_ALLOW, COND_PRESERVE.
  • read(...) entry point will ensure installation of {:features #{:clj}}, appropriately merging others or none. There is no path that will pass :features inside Clojure but users can call LispReader with it directly.
  • readDelimitedList() was refactored so it can share logic with readCondDelimited().
  • Luke can provide more detail on the guts of reader conditional reading if needed.

New ReaderConditional and TaggedLiteral types.

In clojure.core:

  • tagged-literal?, tagged-literal, reader-conditional?, reader-conditional - new functions to support the preserved data form of these
  • new print-method impls for TaggedLiteral and ReaderConditional
  • read - added new arity that takes opts and explain opts in docstring
  • read-string - added new arity that takes opts (placement matches edn/read-string and is designed for partial application)

2) cljc file changes.
In Compiler:

  • OPTS_COND_ALLOWED - keep around a map {:read-cond :allow} to invoke reader when reading .cljc files.
  • readerOpts() - new private function to determine reader options to use based on file name (.clj vs .cljc)
  • load() - use readerOpts() and invoke LispReader with them
  • compile() - use readerOpts() and invoke LispReader with them

In RT:

  • load(...) - check first for .clj file, then for .cljc file when looking by ns

In clojure.main - update code to allow for .clj and .cljc files
In clojure.repl - update code to allow for .clj and .cljc files

3) Tests + infrastructure updates.

  • Move test/clojure/test_clojure/reader.clj to test/clojure/test_clojure/reader.cljc so that we can test reader conditionals.
  • Bump version of test.generative to get newer version that depends on newer tools.namespace which is able to find .cljc files as source files. The test script at src/script/run_test.clj uses this to find Clojure test files and without the bump, it will not find the new reader.cljc test ns. Note: if you use the ant build, you need to re-run antsetup.sh for this to take effect.
  • src/script/run_test.clj - as part of the version bump, move from deprecated namespace to new namespace
  • Added lots of tests in reader.cljc

Patch: clj-1424-11.patch

Screened by:

Related: CLJS-27, TRDR-14



 Comments   
Comment by Jozef Wagner [ 16/May/14 2:19 AM ]

Has there been a decision that CL syntax is going to be used? Related discussion can be found at design page, google groups discussion and another discussion.

Comment by Alex Miller [ 16/May/14 8:34 AM ]

No, no decisions on anything yet.

Comment by Ghadi Shayban [ 19/May/14 7:25 PM ]

Just to echo a comment from TRDR-14:

This is WIP and just one approach for feature expressions. There seem to be at least two couple diverging approaches emerging from the various discussion (Brandon Bloom's idea of read-time splicing being the other.)

In any case having all Clojure platforms be ready for the change is probably essential. Also backwards compatibility of feature expr code to Clojure 1.6 and below is also not trivial.

Comment by Kevin Downey [ 04/Aug/14 1:39 PM ]

if you have ever tried to do tooling for a language where the "parser" tossed out information or did some partial evaluation, it is a pain. this is basically what the #+cljs style feature expressions and bbloom's read time splicing both do with clojure's reader. I think resolving this at read time instead of having the compiler do it before macro expansion is a huge mistake and makes the reader much less useful for reading code.

Comment by Ghadi Shayban [ 04/Aug/14 2:00 PM ]

Kevin, what kind of tooling use case are you alluding to?

Comment by Kevin Downey [ 04/Aug/14 3:24 PM ]

any use case that involves reading code and not immediately handing it off to the compiler. if I wanted to write a little snippet to read in a function, add an unused argument to every arity then pprint it back, reader resolved feature expressions would not round trip.

if I want to write snippet of code to generate all the methods for a deftype (not a macro, just at the repl write a `for` expression) I can generate a clojure data structure, call pprint on it, then paste it in as code, reader feature expressions don't have a representation as data so I cannot do that, I would have to generate strings directly.

Comment by Alex Miller [ 22/Aug/14 9:10 AM ]

Changing Patch setting so this is not in Screenable yet (as it's still a wip).

Comment by Alex Miller [ 07/Nov/14 4:39 PM ]

Latest patch brings up to par with related patches in CLJS-27 and TRDR-14 and importantly adds support for loading .cljc files as Clojure files.

Comment by Andy Fingerhut [ 07/Nov/14 5:55 PM ]

Maybe undesirable behavior demonstrated below with latest Clojure master plus patch clj-1424-3.diff, due to the #+cljs skipping the comment, but not the (dec a). I thought it could be fixed simply by moving RT.suppressRead() check after (ret == r) check in read(), but that isn't correct.

user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs (dec a))")
(defn foo [a] (inc a))
user=> (read-string "(defn foo [a] #+clj (inc a) #+cljs ; foo\n (dec a))")
(defn foo [a] (inc a) (dec a))
Comment by Alex Miller [ 21/Jan/15 4:28 PM ]

Added new clj-1424-4.diff which makes a couple of modifications:

  • removed support for and/or/not (#+ and #- remain)
  • *features* has been removed
  • if you wish to have a custom feature set while reading, there is a new option map that can be passed to read (this all parallels similar changes previously made to the edn reader)

Example of adding a "custom" feature to the feature set (which will always contain "clj" feature):

(read 
  {:features #{:custom}} 
  (java.io.PushbackReader. (java.io.StringReader. "[#+custom :x]")))
Comment by Andy Fingerhut [ 21/Jan/15 5:01 PM ]

Latest patch clj-1424-4.diff also exhibits maybe-undesirable behavior in which #+cljs can suppress an immediately following comment, rather than the form following it. See 07/Nov/14 comment with example above.

Comment by Alex Miller [ 21/Jan/15 6:16 PM ]

Thanks Andy, I'm aware. Haven't looked at it yet.

Comment by Luke VanderHart [ 25/Jan/15 9:26 PM ]

Patch clj-1424-5.diff modifies the code to use "read-conditionals", as outlined by Rich at: https://groups.google.com/d/msg/clojure-dev/LW0ocQ1RcYI/IBPPyfCpM3kJ

Comment by Alex Miller [ 26/Jan/15 12:33 PM ]

Some feedback:

1) Because pendingForms is an internal thing, I would make the read() that takes it non-public.
2) In readDelimitedList, I don't see the point of constructing a new LinkedList then checking if it's empty there. Should just make the add conditional on whether it's null or not.
3) You could treat pendingForms as a Deque (which LinkedList implements) and then use pop() instead of remove(0). The addAll(0, ...) is more painful to replicate though if you're sticking to Deque. I think I'd be tempted to just commit explicitly to LinkedList for pendingForms since we fully control the construction and use of it within the reader.
4) Might be nice to update the commented-out readers to support pendingForms as I did with opts. Or remove the updates for opts. Should either do all the mods or none on the commented-out code.
5) s/read-cond-splicing/read-cond-splice/ ? Seems like where it's used it should be a verb.
6) Should just use :default and make :else and :none throw exceptions. I think Rich mentioned :except or :exception too? or maybe I misheard that.
7) Should have some more tests to tweak the error cases - bad feature, uneven forms, default out of allowed position, bad contents for splice, etc.

Comment by Alex Miller [ 26/Jan/15 2:01 PM ]

From Chouser on the mailing list: "is it intentional that reading (clojure.core/read-cond ...) does not behave the same as (#? ...)? That is, (#? ...) can be read as c.c/read-cond depending on read options, but having been read, if it is printed again it doesn't round-trip back to #?. This is different, for example, from how #(...) is read as (fn* [] (...)), which then retains its meaning."

In shouldReadConditionally(), it looks like the == check vs READ_COND will not work. Instead of:
return (first == READ_COND || first == READ_COND_SPLICING);
do
return (READ_COND.equals(first) || READ_COND_SPLICING.equals(first));

For example, this test doesn't seem to give the right answer:

user=> (read-str-opts {:preserve-read-cond false} "(clojure.core/read-cond :clj :x :default :y)")
(clojure.core/read-cond :clj :x :default :y)    ;; should be :x
Comment by Michael Blume [ 26/Jan/15 3:27 PM ]

With this patch applied to master, lein check fails on instaparse:

Compiling namespace instaparse.abnf
Exception in thread "main" clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader, compiling:(abnf.clj:186:28)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3605)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3599)
	at clojure.lang.Compiler$DefExpr.eval(Compiler.java:436)
	at clojure.lang.Compiler.eval(Compiler.java:6772)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.RT.loadResourceScript(RT.java:384)
	at clojure.lang.RT.loadResourceScript(RT.java:375)
	at clojure.lang.RT.load(RT.java:459)
	at clojure.lang.RT.load(RT.java:425)
	at clojure.core$load$fn__5424.invoke(core.clj:5850)
	at clojure.core$load.doInvoke(core.clj:5849)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at user$eval52$fn__63.invoke(form-init5310597017138984927.clj:1)
	at user$eval52.invoke(form-init5310597017138984927.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6767)
	at clojure.lang.Compiler.eval(Compiler.java:6757)
	at clojure.lang.Compiler.load(Compiler.java:7194)
	at clojure.lang.Compiler.loadFile(Compiler.java:7150)
	at clojure.main$load_script.invoke(main.clj:275)
	at clojure.main$init_opt.invoke(main.clj:280)
	at clojure.main$initialize.invoke(main.clj:308)
	at clojure.main$null_opt.invoke(main.clj:343)
	at clojure.main$main.doInvoke(main.clj:421)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: clojure.lang.ArityException: Wrong number of args (2) passed to: StringReader
	at clojure.lang.AFn.throwArity(AFn.java:429)
	at clojure.lang.AFn.invoke(AFn.java:36)
	at instaparse.cfg$eval800$safe_read_string__801.invoke(cfg.clj:163)
	at instaparse.cfg$process_string.invoke(cfg.clj:180)
	at instaparse.cfg$build_rule.invoke(cfg.clj:217)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:214)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:215)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core$apply.invoke(core.clj:626)
	at instaparse.cfg$build_rule.invoke(cfg.clj:211)
	at instaparse.cfg$build_rule.invoke(cfg.clj:207)
	at clojure.core$map$fn__4523.invoke(core.clj:2612)
	at clojure.lang.LazySeq.sval(LazySeq.java:40)
	at clojure.lang.LazySeq.seq(LazySeq.java:49)
	at clojure.lang.RT.seq(RT.java:504)
	at clojure.core$seq__4103.invoke(core.clj:135)
	at clojure.core.protocols$seq_reduce.invoke(protocols.clj:30)
	at clojure.core.protocols$fn__6436.invoke(protocols.clj:59)
	at clojure.core.protocols$fn__6389$G__6384__6402.invoke(protocols.clj:13)
	at clojure.core$reduce.invoke(core.clj:6501)
	at clojure.core$into.invoke(core.clj:6582)
	at instaparse.cfg$ebnf.invoke(cfg.clj:277)
	at clojure.lang.AFn.applyToHelper(AFn.java:154)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3600)
	... 27 more
Failed.
Comment by Michael Blume [ 26/Jan/15 3:29 PM ]

Aha, of course, Instaparse is calling into the LispReader$StringReader directly.

Is it worth providing versions of these methods with the old arities? Or should instaparse just not be using Clojure internals this way?

Comment by Michael Blume [ 26/Jan/15 3:33 PM ]

https://github.com/Engelberg/instaparse/blob/v1.3.5/src/instaparse/cfg.clj#L159

Comment by Alex Miller [ 26/Jan/15 3:33 PM ]

Instaparse is reaching pretty deep inside implementation details here, so I'd say this should expect to break. We could back-fill the old arities here but I'd really prefer not to if possible.

Comment by Luke VanderHart [ 27/Jan/15 11:23 AM ]

clj-1424-6.diff addresses all the issues mentioned above. Per a comment from Rich, it also adds tests to ensure that nested splices work properly (they do).

There were two things from your list I didn't do, Alex:

3) I kept pendingForms as a List. Because we aren't confining ourselves to a Deque interface, I don't see the benefit of calling pop() over remove(0) (with identical semantics) as justification for over-specifying the concrete type.

5) I kept "read-cond-splicing" since it parallels the form of "unquote-splicing". Seems that those should be consistent.

Comment by Luke VanderHart [ 30/Jan/15 9:03 PM ]

clj-1424-7.diff contains Rich's "reader-conditionals" proposal.

Comment by Alex Miller [ 06/Feb/15 3:45 PM ]

ReaderConditional / TaggedLiteral
1) when patch applied I see some whitespace errors in here, also line endings seem different, might want to check it
2) a common pattern in other Java classes is private constructor and public static create() method
3) could use Util.hash() to clean up the "null->0" logic in hashCode()

LispReader
4) adds unused import: java.util.Iterator
5) it looks like returnOn flag could just be collapsed into checking if returnOnChar is non-null?
6) in readCondDelimited, EOF and FINISHED are never used and can be removed presumably

Comment by Luke VanderHart [ 07/Feb/15 1:49 PM ]

I have attached clj-1424-8.diff, which addresses your most recent comments, Alex. I formatted it using `git format patch` instead of `git diff` so it should have the email address added correctly.

Your comments are all addressed, with the exception of returnOn. I don't think that can be collapsed. You really need two values: one to say what character should cause a return, and one to say what value should be returned in that scenario. You could use a convention on the return value, I suppose, (e.g, null means a completed read) but there's already precedent for passing in the value to be returned (namely, eofValue).

Comment by Alex Miller [ 09/Feb/15 9:49 AM ]

Looks good. I think I actually mis-read what returnOn was doing, so np on that. I still see the whitespace issues and the CR/LF in those two files. Were you going to change those?

Comment by Alex Miller [ 09/Feb/15 4:24 PM ]

Added new -9 patch that squashes the last patch but is otherwise identical. The older patches in that diff were the source of still seeing whitespace errors on apply.

Comment by Rich Hickey [ 20/Feb/15 8:09 AM ]

I think in the first iteration we should allow reader conditionals only in .cljc files, and support only standard features :clj, :cljs and :clr.

Comment by Andy Fingerhut [ 27/Feb/15 10:35 AM ]

Comment on -10 patch:

The doc string additions for clojure.core/read are a bit cryptic for those not already steeped in the details. It would be good to at least mention 'reader conditionals' so people have something to Google for. Current additions:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow, or :preserve to keep all branches as data
+    :features - persistent set of feature keywords
+    :eof - on eof, return value unless :eofthrow, then throw

Alternate suggestion:

+  Opts is a persistent map with valid keys:
+    :read-cond - :allow to process reader conditionals, or :preserve to keep all branches as data.  Throw if reader conditional found and this key not present.
+    :features - persistent set of feature keywords for processing reader conditionals.
+    :eof - on eof, return value unless :eofthrow, then throw
Comment by Alex Miller [ 27/Feb/15 10:39 AM ]

Thanks Andy, will consider. Indeed, this will have lengthier docs on http://clojure.org/reader so I was trying to just hit the surface but that's a reasonable suggestion.

Comment by Fogus [ 06/Mar/15 11:10 AM ]

I've looked over the -10 patch and have some feedback.

  • clojure.core/read

I agree with Andy's recommendations about enhancing the docstring to mention reader conditionals.

  • clojure.core/read-string

I would also like to see a back-reference to the read function to refer to the available opts.

  • clojure.core/tagged-literal

A symbol is expected, so the docstring should probably mention that.

  • clojure.core/reader-conditional

I realize that the '?' implies that true/false is expected for the splicing? argument, but it might be worth saying so in the docstring.

  • clojure.core/print-method clojure.lang.TaggedLiteral
  • Question: The TaggedLiteral prints without a space, but the concrete realization will most likely print with a space (e.g. #uuid "..."). While this does not cause problems in round-tripping, should we be consistent, or was the space removed to distinguish between the TaggedLiteral and its concretion?
  • Compiler.readOpts

The #endsWith check is against the string "cljc" but elsewhere similar checks are against ".cljc". This is minor of course, but consistency is nice.

  • LispReader.read(rdr, opts)

Question: Do we really need to check for and cast to IPersistentMap? Would Map work just as well?

  • ConditionalReader.hasFeature() #1318

The code's explicitly checking for keyword-ness, but the error message thrown is abiguous. If a kw is expected then say so.

  • ConditionalReader.invoke() #1438

Can this error message be more clear? The term "read-cond form"" could just be "A reader condition's body" no? (or something like that)

  • ConditionalReader.readCondDelimited()
  • Question: If the conditional form given to read contains no features that can resolve (e.g. {{#?(:cljs [1 2 3])}} in a Clojure context) then ConditionalReader.readCondDelimited leaves the Reader instance in an empty state. I understand that the reason for this is that unpreserved conditional forms are effectively wiped from the stream leaving nothing left. This is effectively the same as running (read-string ""), but I wonder if we have enough context to report a more pointed error. Is it worth doing so?
Comment by Alex Miller [ 11/Mar/15 4:26 PM ]

Re fogus comments in new -11 patch:

  • clojure.core/read - updated docstring
  • clojure.core/read-string - clarified opts back-reference in docstring
  • clojure.core/tagged-literal - updated docstring
  • clojure.core/reader-conditional - updated docstring
  • clojure.core/print-method clojure.lang.TaggedLiteral - agreed, updated print form to match
  • Compiler.readerOpts - updated as suggested
  • LispReader.read(rdr, opts) - the IPersistentMap is intentional there and used via that interface, so no change
  • ConditionalReader.hasFeature() - improved error msg
  • ConditionalReader.invoke() - slightly altered error msg but I'm on the fence on this one. "form" is pretty precise there.
  • ConditionalReader.readCondDelimited() - this is in my opinion correct and expected as is
Comment by Fogus [ 20/Mar/15 12:33 PM ]

I'm happy with the latest changes.





[CLJ-1561] Incorrect line numbers are emitted Created: 10/Oct/14  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Defect Priority: Major
Reporter: Paul Stadig Assignee: Unassigned
Resolution: Completed Votes: 24
Labels: errormsgs

Attachments: Text File clj-1561.patch    
Patch: Code and Test
Approval: Ok

 Description   

The Clojure JVM compiler marks the line number for a form before emitting the children for that form. Marking the line number before emitting children leads to incorrect line numbers when a runtime error occurs. For example, when

 (foo bar
      baz)

is emitted the compiler will visit the line number for the expression, then emit the children expressions ('bar' and 'baz') which will mark their own line numbers, then come back and emit the invoke bytecode for 'foo', but since the last line number to be marked was that of 'baz', if 'foo' throws an exception the line number of 'baz' will be reported instead of the line number for the expression as a whole.

This same issue was being manifested with special forms and inlined functions, and was especially bad in the case of the threading macro '->', because it is usually spread across several lines, and the line number reported could end up being very different than the line actually causing an exception.

A demonstration of the incorrect line numbers (and how the fix affects line numbers) can be seen here https://github.com/pjstadig/clojure-line-numbers

Patch: clj-1561.patch

Screened by: Alex Miller



 Comments   
Comment by Jozef Wagner [ 10/Oct/14 1:57 PM ]

additions in your patch mixes tabs and spaces. Could you please update the patch so that your added lines indent only with tab characters? Not everyone has tab set at 4 spaces...

Comment by Paul Stadig [ 10/Oct/14 2:42 PM ]

There's already a mixture of just tabs, just spaces, and tabs & spaces in Compiler.java. I'm not sure what the "standard" is, but I've changed the patch to match the surrounding lines.

Comment by Paul Stadig [ 10/Oct/14 2:42 PM ]

Patch with whitespace changes.

Comment by Alex Miller [ 20/Oct/14 8:38 AM ]

These changes will affect the line number tables for a variety of Clojure constructs when compiled. It would be very helpful to me to have a set of examples that covered each case touched in the patch so that I could compile them and look at the bytecode vs the source. This would greatly accelerate the screening process.

Comment by Paul Stadig [ 20/Oct/14 2:29 PM ]

Alex,
I have created a repo on github that has a sample file demonstrating the line number changes.

https://github.com/pjstadig/clojure-line-numbers

Hope that helps!

BTW, I'd be glad to do a skype call or hangout, if you have questions.

Comment by Alex Miller [ 20/Oct/14 2:34 PM ]

This is very helpful, thanks!!

Comment by Alex Miller [ 22/Oct/14 11:35 AM ]

In the hunk at 3191 in KeywordInvokeExpr, a call to visitLineNumber was added, but the prior call 4 lines earlier was not removed. Should it be?

Comment by Paul Stadig [ 22/Oct/14 12:05 PM ]

I left that in thinking that if something goes wrong with the getstatic instruction (null pointer exception? class cast exception?) it should report the line number of the KeywordInvokeExpr. It may be that there isn't a realistic possibility that anything could actually happen with that getstatic instruction, but that was the thought process.

My general rule of thumb was if an emit method emits any instructions before it calls the emit method on another expr, then it should mark its line number before and after the recursive emit call (assuming that the recursive emit call would mark its own line number). In cases where an emit method immediately calls another emit method, then I don't bother to mark a line number until afterwards.

Comment by Bozhidar Batsov [ 10/Jan/15 3:09 AM ]

Seems we need to add tests before this can be merged - https://groups.google.com/forum/#!topic/clojure-dev/7pFhG8LMvGo

Comment by Daniel Solano Gómez [ 11/Jan/15 8:42 PM ]

Here's an updated version of Paul's patch that applies cleanly to master.

I based my test code from his GitHub repository, but I have made a few changes:

  1. I have added test cases for when emitUnboxed or the reflecting branch of emit is called.
  2. The original keyword invocation test doesn't actually go through KeywordInvokeExpr. It is resolved by the change to InvokeExpr.emitArgsAndCall, so I have removed it.

Additionally, I have been unable to create test cases that verify that the changes for intrinsic predicates, KeywordInvokExprs, and for the changes inside of InvokeExpr.emit. These could be that they are redundant, but I am not sure.

Comment by Paul Stadig [ 12/Jan/15 5:51 AM ]

Thanks Daniel!

I applied the patches for my commit (as updated by Daniel) and Daniel's commit. The namespace docstring had shifted all the line numbers down by 4, and the tests were failing. I amended Daniel's commit with updated line numbers and combined both commits into a single patch file (clj-1561.patch).

I also reverted my commit and re-ran Daniel's tests to verify that they failed as expected.

I believe the combined patch file should include everything we need other than the difficult test cases that Daniel mentions in his comment.

Cheers!

Comment by Alex Miller [ 12/Jan/15 8:07 AM ]

Whenever people feel this is ready for screening, just switch the Approval from Incomplete to Vetted.

Comment by Paul Stadig [ 12/Jan/15 2:20 PM ]

I have added a test for KeywordInvokeExpr, and a new test to test the logic in InvokeExpr.emit.

I removed the changes I made to StaticMethodExpr.emitIntrinsicPredicate, since I was unable to find a way to create a test case. The issue is that we mark the lines of child expressions, then come back to the parent expression and try to invoke without marking its line, but in the case of intrinsic predicates there is no invocation. When we come back to the parent we directly emit some bytecodes. I don think the change I made in that case actually changed anything, but I removed it since there is no need for it.

Comment by Alex Miller [ 14/Jan/15 10:07 PM ]

Could we squash the patch into a single commit?

Comment by Alex Miller [ 15/Jan/15 9:21 AM ]

Also, in addition to squashing the patch, can we rename examples_clj_1561.clj to line_number_examples.clj or something w/o the ticket in it? I'm ready to mark as screened when those are done. (And if that can get done today, it's likely to get pushed forward tomorrow.)

Comment by Daniel Solano Gómez [ 15/Jan/15 9:24 AM ]

In case Paul doesn't get to it sooner, I can take care of it this afternoon.

Comment by Paul Stadig [ 15/Jan/15 10:40 AM ]

Alex,
I'd like to maintain attribution. I could squash my test related commits into Daniel's commit, so we'd have two commits, my commit for the change and Daniel's for the tests. Would that suffice? If so, I can take care of it now.

Comment by Alex Miller [ 15/Jan/15 11:13 AM ]

Sure, that's fine.

Comment by Alex Miller [ 15/Jan/15 11:14 AM ]

note that one of the commits has a whitespace error (I think Daniel's) that got removed in a later commit. In the end patch, please make sure it doesn't warn about any whitespace errors when applied.

Comment by Alex Miller [ 15/Jan/15 12:46 PM ]

Paul, this patch still seems to have examples_clj_1561.clj in it - would prefer name that is meaningful vs jira issue.

Comment by Paul Stadig [ 15/Jan/15 1:44 PM ]

Yeah, sorry, I'm still working on it. Should be up shortly.

Comment by Paul Stadig [ 15/Jan/15 2:05 PM ]

I squashed the commits down to two: my commit for the change, and Daniel's for the tests. I renamed the "examples_clj_1561" file to "line_number_examples". I removed some trailing whitespace from Daniel's commit. There are still whitespace errors (on my machine) from my commit because tabs are used for indenting. I used tabs to indent because that is how the lines surrounding were formatted, and Jozef had already complained about the formatting. It also appears that these whitespace errors are dependent on your git config. I have:

[core]
whitespace = trailing-space,space-before-tab,tab-in-indent

But when I comment that out of my gitconfig I get no whitespace errors. I guess the bottom line is if we want to agree on a standard config for that I'd be glad to accomodate, but I don't think we want to reformat Compiler.java, so the best choice seems to be to just match the surrounding context, which is what I've done. If you want me to do anything further with that, then let me know.

Ball is back in your court, Alex.





[CLJ-1677] Add setLineNumber to LineNumberingPushbackReader Created: 17/Mar/15  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: reader

Attachments: Text File clj-1677.patch    
Patch: Code and Test
Approval: Ok

 Description   

Add setLineNumber() to set line number on underlying LineNumberingReader.






[CLJ-1638] Regression - PersistentVector.create(List) was removed in 1.7.0-alpha5 Created: 12/Jan/15  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: collections, regression
Environment:

1.7.0-alpha5


Attachments: Text File clj-1638-2.patch     Text File clj-1638.patch    
Patch: Code
Approval: Ok

 Description   

For CLJ-1546, PersistentVector.create(List) was replaced with PersistentVector.create(ArrayList). At least one library (flatland) was calling this method directly and was broken by the change.

Approach: Change create(ArrayList) to more general prior method create(List).

Patch: clj-1638-2.patch

Screened by:



 Comments   
Comment by Rich Hickey [ 20/Feb/15 7:42 AM ]

Is there a good reason to have both PersistentVector.create(List)and PersistentVector.create(ArrayList)?

Comment by Fogus [ 27/Feb/15 9:08 AM ]

This couldn't possibly be more straight-forward.





[CLJ-1499] Replace seq-based iterators with direct iterators for all non-seq collections that use SeqIterator Created: 08/Aug/14  Updated: 20/Mar/15  Resolved: 20/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.7

Type: Enhancement Priority: Major
Reporter: Rich Hickey Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: None

Attachments: File clj-1499-all.diff     Text File clj-1499-v10.patch     Text File clj-1499-v11.patch     Text File clj-1499-v12.patch     File clj-1499-v2.diff     File clj-1499-v3.diff     File clj-1499-v6.diff     File clj-1499-v7.diff     File clj-1499-v8.diff     Text File clj-1499-v9.patch     Text File defrecord-iterator.patch     File defrecord-iterator-v2.diff    
Patch: Code and Test
Approval: Ok

 Description   

Motivation:

Currently, the process of reducing over a map or set involves converting them to seqs. This action thus causes more intermediate objects than we'd like. Therefore, we'd like for reduce to be more efficient about the way that it reduces over these types. One way is for reduction to use the types' core collections while another way is to instead use a smart iterator type that abstracts the direct operation.

Solution

Add support for direct iterators instead of seq-based iterators for non-seq collections that use SeqIterator.

Patch adds support for direct iterators on the following (removing use of SeqIterator):

  • PersistentHashMap - new internal iterator (~20% faster)
  • APersistentSet - use internal map impl iterator (~5% faster)
  • PersistentQueue
  • PersistentStructMap
  • records (in core_deftype.clj)

Patch does not change use of SeqIterator in:

  • LazyTransformer$MultiStepper (not sure if this could be changed)
  • ASeq
  • LazySeq

In preparation for use in CLJ-1602, there is also a new IMapIterable interface that allows a map to publish direct Iterators of keys and values (without creating and pulling apart MapEntry objs). PersistentHashMap and PersistentArrayMap implement it right now and PersistentHashSet will check for and use the direct key iterator if possible.

Performance:

I re-ran the set of tests from CLJ-1546 for vec improvements which use these new iterators for PersistentArrayMap, PersistentHashMap, and PersistentHashSet. These test the time to run vec on a PHS or PHM of the given size. All timings in nanoseconds.

coll size 1.7.0-alpha4 1.7.0-alpha5 alpha5 + patch alpha5 + patch / alpha5
PHS 4 236 406 204 0.86
PHS 16 857 896 382 0.45
PHS 64 3637 3832 1723 0.47
PHS 256 14235 14672 6525 0.46
PHS 1024 67743 68857 44654 0.66
PHM 4 112 232 232 2.07
PHM 16 550 832 442 0.80
PHM 64 3014 3155 2014 0.67
PHM 256 11661 12082 7522 0.65
PHM 1024 57388 59787 44240 0.77

I also ran a set of tests just on reduce with reduce-kv for comparison (reduce-kv uses a different unaffected reduction path so it's a good baseline for comparison).

expr a-set / a-map size 1.6.0 1.7.0-alpha5 (ns) 1.7.0-alpha5 + patch (ns) patch / alpha5 patch / 1.6
(reduce + 0 a-set) 4 171 241 114 0.47 0.67
(reduce + 0 a-set) 16 663 826 395 0.48 0.60
(reduce + 0 a-set) 64 3637 4456 2361 0.53 0.65
(reduce + 0 a-set) 256 14772 16158 8872 0.55 0.60
(reduce + 0 a-set) 1024 72548 79010 49626 0.63 0.68
(reduce + 0 a-set) 16384 1197962 1234286 729335 0.59 0.61
(reduce #(+ %1 (val %2)) 0 a-map) 4 69 103 104 1.01 1.51
(reduce #(+ %1 (val %2)) 0 a-map) 16 640 752 506 0.67 0.79
(reduce #(+ %1 (val %2)) 0 a-map) 64 2861 3238 2388 0.74 0.83
(reduce #(+ %1 (val %2)) 0 a-map) 256 11998 12848 9526 0.74 0.79
(reduce #(+ %1 (val %2)) 0 a-map) 1024 59650 61913 54717 0.88 0.92
(reduce #(+ %1 (val %2)) 0 a-map) 16384 982317 969168 837842 0.86 0.85
(reducekv #(%1 %3) 0 a-map) 4 58 62 60 0.97 1.03
(reducekv #(%1 %3) 0 a-map) 16 233 250 248 0.99 1.06
(reducekv #(%1 %3) 0 a-map) 64 1477 1525 1442 0.95 0.98
(reducekv #(%1 %3) 0 a-map) 256 5470 5258 5201 0.99 0.95
(reducekv #(%1 %3) 0 a-map) 1024 27344 27442 27410 1.00 1.00
(reducekv #(%1 %3) 0 a-map) 16384 399591 397090 396267 1.00 0.99

Patch: clj-1499-v12.patch

Screened by:



 Comments   
Comment by Alex Miller [ 13/Aug/14 1:57 PM ]

The list of non-seqs that uses SeqIterator are:

  • records (in core_deftype.clj)
  • APersistentSet - fallback, maybe is ok?
  • PersistentHashMap
  • PersistentQueue
  • PersistentStructMap

Seqs (that do not need to be changed) are:

  • ASeq
  • LazySeq.java

LazyTransformer$MultiStepper - not sure

Comment by Ghadi Shayban [ 27/Sep/14 2:16 PM ]

attached iterator impl for defrecords. ready to leverage iteration for extmap when PHM iteration lands.

Comment by Alex Miller [ 29/Sep/14 12:52 PM ]

PHM patch

Comment by Alex Miller [ 29/Sep/14 10:26 PM ]

New patch that fixes bugs with PHMs with null keys (and added tests to expose those issues), added support for PHS.

Comment by Ghadi Shayban [ 29/Sep/14 10:45 PM ]

Alex, the defrecord patch already uses the iterator for extmap. It's just made better by the PHM patch.

Comment by Alex Miller [ 29/Sep/14 10:47 PM ]

Comment by Ghadi Shayban [ 30/Sep/14 4:17 PM ]

Heh. Skate to where the puck is going to be – Gretzky

Re: defrecord iterator: As is, it propagates exceptions from reaching the end of the ExtMap's iterator. As noted in CLJ-1453, PersistentArrayMap's iterator improperly returns an ArrayIndexOutOfBoundsException, rather than NoSuchElementException.

Comment by Alex Miller [ 30/Sep/14 6:41 PM ]

Hey Ghadi, rather than rebuilding the case map to pass to the RecordIterator, why don't you just pass the fields in iteration order to it and leverage the case map via .valAt like everything else?

Comment by Ghadi Shayban [ 30/Sep/14 7:30 PM ]

defrecord-iterator-v2.diff reuses valAt and minimizes macrology.

Comment by Alex Miller [ 07/Oct/14 2:04 PM ]

Comments from Stu (found under the couch):

"1. some of the impls (e.g. queue manually concatenate two iters. Would implementing iter-cat and calling that be simpler and more robust?
2. I found this tweak to the generative testing more useful in reporting failure, non-dependent on clojure.test, and capable of expecting failures. Waddya think?

(defn seq-iter-match
  [seqable iterable]
  (let [i (.iterator iterable)]
    (loop [s (seq seqable)
           n 0]
      (if (seq s)
        (do
          (when-not (.hasNext i)
            (throw (ex-info "Iterator exhausted before seq"
                            {:pos n :seqable seqable :iterable iterable})))
          (when-not (= (.next i) (first s))
            (throw (ex-info "Iterator and seq did not match"
                            {:pos n :seqable seqable :iterable iterable})))
          (recur (rest s) (inc n)))
        (when (.hasNext i)
          (throw (ex-info "Seq exhausted before iterator"
                          {:pos n :seqable seqable :iterable iterable})))))))

		(defspec seq-and-iter-match-for-maps
  identity
  [^{:tag clojure.test-clojure.data-structures/gen-map} m]
  (seq-iter-match m m))

3. similar generative approach would be good for the other types (looks like we just do maps)"

Comment by Alex Miller [ 02/Dec/14 3:03 PM ]

Latest patch (clj-1499-v6.diff) makes Stu's suggested change #2 above and adds tests recommended in #3. I looked at #1 but decided in the end that it wasn't going to make anything easier.

Comment by Alex Miller [ 09/Jan/15 11:11 AM ]

v7 patch just updates to current master

Comment by Alex Miller [ 14/Jan/15 4:08 PM ]

found a bug in last impl in maps with null values...fixing

Comment by Stuart Halloway [ 16/Jan/15 1:40 PM ]

v11 updates v10 to apply cleanly on master

Comment by Fogus [ 30/Jan/15 1:36 PM ]

Added a motivation section to the ticket.

Comment by Rich Hickey [ 20/Feb/15 7:54 AM ]

Any reason why MAKE_ENTRY/KEY/VAL are not final?

Comment by Alex Miller [ 20/Feb/15 8:00 AM ]

Nope, just an oversight. Will fix.

Comment by Alex Miller [ 20/Feb/15 9:46 AM ]

New -v12 patch is identical except makes MAKE_ENTRY, MAKE_KEY, MAKE_VAL final.

Comment by Fogus [ 27/Feb/15 8:45 AM ]

Screened latest v12 (with added finals) and would be happy to work in this code if the need arose. Additionally, with the added "motivation" the ticket and its solution makes more sense.





[CLJ-1232] Functions with non-qualified return type hints force import of hinted classes when called from other namespace Created: 18/Jul/13  Updated: 20/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5, Release 1.6, Release 1.7
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Tassilo Horn Assignee: Unassigned
Resolution: Unresolved Votes: 8
Labels: compiler, typehints

Attachments: Text File 0001-auto-qualify-arglists-class-names.patch     Text File 0001-auto-qualify-arglists-class-names-v2.patch     Text File 0001-auto-qualify-arglists-class-names-v3.patch     Text File 0001-throw-on-non-qualified-class-names-that-are-not-auto.patch    
Patch: Code and Test
Approval: Vetted

 Description   

You can add a type hint to function arglists to indicate the return type of a function like so.

user> (import '(java.util List))
java.util.List
user> (defn linkedlist ^List [] (java.util.LinkedList.))
#'user/linkedlist
user> (.size (linkedlist))
0

The problem is that now when I call `linkedlist` exactly as above from another namespace, I'll get an exception because java.util.List is not imported in there.

user> (in-ns 'user2)
#<Namespace user2>
user2> (refer 'user)
nil
user2> (.size (linkedlist))
CompilerException java.lang.IllegalArgumentException: Unable to resolve classname: List, compiling:(NO_SOURCE_PATH:1:1)
user2> (import '(java.util List)) ;; Too bad, need to import List here, too.
java.util.List
user2> (.size (linkedlist))
0

There are two workarounds: You can import the hinted type also in the calling namespace, or you always use fully qualified class names for return type hints. Clearly, the latter is preferable.

But clearly, that's a bug that should be fixed. It's not in analogy to type hints on function parameters which may be simple-named without having any consequences for callers from other namespaces.

Approach: Make the compiler resolve the return tags when necessary (tag is not a string, primitive tag (^long) or array tag (^longs)) and update the Var's :arglist appropriately.

Patch: 0001-auto-qualify-arglists-class-names-v3.patch



 Comments   
Comment by Andy Fingerhut [ 16/Apr/14 3:47 PM ]

To make sure I understand, Nicola, in this ticket you are asking that the Clojure compiler change behavior so that the sample code works correctly with no exceptions, the same way as it would work correctly without exceptions if one of the workarounds were used?

Comment by Tassilo Horn [ 17/Apr/14 12:18 AM ]

Hi Andy. Tassilo here, not Nicola. But yes, the example should work as-is. When I'm allowed to use type hints with simple imported class names for arguments, then doing so for return values should work, too.

Comment by Rich Hickey [ 10/Jun/14 10:41 AM ]

Type hints on function params are only consumed by the function definition, i.e. in the same module as the import/alias. Type hints on returns are just metadata, they don't get 'compiled' and if the metadata is not useful to consumers in other namespaces, it's not a useful hint. So, if it's not a type in the auto-imported set (java.lang), it should be fully qualified.

Comment by Alex Miller [ 10/Jun/14 11:55 AM ]

Based on Rich's comment, this ticket should probably morph into an enhancement request on documentation, probably on http://clojure.org/java_interop#Java Interop-Type Hints.

Comment by Andy Fingerhut [ 10/Jun/14 3:13 PM ]

I would suggest something like the following for a documentation change, after this part of the text on the page Alex links in the previous comment:

For function return values, the type hint can be placed before the arguments vector:

(defn hinted
(^String [])
(^Integer [a])
(^java.util.List [a & args]))

-> #user/hinted

If the return value type hint is for a class that is outside of java.lang, which is the part auto-imported by Clojure, then it must be a fully qualified class name, e.g. java.util.List, not List.

Comment by Nicola Mometto [ 10/Jun/14 4:02 PM ]

I don't understand why we should enforce this complexity to the user.
Why can't we just make the Compiler (or even defn itself) update all the arglists tags with properly resolved ones? (that's what I'm doing in tools.analyzer.jvm)

Comment by Alexander Kiel [ 19/Jul/14 10:02 AM ]

I'm with Nicola here. I also think that defn should resolve the type hint according the imports of the namespace defn is used in.

Comment by Max Penet [ 22/Jul/14 7:06 AM ]

Same here, I was bit by this in the past. The current behavior is clearly counterintuitive.

Comment by Nicola Mometto [ 28/Aug/14 12:58 PM ]

Attached two patches implementing two different solutions:

  • 0001-auto-qualify-arglists-class-names.patch makes the compiler automatically qualify all the tags in the :arglists
  • 0001-throw-on-non-qualified-class-names-that-are-not-auto.patch makes the compiler throw an exception for all public defs whose return tag is a symbol representing a non-qualified class that is not in the auto-import list (approach proposed in IRC by Alex Miller)
Comment by Tassilo Horn [ 29/Aug/14 1:49 AM ]

For what it's worth, I'd prefer the first patch because the second doesn't help in situations where the caller lives in a namespace where the called function's return type hinted class is `ns-unmap`-ed. And there a good reasons for doing that. For example, Process is a java.lang class and Process is a pretty generic name. So in some namespace, I want to define my own Process deftype or defrecord. Without unmapping 'Process first to get rid of the java.lang.Process auto-import, I'd get an exception:

user> (deftype Process [])
IllegalStateException Process already refers to: class java.lang.Process in namespace: user  clojure.lang.Namespace.referenceClass (Namespace.java:140)

Now when I call some function from some library that has a `^Process` return type hint (meaning java.lang.Process there), I get the same exception as in my original report.

I can even get into troubles when only using standard Clojure functions because those have `^String` and `^Class` type hints. IMO, Class is also a pretty generic name I should be able to name my custom deftype/defrecord. And I might also want to have a custom String type/record in my astrophysics system.

Comment by Andy Fingerhut [ 30/Sep/14 4:39 PM ]

Not sure whether the root cause of this behavior is the same as the example in the description or not, but seems a little weird that even for fully qualified Java class names hinting the arg vector, it makes a difference whether it is done with defn or def:

Clojure 1.6.0
user=> (set! *warn-on-reflection* true)
true
user=> (defn f1 ^java.util.LinkedList [coll] (java.util.LinkedList. coll))
#'user/f1
user=> (def f2 (fn ^java.util.LinkedList [coll] (java.util.LinkedList. coll)))
#'user/f2
user=> (.size (f1 [2 3 4]))
3
user=> (.size (f2 [2 3 4]))
Reflection warning, NO_SOURCE_PATH:5:1 - reference to field size can't be resolved.
3
Comment by Alex Miller [ 30/Sep/14 6:21 PM ]

Andy, can you file that as a separate ticket?

Comment by Andy Fingerhut [ 30/Sep/14 9:08 PM ]

Created ticket CLJ-1543 for the issue raised in my comment earlier on 30 Sep 2014.

Comment by Andy Fingerhut [ 01/Oct/14 12:38 PM ]

Tassilo (or anyone), is there a reason to prefer putting the tag on the argument vector in your example? It seems that putting it on the Var name instead avoids this issue:

user=> (clojure-version)
"1.6.0"
user=> (set! *warn-on-reflection* true)
true
user=> (import '(java.util List))
java.util.List
user=> (defn ^List linkedlist [] (java.util.LinkedList.))
#'user/linkedlist
user=> (.size (linkedlist))
0
user=> (ns user2)
nil
user2=> (refer 'user)
nil
user2=> (.size (linkedlist))
0

I suppose that only allows a single type tag, rather than an independent one for each arity.

Comment by Tassilo Horn [ 02/Oct/14 3:16 AM ]

I wasn't aware of the fact that you can put it on the var's name. That's not documented at http://clojure.org/java_interop#Java Interop-Type Hints. But IMHO the documented version with putting the tag on the argument vector is more general since it supports different return type hints for the different arity version. In any case, if both forms are permitted then they should be equivalent in the case the function has only one arity.

Comment by Rich Hickey [ 16/Mar/15 12:02 PM ]

Please work on the simplest patch that resolves the names

Comment by Alex Miller [ 16/Mar/15 4:34 PM ]

Nicola, in this:

if (tag != null &&
                        !(tag instanceof String) &&
                        primClass((Symbol)tag) == null &&
                        !tagClass((Symbol) tag).getName().startsWith("["))
                        {
                            argvec = (IPersistentVector)((IObj)argvec).withMeta(RT.map(RT.TAG_KEY, Symbol.intern(tagClass((Symbol)tag).getName())));
                        }

doesn't tagClass already handle most of these cases properly already? Can this be simplified? Is there an optimization case in avoiding lookup for a dotted name?

Comment by Nicola Mometto [ 16/Mar/15 5:10 PM ]

Patch 0001-auto-qualify-arglists-class-names-v2.patch avoids doing unnecessary lookups (dotted names, special tags (primitive tags, array tags)) and adds a testcase





[CLJ-1678] Update failing tests for IBM JDK 1.7 and 1.8 Created: 19/Mar/15  Updated: 20/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Andy Fingerhut Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: test
Environment:

IBM JDK 1.7 and 1.8


Approval: Triaged

 Description   

For Sun/Oracle JDKs, and IBM JDK 1.6, we have this:

user=> (.hashCode 9223372039002259457N)
1

For IBM JDKs 1.7 and 1.8, it changed to this (I do not know why):

user=> (.hashCode 9223372039002259457N)
33

This causes a few example-based tests in Clojure to fail when run on those IBM JDK versions. There does not appear to be any bug in Clojure here. Those tests were written with particular constant values that are different, but have equal .hashCode values, to test Clojure's code generated that selects between branches in a case. In particular, these tests in control.clj fail:

;; line 386 in Clojure 1.6.0 and 1.7.0-master-SNAPSHOT as of Mar 19 2015:
    (is (== (.hashCode 1) (.hashCode 9223372039002259457N)))

;; and later on line 423 in the same file:
  (testing "test warn for hash collision"
    (should-print-err-message
     #"Performance warning, .*:\d+ - hash collision of some case test constants; if selected, those entries will be tested sequentially..*\r?\n"
     (case 1 1 :long 9223372039002259457N :big 2)))

There are other tests in the same file with the same constant 9223372039002259457N that do not fail with IBM JDKs 1.7 and 1.8, but they do not test hash collisions as they were intended to.

Some possibilities for what could be changed:

1. Pick a different pair of number other than 1 and 9223372039002259457N when running tests on IBM JDKs 1.7 and 1.8, so that the hash values do collide. For example, 33 and 9223372039002259457N.

2. skip these tests completely when running on IBM JDKs 1.7 and 1.8.



 Comments   
Comment by Alex Miller [ 20/Mar/15 4:03 AM ]

I think my preference would be to skip these tests for the ibm jdk.





[CLJ-1657] proxy creates bytecode that calls super methods of abstract classes Created: 08/Feb/15  Updated: 18/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Alexander Yakushev Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Everywhere, but so far relevant only on Android 5.0


Attachments: File CLJ-1657-patch.diff    

 Description   

When proxy is used to extend abstract classes (e.g. java.io.Writer), the bytecode it produces include the call to non-existing super methods. For example, here's decompiled method from clojure/pprint/column_writer.clj:

public void close()
    {
        Object obj;
label0:
        {
            obj = RT.get(__clojureFnMap, "close");
            if(obj == null)
                break label0;
            ((IFn)obj).invoke(this);
            break MISSING_BLOCK_LABEL_31;
        }
        JVM INSTR pop ;
        super.close();
    }

As you can see on the last line, super.close() tries to call a non-defined method (because close() is abstract in Writer).

This hasn't been an issue anywhere until Android 5.0 came out. Its bytecode optimizer is very aggressive and rejects such code. Google guys claim that it is a bug in their code, which they already fixed[1]. Still I wonder if having faulty bytecode, that is not valid by Java standards, might cause issues in future (not only on Android, but in other enviroments too).

[1] https://code.google.com/p/android/issues/detail?id=80687



 Comments   
Comment by Alexander Yakushev [ 18/Mar/15 5:31 AM ]

I attached a patch that resolves the issue. The change makes `generate-proxy` treat abstract methods like interface methods. Which means, if the implementation for the method is not provided, it will throw unsupported exception rather than try to call the parent method (which doesn't exist).

Comment by Michael Blume [ 18/Mar/15 12:50 PM ]

Alexander: Awesome, thanks =)

Note: If you use git format-patch after making a commit, you can generate a patch file with your name/e-mail and a commit message that a clojure maintainer can apply directly to clojure as a new commit.

Comment by Alex Miller [ 18/Mar/15 12:53 PM ]

The patch process is documented here: http://dev.clojure.org/display/community/Developing+Patches

Comment by Alexander Yakushev [ 18/Mar/15 4:38 PM ]

Sorry, I should have checked the guidelines first. I uploaded a new patch, hope it is correct now.





[CLJ-1372] Inconsistent hash with java collections Created: 09/Mar/14  Updated: 16/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Jozef Wagner Assignee: Unassigned
Resolution: Unresolved Votes: 9
Labels: collections, interop
Environment:

1.6.0 master


Attachments: Text File 0001-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M-alternative.patch     Text File 0001-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M.patch     Text File 0001-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M-substring.patch     Text File 0005-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M.patch     Text File 0006-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M.patch     Text File 0007-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M.patch     File clj-1372-2.diff     File clj-1372.diff    
Patch: Code and Test
Approval: Triaged

 Description   

c.c/hash always use hashCode for java collections, which is incompatible when comparing with Clojure collections, which use Murmur3.

user=> (== (hash (java.util.ArrayList. [1 2 3])) (hash [1 2 3]))
false
user=> (= (java.util.ArrayList. [1 2 3]) [1 2 3])
true

One way to fix it is to add a special case in Util/hasheq for java.util.Collections, as it is now for Strings.

Link to a discussion of this topic in the Clojure group: https://groups.google.com/forum/#!topic/clojure/dQhdwZsyIEw



 Comments   
Comment by Jozef Wagner [ 09/Mar/14 8:02 AM ]

Same problem for maps, so hasheq should have a special case for java.util.Map too.

Comment by Jozef Wagner [ 09/Mar/14 9:21 AM ]

Added patch with fix for j.u. Map, Set and List.

Comment by Andy Fingerhut [ 09/Mar/14 6:02 PM ]

Add patch clj-1372-2.diff that is identical to Jozef Wagner's clj-1372.diff, except it also adds some new tests that fail without his changes, and pass with them.

Comment by Alex Miller [ 10/Mar/14 9:31 AM ]

I think the contract on equiv/hasheq is more narrowly scoped than this and only applies if both collections are IPersistentCollection. In other words, I don't think this is wanted or required.

Note that the Java .equals/.hashCode contract is maintained here - these collections will compare as .equals() and do have the same .hashCode().

Comment by Jozef Wagner [ 10/Mar/14 9:38 AM ]

Without the patch the following statement is not valid: "If two objects are equal with c.c/=, than their hash returned by c.c/hash is the same number". We can say that this is valid only iff both objects are 'clojure' objects, but this goes against clojures interop principles (interop is easy, fast, no surprises).

Comment by Jozef Wagner [ 10/Mar/14 9:54 AM ]

Manifestation of this bug

user=> (assoc (hash-map [1 2 3] :foo) (java.util.ArrayList. [1 2 3]) :bar)
{[1 2 3] :bar, [1 2 3] :foo}
user=> (get (hash-map [1 2 3] :foo) (java.util.ArrayList. [1 2 3]))
nil
Comment by Alex Miller [ 10/Mar/14 10:58 AM ]

I agree that would be a nice thing to say without qualification.

There is a real cost to adding more branches in hasheq - adding those collection checks affects every hasheq. Running a full Clojure build, I see the following set of classes with >100 occurences where this happens (note that exactly 0 of these are the Java collections - this case doesn't exist in the Clojure build itself):

clojure.lang.Var 107001502
java.lang.Class 2651389
java.lang.Character 2076322 
java.util.UUID 435235 
java.util.Date 430956
clojure.lang.Compiler$LocalBinding 116830
java.lang.Boolean 112361
java.util.regex.Pattern 325

We'd be adding 4 more instanceof checks in the path of every one of those hasheqs. This would also likely blow any JVM inlining.

Rich says "all bets should be off for hasheq/equiv of non-values" where Java collections obviously fall into the class of "non-values".

Comment by Andy Fingerhut [ 10/Mar/14 11:04 AM ]

Would a doc patch be considered? Say one that modified the doc of clojure.core/hash to include a phrase indicating that it is only promised to be consistent with clojure.core/= for immutable values? It could even perhaps mention that Floats are out, too: see CLJ-1036

Comment by Alex Miller [ 10/Mar/14 12:00 PM ]

I think it would be preferred to do any detailed docs about hash at http://clojure.org/data_structures rather than in the docstring. Although the docstring on hash probably could use an update and a pointer to the web site after the latest changes.

Comment by Jozef Wagner [ 10/Mar/14 12:14 PM ]

Neverthless it is a breaking change from 1.5, and it should be mentioned in changelog. What still bugs me is that c.c/= is supported in such cases but the c.c/hash is not. If supporting c.c/hash is expensive, isn't it better to drop support for c.c/= in such cases? It will eliminate surprises such as:

user=> (apply distinct? (hash-set [1 2 3] (java.util.Collections/unmodifiableList [1 2 3])))
false
Comment by Alex Miller [ 10/Mar/14 2:05 PM ]

I'm not sure it's a "breaking" change if something not considered to be guaranteed changes. But I take your point.

I don't think it's feasible to drop = support for Clojure and Java collections - that seems important and useful. And if it were free to do so, I would like to be able to say without qualification that if equiv=true, then hasheq is the same.

It's unclear to me that the examples listed on this ticket are actually real problems people are likely to encounter. The main users of hasheq are hash map and hash set. So to manifest, you would need to be putting a mixture of Clojure and Java collections into one of those, in particular a mixture of collections that compare as equal.

Still thinking about it.

Comment by Jozef Wagner [ 10/Mar/14 3:27 PM ]

Sorry for spamming but there may be another option, to not fallback into hashCode in hasheq, but to instead throw in cases where hasheq is requested for non-values. This will lead to a cleaner separation of hash types. Of course it will prevent putting non-values into hash-set.

Comment by Alex Miller [ 10/Mar/14 3:34 PM ]

There is no simple check for "valueness" though?

Comment by Andy Fingerhut [ 10/Mar/14 3:37 PM ]

An idea, for what it might be worth: Add one test for instance of java.util.Collection in Util.hasheq method instead of 3 separate tests for Set, List, and Map. It doesn't cover Map.Entry.

Comment by Alex Miller [ 10/Mar/14 3:38 PM ]

Map doesn't extend Collection either.

Comment by Stuart Halloway [ 11/Mar/14 10:44 AM ]

I think this needs more consideration and should not hold up 1.6.

Comment by Andy Fingerhut [ 20/Mar/14 2:01 PM ]

Both patches clj-1372.diff and clj-1372-2.diff fail to apply cleanly as of latest Clojure master on Mar 20 2014. They did apply cleanly before the Mar 19 2014 commit, I believe, and the only issue appears to be a changed line of diff context. Given the discussion about whether such a change is desired, it sounds like more thought is needed before deciding what change should be made, if any.

Comment by Mike Anderson [ 11/May/14 2:31 PM ]

This is a pretty bad defect. It absolutely needs to be fixed. It's not really about whether using a mix of Clojure and Java collections is a likely use case or not (it probably isn't...), it's about providing consistent guarantees that people can rely upon.

For example, now I'm really unsure about whether some of the library functions I have that use sets or maps are broken or not. I'd be particularly worried about anything that implements object caches / memoisation / interning based on hashed values. Such code may now have some really nasty subtle defects.

Since they are library functions, I can't guarantee what kind of objects are passed in so the code has to work with all possible inputs (either that or I need to write a clear docstring and throw an exception if the input is not supported).

Comment by Michał Marczyk [ 12/May/14 11:29 PM ]

This patch (0001-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M.patch) makes hasheq consistent with = for java.util.{List,Map,Map.Entry,Set}. Additionally it extends the special treatment of String (return hasheq of hashCode) to all types not otherwise handled (see below for a comment on this).

It is also available here:

https://github.com/michalmarczyk/clojure/tree/alien-hasheq-2

An earlier version is available here:

https://github.com/michalmarczyk/clojure/tree/alien-hasheq

If I understand correctly, what needs to be benchmarked is primarily the "dispatch time" for clojure.lang.Util/hasheq given a Clojure type. So, I ran a Criterium benchmark repeatedly hashing the same persistent hash map, on the theory that this will measure just the dispatch time on IHashEq instances. I then ran a separate benchmark hashing a PHM, a string and a long and adding up the results with unchecked-add. Hopefully this is a good start; I've no doubt additional benchmarks would be useful.

The results are somewhat surprising to me: hasheq on PHM is actually slightly faster in this benchmark on my build than on 1.6.0; the "add three hasheqs" benchmark is slightly faster on 1.6.0.

;;; 1.6.0

;;; NB. j.u.HM benchmark irrelevant
user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] #_(assert (= (hash phm) (hash juhm))) (c/bench (clojure.lang.Util/hasheq phm)) (c/bench (clojure.lang.Util/hasheq juhm)))
WARNING: Final GC required 1.24405836928592 % of runtime
Evaluation count : 5549560980 in 60 samples of 92492683 calls.
             Execution time mean : 9.229881 ns
    Execution time std-deviation : 0.156716 ns
   Execution time lower quantile : 8.985994 ns ( 2.5%)
   Execution time upper quantile : 9.574039 ns (97.5%)
                   Overhead used : 1.741068 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 6.2652 % Variance is slightly inflated by outliers
Evaluation count : 35647680 in 60 samples of 594128 calls.
             Execution time mean : 1.695145 µs
    Execution time std-deviation : 20.186554 ns
   Execution time lower quantile : 1.670049 µs ( 2.5%)
   Execution time upper quantile : 1.740329 µs (97.5%)
                   Overhead used : 1.741068 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] #_(assert (= (hash phm) (hash juhm))) (c/bench (unchecked-add (clojure.lang.Util/hasheq phm) (unchecked-add (clojure.lang.Util/hasheq "foo") (clojure.lang.Util/hasheq 123)))))
WARNING: Final GC required 1.028614538339401 % of runtime
Evaluation count : 1029948300 in 60 samples of 17165805 calls.
             Execution time mean : 56.797488 ns
    Execution time std-deviation : 0.732221 ns
   Execution time lower quantile : 55.856731 ns ( 2.5%)
   Execution time upper quantile : 58.469940 ns (97.5%)
                   Overhead used : 1.836671 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

;;; patch applied

user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] (assert (= (hash phm) (hash juhm))) (c/bench (clojure.lang.Util/hasheq phm)) (c/bench (clojure.lang.Util/hasheq juhm)))
Evaluation count : 5537698680 in 60 samples of 92294978 calls.
             Execution time mean : 8.973200 ns
    Execution time std-deviation : 0.157079 ns
   Execution time lower quantile : 8.733544 ns ( 2.5%)
   Execution time upper quantile : 9.289350 ns (97.5%)
                   Overhead used : 1.744772 ns
Evaluation count : 2481600 in 60 samples of 41360 calls.
             Execution time mean : 24.287800 µs
    Execution time std-deviation : 288.124326 ns
   Execution time lower quantile : 23.856445 µs ( 2.5%)
   Execution time upper quantile : 24.774097 µs (97.5%)
                   Overhead used : 1.744772 ns
nil

user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] #_(assert (= (hash phm) (hash juhm))) (c/bench (unchecked-add (clojure.lang.Util/hasheq phm) (unchecked-add (clojure.lang.Util/hasheq "foo") (clojure.lang.Util/hasheq 123)))))
WARNING: Final GC required 1.298136122909759 % of runtime
Evaluation count : 954751500 in 60 samples of 15912525 calls.
             Execution time mean : 61.681794 ns
    Execution time std-deviation : 0.712110 ns
   Execution time lower quantile : 60.622003 ns ( 2.5%)
   Execution time upper quantile : 62.904801 ns (97.5%)
                   Overhead used : 1.744772 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

As a side note, the earlier version of the patch available on the other branch doesn't have a separate branch for String. This made hasheq faster for objects implementing IHashEq, but slowed down the "three hashes" benchmark roughly by a factor of 2.

Comment by Alex Miller [ 12/May/14 11:39 PM ]

Just for clarity, please refer to patches attached here by name so as time goes on we don't have to correlate attachment time with comment time.

I'm not particularly worried about the cost of things that implement IHashEq as they should be unaffected other than potential inlining issues. I am curious about the cost of hasheq for objects that fall through to the end of the cases and pay the cost for all of the checks. The list farther up in the comments is a good place to start - things like Class, Character, and Var (which could possibly be addressed in Var).

Comment by Michał Marczyk [ 12/May/14 11:47 PM ]

Good point, I've edited the above comment to include the patch name.

Thanks for the benchmarking suggestions – I'll post some new results in ~6 minutes.

Comment by Michał Marczyk [ 13/May/14 12:18 AM ]

First, for completeness, here's a new patch (0001-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M-alternative.patch) which doesn't do the extra murmuring for types not otherwise handled. It's slower for the single PHM case; see below for details. Also, here's the branch on GitHub:

https://github.com/michalmarczyk/clojure/tree/alien-hasheq-3

As for the new results, the perf hit is quite large, I'm afraid:

;;; with patch (murmur hashCode for default version)
user=> (let [class-instance java.lang.String character-instance \a var-instance #'hash] (c/bench (clojure.lang.Util/hasheq class-instance)) (c/bench (clojure.lang.Util/hasheq character-instance)) (c/bench (clojure.lang.Util/hasheq var-instance)))
WARNING: Final GC required 1.409118084170768 % of runtime
Evaluation count : 655363680 in 60 samples of 10922728 calls.
             Execution time mean : 96.459888 ns
    Execution time std-deviation : 1.019817 ns
   Execution time lower quantile : 95.079086 ns ( 2.5%)
   Execution time upper quantile : 98.684168 ns (97.5%)
                   Overhead used : 1.708347 ns
Evaluation count : 675919140 in 60 samples of 11265319 calls.
             Execution time mean : 88.965959 ns
    Execution time std-deviation : 0.825226 ns
   Execution time lower quantile : 87.817159 ns ( 2.5%)
   Execution time upper quantile : 90.755688 ns (97.5%)
                   Overhead used : 1.708347 ns
Evaluation count : 574987680 in 60 samples of 9583128 calls.
             Execution time mean : 103.881498 ns
    Execution time std-deviation : 1.103615 ns
   Execution time lower quantile : 102.257474 ns ( 2.5%)
   Execution time upper quantile : 106.071144 ns (97.5%)
                   Overhead used : 1.708347 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

;;; 1.6.0
user=> (let [class-instance java.lang.String character-instance \a var-instance #'hash] (c/bench (clojure.lang.Util/hasheq class-instance)) (c/bench (clojure.lang.Util/hasheq character-instance)) (c/bench (clojure.lang.Util/hasheq var-instance)))
WARNING: Final GC required 1.3353133083866688 % of runtime
Evaluation count : 1829305260 in 60 samples of 30488421 calls.
             Execution time mean : 34.205701 ns
    Execution time std-deviation : 0.379106 ns
   Execution time lower quantile : 33.680636 ns ( 2.5%)
   Execution time upper quantile : 34.990138 ns (97.5%)
                   Overhead used : 1.718257 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 1 (1.6667 %)
	low-mild	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Evaluation count : 1858100340 in 60 samples of 30968339 calls.
             Execution time mean : 30.401309 ns
    Execution time std-deviation : 0.213878 ns
   Execution time lower quantile : 30.095976 ns ( 2.5%)
   Execution time upper quantile : 30.871497 ns (97.5%)
                   Overhead used : 1.718257 ns
Evaluation count : 1592932200 in 60 samples of 26548870 calls.
             Execution time mean : 36.292934 ns
    Execution time std-deviation : 0.333512 ns
   Execution time lower quantile : 35.795063 ns ( 2.5%)
   Execution time upper quantile : 36.918183 ns (97.5%)
                   Overhead used : 1.718257 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

One PHM and Class/Character/Var results with the new patch (no extra murmur step in the default case):

user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] #_(assert (= (hash phm) (hash juhm))) (c/bench (unchecked-add (clojure.lang.Util/hasheq phm) (unchecked-add (clojure.lang.Util/hasheq "foo") (clojure.lang.Util/hasheq 123)))))
WARNING: Final GC required 1.258952964663877 % of runtime
Evaluation count : 1007768460 in 60 samples of 16796141 calls.
             Execution time mean : 58.195608 ns
    Execution time std-deviation : 0.482804 ns
   Execution time lower quantile : 57.655857 ns ( 2.5%)
   Execution time upper quantile : 59.154655 ns (97.5%)
                   Overhead used : 1.567532 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil
user=> (let [class-instance java.lang.String character-instance \a var-instance #'hash] (c/bench (clojure.lang.Util/hasheq class-instance)) (c/bench (clojure.lang.Util/hasheq character-instance)) (c/bench (clojure.lang.Util/hasheq var-instance)))
Evaluation count : 647944080 in 60 samples of 10799068 calls.
             Execution time mean : 91.275863 ns
    Execution time std-deviation : 0.659943 ns
   Execution time lower quantile : 90.330980 ns ( 2.5%)
   Execution time upper quantile : 92.711120 ns (97.5%)
                   Overhead used : 1.567532 ns
Evaluation count : 699506160 in 60 samples of 11658436 calls.
             Execution time mean : 84.564131 ns
    Execution time std-deviation : 0.517071 ns
   Execution time lower quantile : 83.765607 ns ( 2.5%)
   Execution time upper quantile : 85.569206 ns (97.5%)
                   Overhead used : 1.567532 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Evaluation count : 594919980 in 60 samples of 9915333 calls.
             Execution time mean : 100.336792 ns
    Execution time std-deviation : 0.811312 ns
   Execution time lower quantile : 99.313490 ns ( 2.5%)
   Execution time upper quantile : 102.167675 ns (97.5%)
                   Overhead used : 1.567532 ns

Found 3 outliers in 60 samples (5.0000 %)
	low-severe	 3 (5.0000 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil
Comment by Michał Marczyk [ 13/May/14 1:05 AM ]

Here's a new patch (0001-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M-substring.patch) that takes the outrageous approach of replacing the Iterable/Map/Entry test with a .startsWith("java.util.") on the class name. (I experimented with .getClass().getPackage(), but the performance of that was terrible.) The branch is here:

https://github.com/michalmarczyk/clojure/tree/alien-hasheq-4

Hash perf on the "fall-through" cases with this patch seems to be very good:

user=> (let [class-instance java.lang.String character-instance \a var-instance #'hash] (c/bench (clojure.lang.Util/hasheq class-instance)) (c/bench (clojure.lang.Util/hasheq character-instance)) (c/bench (clojure.lang.Util/hasheq var-instance)))
WARNING: Final GC required 1.31690036780011 % of runtime
Evaluation count : 1661453640 in 60 samples of 27690894 calls.
             Execution time mean : 35.099750 ns
    Execution time std-deviation : 0.422800 ns
   Execution time lower quantile : 34.454839 ns ( 2.5%)
   Execution time upper quantile : 35.953584 ns (97.5%)
                   Overhead used : 1.556642 ns
Evaluation count : 1630167600 in 60 samples of 27169460 calls.
             Execution time mean : 35.487409 ns
    Execution time std-deviation : 0.309872 ns
   Execution time lower quantile : 35.083030 ns ( 2.5%)
   Execution time upper quantile : 36.190015 ns (97.5%)
                   Overhead used : 1.556642 ns

Found 4 outliers in 60 samples (6.6667 %)
	low-severe	 3 (5.0000 %)
	low-mild	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Evaluation count : 1440434700 in 60 samples of 24007245 calls.
             Execution time mean : 40.894457 ns
    Execution time std-deviation : 0.529510 ns
   Execution time lower quantile : 40.055991 ns ( 2.5%)
   Execution time upper quantile : 41.990985 ns (97.5%)
                   Overhead used : 1.556642 ns
nil
Comment by Michał Marczyk [ 13/May/14 1:28 AM ]

The new patch (...-substring.patch) returns hashCode for java.util.** classes other than List, Map, Map.Entry and Set, of course, so no behaviour change there.

Here are the benchmarks for repeated PHM lookups (slightly slower than 1.6.0 apparently, though within 1 ns) and the "add three hasheqs" benchmark (66 ns with patch vs. 57 ns without):

user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] (assert (= (hash phm) (hash juhm))) (c/bench (clojure.lang.Util/hasheq phm)) (c/bench (clojure.lang.Util/hasheq juhm)))
Evaluation count : 5183841240 in 60 samples of 86397354 calls.
             Execution time mean : 10.076893 ns
    Execution time std-deviation : 0.182592 ns
   Execution time lower quantile : 9.838456 ns ( 2.5%)
   Execution time upper quantile : 10.481086 ns (97.5%)
                   Overhead used : 1.565749 ns
Evaluation count : 3090420 in 60 samples of 51507 calls.
             Execution time mean : 19.596627 µs
    Execution time std-deviation : 224.380257 ns
   Execution time lower quantile : 19.288347 µs ( 2.5%)
   Execution time upper quantile : 20.085620 µs (97.5%)
                   Overhead used : 1.565749 ns
nil

user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] #_(assert (= (hash phm) (hash juhm))) (c/bench (unchecked-add (clojure.lang.Util/hasheq phm) (unchecked-add (clojure.lang.Util/hasheq "foo") (clojure.lang.Util/hasheq 123)))))
WARNING: Final GC required 1.418253438197936 % of runtime
Evaluation count : 879210900 in 60 samples of 14653515 calls.
             Execution time mean : 66.939309 ns
    Execution time std-deviation : 0.747984 ns
   Execution time lower quantile : 65.667310 ns ( 2.5%)
   Execution time upper quantile : 68.155046 ns (97.5%)
                   Overhead used : 1.724002 ns
nil

It is important to note that I have obtained the no-patch result for the "three hasheqs" benchmarks on a fresh JVM when benchmarking 1.6.0, so that's also how I repeated the benchmark with the patch applied. Hashing many different types changes the results noticeably – presumably HotSpot backs off from some optimizations after seeing several different types passed in to hasheq?

Comment by Michał Marczyk [ 13/May/14 8:04 AM ]

Here's a new patch (0005-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M.patch) that introduces a new isAlien static method that checks for instanceof Map/Map.Entry/Iterable and uses this method to test for "alien collection".

Initial benchmarking results are promising:

;;; "fall-through" benchmark
user=> (let [class-instance java.lang.String character-instance \a var-instance #'hash] (c/bench (clojure.lang.Util/hasheq class-instance)) (c/bench (clojure.lang.Util/hasheq character-instance)) (c/bench (clojure.lang.Util/hasheq var-instance)))
WARNING: Final GC required 1.258979068087473 % of runtime
Evaluation count : 1598432100 in 60 samples of 26640535 calls.
             Execution time mean : 36.358882 ns
    Execution time std-deviation : 0.566925 ns
   Execution time lower quantile : 35.718889 ns ( 2.5%)
   Execution time upper quantile : 37.414722 ns (97.5%)
                   Overhead used : 1.823120 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Evaluation count : 1626362460 in 60 samples of 27106041 calls.
             Execution time mean : 35.426993 ns
    Execution time std-deviation : 0.294517 ns
   Execution time lower quantile : 35.047064 ns ( 2.5%)
   Execution time upper quantile : 36.058667 ns (97.5%)
                   Overhead used : 1.823120 ns

Found 1 outliers in 60 samples (1.6667 %)
	low-severe	 1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Evaluation count : 1461423180 in 60 samples of 24357053 calls.
             Execution time mean : 39.541873 ns
    Execution time std-deviation : 0.423707 ns
   Execution time lower quantile : 38.943560 ns ( 2.5%)
   Execution time upper quantile : 40.499433 ns (97.5%)
                   Overhead used : 1.823120 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil

;;; "three hasheqs" benchmark
user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] #_(assert (= (hash phm) (hash juhm))) (c/bench (unchecked-add (clojure.lang.Util/hasheq phm) (unchecked-add (clojure.lang.Util/hasheq "foo") (clojure.lang.Util/hasheq 123)))))
WARNING: Final GC required 1.5536755331464491 % of runtime
Evaluation count : 820376460 in 60 samples of 13672941 calls.
             Execution time mean : 71.999365 ns
    Execution time std-deviation : 0.746588 ns
   Execution time lower quantile : 70.869739 ns ( 2.5%)
   Execution time upper quantile : 73.565908 ns (97.5%)
                   Overhead used : 1.738155 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil
Comment by Michał Marczyk [ 13/May/14 8:28 AM ]

Ah, I left out the repeated phm hasheq lookup + hasheq of a java.util.HashMap instance pair of benchmarks from the above – here it is for completeness (no surprises though):

user=> (let [phm (apply hash-map (interleave (range 128) (range 128))) juhm (java.util.HashMap. phm)] (assert (= (hash phm) (hash juhm))) (c/bench (clojure.lang.Util/hasheq phm)) (c/bench (clojure.lang.Util/hasheq juhm)))
WARNING: Final GC required 1.260853406580491 % of runtime
Evaluation count : 5369135760 in 60 samples of 89485596 calls.
             Execution time mean : 10.380464 ns
    Execution time std-deviation : 3.407284 ns
   Execution time lower quantile : 9.510624 ns ( 2.5%)
   Execution time upper quantile : 11.461485 ns (97.5%)
                   Overhead used : 1.566301 ns

Found 5 outliers in 60 samples (8.3333 %)
	low-severe	 3 (5.0000 %)
	low-mild	 2 (3.3333 %)
 Variance from outliers : 96.4408 % Variance is severely inflated by outliers
Evaluation count : 3078180 in 60 samples of 51303 calls.
             Execution time mean : 19.717981 µs
    Execution time std-deviation : 209.896848 ns
   Execution time lower quantile : 19.401811 µs ( 2.5%)
   Execution time upper quantile : 20.180163 µs (97.5%)
                   Overhead used : 1.566301 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 2 (3.3333 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil
Comment by Alex Miller [ 13/May/14 9:17 AM ]

Please don't submit any patches that change hashcode for anything other than making Java collections match Clojure collections - any other change is out of scope of this ticket.

In general, I would prefer just the execution time mean report for the moment rather than everything - the full criterium output makes these comments much harder to read and compare.

Comment by Alex Miller [ 13/May/14 9:33 AM ]

Could I get a summary of approaches, and a timing of 1.6.0 vs each patch for a consistent set of tests - say time of hash for Long, PHM, juHM, Class, and the "three hasheqs" test?

Comment by Rich Hickey [ 13/May/14 9:47 AM ]

"Hashing many different types changes the results noticeably – presumably HotSpot backs off from some optimizations after seeing several different types passed in to hasheq?"

Right - if your benchmarks do not treat this site as megamorphic you will get all sorts of distorted results.

Comment by Michał Marczyk [ 14/May/14 3:15 AM ]

Ok, I have what I think is an improved microbenchmark for this: xor of hasheqs for a long, a double, a string, a class, a character and a PHM (single instance, so it'll be a hash lookup). The results are not very encouraging.

Single form including the require to make it convenient to run; also bundled is a j.u.HashMap (128 entries) hasheq benchmark:

(do
  (require '[criterium.core :as c])
  (let [l    41235125123
        d    123.456
        s    "asdf;lkjh"
        k    BigInteger
        c    \S
        phm  (apply hash-map (interleave (range 128) (range 128)))
        juhm (java.util.HashMap. phm)
        f    (fn f []
               (-> (clojure.lang.Util/hasheq l)
                   (bit-xor (clojure.lang.Util/hasheq d))
                   (bit-xor (clojure.lang.Util/hasheq s))
                   (bit-xor (clojure.lang.Util/hasheq k))
                   (bit-xor (clojure.lang.Util/hasheq c))
                   (bit-xor (clojure.lang.Util/hasheq phm))))]
    (c/bench (f))
    (c/bench (hash juhm))))

Mean execution time as reported by Criterium:

version xor (ns) j.u.HM (µs)
unpatched 1.6.0 148.128748 1.701640
0005 patch 272.039667 21.201178
original patch 268.670316 21.169436
-alternative patch 271.747043 20.755397

The substring patch is broken (see below), so I skipped it. The patch I'm describing as the "original" one is attached as 0001-CLJ-1372-consistent-hasheq-for-java.util.-List-Map-M.patch.

Decisions common to all the patches:

1. One extra if statement in hasheq just above the default return with a three-way instanceof check.

2. The types tested for are j.u.Iterable, j.u.Map.Entry and j.u.Map.

3. Murmur3.hashOrdered takes Iterable, so that's why it's on the list. Map does not extend Iterable, so it's listed separately. Map.Entry is on the list, because ultimately the way to hash maps is to iterate over and hash their entries.

4. The actual hashing of the "alien" / host types is done by a separate static method – clojure.lang.Util.doalienhasheq – on the theory that this will permit hasheq to be inlined more aggressively and limit the worst of the perf hit to alien collections.

5. doalienhasheq checks for Map, Map.Entry, Set and List; entries are converted to lists for hashing, maps are hashed through entry sets and lists and sets are passed directly to Murmur3.

6. There is also a default case for other Iterable types – we must return hashCode or the result of composing some other function with hashCode for these, since we use equals to test them for equivalence.

The 0005 patch has hasheq call a separate private static method to perform the three-way type check, whereas the others put the check directly in the actual if test. The -alternative patch and the 0005 patch return hashCode in the default case, whereas the original patch composes Murmur3.hashInt with hashCode.

The substring patch only works for java.util.** classes and so doesn't solve the problem (it wouldn't correctly hash Guava collections, for example).

All of the patches change c.l.Util.hasheq and add one or two new static methods to clojure.lang.Util that act as helpers for hasheq. None of them changes anything else. Murmuring hashCode was a performance experiment that appeared to have a slight positive impact on some of the "fast cases" (in fact it's still the best performer among the current three patches in the microbenchmark presented above, although the margin of victory is of course extremely tiny). Thus I think all the current patches are in fact limited in scope to changes directly relevant to the ticket; the -alternative patch and the 0005 patch certainly are.

Comment by Michał Marczyk [ 14/May/14 3:29 AM ]

For completeness, branching on Map, Set etc. directly in hasheq, as with Jozef's original patch, results in the following timings in the microbenchmark introduced in my previous comment:

xor 315.866626 ns
juhm 18.520133 µs
Comment by Michał Marczyk [ 14/May/14 4:01 AM ]

New patch (0006) that leaves out the Map.Entry check; instead, two methods are introduced in the Murmur3 class to handle j.u.maps.

Java map entries aren't really integrated into Clojure – you can't use them like vectors, can't call seq on them etc. – so I don't think they need to match Clojure map entries in hasheq as long as j.u.maps do.

Timings:

xor 233.341689 ns
juhm 9.104637 µs
Comment by Michał Marczyk [ 14/May/14 4:17 AM ]

Checking for Map/Iterable in-line doesn't seem to affect xor benchmark results very much, but makes juhm hashing quicker. This is rather surprising to me. In any case, here's a new patch (0007) and the timings:

xor 233.062337 ns
juhm 8.629149 µs
Comment by Alex Miller [ 14/May/14 7:17 AM ]

What are equivalent timings without the patch?

Comment by Michał Marczyk [ 14/May/14 7:43 AM ]

They're listed in the table in the comment introducing the benchmark – 148.128748 ns for xor, 1.701640 µs for juhm.

Comment by Alex Miller [ 14/May/14 8:02 AM ]

What if we override hasheq for different types instead of using instanceof?

Comment by Michał Marczyk [ 14/May/14 12:50 PM ]

Overloaded methods are resolved statically, so there's no avoiding testing for type in the Object overload.

A more specific overload could be used to speed up hashing for its parameter type given a type hint or for literals, since the compiler would emit calls to that overload given appropriate compile-time information. There wouldn't be any speed-up in "implicit" hashing during hash map / set ops, however.

Comment by Daniel Compton [ 12/Nov/14 9:19 PM ]

This hit me when upgrading Factual/skuld from 1.5.1 to 1.6. clojure.data.fressian serialises c.l.PersistentHashSet sets into java.util.HashSet. This breaks equality checking in https://github.com/Factual/skuld/blob/b720feb142e6d274e85be208dc1d6d8634801719/test/skuld/net_test.clj#L8-L29 as we are comparing a set of maps where the original set contains a PersistentSet and the serialised and deserialised one contains a HashSet.

Comment by Daniel Compton [ 12/Nov/14 11:54 PM ]

This has come up again for me, details are in http://dev.clojure.org/jira/browse/DFRS-7

Comment by Max Penet [ 16/Mar/15 11:22 AM ]

This bit me again today as well (wasted a lot of time before figuring this one out). Any chance we'll have a patch for 1.7?

Comment by Alex Miller [ 16/Mar/15 12:18 PM ]

afaik, we are still in search of an approach with tolerable performance impacts before this can be considered.





[CLJ-1656] Unroll assoc and assoc! for small numbers of arguments Created: 06/Feb/15  Updated: 16/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7, Release 1.8
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Tom Crayford Assignee: Unassigned
Resolution: Unresolved Votes: 5
Labels: performance

Attachments: File assoc.diff     Text File CLJ-1656-v1.patch     Text File CLJ-1656-v2.patch     Text File CLJ-1656-v3.patch     Text File CLJ-1656-v4.patch     Text File CLJ-1656-v5.patch     File cpuinfo     File javaversion     File output     File uname    
Patch: Code
Approval: Triaged

 Description   

Whilst doing performance work recently, I discovered that unrolling to single assoc calls were significantly faster than using multiple keys (~10% for my particular application). Zachary Tellman then pointed out that clojure.core doesn't unroll assoc at all, even for the case of relatively low numbers of keys.

We already unroll other performance critical functions that call things via `apply`, e.g. `update` https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L5914, but `assoc` (which is, I think in the critical path for quite a bunch of applications and libraries), would likely benefit from this.

I have not yet developed patches for this, but I did some standalone benchmarking work:

https://github.com/yeller/unrolling-assoc-benchmarks

benchmark results:

code: https://github.com/yeller/unrolling-assoc-benchmarks/blob/master/src/bench_assoc_unrolling.clj

  1 2 3 4
empty array map (not unrolled) 23ns 93ns 156ns 224ns
empty array map (unrolled assoc) N/A 51ns 80ns 110ns
         
20 element persistent hashmap (not unrolled) 190ns 313ns 551ns 651ns
20 element persistent hashmap (unrolled assoc) N/A 250ns 433ns 524ns
         
record (not unrolled) 12ns 72ns 105ns 182ns
record (unrolled assoc) N/A 21ns 28ns 41ns

Each measurement was made in a separate JVM, to avoid JIT path dependence.

Benchmarks were ran on a commodity server (8 cpus, 32gb ram), with ubuntu 12.04 and a recent release of Java 8. Attached are `cpuinfo`, `uname` and `java -version` output.

Relatively standard JVM production flags were enabled, and care was taken to disable leiningen's startup time optimizations (which disable many of the JIT optimizations).

Benchmarks can be run by cloning the repository, and running `script/bench`

There's one outstanding question for this patch: How far should we unroll these calls? `update` (which is unrolled in the 1.7 alphas) is unrolled to 3 arguments. Adding more unrolling isn't difficult, but it does impact the readability of assoc.



 Comments   
Comment by Tom Crayford [ 09/Feb/15 12:01 PM ]

Ok, attached `assoc.diff`, which unrolls this to a single level more than the current code (so supporting two key/value pairs without recursion). The code's going to get pretty complex in the case with more than the unrolled number of keys if we continue on this way, so I'm unsure if this is a good approach, but the performance benefits seem very compelling.

Comment by Michael Blume [ 09/Feb/15 3:35 PM ]

Since the unroll comes out kind of hairy, why not have a macro write it for us?

Comment by Michael Blume [ 09/Feb/15 4:03 PM ]

Patch v2 includes assoc!

Comment by Tom Crayford [ 09/Feb/15 5:01 PM ]

I benchmarked conj with similar unrolling, across a relatively wide range of datatypes from core (lists, sets, vectors, each one empty and then again with 20 elements):

  1 2 3 4
empty vector (not unrolled) 19ns 57ns 114ns 126ns
empty vector (unrolled conj) N/A 44ns 67ns 91ns
         
20 element vector (not unrolled) 27.35ns 69ns 111ns 107ns
20 element vector (unrolled conj) N/A 54ns 79ns 104ns
         
empty list (not unrolled) 7ns 28ns 53ns 51ns
empty list (unrolled conj) N/A 15ns 20ns 26ns
         
twenty element list (not unrolled) 8.9ns 26ns 49ns 49ns
twenty element list (unrolled) N/A 15ns 19ns 30ns
         
empty set (not unrolled) 64ns 170ns 286ns 290ns
empty set (unrolled) N/A 154ns 249ns 350ns
         
twenty element set (not unrolled) 33ns 81ns 132ns 130ns
twenty element set (unrolled) N/A 69ns 108ns 139ns

Benchmarks were run on the same machine as before. There's a less clear benefit here, except for lists, where the overhead of creating seqs and recursing seems to be clearly dominating the cost of actually doing the conj (which makes sense - conj on any element list should be a very cheap operation). Raw benchmark output is here: https://gist.github.com/tcrayford/51a3cd24b8b0a8b7fd74

Comment by Tom Crayford [ 09/Feb/15 5:04 PM ]

Michael Blume: I like those patches! They read far nicer to me than my original patch. Did you check if any of those macro generated methods blew Hotspot's hot code inlining limit? (it's 235 bytecodes). That'd be my only worry with using macros here - it's easy to generate code that defeats the inliner.

Comment by Michael Blume [ 09/Feb/15 5:57 PM ]

Thanks! This is new for me, so I might be doing the wrong thing, but I just ran nodisassemble over both definitions and the "instruction numbers" next to each line go up to 219 for the varargs arity assoc and up to 251 for assoc!, so, assuming I'm looking at the right thing, maybe that one needs to have an arity taken off? If I remove the highest arity I get 232 for varargs which is just under the line.

I guess alternatively we could call assoc! instead of assoc!* in the varargs arity, which removes a lot of code – in that case it's 176 for varargs and 149 for six pairs.

Comment by Michael Blume [ 09/Feb/15 6:01 PM ]

Gah, I forgot to include coll in the varargs call to assoc!

which reminds me that this patch needs tests.

Comment by Michael Blume [ 09/Feb/15 10:27 PM ]

OK, this has some fixes I made after examining the disassembled output. There's a change to the assoc!* macro to be sure it type-hints correctly – I'm honestly not sure why it didn't type-hint properly before, but it does now. Also, I changed the call to assoc! rolling up the first six entries at the top of the varargs version from a macro call to a function call so it'd fit within the 251 inlineable bytecodes. (This, again, is assuming I'm reading the output correctly).

Comment by Tom Crayford [ 10/Feb/15 6:38 AM ]

Michael: Wanna push a branch with these patches to clojars or something? Then I can rerun the benchmarks with the exact code in the patches.

Comment by Michael Blume [ 10/Feb/15 2:36 PM ]

Hmm, not sure I know how to do that – here's a branch on github though https://github.com/MichaelBlume/clojure/tree/unroll-assoc

Comment by Michael Blume [ 12/Feb/15 1:12 PM ]

v5 marks the helper macros private.

Comment by Tom Crayford [ 13/Feb/15 4:11 AM ]

Michael: was that branch just based off clojure/clojure master? I tried running benchmarks off it, but ran into undefined var errors when building this code (which doesn't happen with alpha5):

(Retrieving com/yellerapp/clojure-unrolled-assoc/1.7.0-unrollassoc-SNAPSHOT/clojure-unrolled-assoc-1.7.0-unrollassoc-20150213.092242-1.pom from clojars)
(Retrieving com/yellerapp/clojure-unrolled-assoc/1.7.0-unrollassoc-SNAPSHOT/clojure-unrolled-assoc-1.7.0-unrollassoc-20150213.092242-1.jar from clojars)
(Retrieving org/clojure/clojure/1.3.0/clojure-1.3.0.jar from central)
Exception in thread "main" java.lang.RuntimeException: Unable to resolve symbol: bench in this context, compiling:(bench_assoc_unrolling.clj:5)
at clojure.lang.Compiler.analyze(Compiler.java:6235)
at clojure.lang.Compiler.analyze(Compiler.java:6177)
at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3452)
at clojure.lang.Compiler.analyzeSeq(Compiler.java:6411)
at clojure.lang.Compiler.analyze(Compiler.java:6216)
at clojure.lang.Compiler.analyze(Compiler.java:6177)
at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5572)
at clojure.lang.Compiler$FnMethod.parse(Compiler.java:5008)

Comment by Michael Blume [ 23/Feb/15 5:08 PM ]

Ok, how are you building? Why the strange clojure group?

Comment by Michael Blume [ 23/Feb/15 5:09 PM ]

The existing version of assoc does runtime checking that an even number of varargs are passed in, but assoc! does not. Do we want to preserve this behavior or do checks in both?

Comment by Michael Blume [ 23/Feb/15 6:00 PM ]

Also, I'm curious how relevant inlining is here – does HotSpot inlining actually work with Var invocation when there's a getRootBinding step in the way?

Comment by Alex Miller [ 23/Feb/15 7:59 PM ]

Yes, inlining works through var invocation.

Comment by Tom Crayford [ 16/Mar/15 7:05 AM ]

Michael,

That group is just an uploaded version of clojure master with your patches applied, built in just the same way as before (you should be able to check out the repo and replicate).





[CLJ-1676] map destructuring: prevent evaluation of values in :or when they are not used/needed Created: 14/Mar/15  Updated: 15/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Max Penet Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: destructuring

Approval: Triaged

 Description   

The name :or implies this should behave as "or" and be "lazy" but it's not the case currently.
The following gist shows the issue. :x is present in the map but we eval the default value:

(defn foo
  [{:keys [x]
    :or {x (println :set-default)}}] 
  x)
 
 
 
user> (foo {:x 1})
:set-default
1


 Comments   
Comment by Ghadi Shayban [ 15/Mar/15 2:40 PM ]

1.2 - current all behave this way, doesn't seem like a recent change.

Comment by Max Penet [ 15/Mar/15 2:55 PM ]

Right, I thought it might have been a regression, but wasn't sure at all.
It seems it would be safe to change the current behavior, I doubt it would break any ones code.





[CLJ-1644] into-array fails for sequences starting with nil Created: 15/Jan/15  Updated: 13/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Michael Blume Assignee: Michael Blume
Resolution: Unresolved Votes: 0
Labels: arrays, ft

Attachments: Text File CLJ-1644-array-first-nil-v1.patch     Text File CLJ-1644-array-first-nil-v2.patch    
Patch: Code and Test
Approval: Triaged

 Description   
(into-array [nil 1 2]) 
;;=> NullPointerException

Found during work on CLJ-1643

Approach: Loop through the seq until the first non-nil element and use that to determine the class of the elms in the seq
Patch: CLJ-1644-array-first-nil-v1.patch
Screened by:



 Comments   
Comment by Michael Blume [ 15/Jan/15 1:45 PM ]

uploading patch v1 (adding nil as a cons-able element in CLJ-1643 will also bring this out, but I don't want to make the one patch depend on the other)

Comment by Phill Wolf [ 12/Mar/15 7:06 PM ]

Searching through the sequence for a non-null item is not consistent with into-array's docstring. The docstring says, "The array's component type is type if provided, or the type of the first value in aseq if present, or Object." In keeping with the docstring, wouldn't a null first item suggest an array of Object?

Working harder than that (by searching the sequence) only delays the inevitable: a whole sequence of nulls producing an Object array, instead of an array of the type the programmer expected, and triggering a run-time crash.

In summary: this patch goes farther than necessary, but even so, it does not cure the risk of unexpected results from nulls. A simpler remedy – returning an array of Object if the first item is null – would be consistent with the docstring and avoid raising unfounded expectations. Adding a statement that null is of type Object to the docstring could help programmers avoid falling into the trap.

Comment by Alex Miller [ 12/Mar/15 8:29 PM ]

No search through the sequence will pass screening, please just add a nil check.

Comment by Michael Blume [ 13/Mar/15 4:39 PM ]

done





[CLJ-1565] pprint issues infinite output for a protocol Created: 15/Oct/14  Updated: 12/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Michael Nygard Assignee: Unassigned
Resolution: Unresolved Votes: 4
Labels: print, protocols

Attachments: File fix-pprint-var.diff    
Patch: Code
Approval: Triaged

 Description   

Using pprint with a protocol name generates an unending stream of output. pprint appears to recurse through the Var reference as the value of the :var key in the protocol definition itself.

To reproduce:

user=> (defprotocol Foo (foo-you [this]))
Foo
user=> (pprint Foo)
{:on user.Foo,
 :on-interface user.Foo,
 :sigs {:foo-you {:doc nil, :arglists ([this]), :name foo-you}},
 :var
 #<Var@6a3b02d8:
   {:on user.Foo,
    :on-interface user.Foo,
    :sigs {:foo-you {:doc nil, :arglists ([this]), :name foo-you}},
    :var
    #<Var@6a3b02d8:
      {:on user.Foo,
       :on-interface user.Foo,
       :sigs {:foo-you {:doc nil, :arglists ([this]), :name foo-you}},
       :var
       #<Var@6a3b02d8:
         {:on user.Foo,
          :on-interface user.Foo,
          :sigs
          {:foo-you {:doc nil, :arglists ([this]), :name foo-you}},
          :var
          #<Var@6a3b02d8:
            {:on user.Foo,
             :on-interface user.Foo,
             :sigs
             {:foo-you {:doc nil, :arglists ([this]), :name foo-you}},
             :var
             #<Var@6a3b02d8:
               {:on user.Foo,
                :on-interface user.Foo,
                :sigs
                {:foo-you
                 {:doc nil, :arglists ([this]), :name foo-you}},

Approach: Add a dispatch method which doesn't deref the Var
Patch: fix-pprint-var.diff
Screened by:



 Comments   
Comment by Andy Fingerhut [ 12/Nov/14 1:26 PM ]

I've run across this issue while debugging Eastwood. It probably does more than what you want in terms of modifying pprint behavior, but check out eastwood.util/pprint-meta here: https://github.com/jonase/eastwood/blob/master/src/eastwood/util.clj#L206

Comment by Daniel Marjenburgh [ 12/Nov/14 2:29 PM ]

The issue is that the simple-dispatch multifn dispatched a clojure.lang.Var to clojure.lang.IDeref, which dereferenced the Var before printing it. We have created a patch which dispatches a Var to the default print fn.

– With regards from the Amsterdam Clojure meetup group

Comment by Nicola Mometto [ 12/Nov/14 7:13 PM ]

The patch for this ticket also addressed http://dev.clojure.org/jira/browse/CLJ-1576





[CLJ-1586] Compiler doesn't preserve metadata for LazySeq literals Created: 12/Nov/14  Updated: 12/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: compiler, metadata, typehints

Attachments: Text File 0001-Compiler-doesn-t-preserve-metadata-for-lazyseq-liter.patch    
Patch: Code
Approval: Triaged

 Description   

The analyzer in Compiler.java forces evaluation of lazyseq literals, but loses the compile time original metadata of that form, meaning that a type hint will be lost.

Example demonstrating this issue:

user=> (set! *warn-on-reflection* true)
true
user=> (list '.hashCode (with-meta (concat '(identity) '("foo")) {:tag 'String}))
(.hashCode (identity "foo"))
user=> (eval (list '.hashCode (with-meta (concat '(identity) '("foo")) {:tag 'String})))
Reflection warning, NO_SOURCE_PATH:1:1 - reference to field hashCode can't be resolved.
101574

Forcing the concat call to an ASeq rather than a LazySeq fixes this issue:

user=> (eval (list '.hashCode (with-meta (seq (concat '(identity) '("foo"))) {:tag 'String})))
101574

This ticket blocks http://dev.clojure.org/jira/browse/CLJ-1444 since clojure.core/sequence might return a lazyseq.

This bug affected both tools.analyzer and tools.reader and forced me to commit a fix in tools.reader to work around this issue, see: http://dev.clojure.org/jira/browse/TANAL-99

The proposed patch trivially preserves the form metadata after realizing the lazyseq

Approach: Keep a copy of the original form, and apply its metadata to the realized lazyseq
Patch: 0001-Compiler-doesn-t-preserve-metadata-for-lazyseq-liter.patch
Screened by:






[CLJ-1588] StackOverflow in clojure.test macroexpand with `are` and anonymous `fn` Created: 13/Nov/14  Updated: 12/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Nikita Prokopov Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File clojure-test.recursion.patch    
Approval: Triaged

 Description   

I noticed that this happens when an argument in anonymous `fn` is named the same as one of the binding in `are` form

(use 'clojure.test)
(deftest x
  (are [x y] (= x y)
    ((fn [x] (inc x)) 1) 2))
=>
clojure.lang.Compiler$CompilerException: java.lang.StackOverflowError, compiling:(/Users/nprokopov/Dropbox/ws/clojure.unicode/test/clojure/test_unicode.clj:54:3)

Approach: Replace prewalk-replace with postwalk-replace. prewalk-replace first replaces the form and then goes inside it looking for more replacements. postwalk-replace avoids that.
Patch: clojure-test.recursion.patch
Screened by:



 Comments   
Comment by Michael Blume [ 13/Nov/14 11:45 PM ]

yep, I bet it's trying to replace the x with (fn [x] (inc x)) and then replacing the x in that and...

this doesn't seem like that much of a defect? Like why write the code with the same variable names in the first place?

Comment by Nikita Prokopov [ 13/Nov/14 11:49 PM ]

Well, logically these are two completely separate, isolated Xes. It can be avoided, sure, but this behavior is not expected and I’m sure can be easily fixed.

Comment by Nikita Prokopov [ 13/Nov/14 11:53 PM ]

I mean, there shouldn’t be any recursion at all in the first place, right?

Comment by Nikita Prokopov [ 14/Nov/14 2:12 AM ]

Patch

Comment by Nikita Prokopov [ 14/Nov/14 2:13 AM ]

I fixed the issue by replacing prewalk with postwalk. It was caused by prewalk-replace that first replaced the form and then goes inside it looking for more replacements. Postwalk-replace avoids that.

Comment by Alex Miller [ 14/Nov/14 9:27 AM ]

1) Needs tests.
2) As a change in template, this affects many possible users. Can you assess other possible users either in core or in other external projects and whether they are affected? I have found http://crossclj.info to be helpful for questions like this.

Comment by Alex Miller [ 14/Nov/14 11:42 AM ]

Please include tests in a single combined patch and update the description to include a line specifying the current active patch for consideration.

Comment by Nikita Prokopov [ 14/Nov/14 11:46 AM ]

Alex, I attached a test case.

I also took a look at all use-cases of apply-template and do-template in both clojure.core and third-party projects. It’s not used very often, and in all cases use of do-template is pretty straightforward (take [x y z] and just replace it with some completely different forms), it does not depend on recursion and change from prewalk to postwalk will not cause any change in behavior. I think this patch is safe.

Comment by Nikita Prokopov [ 14/Nov/14 12:00 PM ]

I’m afraid I cannot edit description...

src/clj/clojure/template.clj => line 43
test/clojure/test_clojure/test.clj => lines 83-85

Comment by Alex Miller [ 14/Nov/14 10:14 PM ]

You should have edit rights now on jira issues.





[CLJ-1609] Fix an edge case in the Reflector's search for a public method declaration Created: 05/Dec/14  Updated: 12/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Jeremy Heiler Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: interop

Attachments: Text File reflector_method_bug.patch    
Patch: Code
Approval: Triaged

 Description   

The Reflector was not taking into account that a non-public class can implement an interface, and have a non-public parent class contain an implementation of a method on that interface.

The solution I took is to pass in the target object's class instead of the declaring class of the method object.

I've outlined a small example here: https://github.com/jeremyheiler/clj-reflector-bug

The repo contains a Java example that works, and the same example in Clojure that doesn't work. It also includes a patched version of Clojure 1.6.0, and shows that the patch solves the issue. Also, in `src/foo/m.clj`, there is a real example of this bug occurring by using the Java Debug Interface API in the tools.jar library.

I would have added tests to the patch, but I don't think the test runner compiles test Java code, which is required.

Approach: pass the target objects class instead of the declaring class of the method object
Patch: reflector_method_bug.patch
Screened by:



 Comments   
Comment by Alex Miller [ 05/Dec/14 8:48 AM ]

Probably a dupe of CLJ-259. I'll probably dupe that one to this though.

Comment by Jeremy Heiler [ 05/Dec/14 1:33 PM ]

Thanks. Sorry for not finding that myself. CLJ-259 refers to CLJ-126, which I think is covered with this patch.

Comment by Jeremy Heiler [ 06/Dec/14 12:37 PM ]

I was looking into whether or not the target could be null, and it can be when invoking a static method. However, I don't think that code path would make it to the target.getClass() because non-public static methods aren't returned by getMethods().





[CLJ-130] Namespace metadata lost in AOT compile Created: 19/Jun/09  Updated: 12/Mar/15

Status: In Progress
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Critical
Reporter: Stuart Sierra Assignee: Unassigned
Resolution: Unresolved Votes: 5
Labels: aot, metadata

Attachments: Text File 0001-CLJ-130-preserve-metadata-for-AOT-compiled-namespace.patch     File aot-drops-metadata-demo.sh    
Patch: Code
Approval: Triaged

 Description   

AOT-compilation drops namespace metadata.

This also affects all of the namespaces packaged with Clojure, except clojure.core, for which metadata is explicitly added in core.clj.

Cause of the bug:

  • a namespace inherits the metadata of the symbol used to create that namespace the first time
  • the namespace is created in the load() method, that is invoked after the __init() method
  • the __init0() method creates all the Vars of the namespace
  • interning a Var in a namespace that doesn't exist forces that namespace to be created

This means that the namespace will have been already created (with nil metadata) by the time the load() method gets invoked and thus the call to in-ns will be a no-op and the metadata will be lost.

Approach: The attached patch fixes this issue by explicitely attaching the metadata to the namespace after its creation (via ns) using a .resetMeta call
Patch: 0001-CLJ-130-preserve-metadata-for-AOT-compiled-namespace.patch
Screened by:



 Comments   
Comment by Assembla Importer [ 24/Aug/10 6:45 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/130

Comment by Assembla Importer [ 24/Aug/10 6:45 AM ]

richhickey said: Updating tickets (#127, #128, #129, #130)

Comment by Assembla Importer [ 24/Aug/10 6:45 AM ]

juergenhoetzel said: This is still a issue on

Clojure 1.2.0-master-SNAPSHOT

Any progress, hints? I prefer interactive documentiation via slime/repl

Comment by Howard Lewis Ship [ 09/Sep/14 9:44 AM ]

This is of great concern to me, as the Rook web services framework we're building depends on availability of namespace metadata at runtime.

Comment by Howard Lewis Ship [ 09/Sep/14 9:53 AM ]

BTW, I verified that this still exists in 1.6.0.

Comment by Howard Lewis Ship [ 09/Sep/14 10:11 AM ]

For me personally, I would raise the priority of this issue. And I think in general, anything that works differently with AOT vs. non-AOT should be major, if not blocker, priority.

Comment by Howard Lewis Ship [ 09/Sep/14 10:25 AM ]

Alex Miller:

@hlship I think the question is where it would go. note no one has suggested a solution in last 5 yrs.

Alas, I have not delved into the AOT compilation code (since, you know, I value my sanity). But it seems to me like the __init class for the namespace could construct the map and update the Namespace object.

Comment by Howard Lewis Ship [ 09/Sep/14 4:27 PM ]

Just playing with javap, I can see that the meta data is being assembled in some way, so it's a question of why it is not accessible ...

  public static void __init0();
    Code:
       0: ldc           #108                // String clojure.core
       2: ldc           #110                // String in-ns
       4: invokestatic  #116                // Method clojure/lang/RT.var:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Var;
       7: checkcast     #12                 // class clojure/lang/Var
      10: putstatic     #10                 // Field const__0:Lclojure/lang/Var;
      13: aconst_null
      14: ldc           #118                // String fan.auth
      16: invokestatic  #122                // Method clojure/lang/Symbol.intern:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Symbol;
      19: checkcast     #124                // class clojure/lang/IObj
      22: iconst_4
      23: anewarray     #4                  // class java/lang/Object
      26: dup
      27: iconst_0
      28: aconst_null
      29: ldc           #126                // String meta-foo
      31: invokestatic  #130                // Method clojure/lang/RT.keyword:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Keyword;
      34: aastore
      35: dup
      36: iconst_1
      37: aconst_null
      38: ldc           #132                // String meta-bar
      40: invokestatic  #130                // Method clojure/lang/RT.keyword:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Keyword;
      43: aastore
      44: dup
      45: iconst_2
      46: aconst_null
      47: ldc           #134                // String doc
      49: invokestatic  #130                // Method clojure/lang/RT.keyword:(Ljava/lang/String;Ljava/lang/String;)Lclojure/lang/Keyword;
      52: aastore
      53: dup
      54: iconst_3
      55: ldc           #136                // String Defines the resources for the authentication service.
      57: aastore
      58: invokestatic  #140                // Method clojure/lang/RT.map:([Ljava/lang/Object;)Lclojure/lang/IPersistentMap;
      61: checkcast     #64                 // class clojure/lang/IPersistentMap
      64: invokeinterface #144,  2          // InterfaceMethod clojure/lang/IObj.withMeta:(Lclojure/lang/IPersistentMap;)Lclojure/lang/IObj;

If I'm reading the code correctly, a Symbol named after the namespace is interned, and the meta-data for the namespace is applied to the symbol, so it's just a question of commuting that meta data to the Namespace object. I must be missing something.

Comment by Nicola Mometto [ 30/Sep/14 6:45 PM ]

Attached patch fixes this issue by explicitely attaching the metadata to the namespace after its creation using a .resetMeta call.

Comment by Nicola Mometto [ 30/Sep/14 7:46 PM ]

Here's an explaination of why this bug happens:

  • a namespace inherits the metadata of the symbol used to create that namespace the first time
  • the namespace is created in the load() method, that is invoked after the __init() method
  • the __init0() method creates all the Vars of the namespace
  • interning a Var in a namespace that doesn't exist forces that namespace to be created

This means that the namespace will have been already created (with nil metadata) by the time the load() method gets invoked and thus the call to in-ns will be a no-op and the metadata will be lost.





[CLJ-1290] clojure.xml parse docstring omits InputSource Created: 01/Nov/13  Updated: 12/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Phill Wolf Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: docstring, newbie, xml

Attachments: Text File CLJ-1290.patch    
Patch: Code
Approval: Triaged

 Description   

The clojure.xml parse docstring mentions that parameter s "can be a File, InputStream or String naming a URI." But those choices do not cover a common case, parsing the value of a String. Actually, parse also allows InputSource, which solves the problem. The docstring should mention InputSource (or clarify its omission, if not inadvertent).

user> (use '[clojure.xml :as xml])
nil
user> (import '[java.io StringReader])
java.io.StringReader
user> (import '[org.xml.sax InputSource])
org.xml.sax.InputSource
user> (xml/parse (InputSource. (StringReader. "<egg>green</egg>")))
{:tag :egg, :attrs nil, :content ["green"]}

Approach: Update doc-string to reflect that s also can be an InputSource
Patch: CLJ-1290.patch
Screened by:



 Comments   
Comment by Édipo L Féderle [ 15/Sep/14 3:57 PM ]

You and mean that de (doc xml/parse) should include also "can be a xml String" ?
I don't know if I understand you right.
Thanks.

Comment by Phill Wolf [ 24/Sep/14 6:06 AM ]

InputSource is the use of xml/parse that is not encompassed by the docstring:

(xml/parse (InputSource. (StringReader. "<egg>green</egg>")))

Perhaps xml/parse aimed to hide InputSource by making specific provision for some of InputSource's capabilities. But reading a String is important, and xml/parse does not accept a StringReader, so InputSource remains important.





[CLJ-1674] Boolean return type-hint confusing the compiler Created: 12/Mar/15  Updated: 12/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Mikkel Kamstrup Erlandsen Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: typehints
Environment:

OSX, Clojure 1.6.0



 Description   

Saving the below snippet and running it like

java -jar clojure-1.6.0.jar snippet.clj

Produces

$ java -jar clojure-1.6.0.jar snippet.clj 
Exception in thread "main" java.lang.IllegalArgumentException: Unable to resolve classname: clojure.core$boolean@1356d4d4, compiling:(/Users/kamstrup/tmp/snippet.clj:15:1)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6651)
	at clojure.lang.Compiler.analyze(Compiler.java:6445)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6632)
	at clojure.lang.Compiler.analyze(Compiler.java:6445)
	at clojure.lang.Compiler.access$100(Compiler.java:38)
	at clojure.lang.Compiler$DefExpr$Parser.parse(Compiler.java:538)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6644)
	at clojure.lang.Compiler.analyze(Compiler.java:6445)
	at clojure.lang.Compiler.analyze(Compiler.java:6406)
	at clojure.lang.Compiler.eval(Compiler.java:6707)
	at clojure.lang.Compiler.load(Compiler.java:7130)
	at clojure.lang.Compiler.loadFile(Compiler.java:7086)
	at clojure.main$load_script.invoke(main.clj:274)
	at clojure.main$script_opt.invoke(main.clj:336)
	at clojure.main$main.doInvoke(main.clj:420)
	at clojure.lang.RestFn.invoke(RestFn.java:408)
	at clojure.lang.Var.invoke(Var.java:379)
	at clojure.lang.AFn.applyToHelper(AFn.java:154)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: java.lang.IllegalArgumentException: Unable to resolve classname: clojure.core$boolean@1356d4d4
	at clojure.lang.Compiler$HostExpr.tagToClass(Compiler.java:1069)
	at clojure.lang.Compiler$InvokeExpr.getJavaClass(Compiler.java:3659)
	at clojure.lang.Compiler$LocalBinding.hasJavaClass(Compiler.java:5657)
	at clojure.lang.Compiler$LocalBindingExpr.hasJavaClass(Compiler.java:5751)
	at clojure.lang.Compiler.maybePrimitiveType(Compiler.java:1283)
	at clojure.lang.Compiler$IfExpr.doEmit(Compiler.java:2631)
	at clojure.lang.Compiler$IfExpr.emit(Compiler.java:2613)
	at clojure.lang.Compiler$BodyExpr.emit(Compiler.java:5826)
	at clojure.lang.Compiler$LetExpr.doEmit(Compiler.java:6180)
	at clojure.lang.Compiler$LetExpr.emit(Compiler.java:6133)
	at clojure.lang.Compiler$BodyExpr.emit(Compiler.java:5826)
	at clojure.lang.Compiler$FnMethod.doEmit(Compiler.java:5374)
	at clojure.lang.Compiler$FnMethod.emit(Compiler.java:5232)
	at clojure.lang.Compiler$FnExpr.emitMethods(Compiler.java:3771)
	at clojure.lang.Compiler$ObjExpr.compile(Compiler.java:4410)
	at clojure.lang.Compiler$FnExpr.parse(Compiler.java:3904)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:6642)
	... 19 more

The snippet:

snippet.clj
;; Bug in the Clojure compiler (1.6.0): If we annotate the return here with ^boolean we get:
;; 'IllegalArgumentException: Unable to resolve classname: clojure.core$boolean' from the compiler.
;; Removing it, everything is as expected
(defn ^boolean foo-bar?
  [node]
  (= node "foo-bar"))

;; Check it out, we can have ^boolean here, but not on foo-bar? !! :-)
(defn ^boolean bar-foo?
  [node]
  (= node "bar-foo"))

;; Instead of removing the ^boolean return on foo-bar? we can also remove this function
;; to have all work as expected
(defn ^boolean interesting?
  [node]
  (or (foo-bar? node) (bar-foo? node)))

(println "Foo-Bar?" (foo-bar? "baz"))


 Comments   
Comment by Mikkel Kamstrup Erlandsen [ 12/Mar/15 5:25 AM ]

Typo in comment 2 in the snippet: s/xtc-scenario?/foo-bar?/

Comment by Nicola Mometto [ 12/Mar/15 6:01 AM ]

Metadata on def's symbol is evaluated as per the doc (http://clojure.org/special_forms), evaluating `boolean` results in the clojure.core/boolean function which is not a valid type hint.

As a rule of thumb, attach the return tag in the argvec rather than on the def symbol, in this case you should write

(defn foo-bar?
   ^boolean [node]
  (= node "foo-bar"))

I understand why the fact that

(defn ^boolean foo [] true)

and

(defn foo ^boolean [] true)

behave differently and the fact that the compiler will throw iff the type hint is used rather than throwing at the function definition time is confusing (and I've complained about this and the lack of documentation/specification regarding type hints for a while) but this is not a bug

Comment by Mikkel Kamstrup Erlandsen [ 12/Mar/15 6:36 AM ]

Thanks for clarifying Nicola, you are indeed correct.

Putting return type annotations before the method name seems to be common practice in a lot of Clojure code I've read online. Perpetuated by some online tutorials, and the clojure.org docs them selves (fx.

(defn ^:private ^String my-fn ...)
is found in http://clojure.org/cheatsheet)

Comment by Andy Fingerhut [ 12/Mar/15 8:36 AM ]

Mikkel: If the type tags are Java classes, not primitives, then ^Classname is a correct type tag. If you use Eastwood, it can warn about these incorrect type tags, and has some documentation on what works and what does not here: https://github.com/jonase/eastwood#wrong-tag

Also here: https://github.com/jonase/eastwood#unused-meta-on-macro





[CLJ-1675] IOFactory protocol extension on String does not call its Coercions Created: 12/Mar/15  Updated: 12/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Trivial
Reporter: Andrea Richiardi Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: io
Environment:

-


Attachments: File fix-string-protocol.diff    
Patch: Code

 Description   

The IOFactory protocol extension on String doesn't call the respective as-file and as-url implemented in Coercions.

I found it odd and fixed it. If it had been done on purpose, I apologize.






[CLJ-1673] Improve clojure.repl/dir-fn to work on namespace aliases in addition to canonical namespaces. Created: 11/Mar/15  Updated: 11/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Jason Whitlark Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: repl

Attachments: Text File improve_dir.patch    
Patch: Code
Approval: Triaged

 Description   

Extend clojure.repl/dir to work with the aliases in the current namespace

Patch: improve_dir.patch

Question: This does introduce a dependency on *ns*, but only for looking up aliases. If no alias is found, behavior is the same as the current implementation.



 Comments   
Comment by Jason Whitlark [ 11/Mar/15 4:00 PM ]

Possible unit test, since clojure.string is aliased in the test file:

(is (= (dir-fn 'clojure.string) (dir-fn 'str)))





[CLJ-1672] Better error message when passing a list to update-in Created: 11/Mar/15  Updated: 11/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: John Gabriele Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs, errormsgs
Environment:

OpenJDK 1.7 on GNU/Linux



 Description   

This one confused me when I'd accidentally passed a list (returned by a function) in to `update-in` instead of a vector.

Example:

some-app.core=> (update-in [:a :b :c] [1] name)
[:a "b" :c]
some-app.core=> (update-in '(:a :b :c) [1] name)

NullPointerException   clojure.core/name (core.clj:1518)

Similar result if passing in another function; for example:

some-app.core=> (update-in ["a" "b" "c"] [1] str/capitalize)
["a" "B" "c"]
some-app.core=> (update-in '("a" "b" "c") [1] str/capitalize)

NullPointerException   clojure.string/capitalize (string.clj:199)


 Comments   
Comment by Alex Miller [ 11/Mar/15 9:26 AM ]

I think this is effectively a dupe of CLJ-1107 re throwing on get with a non-Associative collection?





[CLJ-1208] Namespace is not loaded on defrecord class init Created: 03/May/13  Updated: 11/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.8

Type: Enhancement Priority: Major
Reporter: Tim McCormack Assignee: Unassigned
Resolution: Unresolved Votes: 5
Labels: compiler, defrecord, deftype

Attachments: Text File 0001-CLJ-1208-load-own-namespace-in-deftype-defrecord-cla.patch     Text File 0001-CLJ-1208-load-own-namespace-in-deftype-defrecord-cla-v2.patch     Text File 0001-CLJ-1208-load-own-namespace-in-deftype-defrecord-cla-v3.patch    
Patch: Code and Test
Approval: Vetted

 Description   

As a user of Clojure interop from Java, I want defrecords (and deftypes?) to load their namespaces upon class initialization so that I can simply construct and use AOT'd record classes without manually requiring their namespaces first.

Calling the defrecord's constructor may or may not result in "Attempting to call unbound fn" exceptions, depending on what code has already been run.

This issue has been raised several times over the years, but I could not find an existing ticket for it:



 Comments   
Comment by Nicola Mometto [ 18/Jan/15 7:10 AM ]

The attached patch approaches this issue by adding a :load-ns options to deftype/defrecord which defaults to false.
When true, the type/record be compiled with a call to clojure.core/require to its originating namespace in its static initializer.

The patch has two known limitations:

  • clojure.core deftypes/defrecords cannot have :load-ns since we use clojure.core/require to load the namespaces so clojure.core needs to be loaded manually anyway
  • clojure.lang.Compiler/demunge is used to get the originating namespace from the deftype/defrecord class name, this means that namespaces like foo_bar are not supported since they get demunged into foo-bar. If this is something that needs to be addressed, it shouldn't be too hard to just pass the unmunged namespace name in the opts map.
Comment by Nicola Mometto [ 22/Jan/15 12:59 PM ]

Updated patch fixing a whitespace error and mentionint :load-ns in the docstrings of deftype/defrecord

Comment by Nicola Mometto [ 11/Mar/15 6:12 AM ]

Updated patch so it applies on lastest HEAD





[CLJ-991] partition-by reducer Created: 10/May/12  Updated: 09/Mar/15  Resolved: 05/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Enhancement Priority: Minor
Reporter: Kevin Downey Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: reducers

Attachments: File reducer-partition-by2.diff     File reducer-partition-by3.diff     File reducer-partition-by4.diff     File reducer-partition-by.diff    
Patch: Code and Test

 Comments   
Comment by Rich Hickey [ 14/Aug/12 1:52 PM ]

I'd like to see something much faster than this.

Comment by Kevin Downey [ 15/Aug/12 1:58 AM ]

For reference here is a benchmark of a non-reducers (seq based) process that uses partition-by

user=> (def x (vec (range 1e6)))
#'user/x
user=> (bench (reduce + (map count (partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 60
             Execution time mean : 1.072157 sec  95.0% CI: (1.070606 sec, 1.073381 sec)
    Execution time std-deviation : 165.818282 ms  95.0% CI: (163.873585 ms, 168.271261 ms)
         Execution time lower ci : 972.562000 ms  95.0% CI: (972.562000 ms, 973.301850 ms)
         Execution time upper ci : 1.419148 sec  95.0% CI: (1.419148 sec, 1.419148 sec)

Found 7 outliers in 60 samples (11.6667 %)
	low-severe	 2 (3.3333 %)
	low-mild	 5 (8.3333 %)
 Variance from outliers : 85.8489 % Variance is severely inflated by outliers
nil
user=>

Same again using r/partition-by from reducer-partition-by.diff

user=> (bench (r/reduce + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 60
             Execution time mean : 1.418350 sec  95.0% CI: (1.417738 sec, 1.418948 sec)
    Execution time std-deviation : 66.736477 ms  95.0% CI: (66.186568 ms, 67.610777 ms)
         Execution time lower ci : 1.370419 sec  95.0% CI: (1.370419 sec, 1.370419 sec)
         Execution time upper ci : 1.544151 sec  95.0% CI: (1.544151 sec, 1.544156 sec)

Found 10 outliers in 60 samples (16.6667 %)
	low-severe	 2 (3.3333 %)
	low-mild	 8 (13.3333 %)
 Variance from outliers : 33.5591 % Variance is moderately inflated by outliers
nil
user=> 

(using https://github.com/hugoduncan/criterium for benchmarking)

Comment by Kevin Downey [ 15/Aug/12 2:17 AM ]

same again for r/partition-by from reducers-partition-by2.diff

user=> (bench (r/reduce + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 180
             Execution time mean : 307.596806 ms  95.0% CI: (307.271339 ms, 307.961550 ms)
    Execution time std-deviation : 34.060809 ms  95.0% CI: (33.613169 ms, 34.416837 ms)
         Execution time lower ci : 285.339333 ms  95.0% CI: (285.339333 ms, 285.339333 ms)
         Execution time upper ci : 385.087950 ms  95.0% CI: (385.087950 ms, 385.087950 ms)

Found 10 outliers in 60 samples (16.6667 %)
	low-severe	 4 (6.6667 %)
	low-mild	 6 (10.0000 %)
 Variance from outliers : 73.8053 % Variance is severely inflated by outliers
nil
user=> 

same again driven using r/fold (for grins) instead of r/reduce

user=> (bench (r/fold + (r/map count (r/partition-by #(or (zero? (mod % 3)) (zero? (mod % 5))) x))))
Evaluation count             : 360
             Execution time mean : 215.214486 ms  95.0% CI: (214.915417 ms, 215.664236 ms)
    Execution time std-deviation : 36.588464 ms  95.0% CI: (36.305548 ms, 36.847846 ms)
         Execution time lower ci : 185.575000 ms  95.0% CI: (185.575000 ms, 185.575000 ms)
         Execution time upper ci : 287.605175 ms  95.0% CI: (286.547833 ms, 287.605175 ms)

Found 6 outliers in 60 samples (10.0000 %)
	low-severe	 3 (5.0000 %)
	low-mild	 3 (5.0000 %)
 Variance from outliers : 87.6303 % Variance is severely inflated by outliers
nil
user=> 

reducers-partition-by2.diff is faster, but I am not wild about introducing a type and a protocol. also not sure about the CollFold impl.

Comment by Rich Hickey [ 15/Aug/12 6:58 AM ]

Let's leave fold out for this first patch, please.

Comment by Kevin Downey [ 15/Aug/12 2:34 PM ]

reducer-partition-by3.diff is a cleaned up version of reducer-partition-by2.diff without fold

Comment by Devin Walters [ 20/Oct/12 7:07 PM ]

Per Andy Fingerhut's email reducer-partition-by3.diff was failing to apply. This patch should apply cleanly to current master.

Comment by Andy Fingerhut [ 01/Nov/12 6:59 PM ]

Presumptuously changing Approval from Incomplete to None, since the reason for its being marked Incomplete seems to have been addressed with the latest patch.

Comment by Kevin Downey [ 04/Mar/13 2:49 PM ]

should this be assigned to someone?

Comment by Andy Fingerhut [ 03/Sep/14 8:09 PM ]

Patch reducer-partition-by4.diff dated Oct 20 2012 no longer applies cleanly to latest master after some commits were made to Clojure on Sep 3, 2014.

I have not checked whether this patch is straightforward to update, nor do I know if there is interest in an updated patch, or whether transducers are the preferred way to go over reducers, and thus all reducers-related tickets will be closed. See the section "Updating stale patches" at http://dev.clojure.org/display/community/Developing+Patches for suggestions on how to update patches.

Comment by Kevin Downey [ 05/Mar/15 11:41 AM ]

the single arity for completion added to reducing functions in 1.7 makes partition-by easier to write, and low and behold clojure.core/parition-by can return a partitioning transducing function





[CLJ-1593] Use PAM for small maps when assigned to a var rather than always using PHMs Created: 15/Nov/14  Updated: 09/Mar/15  Resolved: 09/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: collections, compiler, maps

Attachments: Text File 0001-Use-PAM-rather-than-always-using-PHMs-for-small-maps.patch    
Patch: Code

 Description   

I'm reproposing the fix I implemented for http://dev.clojure.org/jira/browse/CLJ-944 a while ago as an enhancement rather than as a defect.

Currently when a map is used as the value of a `def` expression, unless it's an empty map, it will always be a PersistentHashMap even if it's a small map.

user=> (def a {:foo :bar})
#'user/a
user=> (class a)
clojure.lang.PersistentHashMap

The current patch makes makes small maps be compiled to PAMs, consistently with how it's handled in lexical contexts, only using PHMs when the number of elements is above the threshold

user=> (def a {:foo :bar})
#'user/a
user=> (class a)
clojure.lang.PersistentArrayMap
user=> (class (let [a {:foo :bar}] a))
clojure.lang.PersistentArrayMap
user=> (def a {0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9})
#'user/a
user=> (class a)
clojure.lang.PersistentHashMap


 Comments   
Comment by Alex Miller [ 15/Nov/14 12:17 PM ]

This might be subsumed under the small collections CLJ-1517, not sure.

Comment by Nicola Mometto [ 08/Dec/14 9:19 AM ]

This is now out of scope for CLJ-1517 now that's focused only on vectors.

Comment by Alex Miller [ 08/Dec/14 9:47 AM ]

We're just splitting the ticket apart, maps will be a separate ticket/patch.

Comment by Nicola Mometto [ 09/Mar/15 1:40 PM ]

This change has been included by Rich in commit https://github.com/clojure/clojure/commit/692645c73c86d12c93a97c858dc6e8b0f4280a0b#diff-f17f860d14163523f1e1308ece478ddb





[CLJ-1660] Unify Stepper and MultiStepper in LazyTransformer Created: 16/Feb/15  Updated: 04/Mar/15  Resolved: 04/Mar/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Michael Blume Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None

Attachments: Text File CLJ-1660-v1.patch    
Patch: Code

 Description   

This seemed worthwhile to me mainly because some of the stepper logic is actually pretty fiddly, so it seems better not to have it duplicated.



 Comments   
Comment by Alex Miller [ 04/Mar/15 1:06 PM ]

Based on CLJ-1669 direction which eliminates this code.





[CLJ-1599] Add a reset! that returns old value Created: 24/Nov/14  Updated: 02/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Steven Yi Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: atom

Attachments: File get-and-set.diff    
Patch: Code
Approval: Triaged

 Description   

DESCRIPTION

This patch adds the equivalent of reset!, here called get-and-set!, to core to allow getting the last value from an atom and setting it to a new value. This is useful for atomically draining an atom of its value and setting to a new value. The implementation delegates to Java's AtomicReference.getAndSet().

The equivalent operation in Clojure code would be:

(defn get-and-set! [atm newv]
(loop [oldv @atm]
(if (compare-and-set! atm oldv newv)
oldv
(recur @atm))))

This is close to a 1:1 translation of the Java code in sun.misc.Unsafe's getAndSetObject, used by AtomicReference (as of current JDK9 source code).

APPLICATIONS

  • User may want to check if an operation has occurred before by using an atom as a flag. I.e.,

(def has-run-once (atom false))
...
(when-not (get-and-set! has-run-once true)
(do something))

  • User may want to use an atom similarly to a java.util.concurrent.LinkedTransferQueue, for the case of pairing up adds by writers and drainTo by readers:

Thread 1: (swap! atm conj item1)
Thread 2: (swap! atm conj item2)
Thread 3: (let [new-vals (get-and-set! atm [])]
(do-something new-vals))

ALTERNATIVES

  • For case of atom as flag, user can use existing compare-and-set!:

(def has-run-once (atom false))
...
(when-not (compare-and-set! has-run-once false true)
(do something))

Argument: get-and-set! is a little clearer in intent as it is using the value of the atom, rather than the success of the cas operation. Also, this would not be applicable to situations where the value stored is not a boolean.

  • User can just go ahead and use LinkedTransferQueue.

Argument: User not fluent in Java may not be readily able to use this.

==

Argument for: This seems like a sufficiently primitive operation to include in core for atoms. I am unsure of the rationale, but assume it was vetted to include into Java's AtomicReference for a reason. Also, if users are using atoms and have this available, they are less likely to try to do this incorrectly, such as:

(let [vals @some-atom]
(reset! some-atom [])
(do-something-with vals))

Argument against: This may not be sufficiently primitive enough to include in core. Users have a workaround to implement the get-and-set! operation in user-code as given above.

Note: This request is similar to CLJ-1454 (http://dev.clojure.org/jira/browse/CLJ-1454), but differs in that this is not a swap! operation that accepts an IFn argument. Also, I looked to add a test in test/clojure/test_clojure/atoms.clj, but saw that the other operations weren't tested. (I assume this is due to the other operations delegating to AtomicReference and hence not deemed test-worthy.)



 Comments   
Comment by Michael Blume [ 02/Mar/15 5:03 PM ]

I tend to wind up inevitably adding this to my projects, be nice to have it in core

Comment by Alex Miller [ 02/Mar/15 7:27 PM ]

So CLJ-1454 is swap! and return old and CLJ-1599 is reset! and return old?

Comment by Steven Yi [ 02/Mar/15 7:48 PM ]

Yes, I think that's an accurate interpretation of the two tickets.





[CLJ-1454] Add a swap! that returns old value Created: 28/Jun/14  Updated: 02/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Philip Potter Assignee: Unassigned
Resolution: Unresolved Votes: 6
Labels: atom

Approval: Triaged

 Description   

Sometimes, when mutating an atom, it's desirable to know what the value before the swap happened. The existing swap! function returns the new value, so is unsuitable for this use case. Currently, the only option is to roll your own using a loop and compare-and-set!

An example of this would be where the atom contains a PersistentQueue and you want to atomically remove the head of the queue and process it: if you run (swap! a pop), you have lost the reference to the old head of the list so you can't process it.

It would be good to have a new function swap-returning-old! which returned the old value instead of the new.



 Comments   
Comment by Philip Potter [ 28/Jun/14 4:00 PM ]

Overtone already defines functions like this in overtone.helpers.ref, which get used by overtone.libs.event. These return both the old and the new value, although in all existing use cases only the old value gets used.

flatland/useful defines a trade! fn which returns the old value, although the implementation is less clean than a compare-and-set! based solution would be.

Comment by Philip Potter [ 29/Jun/14 6:23 AM ]

Chris Ford suggested "swap-out!" as a name for this function. I definitely think "swap-returning-old!" isn't the ideal name.

Comment by Jozef Wagner [ 30/Jun/14 1:33 AM ]

I propose a switch! name. The verb switch is defined as "substitute (two items) for each other; exchange.", and as you get the old value back, it evokes slightly the exchange of items.

Comment by Philip Potter [ 30/Jun/14 3:03 AM ]

Medley also has a deref-swap! which does the same thing.

Comment by Alex Miller [ 30/Jun/14 8:20 AM ]

I think deref-swap! seems like a morally equivalent name to Java's AtomicReference.getAndSet() which is the same idea.

Comment by Philip Potter [ 30/Jun/14 1:19 PM ]

Funny you say that Alex, because prismatic/plumbing defines a get-and-set! (also defined by other projects), equivalent to deref-reset! in medley. Plumbing also defines swap-pair! which returns both old and new values, like the overtone fn, although once again the only usage I can find only uses the old value.

Comment by Alex Miller [ 30/Jun/14 3:37 PM ]

I think it's important to retain the notion that you are not switching/exchanging values but applying the update model of applying a function to the old value to produce the new value. I don't even particularly like "swap!" as I think that aspect is lost in the name (alter and alter-var-root are better). I like that "deref-swap!" combines two words with existing connotations and orders them appropriately.

Comment by Timothy Baldridge [ 30/Jun/14 3:43 PM ]

except that that naming doesn't fit well compared to functions like nfirst which are defined as (comp next first). This function is not (comp deref swap!).





[CLJ-1668] ns macro throws NPE if empty reference is specified Created: 02/Mar/15  Updated: 02/Mar/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Philipp Meier Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: errormsgs, macro, namespace


 Description   

The following invocations of `ns` will all throw a NPE

(ns foo ())
(ns foo [])
(ns foo (:require clojure.core) ())

;; throw


1. Unhandled java.lang.NullPointerException
   (No message)

                      core.clj: 1518  clojure.core/name
                      core.clj: 5330  clojure.core/ns/process-reference
                      core.clj: 2559  clojure.core/map/fn
                  LazySeq.java:   40  clojure.lang.LazySeq/sval
                  LazySeq.java:   49  clojure.lang.LazySeq/seq
                       RT.java:  484  clojure.lang.RT/seq
                      core.clj:  133  clojure.core/seq
                      core.clj:  694  clojure.core/concat/cat/fn
                  LazySeq.java:   40  clojure.lang.LazySeq/sval
                  LazySeq.java:   49  clojure.lang.LazySeq/seq
                     Cons.java:   39  clojure.lang.Cons/next
                       RT.java: 1654  clojure.lang.RT/boundedLength
                      AFn.java:  148  clojure.lang.AFn/applyToHelper
                      Var.java:  700  clojure.lang.Var/applyTo

I'd expect an exception that is describing the cause of the error, not an "symptom".






[CLJ-1060] 'list*' returns not a list Created: 03/Sep/12  Updated: 26/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Trivial
Reporter: Andrei Zhlobich Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File list-star-fix.diff    
Patch: Code and Test

 Description   

Function 'list*' returns sequence, but not a list.
It is a bit confusing.

user=> (list? (list* 1 '(2 3)))
false


 Comments   
Comment by Stuart Halloway [ 17/Sep/12 6:52 AM ]

should the docstring for list* change to say it returns a seq?

Comment by Timothy Baldridge [ 27/Nov/12 11:58 AM ]

Is there a reason why we can't have list* actually return a list? The cost of creating a list vs a cons is negligible.

Comment by Marek Srank [ 04/Jan/13 2:02 PM ]

The question is what to do with the one-argument case of list*, because in cases like: (list* {:a 1 :b 2}) it doesn't return IPersistentList as well. I propose just applying 'list'.

I added patch 'list-star-fix.diff' (dated 04/Jan/2013) with Cons implementing IPersistentList and doing (apply list args) in one-argument case of list*. To be able to use 'apply' in list* I had to declare it before the definition of list* in the source code. The apply itself also uses list*, but luckily not the one-argument version of list*, so it should be safe... The patch also contains simple unit tests.

Discussion is also here: https://groups.google.com/forum/#!topic/clojure/co8lcKymfi8

Comment by Michał Marczyk [ 04/Jan/13 4:11 PM ]

(apply list args) would make (list* (range)) hang, where currently it returns a partially-realized lazy seq. Also, even for non-lazy seqs – possibly lists themselves – it would always build a new list from scratch, right?

Also, if I'm reading the patch correctly, it would make 2+-ary list* overloads and cons return lists – that is, IPersistentList instances – always (Conses would now be lists), but repeatedly calling next on such a list might eventually produce a non-list. The only way around that would be to make all seqs into IPersistentLists – that doesn't seem desirable at first glance...?

On a side note, for the case where the final "args" argument is in fact a list, we already have a "prepend many items" function, namely conj. list* currently acts as the vararg variant of cons (in line with Lisp tradition); I'm actually quite happy with that.

Comment by Michał Marczyk [ 04/Jan/13 4:19 PM ]

I'm in favour of the "list" -> "seq" tweak to the docstring though, assuming impl remains unchanged.

Comment by Marek Srank [ 04/Jan/13 6:13 PM ]

Yep, these are all valid points, thanks! I see this as a question whether the list* function is a list constructor or not. If yes (and it would be possible to implement it in a satisfactory way), it should probably return a list.

We could avoid building a new list by sth like:

(if (list? args)
  args
  (apply list args))

(btw, 'vec' also creates a new vector even when the argument itself is a vector)

The contract of next seems to be to return a seq, not a list - its docstring reads: "Returns a seq of the items after the first. Calls seq on its argument. If there are no more items, returns nil."

Btw, in some Lisp/Scheme impls I checked, cons seems to be a list as well. E.g. in CLisp (and similar in Guile and Racket):

> (listp (cons 2 '()))
T
> (listp (list* 1 2 '()))
T
Comment by Steve Miner [ 26/Feb/15 3:11 PM ]

I bump into this every once in a while and it bothers my pedantic side.

I think it's too late to change the implementation of `list*`. There's a risk of breaking existing code (dealing with lazy-seqs, etc.) It would be good to change the doc string to say it returns a seq, not a list.

But the real issue is the name of the function implies that it will return a list. You could deprecate `list*` (but keep it forever for backwards compatibility.) A better name for the same implementation might be `seq*`.





[CLJ-720] check that argument to keys/vals is a Map Created: 18/Jan/11  Updated: 24/Feb/15  Resolved: 23/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Backlog
Fix Version/s: Backlog

Type: Enhancement Priority: Minor
Reporter: Stuart Sierra Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: checkargs

Attachments: Text File keys-vals-type-1.patch     Text File keys-vals-type-2.patch    
Patch: Code and Test
Approval: Ok

 Description   

Current behavior:

  • If you call keys or vals on something that is not a Map, you do not get a ClassCastException until the KeySeq or ValSeq is consumed
  • Calling keys or vals on an empty collection of any type, even non-Map types, returns nil

The attached patch:

  • checks the type of the argument to keys and vals and throws IllegalArgumentException if it is neither java.util.Map nor null
  • changes tests for keys and vals to check that those functions throw IllegalArgumentException on empty collections that are not maps


 Comments   
Comment by Rich Hickey [ 20/Jan/11 7:50 AM ]

Please don't test for specific exception types - thanks

Comment by Stuart Sierra [ 20/Jan/11 7:59 AM ]

keys-vals-type-2.patch replaces previous. Tests check for Exception instead of IllegalArgumentException

Comment by OHTA Shogo [ 20/Feb/15 7:37 PM ]

Is there any reason why this patch hasn't been applied yet?
Recently I suffered from the ClassCastException raised by something like (keys [[:a 0]]), which caused my CIDER to miss stack traces somehow and made debug harder.

Comment by Andy Fingerhut [ 20/Feb/15 9:48 PM ]

It was committed here: https://github.com/clojure/clojure/commit/13d9404b5227f3b9e8f86371d211be890e5302a9

later it was reverted with this commit: https://github.com/clojure/clojure/commit/d93ef6293bbcc6059e6da77dbf6ed26167f36cdb
The comment on that commit was 'breaks subset'. I don't know the details of how it breaks subseq.

Comment by Andy Fingerhut [ 20/Feb/15 9:50 PM ]

'breaks subset' in my previous comment should have been 'breaks subseq'. Perhaps JIRA only lets one edit their comments for tickets that are still open.

Comment by Andy Fingerhut [ 20/Feb/15 9:52 PM ]

I don't know if dynalint [1] includes a check for this, but you might consider suggesting the addition of such a run-time check for that project.

There is also Dire [2] which may help add dynamic run-time checks to functions in clojure.core (I haven't used it for that before).

[1] https://github.com/frenchy64/dynalint
[2] https://github.com/MichaelDrogalis/dire

Comment by Alex Miller [ 20/Feb/15 9:54 PM ]

Subseq and rsubseq are kind of special cases where iirc they work with seqs of mapentries instead of a map. I ran into this recently with CLJ-1602 and added some tests there for subseq.

Comment by Andy Fingerhut [ 23/Feb/15 5:53 PM ]

Temporarily reopening ticket to add a 'checkargs' label to it. Will return to original state (except for the new label) soon.

Comment by OHTA Shogo [ 24/Feb/15 7:32 PM ]

Thank you for comments.
I didn't know keys/vals should work with seqs of mapentries (it doesn't seem their docstrings are saying so either).





[CLJ-17] GC Issue 13: validate in (keyword s) and (symbol s) Created: 17/Jun/09  Updated: 24/Feb/15  Resolved: 24/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Backlog

Type: Enhancement Priority: Blocker
Reporter: Rich Hickey Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: checkargs


 Description   
Reported by richhickey, Dec 17, 2008
Make sure they create readable keywords/symbols

Comment 1 by p...@hagelb.org, Apr 27, 2009
Could this be done with a regex, or should it try to confirm the name using the reader?
Comment 2 by p...@hagelb.org, Apr 27, 2009
I've implemented this in the attached patch. One thing that could be improved is that
invalid names simply raise an Exception, though it seems LispReader's ReaderException
would be more appropriate. I wasn't sure how to raise that though since it needs a
line number; it's not clear how to get the current line number.

The patch is still an improvement on the current state of things, though I'd
appreciate a tip as to how to raise the right exception.

I've also attached a patch to the test suite that ensures (symbol s) and (keyword s)
work properly in the context of invalid names. I can re-submit this to the contrib
project if that's desired if the core patch is accepted.
 0001-Test-invalid-symbol-keyword-names-raise-exceptions.patch
1.6 KB Download
Comment 3 by p...@hagelb.org, Apr 27, 2009
Last patch had a problem; used things like defn- etc. before they were defined in
core.clj. This attachment fixes that.
 validate-symbol-keyword-names.patch
2.1 KB Download
Comment 4 by p...@hagelb.org, Jun 13 (3 days ago)
This exists as a git branch too now: 
http://github.com/technomancy/clojure/tree/validate-symbols-issue-13


 Comments   
Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/17
Attachments:
0001-Test-invalid-symbol-keyword-names-raise-exceptions.patch - https://www.assembla.com/spaces/clojure/documents/cpC-Qow3ar3P8LeJe5afGb/download/cpC-Qow3ar3P8LeJe5afGb
validate-symbol-keyword-names.patch - https://www.assembla.com/spaces/clojure/documents/cpDbtWw3ar3P8LeJe5afGb/download/cpDbtWw3ar3P8LeJe5afGb
17-validate-keywords-symbols.diff - https://www.assembla.com/spaces/clojure/documents/d61sHuVFer3OGKeJe5afGb/download/d61sHuVFer3OGKeJe5afGb

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

cemerick said: [file:cpC-Qow3ar3P8LeJe5afGb]

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

cemerick said: [file:cpDbtWw3ar3P8LeJe5afGb]

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

richhickey said: Updating tickets (#8, #19, #30, #31, #126, #17, #42, #47, #50, #61, #64, #69, #71, #77, #79, #84, #87, #89, #96, #99, #103, #107, #112, #113, #114, #115, #118, #119, #121, #122, #124)

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

technomancy said: These patches no longer cleanly apply. Working on an updated version.

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

technomancy said: [file:d61sHuVFer3OGKeJe5afGb]: test and implementation

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

richhickey said: I just wonder if we can afford the overhead of this validation all the time

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

richhickey said: Anyone have any bright ideas on how to avoid the overhead?

Comment by Assembla Importer [ 28/Sep/10 6:50 AM ]

cemerick said: I'm sure this was brought up a long time ago in prior discussions, but: should validity of symbols and keywords even be defined? Insofar as libraries use them for naming things, especially as read from external representations (e.g. XML, RDF, SGML, etc), it seems like any validation would be an entirely artificial limitation.

For my money, having keywords and symbols print in a failsafe-readable way, e.g. #|symbol with whitespace| (maybe checking at print-time to see if the more common printing is suitable, as would be most common) would:

  • guarantee that symbols and keywords are readably printable
  • not limit libraries from using keywords and such in very convenient ways
  • provide a pleasant shortcut for using symbols and keywords that might not be "valid", but still somehow necessary
Comment by Rich Hickey [ 07/Oct/11 8:03 AM ]

Runtime validation off the table for perf reasons. cemerick's suggestion that arbitrary symbol support will render them valid is sound, but arbitrary symbol support is a different ticket/idea.

Comment by Andy Fingerhut [ 24/Feb/15 12:03 PM ]

Temporarily reopening to add a label. Will return to original state shortly.





[CLJ-1033] pr-str and read-string don't handle @ symbols inside keywords properly Created: 26/Jul/12  Updated: 24/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Steven Ruppert Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs, print, reader
Environment:

Ubuntu 12.04 LTS; Java 1.7.0_05 Java HotSpot(TM) Client VM



 Description   
user=> (read-string (pr-str {(keyword "key@other") :stuff}))
RuntimeException Map literal must contain an even number of forms  clojure.lang.Util.runtimeException (Util.java:170)

pr-str emits "{:key@other :stuff}", which read-string fails to interpret correctly. Either pr-str needs to escape the @ symbol, or read-string needs to handle the symbol inside a keyword.

Background: I'm passing a map with email addresses as keys through Storm bolts, which require a thrift-serializable form. Using the pr-str/read-string combo fails on these keys, so I've fallen back to JSON serialization.



 Comments   
Comment by Stuart Halloway [ 10/Aug/12 12:51 PM ]

The '@' character is not a legal character for keywords or symbols (see http://clojure.org/reader). Recategorized as enhancement request.

Comment by Steven Ruppert [ 10/Aug/12 1:04 PM ]

Then why doesn't (keyword "keywith@") throw an exception? It seems inconsistent with your statement.

Comment by Andy Fingerhut [ 13/Sep/12 2:23 PM ]

It is a long standing property of Clojure that it does not throw exceptions for everything that is illegal.

Comment by Steven Ruppert [ 17/Sep/12 2:16 PM ]

Yeah, but read-string clearly does. Is there a good reason that the "keyword" function can't throw an exception? With the other special rules on namespaces within symbol names, the "keyword" function really should be doing validation.

Another solution would be to allow a ruby-like :"symbol with disallowed characters" literal, but that would also be confusing with how the namespace is handled.

https://groups.google.com/forum/?fromgroups=#!topic/clojure/Ct5v9w0yNAE has some older discussion on this topic.

Comment by Andy Fingerhut [ 17/Sep/12 7:43 PM ]

Disclaimer: I'm not a Clojure/core member, just an interested contributor who doesn't know all the design decisions that were made here.

Steven, I think perhaps a couple of concerns are: (1) doing such checks would be slower than not doing them, and (2) implementing such checks means having to update them if/when the rules for legal symbols, keywords, namespace names, etc. change.

Would you be interested in writing strict versions of functions like symbol and keyword for addition to a contrib library? And test suites that try to hit a significant number of corner cases in the rules for what is legal and what is not? I mean those as serious questions, not rhetorical ones. This would permit people that want to use the strict versions of these functions to do so, and at the same time make it easy to measure performance differences between the strict and loose versions.

Comment by Steven Ruppert [ 13/Jan/13 10:58 PM ]

Looking back at this, the root cause of the problem is that the {pr} function, although it by default "print[s] in a way that objects can be read by the reader" [0], doesn't always do so. Thus, the easiest "fix" is to change its docstring to warn that not all keywords can be read back.

The deeper problem is that symbol don't have a reader form that can represent all actually possible keywords (in this case, those with "@" in them). Restricting the actually-possible keywords to match the reader form, i.e. writing a strict "keyword" function actually seems like a worse solution overall to me. The better solution would be to somehow extend the keyword reader form to allow it to express all possible keywords, possibly ruby's :"keyword" syntax. Plus, that solution would avoid having to keep hypothetical strict keyword/symbol functions in sync with reader operation, and write test cases for that, and so on.

Thus, the resolution of this bug comes down to how far we're willing to go. Changing the docstring would be the easiest, but extending the keyword form would be the "best" resolution, IMO.

[0]: http://clojuredocs.org/clojure_core/clojure.core/pr

Comment by Andy Fingerhut [ 19/Aug/13 10:54 AM ]

I happened across ticket CLJ-17 yesterday. Its discussion thread shows that the topic of validating the contents of constructed keywords and symbols has arisen before. At the time, a patch was written that modified the functions "symbol" and "keyword" so that the symbol/keyword was constructed as it does now, but then it was double-checked for readability using the clojure.lang.RT/readString method on the string arg given. It threw an exception if the intern and readString methods returned non-equal symbols (or if readString threw an exception).

Rich was concerned that this run-time overhead would be too high, and asked if anyone knew a faster way of doing it. Chas Emerick proposed making all symbols readable using a syntax like Common Lisp's #|symbol with whitespace|, with some checks for the common case where the quoting would be unnecessary. Rich was open to the idea of quoting arbitrary symbols, but that it would be a different ticket than that one.

I am not aware of anyone creating a ticket to introduce quoting of arbitrary symbols since then, but I could have missed it. This ticket could become that ticket, but its description would need significant editing, and code changes would be needed in a variety of places in Clojure.





[CLJ-1087] clojure.data/diff uses set union on key seqs Created: 15/Oct/12  Updated: 24/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Tom Jack Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs, performance

Attachments: Text File clj-1087-diff-perf-enhance-patch-v1.txt    
Patch: Code

 Description   

clojure.data/diff, on line 118, defines:

java.util.Map
(diff-similar [a b]
  (diff-associative a b (set/union (keys a) (keys b))))

Since keys returns a key seq, this seems like an error. clojure.set/union has strange and inconsistent behavior with regard to non-sets, and in this case the two key seqs are concatenated. Based on a cursory benchmark, it seems that this bug is a slight performance gain when the maps have no common keys, and a significant performance loss when the maps have the same keys. The results are still correct because of the merging reduce in diff-associative.

The patch is easy (just call set on each key seq).



 Comments   
Comment by Andy Fingerhut [ 15/Oct/12 2:52 PM ]

clj-1087-diff-perf-enhance-patch-v1.txt dated Oct 15 2012 implements Tom's suggested performance enhancement, although not exactly in the way he suggested. It does calculate the union of the two key sequences.





[CLJ-1099] better error message when passing non-seq to seq Created: 01/Nov/12  Updated: 24/Feb/15

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Stuart Halloway Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: checkargs

Attachments: Text File better-error-message-for-seq.patch    
Patch: Code
Approval: Vetted

 Description   

Design discussion here.

This patch improves Clojure's error message for a single common error: passing a non-seq where a seq is neede. More importantly, it is intended as a prototype for other similar improvements in the future.

Error message before:

(cons 1 2)
=> IllegalArgumentException Don't know how to create ISeq from: java.lang.Long

Error message after:

user=> (cons 1 2)
ExceptionInfo Don't know how to create ISeq from: java.lang.Long
user=> (ex-data *e)
{:instance 2}

Patch: better-error-message-for-seq.patch
NOTE: This patch was reverted as it affected the inlining of RT.seqFrom().



 Comments   
Comment by Michael Klishin [ 12/Nov/12 10:34 AM ]

Wouldn't it be better to make it read "Don't know how to create ISeq from: 2 (java.lang.Long)"? How many beginners will figure
out ex-data exists and how to use it?

Comment by Stuart Halloway [ 12/Apr/13 11:36 AM ]

Hi Michael,

ex-info messages should not, in general, pr-str things into their bodies. This raises the question of print-length and print-level in a place where the user doesn't have good control, while the whole point of ex-info is to be in the data business, not the string business. Users can control printing from ex-data any way they like.

There are two possible ways to make beginners aware of ex-data: Tell them about it in one (or a few places) in docs, or in an infinite number of places saying "This would have been useful here, but we didn't use it because you might not know about it." I prefer the former.

That said, I think it would be great to increase the visibility of ex-info and ex-data early on in documentation for beginners, and to make sure that things like exception printing in logs are flexible enough not to lose the benefits of ex-info.

Comment by Andy Fingerhut [ 25/Mar/14 5:14 PM ]

Just a comment that this fix was committed before release 1.6.0, and then reverted very shortly before release 1.6.0. I believe the reason for reverting was due to concerns that this change made performance about 5% slower in some relatively common cases, with a suspicion that it could have affected inlining of the seqFrom method.

Not sure whether the ticket should be reopened or not.





[CLJ-1438] bit-* functions don't check for overflow Created: 05/Jun/14  Updated: 24/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Pascal Germroth Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs, math


 Description   

The bit* functions, in contrast to the other numerical functions, don't appear to check for overflow, i.e. (bit-test 13 200000) returns true.

It would be nice if the behaviour would fit the other numerical operators, i.e. throw on overflow and provide a variant that doesn't, and one that works with arbitrary precision, also not currently supported:
(bit-test (bigint 13) 20000), (bit-test (biginteger 13) 20000) throw IllegalArgumentException.






[CLJ-1346] clojure.core.VecSeq does not implement method equals Created: 09/Feb/14  Updated: 24/Feb/15  Resolved: 24/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5, Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Andy Fingerhut Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None


 Description   

.equals is asymmetric for seq's of primitive vectors and PersistentVectors, because clojure.core.VecSeq does not implement Java's equals method. It implements equiv, so clojure.core/= is fine:

user=> (def v1 [1 2 3])
#'user/v1
user=> (def v2 (vector-of :long 1 2 3))
#'user/v2
user=> (= v1 v2)
true
user=> (.equals v1 v2)
true
user=> (= (seq v1) (seq v2))
true
user=> (.equals (seq v1) (seq v2))
true
user=> (= v2 v1)
true
user=> (.equals v2 v1)
true
user=> (= (seq v2) (seq v1))
true
;; This is the one that is not like the others, and a symptom of the problem
user=> (.equals (seq v2) (seq v1))
false


 Comments   
Comment by Andy Fingerhut [ 24/Feb/15 10:32 AM ]

CLJ-1364 includes this as well as other things that are missing in clojure.core.VecSeq implementation.





[CLJ-1341] keyword function returns nil on bad input Created: 05/Feb/14  Updated: 24/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs, errormsgs

Attachments: Text File keyword-1341-2014-02-12.2.patch     Text File keyword-1341-2014-02-12.patch    
Patch: Code and Test

 Description   

The keyword function should throw an exception on bad input rather than return nil.

user=> (keyword 5)
nil
user=> (keyword [])
nil

Cause: The keyword function is defined as a cond with cases for keywords, symbols, and strings. There is no :else so all other cases return nil.

Proposal: Add an :else branch and throw an exception in keyword.

Patch:



 Comments   
Comment by Eric Normand [ 12/Feb/14 7:17 PM ]

The keyword function should throw an IllegalArgumentException on wrong argument type rather than return nil. For consistency, the two-argument case should throw an IllegalArgumentException if not both arguments are strings.

The find-keyword function should behave similarly to maintain the same signature.

Current behavior:

user=> (keyword 5)
nil
user=> (keyword [])
nil
user=> (keyword 1 1)
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.String

Cause: The keyword function is defined as a cond with cases for keywords, symbols, and strings. There is no :else so all other cases return nil. The same goes for find-keyword, which should behave in the same way. The two-argument case does throw an exception, but the message is not very helpful.

Proposal: I have added an :else branch to the cond that throws an IllegalArgumentException with a message that indicates the acceptable types and prints the actual argument. I made the same change to find-keyword. There are also simple tests.

Patch: keyword-1341-2014-02-12.patch

Note: This change does not check for all bad input, just the type. For instance, it is still possible to pass in a string with "illegal" keyword characters.

Comment by Alex Miller [ 12/Feb/14 9:20 PM ]

Hey Eric, thanks for the patch! The 1 arg change looks good.

On the 2 arg change I have a concern - I'm worried that we are adding new checks into a pretty hot code path (keyword creation). The 2 arg path is not a silent failure as you'll get a ClassCastException so I do not think adding these checks here is worth it. In the 1-arg case you've already fallen through the else, so there's no additional cost.

Comment by Eric Normand [ 12/Feb/14 9:35 PM ]

Understood. I'll remove the two-argument case.

Comment by Eric Normand [ 12/Feb/14 9:51 PM ]

The keyword function should throw an IllegalArgumentException on wrong argument type rather than return nil. The two-argument case already throws an exception.

The find-keyword function should behave similarly to maintain the same signature.

Current behavior:

user=> (keyword 5)
nil
user=> (keyword [])
nil
user=> (keyword 1 1)
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.String

Cause: The keyword function is defined as a cond with cases for keywords, symbols, and strings. There is no :else so all other cases return nil. The same goes for find-keyword, which should behave in the same way.

Proposal: I have added an :else branch to the cond that throws an IllegalArgumentException with a message that indicates the acceptable types and prints the actual argument. I made the same change to find-keyword. There are also simple tests.

Alternatives: Adding checks for the two-argument case was considered but it was feared that adding the extra overhead was not worth it since it already threw an exception. No significant overhead is added in the single-argument case since it will only affect erroneous input.

Patch: keyword-1341-2014-02-12.2.patch

Note: This change does not check for all bad input, just the type. For instance, it is still possible to pass in a string with "illegal" keyword characters.





[CLJ-995] sorted-set doesn't support IEditableCollection Created: 13/May/12  Updated: 24/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Moritz Ulrich Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: collections


 Description   

I think sorted-set (PersistentTreeSet) should implement the transient interface. It's a special-purpose set and should be usable just like every normal set.



 Comments   
Comment by Michel Alexandre Salim [ 04/Jun/12 2:32 AM ]

Note that this would require PersistentTreeMap to implement IEditableCollection as well.

Comment by Andy Fingerhut [ 24/Feb/15 10:00 AM ]

Anyone wishing for sorted sets/maps with transients should consider the data.avl library, which has them: https://github.com/clojure/data.avl





[CLJ-810] clojure.set/difference Created: 15/Jun/11  Updated: 23/Feb/15  Resolved: 23/Feb/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Joel Kaasinen Assignee: Unassigned
Resolution: Declined Votes: 1
Labels: checkargs
Environment:

Linux, openjdk 6



 Description   

This seems a bit weird. Maybe set difference could check that the collections are sets?

(clojure.set/difference (set [0 300 :a]) [:a :b])
#{0 300}
(clojure.set/difference (set [0 300 :a]) [:a :b :c :d])
#{:a 300}
(clojure.set/difference (set [0 300 :a]) (set [:a :b :c :d]))
#{0 300}

The problem is still present in the master branch too.



 Comments   
Comment by Stuart Halloway [ 20/Jun/11 7:04 AM ]

set/difference's behavior is not documented if you don't pass in a set.

Comment by Andy Fingerhut [ 23/Feb/15 9:23 PM ]

Temporarily reopening only to add a label, and then will return to its original state.





[CLJ-790] Primitive type hints on function names should print error message Created: 10/May/11  Updated: 23/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Alan Dipert Assignee: Alan Dipert
Resolution: Unresolved Votes: 0
Labels: errormsgs


 Description   

Functions returning primitives are hinted with metadata on the argument list, not on the function name. Using a primitive type hint on a function name should print an error message.

Currently, misplaced primitive hints are read without error.



 Comments   
Comment by Andy Fingerhut [ 23/Feb/15 9:20 PM ]

One can type hint a primitive value on a Var naming a function, or any value one wants, like so:

(def {:tag 'long} foo 17)

(defn {:tag 'double} bar [x y]
(* 2.0 x y))

I think it is odd that one must use {:tag 'long} instead of ^long, since trying to use ^long ends up giving the useless type hint that is the value of the function clojure.core/long.

However, the Clojure compiler will use the primitive type hints as shown in the examples above to avoid reflection in appropriate Java interop calls, so making them an error seems undesirable.





[CLJ-758] Add Clojure program to generate HTML and LaTeX cheatsheets from common source Created: 14/Mar/11  Updated: 23/Feb/15  Resolved: 17/Feb/12

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.2
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Andy Fingerhut Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None

Attachments: File clj758-patch-v1.diff    

 Description   

Discussion thread on clojure-dev: http://groups.google.com/group/clojure-dev/browse_thread/thread/265b887a940219b2#

Steve Tayon created a cheat sheet of Clojure functions, macros, and vars organized into categories. He created it in LaTeX, editing it through several revisions. One or a few revisions were manually converted to HTML for publication at URL http://clojure.org/cheatsheet

Alex Miller proposed creating a single source that could then be used to automatically generate both HTML and LaTeX/PDF versions, where each symbol on the cheatsheet would be a link to documentation on clojure.github.com

Andy Fingerhut created a Clojure program that, from a single data structure contained within the program, generates either HTML or LaTeX, with or without the links. The LaTeX can be used to generate PDF files in either A4 or US letter size pages.



 Comments   
Comment by Andy Fingerhut [ 14/Mar/11 9:19 PM ]

First attempt at patch for adding cheat sheet generator program. See readme.txt inside there for some open questions on how this ought to be packaged, e.g. does it belong here or in clojure-contrib? What kinds of ant/maven targets should it have?

Comment by Andy Fingerhut [ 17/Feb/12 2:36 AM ]

Updated versions of this code for newer versions of the cheatsheet will be added to a github repo soon, perhaps this one: http://github.com/fogus/clojure-cheatsheets

Comment by Andy Fingerhut [ 23/Feb/15 9:15 PM ]

The actual Github repo is here: https://github.com/jafingerhut/clojure-cheatsheets





[CLJ-706] make use of deprecated namespaces/vars easier to spot Created: 05/Jan/11  Updated: 23/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Stuart Halloway Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: errormsgs

Attachments: File 706-deprecated-var-warning.diff     Text File 706-deprecated-var-warning-patch-v2.txt     File 706-fix-deprecation-warnings-agents.diff     File 706-fix-deprecation-warnings-on-replicate.diff     File 706-fix-deprecation-warning-test-junit.diff     File 706-warning-on-deprecated-ns.diff    
Patch: Code

 Description   

From the mailing list http://groups.google.com/group/clojure/msg/c41d909bd58e4534. It is easy to use deprecated namespaces without knowing you are doing so. The documentation warnings are small, and there is no compiler warning.

Some possibilities include:

  1. much more visible deprecation styling in the documentation
  2. stderr warnings when referencing a deprecated thing.

I don't love the idea of stderr warnings on all the time. Rich: is there an approach to this that you would like to see a patch for?



 Comments   
Comment by Rich Hickey [ 07/Jan/11 9:38 AM ]

I don't mind warning to stderr

Comment by Luke VanderHart [ 26/Oct/12 1:37 PM ]

706-deprecated-var-warning.diff adds warnings when using deprecated vars. The other three patches clean up deprecation warnings.

Comment by Andy Fingerhut [ 26/Oct/12 2:23 PM ]

Great stuff. I looked through the first patch, and didn't see anything in there that lets someone disable deprecation warnings from the command line, the way that warn-on-reflection can today be set to true with a command line option.

Is that something important to have for deprecation warnings?

Comment by Andy Fingerhut [ 28/Oct/12 4:57 PM ]

I was hoping it would be quick and easy to add source file, line, and column info to the deprecation warning messages. It isn't as easy as adding them to the format() call, because the method analyzeSymbol doesn't receive these values as args. Is this deprecation check being done in a place where it is not easy to relate it to the source file, line, and column? Could it be done in a place where that info is easily available?

Comment by Ghadi Shayban [ 29/Oct/12 1:02 AM ]

Another patch - this time to warn on loading deprecated namespaces, instead of vars. This patch requires the first one.

Re: line/column, I'll figure out how to thread the compile context through if it's desired.

Re: Compile flag. I have a patch for this also, but I'm still verifying how to invoke. How is warn-on-reflection set by command line?

Comment by Andy Fingerhut [ 29/Oct/12 1:43 AM ]

Re: Compile flag. Don't hold off on implementing a flag if you want to, but it might be worth hearing from others whether such a command line option is even desired. I was asking in hopes of eliciting such a response.

For the way that it is handled in the Clojure compiler, search for REFLECTION_WARNING_PROP and related code in Compile.java. If you invoke the Clojure compiler directly via a Java command line, use -Dclojure.compile.warn-on-reflection=true (default is false). See the recent email thread sent to Clojure Dev Google group if you want to know how to do it via ant or Maven. Link: https://mail.google.com/mail/?shva=1#label/clojure-dev/13aa0e34530196c3

There is also a separate command-line flag called compiler-options (see Compile.java) that is implemented as a map inside the compiler. It was added more recently than warn-on-reflection, and might be the preferred method to add more such options, to avoid having to continue to add more arguments to the pushThreadBindings calls done in several places.

Comment by Ghadi Shayban [ 29/Oct/12 10:36 AM ]

Thanks, Andy.

Alternately for the last ns patch, it is equivalent to call (print-method msg err), rather than binding out to err, may be more readable. I'll be glad to send that in if it's preferable.

Comment by Andy Fingerhut [ 13/Feb/13 12:38 AM ]

706-deprecated-var-warning-patch-v2.txt dated Feb 12 2013 is identical to 706-deprecated-var-warning.diff dated Oct 26 2012, except it applies cleanly to latest master.

Comment by Andy Fingerhut [ 23/Feb/15 8:21 PM ]

For the information of anyone examining this ticket wishing for this feature, the Eastwood lint tool reports calls to deprecated Clojure functions, and also to deprecated Java methods. https://github.com/jonase/eastwood





[CLJ-77] GC Issue 74: Clojure compiler emits too-large classfiles (results in ClassFormatError) Created: 17/Jun/09  Updated: 23/Feb/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Chas Emerick Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: errormsgs


 Description   
Reported by cemer...@snowtide.com, Feb 10, 2009

The jvm has certain implementation limits around the maximum size of
classfiles, literal strings, method length, etc; however, in certain
circumstances, the Clojure compiler can currently emit classfiles that
violate some of those limitations, causing an error later when the
classfile is loaded.

While test coverage would necessarily detect this sort of problem on a
project-by-project basis when one's tests attempted to load a project's
classfiles, it seems like Clojure should do the following to ensure failure
as quickly as possible:

- throw an exception immediately if, while compiling a lib, it is detected
that the resulting classfile(s) would violate any classfile implementati