<< Back to previous view

[CLJ-1671] Clojure socket server Created: 09/Mar/15  Updated: 04/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.8

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: repl

Attachments: Text File clj-1671-10.patch     Text File clj-1671-11.patch     Text File clj-1671-12.patch     Text File clj-1671-2.patch     Text File clj-1671-3.patch     Text File clj-1671-4.patch     Text File clj-1671-5.patch     Text File clj-1671-6.patch     Text File clj-1671-7.patch     Text File clj-1671-8.patch     Text File clj-1671-9.patch    
Patch: Code and Test
Approval: Vetted

 Description   

Programs often want to provide REPLs to users in contexts when a) network communication is desired, b) capturing stdio is difficult, or c) when more than one REPL session is desired. In addition, tools that want to support REPLs and simultaneous conversations with the host are difficult with a single stdio REPL as currently provided by Clojure.

Tooling and users often need to enable a REPL on a program without changing the program, e.g. without asking author or program to include code to start a REPL host of some sort. Thus a solution must be externally and declaratively configured (no user code changes). A REPL is just a special case of a socket service. Rather than provide a socket server REPL, provide a built-in socket server that composes with the existing repl function.

For design background, see: http://dev.clojure.org/display/design/Socket+Server+REPL

Start a socket server by supplying an extra system property (classpath and clojure.main here, but this would generally be starting your own app instead - we won't use the repl it starts):

java -cp target/classes -Dclojure.server.repl="{:port 5555 :accept clojure.core.server/repl}" clojure.main

where options are:

  • address = host or address, defaults to loopback
  • port = port, required
  • accept = namespaced function to invoke on socket accept, required
  • args = sequential collection of args to pass to accept
  • bind-err = defaults to true, binds err to out stream
  • server-daemon = defaults to true, socket server thread doesn't block exit
  • client-daemon = defaults to true, socket client threads don't block exit

Run a repl client using telnet:

$ telnet 127.0.0.1 5555
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
user=> (println "hello")
hello
nil
user=> clojure.core.server/*session*
{:server "repl", :client "1"}
user=> (ns foo)
nil
foo=> (+ 1 1)
2
foo=>

Now open a 2nd telnet client and "attach" to the session of the first client:

telnet 127.0.0.1 5555
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
user=> :repl/sessions
[{:server "repl", :client "1"} {:server "repl", :client "2"}]
user=> :repl/attach {:server "repl", :client "1"}
true
1|foo=> *ns*
#object[clojure.lang.Namespace 0x6023edaa "foo"]
1|foo=> *1
2
1|foo=> :repl/detach
true
user=> *1
true
user=> :repl/quit
Connection closed by foreign host.

Note that when the session is attached, the prompt changes, showing the client id and namespace of the attached session instead. When attached to another session, your expressions are evaluated in the context of the attached session, so you can grab the current namespace or even the *1/*2/*3 results. This is an example of how a tool could attach and be "inside" the context of the user's repl.

Patch: clj-1671-12.patch



 Comments   
Comment by Timothy Baldridge [ 09/Mar/15 5:50 PM ]

Could we perhaps keep this as a contrib library? This ticket simply states "The goal is to provide a simple streaming socket repl as part of Clojure." What is the rationale for the "part of Clojure" bit?

Comment by Alex Miller [ 09/Mar/15 7:33 PM ]

We want this to be available as a Clojure.main option. It's all additive - why wouldn't you want it in the box?

Comment by Timothy Baldridge [ 09/Mar/15 10:19 PM ]

It never has really been too clear to me why some features are included in core, while others are kept in contrib. I understand that some are simply for historical reasons, but aside from that there doesn't seem to be too much of a philosophy behind it.

However it should be noted that since patches to clojure are much more guarded it's sometimes nice to have certain features in contrib, that way they can evolve with more rapidity than the one release a year that clojure has been going through.

But aside from those issues, I've found that breaking functionality into modules forces the core of a system to become more configurable. Perhaps I would like to use this repl socket feature, but pipe the data over a different communication protocol, or through a different serializer. If this feature were to be coded as a contrib library it would expose extension points that others could use to add additional functionality.

So I guess, all that to say, I'd prefer a tool I can compose rather than a pre-built solution.

Comment by Rich Hickey [ 10/Mar/15 6:25 AM ]

Please move discussions on the merits of the idea to the dev list. Comments should be about the work of resolving the ticket, approach taken by the patch, quality/perf issues etc.

Comment by Colin Jones [ 11/Mar/15 1:33 PM ]

I see that context (a) of the rationale is that network communication is desired, which sounds to me like users of this feature may want to communicate across hosts (whether in VMs or otherwise). Is that the case?

If so, it seems like specifying the address to bind to (e.g. "0.0.0.0", "::", "127.0.0.1", etc.) may become important as well as the existing port option. This way, someone who wants to communicate across hosts (or conversely, lock down access to local-only) can make that decision.

Comment by Alex Miller [ 11/Mar/15 2:07 PM ]

Colin - agreed. There are many ways to potentially customize what's in there so we need to figure out what's worth doing, both in the function and via the command line.

I think address is clearly worth having via the function and possibly in the command line too.

Comment by Kevin Downey [ 11/Mar/15 5:49 PM ]

I find the exception printing behavior really odd. for a machine you want an exception as data, but you also want some indication of if the data is an error or not, for a human you wanted a pretty printed stacktrace. making the socket repl default to printing errors this way seems to optimize for neither.

Comment by Rich Hickey [ 12/Mar/15 12:29 PM ]

Did you miss the #error tag? That indicates the data is an error. It is likely we will pprint the error data, making it not bad for both purposes

Comment by Alex Miller [ 13/Mar/15 11:29 AM ]

New -4 patch changes:

  • clojure.core/throwable-as-map now public and named clojure.core/Throwable->map
  • catch and ignore SocketException without printing in socket server repl (for client disconnect)
  • functions to print as message and as data are now: clojure.main/err-print and clojure.main/err->map. All defaults and docs updated.
Comment by David Nolen [ 18/Mar/15 12:44 PM ]

Is there any reason to not allow supplying :eval in addition to :use-prompt? In the case of projects like ClojureCLR + Unity eval generally must happen on the main thread. With :eval as something which can be configured, REPL sessions can queue forms to be eval'ed with the needed context (current ns etc.) to the main thread.

Comment by Kevin Downey [ 20/Mar/15 2:12 PM ]

I did see the #error tag, but throwables print with that tag regardless of if they are actually thrown or if they are just the value returned from a function. Admittedly returning exceptions as values is not something generally done, but the jvm does distinguish between a return value and a thrown exception. Having a repl that doesn't distinguish between the two strikes me as an odd design. The repl you get from clojure.main currently prints the message from a thrown uncaught throwable, and on master prints with #error if you have a throwable value, so it distinguishes between an uncaught thrown throwable and a throwable value. That obviously isn't great for tooling because you don't get a good data representation in the uncaught case.

It looks like the most recent patch does pretty print uncaught throwables, which is helpful for humans to distinguish between a returned value and an uncaught throwable.

Comment by Kevin Downey [ 25/Mar/15 1:10 PM ]

alex: saying this is all additive, when it has driven changes to how things are printed, using the global print-method, rings false to me

Comment by Sam Ritchie [ 25/Mar/15 1:15 PM ]

This seems like a pretty big last minute addition for 1.7. What's the rationale for adding it here vs deferring to 1.8, or trying it out as a contrib first?

Comment by Alex Miller [ 25/Mar/15 2:13 PM ]

Kevin: changing the fallthrough printing for things that are unreadable to be readable should be useful regardless of the socket repl. It shouldn't be a change for existing programs (unless they're relying on the toString of objects without print formats).

Sam: Rich wants it in the box as a substrate for tools.

Comment by Alex Miller [ 26/Mar/15 10:03 AM ]

Marking incomplete, pending at least the repl exit question.

Comment by Laurent Petit [ 29/Apr/15 2:18 PM ]

Hello, I intend to work on this, if it appears it still has a good probability of being included in clojure 1.7.
There hasn't been much visible activity on it lately.
What is the current status of the pending question, and do you think it will still make it in 1.7?

Comment by Alex Miller [ 29/Apr/15 2:29 PM ]

This has been pushed to 1.8 and is on my plate. The direction has diverged quite a bit from the original description and we don't expect to modify clojure.main as is done in the prior patches. So, I would recommend not working on it as described here.

Comment by Laurent Petit [ 01/May/15 7:24 AM ]

OK thanks for the update.

Is the discussion about the new design / goal (you say the direction has diverged) available somewhere so that I can keep in touch with what the Hammock Time is producing? Because on my own hammock time I'm doing some mental projections for CCW support of this, based on what is publicly available here -

Also, as soon as you have something available for testing please don't hesitate to ping me, I'll see what I can do to help depending on my schedule. Cheers.

Comment by Alex Miller [ 01/May/15 8:44 AM ]

Some design work is here - http://dev.clojure.org/display/design/Socket+Server+REPL.

Comment by Laurent Petit [ 05/May/15 11:41 AM ]

Thanks for the link. It seems that the design is totally revamped indeed. Better to wait then.

Comment by Andy Fingerhut [ 04/Jul/15 1:21 PM ]

Alex, just a note that the Java method getLoopbackAddress [1] appears to have been added with Java 1.7, so your patches that use that method do not compile with Java 1.6. If the plan was for the next release of Clojure to drop support for Java 1.6 anyway, then no worries.

[1] http://docs.oracle.com/javase/7/docs/api/java/net/InetAddress.html#getLoopbackAddress%28%29

Comment by Stuart Halloway [ 08/Jul/15 8:31 AM ]

Lifecycle concerns

1. atoms are weak when actions (starting threads / sockets) are coordinated with recorded state (the map)
2. I see why the plumbing needs access to socket, but what is the motivation for expoosing it to outside code, seems like opportunity to break stuff
3. stating a server has a race condition
4. what happens if somebody wants to call start-server explicitly – how do they know whether that happens before or after config-driven process launches
5. what guarantees about when config-driven launch happens, vis-a-vis other startup-y things
6. is there a use for stop-servers other than shutdown?
7. does Clojure now need a shutdown-clojure-resources? I don't want to have to remember shutdown-agents, plus stop-servers, plus whatever we add next year
8. what happens on misconfiguration (e.g. nonexistent namespace)? will other servers still launch? what thread dies, and where does is report? will the app main process die before even reaching the user?

Comment by Alex Miller [ 08/Jul/15 9:12 AM ]

1. it's not in the patch, but the intention is that in the runtime on startup there is a call to (start-servers (System/getProperties)) and generally you're not starting servers on the fly (although it's broken out to make that possible).
2. just trying to make resources available, not sure how much locking down we want/need
3. I'm expecting this to be a startup thing primarily
4. I'm assuming the config-driven process will start in the runtime and thus it will happen first. Not sure how much we need to support the manual stuff - it just seemed like a good idea to make it possible.
5. dunno, haven't looked at where to do that yet. Probably somewhere similar to data_readers stuff?
6. no
7. the threads are daemons by default meaning that it will shut down regardless. If you set the daemon properties to false, then you're in control and need to call stop-servers where it's appropriate.
8. I thought about these questions and do not have good answers. Lots more of that stuff needs to be handled.

Comment by Alex Miller [ 10/Aug/15 10:55 AM ]

More feedback forwarded from Stu:

1. Need to switch from atom to locking around the server/client state map.
2. It is not clear to me exactly what gets pushed by :repl/push – what does "like ns" mean?
3. :repl/quit feels categorically different from push and pull. People may come to expect this on all Clojure REPLs, should we install it at the bottom instead of in the REPL server?
4. The design anticipates the possibility of using multiple connections sharing state to implement a higher level "session". I think we should implement an example of this, both to flesh out the idea and as an example for users.
5. I am opposed to adding variety to the config options more work for unclear payoff and would prefer to leave the file-based option and REPL convenience as a possibility for later
6. We have one name too few - it doesn't make sense for client id to be bound to a tuple that includes server-name and client id the word 'coordinates' has been used in some places for the bigger thing
7. Explicitly enumerate where exceptions end up may be sufficient to document that UncaughtExceptionHandler gets all of them
8. Should programs have direct access to sockets? pro: more control (e.g. design says you can grab socket from server state map and close it). con: leaking abstraction

Comment by Alex Miller [ 10/Aug/15 11:22 AM ]

1. Updating patch (but geez does this feel weird to do in Clojure)
2. I'm not entirely sure what info a tooling connection needs about a user's state. *ns* is an obvious one. Maybe *e or the other repl vars too? In some ways this is a similar problem to binding conveyance - giving one thread another's dynamic "state".
3. Agreed on :repl/quit. I had a version of this patch that actually added the "command" capability to the base repl and installed :repl/quit as a default command. I think that's a better answer but I did it this way to make the patch additive and easier to evaluate, idea-wise. So, in other words, agreed and that's do-able. Repl invokers would then need a way to install new commands, which the socket repl could do.
4. I think the "session" notion is a more useful way of saying this than I have done so far. With the benefit of time, I'm not happy with the push/pull stuff and I don't think it really solves the problem. In particular, it would be better if no explicit actions were required for the "push" part. I will do some more design thinking on this on the design page.
5. ok
6. Again, bound up in the push/pull stuff and maybe not necessary.
7. Will add to design page.
8. Given the locking and other stuff, I think I'd now say no - there should be an API for closing etc but not direct access to these.

Comment by Laurent Petit [ 14/Aug/15 4:09 AM ]

Just to check, I'd like to understand if we share the same detail view of how the `:accept` key will work.

Its default value is `clojure.repl/repl`.

I expect to be able to specify a value like `my.not.yet.loaded.namespace/my-repl-fn` and that the server code will try to require `my.not.yet.loaded.namespace` from the java classpath.

Also, when will the resolution of `:accept` take place? Before or after all the -e and -i expressions / files have been evaluated? That is, will it be possible to load `my.not.yet.loaded.namespace` via a -i option or in the list of files to load, instead of having to tweak the classpath? (Some tools / IDEs abstracting away the notion of classpath, program arguments, jvm parameters make it difficult if possible at all to tweak the classpath, but really easy to add JVM parameters / program arguments before launching.

Comment by Alex Miller [ 14/Aug/15 8:11 AM ]

Hey Laurent,

Keep in mind this is a work in progress still. The idea here is that you can pass these system properties when running ANY Clojure program. The Clojure runtime will scan system properties and start socket servers as needed (that hook is not yet installed).

So you can do clojure.main if you like but if you're passing the server system properties the socket server will be launched. I have not determined yet exactly where that RT loading will happen but I would expect it to be prior to the -i or -e being evaluated.

Comment by Laurent Petit [ 14/Aug/15 9:08 AM ]

Sure, I know this is WIP, and that's why I find it valuable to add my 2 cents now, rather that when things are almost wrapped up.

So it will be triggered not by the fact that clojure.main is invoked, but during the initialization of the clojure runtime ?

Then it seems indeed difficult to ask to wait for clojure.main to have finished starting.

My first remark still stands, though:

  • the fully qualified var symbol String passed as value of `:accept` parameter should not require that its namespace be already loaded, or this will really limit to choice of vars that can be passed on. Some dynamic `require` call and var `resolve` should be taking place here.
Comment by Alex Miller [ 14/Aug/15 9:43 AM ]

Oh sorry, I thought you were just confirming what was already there - the current patch does execute a require for the namespace containing the accept function.

Comment by Alex Miller [ 19/Aug/15 3:27 PM ]

The -8 patch integrates the server startup into the Clojure runtime. It also improves validation and error-handling on the :repl commands and attached sessions now report the attached session ns as the repl prompt and properly allow access to *1, *2, *3.

Comment by Alex Miller [ 19/Aug/15 3:27 PM ]

Oh, and I got rid of the use of getLoopbackAddress() so this works in jdk 1.6.

Comment by Alex Miller [ 25/Aug/15 9:51 AM ]

The -10 patch defers requiring the accept ns until the socket makes a connection - this should allow a lot more freedom for the app to initialize code as needed.

Comment by Ragnar Dahlen [ 25/Aug/15 10:14 AM ]

Two comments on latest patch:

  • Swallowing the exception without notice in stop-server might be confusing for a user. If stop-server actually fails (but seemingly succeeds), calling start-server might fail too because port is already open, for example, but as a user it would be hard to understand why.
  • Would be helpful document return values in public API (if there is one), for example start-server/stop-server
Comment by Alex Miller [ 25/Aug/15 3:37 PM ]

Ragnar, good comments. I've updated the patch. That stop code was broken in a couple other ways too but should be good now.

Comment by Stuart Halloway [ 04/Sep/15 1:41 PM ]

Major stuff:

  • I don't feel like we have nailed down the requirements for the session feature, so hard to comment on that
  • attach/detach metaphor for sessions feels weird, e.g. I cannot set a dynamic var inside there and have the other session see my change. We would instead just session state query and eval-in-session APIs.
  • something needs to happen if the session you are attached to goes away out from under you

Minor stuff:

  • socket backlog arg of 0 means "use default", is there some reason 50 is better?
  • I think 'Clojure' better than 'Socket' more disinguishing as thread names
  • catch-all exception eating scares me, couldn't stop-servers call the uncaught exception handler?
  • vec + map can be replaced by mapv
  • start-client seems a very odd name to me, as this is the server side client handler
Comment by Alex Miller [ 04/Sep/15 4:45 PM ]

Added new -12 patch with the following changes per Stu's comments:

  • In the scenario where an attached session goes away, you now lose that attachment and get an exception tell you so the next time you try to eval an expression.
  • Changed socket backlog to 0 (50 is the default, but I wasn't aware that 0 would get the default)
  • Changed thread names
  • Changed stop-server to shutdown each server in a future independently - exceptions will bubble up to default uncaught exception handler
  • Replaced vec+map with mapv
  • Changed start-client to accept-connection

Deferred discussion of session intent.





[CLJ-1806] group-by as reducer / reduction fn Created: 29/Aug/15  Updated: 04/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Karsten Schmidt Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reducers, transducers

Attachments: Text File clj-1806-1.patch    
Patch: Code and Test

 Description   

Whilst working on a query engine heavily based on transducers, I noticed that it'd be great to have group-by able to be used as reduction fn. The attached patch adds a single arity version to group-by which enables this use case and also includes a few tests.



 Comments   
Comment by Karsten Schmidt [ 29/Aug/15 8:08 PM ]

patch w/ described changes





[CLJ-1807] Add prefer-proto, like prefer-method but for protocols Created: 30/Aug/15  Updated: 04/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: protocols

Attachments: Text File 0001-CLJ-1807-add-prefer-proto.patch    
Patch: Code and Test
Approval: Triaged

 Description   

Currently it's possible to extend a protocol to multiple interfaces but there's no mechanism like prefer-method for multimethods to prefer one implementation over another, as a result, if multiple interfaces match, a random one is picked.

One particular example where this is a problem, is trying to handle generically records and maps (this come up in tools.analyzer): when extending a protocol to both IRecord and IPersistentMap there's no way to make the IRecord implementation be chosen over the IPersistentMap one and thus protocols can't be used.

The attached patch adds a prefer-proto function that's like prefer-method but for protocols.

No performance penalty is paid if prefer-proto is never used, if it's used there will be a penalty during the first protocol method dispatch to lookup the perference table but the protocol method cache will remove that penalty for further calls.

Example:

user=> (defprotocol p (f [_]))
p
user=> (extend-protocol p clojure.lang.Counted (f [_] 1) clojure.lang.IObj (f [_] 2))
nil
user=> (f [1])
2
user=> (prefer-proto p clojure.lang.Counted clojure.lang.IObj)
nil
user=> (f [1])
1

Patch: 0001-CLJ-1807-add-prefer-proto.patch






[CLJ-1808] map-invert should use transients and reduce-kv instead of reduce Created: 30/Aug/15  Updated: 04/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Nikita Prokopov Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: ft

Attachments: Text File clj-1808-map-invert-should-use-reduce-kv-and-transient.patch    
Patch: Code
Approval: Triaged

 Description   

Two performance enhancements on clojure.set/map-invert:

1) Use reduce-kv to avoid realizing mapentry's from input map
2) Use transients to create the output map

Patch: clj-1808-map-invert-should-use-reduce-kv-and-transient.patch



 Comments   
Comment by Alex Miller [ 04/Sep/15 10:42 AM ]

Would be nice to see a quick perf test that compared before/after times.





[CLJ-1810] ATransientMap not marked public Created: 31/Aug/15  Updated: 04/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7, Release 1.8
Fix Version/s: None

Type: Enhancement Priority: Trivial
Reporter: Gregg Reynolds Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: ft, transient
Environment:

all


Attachments: Text File CLJ-1810-v1.patch    
Patch: Code
Approval: Triaged

 Description   

All the other abstract classes are explicitly marked "public". Seems like ATransientMap should be marked like the others.



 Comments   
Comment by Michael Blume [ 31/Aug/15 5:04 PM ]

Here is the obvious patch =)





[CLJ-1811] test line reporting doesn't always report test's file & line number Created: 02/Sep/15  Updated: 04/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7, Release 1.8
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Ryan Fowler Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: test

Attachments: File error_reporting.clj     Text File LINE_REPORING_1.patch    
Patch: Code and Test
Approval: Triaged

 Description   

If an Exception is thrown during test execution, the filename and line number are frequently not helpful for finding the problem. For instance, this code:

error_reporting.clj
(require '[clojure.test :refer [deftest test-var]])

(deftest foo
  (meta))

(test-var #'foo)

Will output an error at AFn.java:429.

ERROR in (foo) (AFn.java:429)
Uncaught exception, not in assertion.
expected: nil
  actual: clojure.lang.ArityException: Wrong number of args (0) passed to: core/meta--4144
 at clojure.lang.AFn.throwArity (AFn.java:429)
    clojure.lang.AFn.invoke (AFn.java:28)
    user/fn (error_reporting.clj:4)
    clojure.test$test_var$fn__7670.invoke (test.clj:704)
    clojure.test$test_var.invoke (test.clj:704)
    user$eval6.invoke (error_reporting.clj:6)
    clojure.lang.Compiler.eval (Compiler.java:6782)
    ...etc

Rich's Comment 24016 on CLJ-377 says that he thinks the message should report the test file line rather than where the exception was thrown.

Approach: The patch matches the the classname of the top testing-vars element against stack trace elements to find the test's file name and line number.

After applying the patch, the above example outputs error_report.clj:4:

ERROR in (foo) (error_reporting.clj:4)
Uncaught exception, not in assertion.
expected: nil
  actual: clojure.lang.ArityException: Wrong number of args (0) passed to: core/meta--4141
 at clojure.lang.AFn.throwArity (AFn.java:429)
    clojure.lang.AFn.invoke (AFn.java:28)
    user$fn__3.invokeStatic (error_reporting.clj:4)
    user/fn (error_reporting.clj:-1)
    clojure.test$test_var$fn__114.invoke (test.clj:705)
    clojure.test$test_var.invokeStatic (test.clj:705)
    clojure.test$test_var.invoke (test.clj:-1)
    user$eval6.invokeStatic (error_reporting.clj:6)
    user$eval6.invoke (error_reporting.clj:-1)
    clojure.lang.Compiler.eval (Compiler.java:6939)
    ...etc

Patch: LINE_REPORING_1.patch



 Comments   
Comment by Ryan Fowler [ 02/Sep/15 5:36 PM ]

example file from description

Comment by Alex Miller [ 04/Sep/15 10:04 AM ]

A quick search on Github shows many cases where people call into the (admittedly private) file-and-line function. These users would be broken by the patch. Perhaps it would be better to create a new function or a new arity rather than removing the existing arity.

Just eyeballing it, but I suspect you've introduced reflection in a couple places in the new code, particularly these might need another type hint:

1. (.getName (.getClass (:test (meta test-var))))
2. #(= (.getClassName %) test-var-class-name)

I need to look at more of the code to make a judgement on everything else. Seeing testing-vars in there means that this function is now dependent on external state, so need to think carefully to be sure that every calling context has that global state (or won't fail in bad ways if it doesn't). It would be helpful to see a discussion of your thinking about that in the "approach" section of the ticket description.





[CLJ-668] Improve slurp performance by using native Java StringWriter and jio/copy Created: 01/Nov/10  Updated: 03/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Enhancement Priority: Critical
Reporter: Jürgen Hötzel Assignee: Timothy Baldridge
Resolution: Unresolved Votes: 0
Labels: ft, io, performance

Attachments: File slurp-perf-patch.diff    
Patch: Code
Approval: Triaged

 Description   

Instead of copying each character from InputReader to StringBuffer.

Performance improvement:

Generate a 10meg file:
user> (spit "foo.txt" (apply str (repeat (* 1024 1024 10) "X")))

Test code:
user> (dotimes [x 100] (time (do (slurp "foo.txt") 0)))

From:
...
Elapsed time: 136.387 msecs"
"Elapsed time: 143.782 msecs"
"Elapsed time: 153.174 msecs"
"Elapsed time: 211.51 msecs"
"Elapsed time: 155.429 msecs"
"Elapsed time: 145.619 msecs"
"Elapsed time: 142.641 msecs"
...


To:
...
"Elapsed time: 23.408 msecs"
"Elapsed time: 25.876 msecs"
"Elapsed time: 41.449 msecs"
"Elapsed time: 28.292 msecs"
"Elapsed time: 25.765 msecs"
"Elapsed time: 24.339 msecs"
"Elapsed time: 32.047 msecs"
"Elapsed time: 23.372 msecs"
"Elapsed time: 24.365 msecs"
"Elapsed time: 26.265 msecs"
...

Approach: Use StringWriter and jio/copy vs character by character copy. Results from the current patch see a 4-5x perf boost after the jit warms up, with purely in-memory streams (ByteArrayInputStream over a 6MB string).

Patch: slurp-perf-patch.diff
Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 21/Apr/14 3:28 PM ]

This is double-better with the changes in Clojure 1.6 to improve jio/copy performance by using the NIO impl. Rough timing difference on a 25M file: old= 2316.021 msecs, new= 93.319 msecs.

Filer did not supply a patch and is not a contributor. If someone wants to make a patch (and better timing info demonstrating performance improvements), that would be great.

Comment by Timothy Baldridge [ 10/Sep/14 10:29 PM ]

Fixed the ticket formatting a bit, and added a patch I coded up tonight. Should be pretty close to the old patch, as we both use StringWriter, but I didn't really look at the old patch beyond noticing that it was using StringWriter.

Comment by Alex Miller [ 11/Sep/14 7:01 AM ]

Can you update the perf comparison on latest code and do both a small and big file?

Comment by Michael Blume [ 03/Sep/15 7:44 PM ]

Is this blocked on a new perf comparison? Should I post one?

Comment by Michael Blume [ 03/Sep/15 8:19 PM ]

https://github.com/MichaelBlume/slurp-test

Comment by Michael Blume [ 03/Sep/15 8:28 PM ]
$ lein bench 20000 /tmp/hello
writing test file
filename: /tmp/hello
file size: 20000 bytes
benching old slurp
WARNING: Final GC required 1.210381595203919 % of runtime
Evaluation count : 298920 in 60 samples of 4982 calls.
             Execution time mean : 202.329756 µs
    Execution time std-deviation : 6.314504 µs
   Execution time lower quantile : 191.929679 µs ( 2.5%)
   Execution time upper quantile : 218.094592 µs (97.5%)
                   Overhead used : 1.982912 ns

Found 4 outliers in 60 samples (6.6667 %)
	low-severe	 1 (1.6667 %)
	low-mild	 2 (3.3333 %)
	high-mild	 1 (1.6667 %)
 Variance from outliers : 17.4347 % Variance is moderately inflated by outliers
benching new slurp
Evaluation count : 906180 in 60 samples of 15103 calls.
             Execution time mean : 67.856649 µs
    Execution time std-deviation : 1.729056 µs
   Execution time lower quantile : 64.529704 µs ( 2.5%)
   Execution time upper quantile : 70.907626 µs (97.5%)
                   Overhead used : 1.982912 ns

Found 6 outliers in 60 samples (10.0000 %)
	low-severe	 4 (6.6667 %)
	low-mild	 1 (1.6667 %)
	high-mild	 1 (1.6667 %)
 Variance from outliers : 12.6203 % Variance is moderately inflated by outliers
$ lein bench 20000000 /tmp/hello
writing test file
filename: /tmp/hello
file size: 20000000 bytes
benching old slurp
WARNING: Final GC required 3.245241304626936 % of runtime
Evaluation count : 360 in 60 samples of 6 calls.
             Execution time mean : 191.645108 ms
    Execution time std-deviation : 7.985328 ms
   Execution time lower quantile : 177.640022 ms ( 2.5%)
   Execution time upper quantile : 199.548233 ms (97.5%)
                   Overhead used : 1.733362 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 1 (1.6667 %)
	low-mild	 1 (1.6667 %)
 Variance from outliers : 28.6500 % Variance is moderately inflated by outliers
benching new slurp
Evaluation count : 1380 in 60 samples of 23 calls.
             Execution time mean : 45.151513 ms
    Execution time std-deviation : 1.298921 ms
   Execution time lower quantile : 41.675855 ms ( 2.5%)
   Execution time upper quantile : 47.746682 ms (97.5%)
                   Overhead used : 1.733362 ns

Found 5 outliers in 60 samples (8.3333 %)
	low-severe	 3 (5.0000 %)
	low-mild	 2 (3.3333 %)
 Variance from outliers : 15.7926 % Variance is moderately inflated by outliers
$ lein bench 20 /tmp/hello
writing test file
filename: /tmp/hello
file size: 20 bytes
benching old slurp
WARNING: Final GC required 1.413345976323001 % of runtime
Evaluation count : 2707140 in 60 samples of 45119 calls.
             Execution time mean : 22.617091 µs
    Execution time std-deviation : 409.896754 ns
   Execution time lower quantile : 21.777914 µs ( 2.5%)
   Execution time upper quantile : 23.316498 µs (97.5%)
                   Overhead used : 1.917000 ns
benching new slurp
Evaluation count : 2430420 in 60 samples of 40507 calls.
             Execution time mean : 25.023136 µs
    Execution time std-deviation : 476.636242 ns
   Execution time lower quantile : 23.910619 µs ( 2.5%)
   Execution time upper quantile : 25.666224 µs (97.5%)
                   Overhead used : 1.917000 ns

Found 2 outliers in 60 samples (3.3333 %)
	low-severe	 1 (1.6667 %)
	low-mild	 1 (1.6667 %)
 Variance from outliers : 7.8349 % Variance is slightly inflated by outliers
Comment by Michael Blume [ 03/Sep/15 8:30 PM ]

This is with an SSD, no idea what the numbers would look like with a spinny disk.





[CLJ-1449] Add clojure.string functions for portability to ClojureScript Created: 19/Jun/14  Updated: 03/Sep/15  Resolved: 03/Sep/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.8

Type: Enhancement Priority: Major
Reporter: Bozhidar Batsov Assignee: Nola Stowe
Resolution: Completed Votes: 29
Labels: ft, string

Attachments: Text File add_functions_to_strings-2.patch     Text File add_functions_to_strings-3.patch     Text File add_functions_to_strings-4.patch     Text File add_functions_to_strings-5.patch     Text File add_functions_to_strings-6.patch     Text File add_functions_to_strings-7.patch     Text File add_functions_to_strings.patch     Text File clj-1449-more-v1.patch    
Patch: Code and Test
Approval: Ok

 Description   

It would be useful if a few common functions from Java's String were available as Clojure functions for the purposes of increasing portability to other Clojure platforms like ClojureScript.

The functions below also cover the vast majority of cases where Clojure users currently drop into Java interop for String calls - this tends to be an issue for discoverability and learning. While the goal of this ticket is increased portability, improving that is a nice secondary benefit.

Proposed clojure.string fn java.lang.String method
index-of indexOf
last-index-of lastIndexOf
starts-with? startsWith
ends-with? endsWith
includes? contains

Patch:

  • clj-1449-7.patch - uses nil to indicate not-found in index-of

Performance: Tested the following with criterium for execution mean time (all times in ns).

Java Clojure Java Clojure (-4 patch) Clojure (-6 patch)
(.indexOf "banana" "n") (clojure.string/index-of "banana" "n") 8.70 9.03 9.27
(.indexOf "banana" "n" 1) (clojure.string/index-of "banana" "n" 1) 7.29 7.61 7.66
(.indexOf "banana" (int \n)) (clojure.string/index-of "banana" \n) 5.34 6.20 6.20
(.indexOf "banana" (int \n) 1) (clojure.string/index-of "banana" \n 1) 5.38 6.19 6.24
(.startsWith "apple" "a") (clojure.string/starts-with? "apple" "a") 4.84 5.71 5.65
(.endsWith "apple" "e") (clojure.string/ends-with? "apple" "e") 4.82 5.68 5.70
(.contains "baked apple pie" "apple") (clojure.string/includes? "baked apple pie" "apple") 10.78 11.99 12.17

Screened by: Stu, who prefers nil over -1 - add_functions_to_strings-6.patch

Note: In both Java and JavaScript, indexOf will return -1 to indicate not-found, so this is (at least in these two cases) the status quo. The benefit of nil return in Clojure is that it can be used as a found/not-found condition. The benefit of a -1 return is that it matches Java/JavaScript and that the return type can be hinted as a primitive long, allowing you to feed it into some other arithmetic or string operation without boxing.



 Comments   
Comment by Alex Miller [ 19/Jun/14 12:53 PM ]

Re substring, there is a clojure.core/subs for this (predates the string ns I believe).

clojure.core/subs
([s start] [s start end])
Returns the substring of s beginning at start inclusive, and ending
at end (defaults to length of string), exclusive.

Comment by Jozef Wagner [ 20/Jun/14 3:21 AM ]

As strings are collection of characters, you can use Clojure's sequence facilities to achieve such functionality:

user=> (= (first "asdf") \a)
true
user=> (= (last "asdf") \a)
false
Comment by Alex Miller [ 20/Jun/14 8:33 AM ]

Jozef, String.startsWith() checks for a prefix string, not just a prefix char.

Comment by Bozhidar Batsov [ 20/Jun/14 9:42 AM ]

Re substring, I know about subs, but it seems very odd that it's not in the string ns. After all most people will likely look for string-related functionality in clojure.string. I think it'd be best if `subs` was added to clojure.string and clojure.core/subs was deprecated.

Comment by Pierre Masci [ 01/Aug/14 5:27 AM ]

Hi, I was thinking the same about starts-with and .ends-with, as well as (.indexOf s "c") and (.lastIndexOf "c").

I read the whole Java String API recently, and these 4 functions seem to be the only ones that don't have an equivalent in Clojure.
It would be nice to have them.

Andy Fingerhut who maintains the Clojure Cheatsheet told me: "I maintain the cheatsheet, and I put .indexOf and .lastIndexOf on there since they are probably the most common thing I saw asked about that is in the Java API but not the Clojure API, for strings."
Which shows that there is a demand.

Because Clojure is being hosted on several platforms, and might be hosted on more in the future, I think these functions should be part of the de-facto ecosystem.

Comment by Andy Fingerhut [ 30/Aug/14 3:39 PM ]

Updating summary line and description to add contains? as well. I can back this off if it changes your mind about triaging it, Alex.

Comment by Andy Fingerhut [ 30/Aug/14 3:40 PM ]

Patch clj-1449-basic-v1.patch dated Aug 30 2014 adds starts-with? ends-with? contains? functions to clojure.string.

Patch clj-1449-more-v1.patch is the same, except it also replaces several Java method calls with calls to these Clojure functions.

Comment by Andy Fingerhut [ 05/Sep/14 1:02 PM ]

Patch clj-1449-basic-v1.patch dated Sep 5 2014 is identical to the patch I added recently called clj-1149-basic-v1.patch. It is simply renamed without the typo'd ticket number in the file name.

Comment by Yehonathan Sharvit [ 02/Dec/14 3:09 PM ]

What about an implementation that works also in cljs?

Comment by Bozhidar Batsov [ 02/Dec/14 3:11 PM ]

Once this is added to Clojure it will be implemented in ClojureScript as well.

Comment by Yehonathan Sharvit [ 02/Dec/14 3:22 PM ]

Great! Any idea when it will be added to Clojure?
Also, will it be automatically added to Clojurescript or someone will have to write a particular code for it.
The suggested patch relies on Java so I am curious to understand who is going to port the patch to cljs.

Comment by Bozhidar Batsov [ 02/Dec/14 3:27 PM ]

No idea when/if this will get merged. Upvote the ticket to improve the odds of this happening sooner.
Someone on the ClojureScript team will have to implement this in terms of JavaScript.

Comment by Alex Miller [ 02/Dec/14 4:01 PM ]

Some things that would be helpful:

1) It would be better to combine the two patches into a single patch - I think changing current uses into new users is a good thing to include. Also, please keep track of the "current" patch in the description.
2) Patch needs tests.
3) Per the instructions at the top of the clojure.string ns (and the rest of the functions), the majority of these functions are implemented to take the broader CharSequence interface. Similar to those implementations, you will need to provide a CharSequence implementation while also calling into the String functions when you actually have a String.
4) Consider return type hints - I'm not sure they're necessary here, but I would examine bytecode for typical calling situations to see if it would be helpful.
5) Check performance implications of the new versions vs the old with a good tool (like criterium). You've put an additional var invocation and (soon) type check in the calling path for these functions. I think providing a portable target is worth a small cost, but it would be good to know what the cost actually is.

I don't expect we will look at this until after 1.7 is released.

Comment by Andy Fingerhut [ 02/Dec/14 8:05 PM ]

Alex, all your comments make sense.

If you think a ready-and-waiting patch that does those things would improve the odds of the ticket being vetted by Rich, please let us know.

My guess is that his decision will be based upon the description, not any proposed patches. If that is your belief also, I'll wait until he makes that decision before working on a patch. Of course, any other contributor is welcome to work on one if they like.

Comment by Alex Miller [ 02/Dec/14 8:40 PM ]

Well nothing is certain of course, but I keep a special report of things I've "screened" prior to vetting that makes possible moving something straight from Triaged all the way through into Screened/Ok when Rich is able to look at them. This is a good candidate if things were in pristine condition.

That said, I don't know whether Rich will approve it or not, so it's up to you. I think the argument for portability is a strong one and complements the feature expression.

Comment by Alex Miller [ 14/May/15 8:55 AM ]

I'd love to have a really high-quality patch on this ticket to consider for 1.8 that took into account my comments above.

Also, it occurred to me that I don't think this should be called "contains?", overlapping the core function with a different meaning (contains value vs contains key). Maybe "includes?"?

Comment by Andy Fingerhut [ 14/May/15 9:14 AM ]

clojure.string already has 2 name conflicts with clojure.core, for which most people probably do something like (require '[clojure.string :as str]) to avoid that:

user=> (use 'clojure.string)
WARNING: reverse already refers to: #'clojure.core/reverse in namespace: user, being replaced by: #'clojure.string/reverse
WARNING: replace already refers to: #'clojure.core/replace in namespace: user, being replaced by: #'clojure.string/replace
nil
Comment by Alex Miller [ 14/May/15 10:05 AM ]

I'm not concerned about overlapping the name. I'm concerned about overlapping it and meaning something different, particularly vs one of the most confusing functions in core already. clojure.core/contains? is better than linear time key search. clojure.string/whatever will be a linear time subsequence match.

Comment by Alex Miller [ 14/May/15 10:18 AM ]

Ruby uses "include?" for this.

Comment by Devin Walters [ 19/May/15 4:56 PM ]

I agree with Alex's comment about the overlap. Personally, I prefer the way "includes?" reads over "include?", but IMO either one is better than adding to the "contains?" confusion.

Comment by Bozhidar Batsov [ 20/May/15 12:10 AM ]

I'm fine with `includes?`. Ruby is famous for the bad English used in its core library.

Comment by Andrew Rosa [ 17/Jun/15 12:29 PM ]

Andy Fingerhut, since you are the author of the original patch do you desire to continue the work on it? If you prefer (or even need) to focus on different stuff, I would like to tackle this issue.

Comment by Andy Fingerhut [ 17/Jun/15 1:08 PM ]

Andrew Rosa, I am perfectly happy if someone else wants to work on a revised patch for this. If you are interested, go for it! And from Alex's comments, I believe my initial patch covers such a small fraction of what he wants, do not worry about keeping my name associated with any patch you submit – if yours is accepted, my patch will not have helped you in getting there.

Comment by Nola Stowe [ 03/Jul/15 9:01 AM ]

Andrew Rosa, I am interested on working on this. Are you currently working it?

Comment by Andrew Rosa [ 03/Jul/15 12:08 PM ]

Thanks Andy Fingerhut, didn't saw the answer here. I'm happy that Nola Stowe could take this one.

Comment by Nola Stowe [ 08/Jul/15 12:46 PM ]

Updated patch to rename methods as specified and add tests.

Comment by Bozhidar Batsov [ 08/Jul/15 3:26 PM ]

A few remarks:

1. added should say 1.8, not 1.9.
2. The docstrings could be improved a bit - normally they should end with a . and refer to all the params of the function in question.

Comment by Nola Stowe [ 08/Jul/15 3:29 PM ]

Thanks Bozhidar Batsov .. I will, and I also got feedback from the slack clojure-dev group (Alex, Ghadi, AndrewHr) so I will be submitting an improved patch soon

Comment by Andy Fingerhut [ 09/Jul/15 11:57 AM ]

Also, I am not sure, but Alex Miller made this comment earlier: "Per the instructions at the top of the clojure.string ns, functions should take the broader CharSequence interface instead of String. Similar to existing clojure.string functions, you will need to provide a CharSequence implementation while also calling into the String functions when you actually have a String."

Looking at the Java classes and interfaces, one of the classes that implements CharSequence is CharBuffer, and it has no indexOf or lastIndexOf methods. If one were to call the proposed index-of or last-index-of with a CharBuffer, I believe you would get an exception for an unknown method.

I don't know if Alex's sentence is essential to any patch to be considered, but wanted to point out that it seems like it effectively means writing your own implementation of each of these methods for some of the classes implementing the CharSequence interface.

Comment by Nola Stowe [ 09/Jul/15 4:15 PM ]

Andy, yes you are right. Alex helped me with my current changes (not posted yet) and where needed I cast the value to a string so that it would have the indexOf methods.

Comment by Nola Stowe [ 10/Jul/15 7:43 AM ]

Added type hints, using StringBuffer instead of string in tests.

Comment by Nola Stowe [ 10/Jul/15 11:53 PM ]

Use critium and did some bench marking, I am not familiar enough to say "hey its good or no, too slow". I tried an optimization but it didn't make much difference.

notes here:

https://www.evernote.com/l/ABLYZbDr8ElIs6fkODAd3poBfIm_Bfr5kLo

Appreciate any feedback

Comment by Alex Miller [ 13/Jul/15 10:37 AM ]

Nola, some comments:

1) In index-of and last-index-of, I think we should use (instance? Character value) instead of char? - this will avoid a function invocation.

2) In index-of and last-index-of, in the branches where you know you have a Character, we should just invoke the proper function on the character instance directly rather than calling int. In this case that is (.charValue ^Character value). However, this function returns char so you still want the ^int hint.

3) In index-of and last-index-of, we should do a couple of things to maximally utilize primitive support for from-index. We should type-hint the incoming value as a ^long (only long and double are specialized cases in Clojure, not int). We should then use the function unchecked-int to optimally convert the long primitive to an int primitive.

4) We should type-hint the return value to a primitive long to avoid boxing the return.

Putting 1+2+3+4 together:

(defn index-of
  "Return index of value (string or char) in s, optionally searching forward from from-index."
  {:added "1.8"}
  (^long [^CharSequence s value]
   (if (instance? Character value)
     (.indexOf (.toString s) ^int (.charValue ^Character value))
     (.indexOf (.toString s) ^String value)))
  (^long [^CharSequence s value ^long from-index]
  (if (instance? Character value)
    (.indexOf (.toString s) ^int (.charValue ^Character value) (unchecked-int from-index))
    (.indexOf (.toString s) ^String value (unchecked-int from-index)))))

Similar changes in last-index-of.

5) In starts-with? and ends-with?, I would move the type hint for substr to the parameter declaration. This is more of a style preference on my part, no functional difference I believe.

6) On includes? I would do the same as #5, except .contains actually takes a CharSequence for the substr, so I would change the type hint from ^String to the broader ^CharSequence.

7) Can you edit the description and add a table with the "execution time mean" number for direct Java vs new Clojure fn for the patch after these changes? Please add tests for the from-index variants of index-of and last-index-of as well.

Comment by Nola Stowe [ 14/Jul/15 5:46 PM ]

optimizations as suggested by Alex Miller

Comment by Andy Fingerhut [ 14/Jul/15 5:50 PM ]

Nola, not a big deal, but Rich has requested that patch file names end with ".patch" or ".diff", I think because whatever he uses to view them recognizes that suffix and puts it in a different viewing/editing mode.

Comment by Nola Stowe [ 14/Jul/15 6:03 PM ]

Andy, so no need to keep around patches when used in comparisons?

Comment by Andy Fingerhut [ 14/Jul/15 8:31 PM ]

It is ok to have multiple versions of patches attached, but names like add_functions_to_strings-2.patch etc. would be preferred over names like add_functions_to_strings.patch-2, so the suffix is ".patch".

Comment by Nola Stowe [ 14/Jul/15 8:55 PM ]

original patch submitted by Nola

Comment by Nola Stowe [ 14/Jul/15 8:56 PM ]

Optimizations suggested by Alex

Comment by Alex Miller [ 14/Jul/15 9:44 PM ]

Nola, I re-did the timings on my machine - these were done with Java 1.8 and no Leiningen involved.

Comment by Alex Miller [ 14/Jul/15 9:57 PM ]

-4 patch is identical to -3 but removes one errant ^long type hint

Otherwise, looks good to me!

Comment by Alex Miller [ 17/Jul/15 10:24 PM ]

Rich, since you didn't put any comments on here, I'm not sure if there's anything else you wanted, so marking screened.

Comment by Stuart Halloway [ 19/Jul/15 7:36 AM ]

I dislike the approach here for several reasons:

  • floats Java-isms (-1 for missing instead of nil?) up into the Clojure API
  • makes narrow string fns out of things that could/should be seq fns

Stepping back, this is all library work. Why should it be in core? If this started in a contrib, it could evolve much more freely.

Comment by Alex Miller [ 19/Jul/15 8:49 AM ]

re Java-isms: totally fair comment worth changing

re narrow string fns: if these were seq fns, they would presumably take seqables of chars and return seqables of chars. When I have strings, I want to work with strings (as with clojure.string/join and clojure.string/reverse vs their seq equivalents).

re library: these particular functions are the most commonly used string functions where current users drop to Java interop. Note that all 5 of these functions have over time been added to the cheatsheet (http://clojure.org/cheatsheet) because they are commonly used. I believe there is significant value in including them in core and standardizing their name and behavior across to CLJS.

Comment by Bozhidar Batsov [ 19/Jul/15 10:29 AM ]

I'm obviously biased, but I totally agree with everything Alex said. Having one core string library and a separate contrib string library is just impractical. With clojure.string already part of the language it's much more sensible to extend it where/when needed instead of encouraging the proliferation of tons of alternative libraries (there are already a couple of those out there, mostly because essential functions are missing). Initial designs are rarely perfect, there's nothing wrong in revisiting and improving them.

I believe this is one of the top voted tickets, which clearly indicates that the Clojure community is quite supportive of this additions.

Comment by Michael Blume [ 19/Jul/15 12:26 PM ]

None of these functions return Strings so they could very easily be seq functions which use the String methods as a fast path when called with Strings, the trouble is then they wouldn't necessarily belong in clojure.string.

Comment by Bozhidar Batsov [ 19/Jul/15 1:49 PM ]

Such an argument can be made for existing functions like clojure.core/subs and clojure.string/reverse - we could have just went with generic functions that operate on seqs and converted the results to strings, but we didn't. It might be just me, but I'm really thinking this is being overanalysed. Sometimes it's really preferable to deal with some things as strings (not to mention more efficient).

Comment by Rich Hickey [ 20/Jul/15 9:39 AM ]

I completely agree with Alex and Bozhidar re: strings vs seqs. Other langs that have only lists and use them for strings are really bad at strings. So - string-specific fns are ok. What about the Java-isms?

Comment by Bozhidar Batsov [ 20/Jul/15 10:08 AM ]

We should address them for sure. Let's wait for a couple of days for Nola to update the patch. If she's too busy I'll do the update myself.

Comment by Nola Stowe [ 20/Jul/15 10:50 AM ]

I will be happy to work on it Saturday. Is nil an acceptable response when index-of "apple" "p" is not found? The other functions return booleans and I think those are ok.

Comment by Bozhidar Batsov [ 20/Jul/15 10:55 AM ]

Yep, we should go with nil.

Comment by Nola Stowe [ 25/Jul/15 4:33 PM ]

Changed index-of and last-index-of to return nil instead of -1 (java-ism), thus had to remove the type hint for the return value.

Seems to have slowed down the function compared to having the function return -1. My benchmarks: https://www.evernote.com/l/ABL2wead73BJZYAkpH5yxsfX9JM8cpUiNhw

Comment by Nola Stowe [ 25/Jul/15 9:05 PM ]

Updated patch to include return value in doc string for index-of last-index-of, otherwise same as last update.

Comment by Alex Miller [ 26/Jul/15 8:40 AM ]

Removing the hint causes the return type to be Object, which forces boxing in the "found" cases.

Comment by Nola Stowe [ 26/Jul/15 8:56 AM ]

Is it worth it to remove the java-ism of returning -1 when not found? Personally, I don't really like a function returning one of two things... found index or nil ... I prefer it being an int at all times.

Comment by Alex Miller [ 26/Jul/15 9:00 AM ]

There is one whitespace issue in the -5 patch:

[~/code/clojure (master)]$ git apply ~/Downloads/add_functions_to_strings-5.patch
/Users/alex/Downloads/add_functions_to_strings-5.patch:34: trailing whitespace.
  (let [result
warning: 1 line adds whitespace errors.

I updated the timings in the description to include both -4 and -5 - the boxing there does have a significant impact.

Comment by Nola Stowe [ 26/Jul/15 9:28 AM ]

Fixed whitespace issue.

Comment by Alex Miller [ 26/Jul/15 9:32 AM ]

Added -6 patch and timings that address most of the performance impact.

Comment by Stuart Halloway [ 27/Jul/15 9:35 AM ]

Alex: Note that all of these functions return numbers and booleans, not strings nor seqs of anything. This is partially why they were left out of the original string implementation.

It isn't clear to me why index-of should be purely a string fn, I regularly need index-of with other things, and both my book and JoC use index-of as an example of a nice-to-have.

Comment by Nicola Mometto [ 27/Jul/15 9:44 AM ]

Stuart, I don't think the fact that those functions don't return string is really relevant, the important fact is that they operate on strings (just like c.set/superset? or subset? don't return sets) and are frequently used on them.

Comment by Bozhidar Batsov [ 27/Jul/15 9:51 AM ]

I'm with Nicola on this one - we have to be practical. There's a reason why this is the top-voted JIRA ticket - Clojure programmers want those functions. That's says plenty.

Anyways, having a generic `index-of` is definitely not a bad idea.

Comment by Stuart Halloway [ 31/Jul/15 11:27 AM ]

How does this compare to making them all seq fns and adding them to clojure.core?

Comment by Andy Fingerhut [ 31/Jul/15 11:39 AM ]

Given Rich's comment earlier "I completely agree with Alex and Bozhidar re: strings vs seqs. Other langs that have only lists and use them for strings are really bad at strings. So - string-specific fns are ok.", do we need to write a separate patch that makes these all into seq fns and do performance comparisons between those vs. the string-specific ones?

Comment by Stuart Halloway [ 31/Jul/15 1:26 PM ]

Andy,

Definitely not! I want to have a design conversation.

We have protocols, and could use them to provide core fns that are specialized to strings, so it isn't clear to me why we would have to be really bad at it.

Comment by Stuart Halloway [ 07/Aug/15 2:19 PM ]

Discussed with Rich. He does not want sequence fns with these performance characteristics, so string-specific fns it shall be.

Comment by Rich Hickey [ 08/Aug/15 10:11 AM ]

So which patch is screened?

Comment by Alex Miller [ 08/Aug/15 11:40 AM ]

Stu screened add_functions_to_strings-6.patch where index-of returns nil on not-found. The add_functions_to_strings-4.patch is almost identical but returns -1 from index-of (which matches Java and JavaScript index-of behavior and may be slightly faster for callers that can be monomorphic on primitive longs?).

Comment by Rich Hickey [ 24/Aug/15 11:27 AM ]

nil-returning index fns should document that return when not found

Comment by Nola Stowe [ 24/Aug/15 11:56 AM ]

adds better doc string to indicated nil is return when not found for index-of and last-index-of.

Comment by Alex Miller [ 24/Aug/15 11:59 AM ]

Added clj-1449-7.patch that is identical to -6 but adds statement of return expectation on index-of and last-index-of.

Comment by Alex Miller [ 24/Aug/15 12:19 PM ]

jinx - same patch! Replacing with Nola's.

Comment by Colin Taylor [ 30/Aug/15 6:48 PM ]

Bit late on this - but .equalsIgnoreCase (=ic? equals-ic ? ) is currently a visible wart for our portability.

Comment by Alex Miller [ 30/Aug/15 10:03 PM ]

Hi Colin, that's a good idea but you have missed the window on this ticket - feel free to file a different one.

Comment by Tom Marble [ 03/Sep/15 3:46 PM ]

I have just proposed an analogous patch to ClojureScript under
http://dev.clojure.org/jira/browse/CLJS-1441





[CLJ-1322] doseq with several bindings causes "ClassFormatError: Invalid Method Code length" Created: 10/Jan/14  Updated: 03/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Miikka Koskinen Assignee: Unassigned
Resolution: Unresolved Votes: 6
Labels: None
Environment:

Clojure 1.5.1, java 1.7.0_25, OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1ubuntu0.12.04.2)


Attachments: Text File doseq-bench.txt     Text File doseq.patch     File script.clj    
Patch: Code
Approval: Incomplete

 Description   

Important Perf Note the new impl is faster for collections that are custom-reducible but not chunked, and is also faster for large numbers of bindings. The original implementation is hand tuned for chunked collections, and wins for larger chunked coll/smaller binding count scenarios, presumably due to the fn call/return tracking overhead of reduce. Details are in the comments.
Screened By
Patch doseq.patch

user=> (def a1 (range 10))
#'user/a1
user=> (doseq [x1 a1 x2 a1 x3 a1 x4 a1 x5 a1 x6 a1 x7 a1 x8 a1] (do))
CompilerException java.lang.ClassFormatError: Invalid method Code length 69883 in class file user$eval1032, compiling:(NO_SOURCE_PATH:2:1)

While this example is silly, it's a problem we've hit a couple of times. It's pretty surprising when you have just a couple of lines of code and suddenly you get the code length error.



 Comments   
Comment by Kevin Downey [ 18/Apr/14 12:20 AM ]

reproduces with jdk 1.8.0 and clojure 1.6

Comment by Nicola Mometto [ 22/Apr/14 5:35 PM ]

A potential fix for this is to make doseq generate intermediate fns like `for` does instead of expanding all the code directly.

Comment by Ghadi Shayban [ 25/Jun/14 8:39 PM ]

Existing doseq handles chunked-traversal internally, deciding the
mechanics of traversal for a seq. In addition to possibly conflating
concerns, this is causing a code explosion blowup when more bindings are
added, approx 240 bytes of bytecode per binding (without modifiers).

This approach redefs doseq later in core.clj, after protocol-based
reduce (and other modern conveniences like destructuring.)

It supports the existing :let, :while, and :when modifiers.

New is a stronger assertion that modifiers cannot come before binding
expressions. (Same semantics as let, i.e. left to right)

valid: [x coll :when (foo x)]
invalid: [:when (foo x) x coll]

This implementation does not suffer from the code explosion problem.
About 25 bytes of bytecode + 1 fn per binding.

Implementing this without destructuring was not a party, luckily reduce
is defined later in core.

Comment by Andy Fingerhut [ 26/Jun/14 12:25 AM ]

For anyone reviewing this patch, note that there are already many tests for correct functionality of doseq in file test/clojure/test_clojure/for.clj. It may not be immediately obvious, but every test for 'for' defined with deftest-both is a test for 'for' and also for 'doseq'.

Regarding the current implementation of doseq: it in't simply that it is too many bytes per binding, it is that the code size doubles with each additional binding. See these results, which measures size of the macroexpanded form rather than byte code size, but those two things should be fairly linearly related to each other here:

(defn formsize [form]
  (count (with-out-str (print (macroexpand form)))))

user=> (formsize '(doseq [x (range 10)] (print x)))
652
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
1960
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
4584
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
9947
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))
20997

Here are results for the same expressions after Ghadi's patch doseq.patch dated June 25 2014:

user=> (formsize '(doseq [x (range 10)] (print x)))
93
user=> (formsize '(doseq [x (range 10) y (range 10)] (print x y)))
170
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10)] (print x y z)))
247
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10)] (print x y z w)))
324
user=> (formsize '(doseq [x (range 10) y (range 10) z (range 10) w (range 10) p (range 10)] (print x y z w p)))
401

It would be good to see some performance results with and without this patch, too.

Comment by Stuart Halloway [ 28/Jun/14 2:21 PM ]

In the tests below, the new impl is called "doseq2", vs. the original impl "doseq"

(def hund (into [] (range 100)))
(def ten (into [] (range 10)))
(def arr (int-array 100))
(def s "superduper")

;; big seq, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a (range 100000000)])))
;; 1.2 sec

(dotimes [_ 5]
  (time (doseq2 [a (range 100000000)])))
;; 1.8 sec

;; small unchunked reducible, few bindings: doseq2 wins
(dotimes [_ 5]
  (time (doseq [a s b s c s])))
;; 0.5 sec

(dotimes [_ 5]
  (time (doseq2 [a s b s c s])))
;; 0.2 sec

(dotimes [_ 5]
  (time (doseq [a arr b arr c arr])))
;; 40 msec

(dotimes [_ 5]
  (time (doseq2 [a arr b arr c arr])))
;; 8 msec

;; small chunked reducible, few bindings: doseq2 LOSES
(dotimes [_ 5]
  (time (doseq [a hund b hund c hund])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a hund b hund c hund])))
;; 8 msec

;; more bindings: doseq2 wins bigger and bigger
(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten ])))
;; 2 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten ])))
;; 0.4 msec

(dotimes [_ 5]
  (time (doseq [a ten b ten c ten d ten e ten])))
;; 18 msec

(dotimes [_ 5]
  (time (doseq2 [a ten b ten c ten d ten e ten])))
;; 1 msec
Comment by Ghadi Shayban [ 28/Jun/14 6:23 PM ]

Hmm, I cannot reproduce your results.

I'm not sure whether you are testing with lein, on what platform, what jvm opts.

Can we test using this little harness instead directly against clojure.jar? I've attached a the harness and two runs of results (one w/ default heap, the other 3GB w/ G1GC)

I added a medium and small (range) too.

Anecdotally, I see doseq2 outperform in all cases except the small range. Using criterium shows a wider performance gap favoring doseq2.

I pasted the results side by side for easier viewing.

core/doseq                          doseq2
"Elapsed time: 1610.865146 msecs"   "Elapsed time: 2315.427573 msecs"
"Elapsed time: 2561.079069 msecs"   "Elapsed time: 2232.479584 msecs"
"Elapsed time: 2446.674237 msecs"   "Elapsed time: 2234.556301 msecs"
"Elapsed time: 2443.129809 msecs"   "Elapsed time: 2224.302855 msecs"
"Elapsed time: 2456.406103 msecs"   "Elapsed time: 2210.383112 msecs"

;; med range, few bindings:
core/doseq                          doseq2
"Elapsed time: 28.383197 msecs"     "Elapsed time: 31.676448 msecs"
"Elapsed time: 13.908323 msecs"     "Elapsed time: 11.136818 msecs"
"Elapsed time: 18.956345 msecs"     "Elapsed time: 11.137122 msecs"
"Elapsed time: 12.367901 msecs"     "Elapsed time: 11.049121 msecs"
"Elapsed time: 13.449006 msecs"     "Elapsed time: 11.141385 msecs"

;; small range, few bindings:
core/doseq                          doseq2
"Elapsed time: 0.386334 msecs"      "Elapsed time: 0.372388 msecs"
"Elapsed time: 0.10521 msecs"       "Elapsed time: 0.203328 msecs"
"Elapsed time: 0.083378 msecs"      "Elapsed time: 0.179116 msecs"
"Elapsed time: 0.097281 msecs"      "Elapsed time: 0.150563 msecs"
"Elapsed time: 0.095649 msecs"      "Elapsed time: 0.167609 msecs"

;; small unchunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 2.351466 msecs"      "Elapsed time: 2.749858 msecs"
"Elapsed time: 0.755616 msecs"      "Elapsed time: 0.80578 msecs"
"Elapsed time: 0.664072 msecs"      "Elapsed time: 0.661074 msecs"
"Elapsed time: 0.549186 msecs"      "Elapsed time: 0.712239 msecs"
"Elapsed time: 0.551442 msecs"      "Elapsed time: 0.518207 msecs"

core/doseq                          doseq2
"Elapsed time: 95.237101 msecs"     "Elapsed time: 55.3067 msecs"
"Elapsed time: 41.030972 msecs"     "Elapsed time: 30.817747 msecs"
"Elapsed time: 42.107288 msecs"     "Elapsed time: 19.535747 msecs"
"Elapsed time: 41.088291 msecs"     "Elapsed time: 4.099174 msecs"
"Elapsed time: 41.03616 msecs"      "Elapsed time: 4.084832 msecs"

;; small chunked reducible, few bindings:
core/doseq                          doseq2
"Elapsed time: 31.793603 msecs"     "Elapsed time: 40.082492 msecs"
"Elapsed time: 17.302798 msecs"     "Elapsed time: 28.286991 msecs"
"Elapsed time: 17.212189 msecs"     "Elapsed time: 14.897374 msecs"
"Elapsed time: 17.266534 msecs"     "Elapsed time: 10.248547 msecs"
"Elapsed time: 17.227381 msecs"     "Elapsed time: 10.022326 msecs"

;; more bindings:
core/doseq                          doseq2
"Elapsed time: 4.418727 msecs"      "Elapsed time: 2.685198 msecs"
"Elapsed time: 2.421063 msecs"      "Elapsed time: 2.384134 msecs"
"Elapsed time: 2.210393 msecs"      "Elapsed time: 2.341696 msecs"
"Elapsed time: 2.450744 msecs"      "Elapsed time: 2.339638 msecs"
"Elapsed time: 2.223919 msecs"      "Elapsed time: 2.372942 msecs"

core/doseq                          doseq2
"Elapsed time: 28.869393 msecs"     "Elapsed time: 2.997713 msecs"
"Elapsed time: 22.414038 msecs"     "Elapsed time: 1.807955 msecs"
"Elapsed time: 21.913959 msecs"     "Elapsed time: 1.870567 msecs"
"Elapsed time: 22.357315 msecs"     "Elapsed time: 1.904163 msecs"
"Elapsed time: 21.138915 msecs"     "Elapsed time: 1.694175 msecs"
Comment by Ghadi Shayban [ 28/Jun/14 6:47 PM ]

It's good that the benchmarks contain empty doseq bodies in order to isolate the overhead of traversal. However, that represents 0% of actual real-world code.

At least for the first benchmark (large chunked seq), adding in some tiny amount of work did not change results signifantly. Neither for (map str [a])

(range 10000000) =>  (map str [a])
core/doseq
"Elapsed time: 586.822389 msecs"
"Elapsed time: 563.640203 msecs"
"Elapsed time: 369.922975 msecs"
"Elapsed time: 366.164601 msecs"
"Elapsed time: 373.27327 msecs"
doseq2
"Elapsed time: 419.704021 msecs"
"Elapsed time: 371.065783 msecs"
"Elapsed time: 358.779231 msecs"
"Elapsed time: 363.874448 msecs"
"Elapsed time: 368.059586 msecs"

nor for intrisics like (inc a)

(range 10000000)
core/doseq
"Elapsed time: 317.091849 msecs"
"Elapsed time: 272.360988 msecs"
"Elapsed time: 215.501737 msecs"
"Elapsed time: 206.639181 msecs"
"Elapsed time: 206.883343 msecs"
doseq2
"Elapsed time: 241.475974 msecs"
"Elapsed time: 193.154832 msecs"
"Elapsed time: 198.757873 msecs"
"Elapsed time: 197.803042 msecs"
"Elapsed time: 200.603786 msecs"

I still see reduce-based doseq ahead of the original, except for small seqs

Comment by Ghadi Shayban [ 04/Aug/14 2:55 PM ]

A form like the following will not work with this patch:

(go (doseq [c chs] (>! c :foo)))

as the go macro doesn't traverse fn boundaries. The only such code I know is core.async/mapcat*, a private fn supporting a fn that is marked deprecated.

Comment by Ghadi Shayban [ 07/Aug/14 2:09 PM ]

I see #'clojure.core/run! was just added, which has a similar limitation

Comment by Rich Hickey [ 29/Aug/14 8:19 AM ]

Please consider Ghadi's feedback, esp re: closures.

Comment by Ghadi Shayban [ 22/Sep/14 4:36 PM ]

The current expansion of a doseq [1] under a go form is less than ideal due to the amount of control flow. 14 states in the state machine vs. 7 with loop/recur

[1] Comparison of macroexpansion of (go ... doseq) vs (go ... loop/recur)
https://gist.github.com/ghadishayban/639009900ce1933256a1

Comment by Nicola Mometto [ 25/Mar/15 6:07 PM ]

Related: CLJ-77

Comment by Nicola Mometto [ 03/Sep/15 12:59 PM ]

The general solution to this would be to automatically split methods when too large, using something like https://bitbucket.org/sperber/asm-method-size





[CLJ-1609] Fix an edge case in the Reflector's search for a public method declaration Created: 05/Dec/14  Updated: 03/Sep/15  Resolved: 03/Sep/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Jeremy Heiler Assignee: Unassigned
Resolution: Completed Votes: 1
Labels: interop

Attachments: Text File clj-1609-2.patch     Text File clj-1609-3.patch     Text File clj-1609-test.patch     Text File reflector_method_bug.patch    
Patch: Code and Test
Approval: Ok

 Description   
// all: package demo;

public interface IBar {
  void stuff();
}

class Bar {
  public void stuff() { System.out.println("stuff"); }
}

class SubBar extends Bar implements IBar {
  // implements IBar, via non-public superclass Bar
}

public class Factory {
  public static IBar get() { return new SubBar(); }
}

Can't invoke from Clojure:

(import '[demo Factory IBar])
(def inst (Factory/get))  ;; instance of SubBar, inferred as IBar
(.stuff inst)
;; IllegalArgumentException Can't call public method of non-public class: public void demo.Bar.stuff()  clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:88)

Cause: The Reflector is not taking into account that a non-public class can implement an interface, and have a non-public parent class contain an implementation of a method on that interface.

Approach: The solution I took is to pass in the target object's class instead of the declaring class of the method object.

Patch: clj-1609-3.patch + clj-1609-test.patch

Screened by: Fogus



 Comments   
Comment by Alex Miller [ 05/Dec/14 8:48 AM ]

Probably a dupe of CLJ-259. I'll probably dupe that one to this though.

Comment by Jeremy Heiler [ 05/Dec/14 1:33 PM ]

Thanks. Sorry for not finding that myself. CLJ-259 refers to CLJ-126, which I think is covered with this patch.

Comment by Jeremy Heiler [ 06/Dec/14 12:37 PM ]

I was looking into whether or not the target could be null, and it can be when invoking a static method. However, I don't think that code path would make it to the target.getClass() because non-public static methods aren't returned by getMethods().

Comment by Stuart Halloway [ 31/Jul/15 9:24 AM ]

Jeremy,

The compile-tests phase of the build provides a point for compiling Java classes that that tests can then consume. Does that provide the hook you would need to add a test?

Comment by Alex Miller [ 07/Aug/15 9:10 AM ]

afaict, this ticket is incomplete pending a test from Stu's last comment so moving there

Comment by Alex Miller [ 18/Aug/15 4:58 PM ]

clj-1609-2.patch is identical to previous patch, just adds jira id to beginning of commit message

clj-1609-test.patch adds a reflection test that fails without the patch and passes with it.

Comment by Alex Miller [ 18/Aug/15 4:58 PM ]

should be ready to screen now

Comment by Fogus [ 21/Aug/15 12:53 PM ]

The patch and test are straight-forward and clean.

Comment by Rich Hickey [ 24/Aug/15 9:54 AM ]

Can I get some more context in this diff please?

Comment by Alex Miller [ 24/Aug/15 12:11 PM ]

Added clj-1609-3.patch which is identical to prior but has 15 lines of context instead.





[CLJ-1586] Compiler doesn't preserve metadata for LazySeq literals Created: 12/Nov/14  Updated: 03/Sep/15  Resolved: 03/Sep/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: compiler, metadata, typehints

Attachments: Text File 0001-CLJ-1586-Compiler-doesn-t-preserve-metadata-for-lazy.patch     Text File 0001-Compiler-doesn-t-preserve-metadata-for-lazyseq-liter.patch     Text File clj-1586-2.patch    
Patch: Code and Test
Approval: Ok

 Description   

The analyzer in Compiler.java forces evaluation of lazyseq literals, but loses the compile time original metadata of that form, meaning that a type hint will be lost.

Example demonstrating this issue:

user=> (set! *warn-on-reflection* true)
true
user=> (list '.substring (with-meta (concat '(identity) '("foo")) {:tag 'String}) 0)
(.substring (identity "foo") 0)
user=> (eval (list '.substring (with-meta (concat '(identity) '("foo")) {:tag 'String}) 0))
Reflection warning, NO_SOURCE_PATH:6:1 - call to method substring on java.lang.Object can't be resolved (no such method).
"foo"

Forcing the concat call to an ASeq rather than a LazySeq fixes this issue:

user=> (eval (list '.hashCode (with-meta (seq (concat '(identity) '("foo"))) {:tag 'String})))
101574

This ticket blocks http://dev.clojure.org/jira/browse/CLJ-1444 since clojure.core/sequence might return a lazyseq.

This bug affected both tools.analyzer and tools.reader and forced me to commit a fix in tools.reader to work around this issue, see: http://dev.clojure.org/jira/browse/TANAL-99

The proposed patch trivially preserves the form metadata after realizing the lazyseq

Approach: Keep a copy of the original form, and apply its metadata to the realized lazyseq
Patch: clj-1586-2.patch
Screened by: Alex Miller



 Comments   
Comment by Stuart Halloway [ 31/Jul/15 9:19 AM ]

Please add a test case.

Comment by Nicola Mometto [ 31/Jul/15 9:32 AM ]

Updated patch with testcase

Comment by Alex Miller [ 11/Aug/15 10:55 AM ]

clj-1586-2.patch is same as prior, just updated to apply to current master





[CLJ-1232] Functions with non-qualified return type hints force import of hinted classes when called from other namespace Created: 18/Jul/13  Updated: 03/Sep/15  Resolved: 03/Sep/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5, Release 1.6, Release 1.7
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Tassilo Horn Assignee: Unassigned
Resolution: Completed Votes: 8
Labels: compiler, typehints

Attachments: Text File 0001-auto-qualify-arglists-class-names.patch     Text File 0001-auto-qualify-arglists-class-names-v2.patch     Text File 0001-auto-qualify-arglists-class-names-v3.patch     Text File 0001-auto-qualify-arglists-class-names-v4.patch     Text File 0001-CLJ-1232-auto-qualify-arglists-class-names.patch     Text File 0001-throw-on-non-qualified-class-names-that-are-not-auto.patch    
Patch: Code and Test
Approval: Ok

 Description   

You can add a type hint to function arglists to indicate the return type of a function like so.

user> (import '(java.util List))
java.util.List
user> (defn linkedlist ^List [] (java.util.LinkedList.))
#'user/linkedlist
user> (.size (linkedlist))
0

The problem is that now when I call `linkedlist` exactly as above from another namespace, I'll get an exception because java.util.List is not imported in there.

user> (in-ns 'user2)
#<Namespace user2>
user2> (refer 'user)
nil
user2> (.size (linkedlist))
CompilerException java.lang.IllegalArgumentException: Unable to resolve classname: List, compiling:(NO_SOURCE_PATH:1:1)
user2> (import '(java.util List)) ;; Too bad, need to import List here, too.
java.util.List
user2> (.size (linkedlist))
0

There are two workarounds: You can import the hinted type also in the calling namespace, or you always use fully qualified class names for return type hints. Clearly, the latter is preferable.

Approach: Resolve the tags in the defn macro.

Patch: 0001-CLJ-1232-auto-qualify-arglists-class-names.patch

Screened by: Fogus

Alternate approach: Make the compiler resolve the return tags when necessary (tag is not a string, primitive tag (^long) or array tag (^longs)) and update the Var's :arglist appropriately. Patch: 0001-auto-qualify-arglists-class-names-v4.patch. Note that this patch had problems with Hystrix, which was using non-conformant arglists - Hystrix has since been patched.



 Comments   
Comment by Andy Fingerhut [ 16/Apr/14 3:47 PM ]

To make sure I understand, Nicola, in this ticket you are asking that the Clojure compiler change behavior so that the sample code works correctly with no exceptions, the same way as it would work correctly without exceptions if one of the workarounds were used?

Comment by Tassilo Horn [ 17/Apr/14 12:18 AM ]

Hi Andy. Tassilo here, not Nicola. But yes, the example should work as-is. When I'm allowed to use type hints with simple imported class names for arguments, then doing so for return values should work, too.

Comment by Rich Hickey [ 10/Jun/14 10:41 AM ]

Type hints on function params are only consumed by the function definition, i.e. in the same module as the import/alias. Type hints on returns are just metadata, they don't get 'compiled' and if the metadata is not useful to consumers in other namespaces, it's not a useful hint. So, if it's not a type in the auto-imported set (java.lang), it should be fully qualified.

Comment by Alex Miller [ 10/Jun/14 11:55 AM ]

Based on Rich's comment, this ticket should probably morph into an enhancement request on documentation, probably on http://clojure.org/java_interop#Java Interop-Type Hints.

Comment by Andy Fingerhut [ 10/Jun/14 3:13 PM ]

I would suggest something like the following for a documentation change, after this part of the text on the page Alex links in the previous comment:

For function return values, the type hint can be placed before the arguments vector:

(defn hinted
(^String [])
(^Integer [a])
(^java.util.List [a & args]))

-> #user/hinted

If the return value type hint is for a class that is outside of java.lang, which is the part auto-imported by Clojure, then it must be a fully qualified class name, e.g. java.util.List, not List.

Comment by Nicola Mometto [ 10/Jun/14 4:02 PM ]

I don't understand why we should enforce this complexity to the user.
Why can't we just make the Compiler (or even defn itself) update all the arglists tags with properly resolved ones? (that's what I'm doing in tools.analyzer.jvm)

Comment by Alexander Kiel [ 19/Jul/14 10:02 AM ]

I'm with Nicola here. I also think that defn should resolve the type hint according the imports of the namespace defn is used in.

Comment by Max Penet [ 22/Jul/14 7:06 AM ]

Same here, I was bit by this in the past. The current behavior is clearly counterintuitive.

Comment by Nicola Mometto [ 28/Aug/14 12:58 PM ]

Attached two patches implementing two different solutions:

  • 0001-auto-qualify-arglists-class-names.patch makes the compiler automatically qualify all the tags in the :arglists
  • 0001-throw-on-non-qualified-class-names-that-are-not-auto.patch makes the compiler throw an exception for all public defs whose return tag is a symbol representing a non-qualified class that is not in the auto-import list (approach proposed in IRC by Alex Miller)
Comment by Tassilo Horn [ 29/Aug/14 1:49 AM ]

For what it's worth, I'd prefer the first patch because the second doesn't help in situations where the caller lives in a namespace where the called function's return type hinted class is `ns-unmap`-ed. And there a good reasons for doing that. For example, Process is a java.lang class and Process is a pretty generic name. So in some namespace, I want to define my own Process deftype or defrecord. Without unmapping 'Process first to get rid of the java.lang.Process auto-import, I'd get an exception:

user> (deftype Process [])
IllegalStateException Process already refers to: class java.lang.Process in namespace: user  clojure.lang.Namespace.referenceClass (Namespace.java:140)

Now when I call some function from some library that has a `^Process` return type hint (meaning java.lang.Process there), I get the same exception as in my original report.

I can even get into troubles when only using standard Clojure functions because those have `^String` and `^Class` type hints. IMO, Class is also a pretty generic name I should be able to name my custom deftype/defrecord. And I might also want to have a custom String type/record in my astrophysics system.

Comment by Andy Fingerhut [ 30/Sep/14 4:39 PM ]

Not sure whether the root cause of this behavior is the same as the example in the description or not, but seems a little weird that even for fully qualified Java class names hinting the arg vector, it makes a difference whether it is done with defn or def:

Clojure 1.6.0
user=> (set! *warn-on-reflection* true)
true
user=> (defn f1 ^java.util.LinkedList [coll] (java.util.LinkedList. coll))
#'user/f1
user=> (def f2 (fn ^java.util.LinkedList [coll] (java.util.LinkedList. coll)))
#'user/f2
user=> (.size (f1 [2 3 4]))
3
user=> (.size (f2 [2 3 4]))
Reflection warning, NO_SOURCE_PATH:5:1 - reference to field size can't be resolved.
3
Comment by Alex Miller [ 30/Sep/14 6:21 PM ]

Andy, can you file that as a separate ticket?

Comment by Andy Fingerhut [ 30/Sep/14 9:08 PM ]

Created ticket CLJ-1543 for the issue raised in my comment earlier on 30 Sep 2014.

Comment by Andy Fingerhut [ 01/Oct/14 12:38 PM ]

Tassilo (or anyone), is there a reason to prefer putting the tag on the argument vector in your example? It seems that putting it on the Var name instead avoids this issue:

user=> (clojure-version)
"1.6.0"
user=> (set! *warn-on-reflection* true)
true
user=> (import '(java.util List))
java.util.List
user=> (defn ^List linkedlist [] (java.util.LinkedList.))
#'user/linkedlist
user=> (.size (linkedlist))
0
user=> (ns user2)
nil
user2=> (refer 'user)
nil
user2=> (.size (linkedlist))
0

I suppose that only allows a single type tag, rather than an independent one for each arity.

Comment by Tassilo Horn [ 02/Oct/14 3:16 AM ]

I wasn't aware of the fact that you can put it on the var's name. That's not documented at http://clojure.org/java_interop#Java Interop-Type Hints. But IMHO the documented version with putting the tag on the argument vector is more general since it supports different return type hints for the different arity version. In any case, if both forms are permitted then they should be equivalent in the case the function has only one arity.

Comment by Rich Hickey [ 16/Mar/15 12:02 PM ]

Please work on the simplest patch that resolves the names

Comment by Alex Miller [ 16/Mar/15 4:34 PM ]

Nicola, in this:

if (tag != null &&
                        !(tag instanceof String) &&
                        primClass((Symbol)tag) == null &&
                        !tagClass((Symbol) tag).getName().startsWith("["))
                        {
                            argvec = (IPersistentVector)((IObj)argvec).withMeta(RT.map(RT.TAG_KEY, Symbol.intern(tagClass((Symbol)tag).getName())));
                        }

doesn't tagClass already handle most of these cases properly already? Can this be simplified? Is there an optimization case in avoiding lookup for a dotted name?

Comment by Nicola Mometto [ 16/Mar/15 5:10 PM ]

Patch 0001-auto-qualify-arglists-class-names-v2.patch avoids doing unnecessary lookups (dotted names, special tags (primitive tags, array tags)) and adds a testcase

Comment by Michael Blume [ 20/May/15 1:13 PM ]

I'm seeing an odd failure with this patch and hystrix defcommands, will post a small reproduction shortly

Comment by Michael Blume [ 20/May/15 1:20 PM ]

https://github.com/MichaelBlume/hystrix-demo

passes lein check with 1.7 beta3, fails with v3 patch applied

Comment by Nicola Mometto [ 20/May/15 1:40 PM ]

During analysis the compiler understands only arglists in the form of (quote ([..]*)) (see https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Compiler.java#L557-L558), hystrix emits arglists in the form of (list (quote [..])*).

Not sure what to do about this.

Comment by Michael Blume [ 20/May/15 1:51 PM ]

Possibly just ask Hystrix not to do that?

Comment by Alex Miller [ 20/May/15 2:30 PM ]

test.generative uses non-standard arglists too. I haven't looked at the patch, but if it's sensitive to that, it's probably not good enough.

Comment by Nicola Mometto [ 20/May/15 6:51 PM ]

test.generative uses non-standard :tag, not :arglists

Comment by Alex Miller [ 20/May/15 10:22 PM ]

ah, yes. sorry.

Comment by Stuart Halloway [ 10/Jul/15 12:15 PM ]

Opened https://github.com/Netflix/Hystrix/issues/831 to see what is up with hystrix.

Comment by Stuart Halloway [ 31/Jul/15 9:16 AM ]

Can somebody with context update this patch to apply to latest master?

Comment by Nicola Mometto [ 31/Jul/15 9:37 AM ]

patch updated

Comment by Matt Jacobs [ 06/Aug/15 12:59 PM ]

I've released hystrix-clj 1.4.14, with a fix for https://github.com/Netflix/Hystrix/issues/831: https://github.com/Netflix/Hystrix/releases/tag/v1.4.14. If anything else is needed on the Hystrix side, please open another issue.

Comment by Fogus [ 07/Aug/15 11:03 AM ]

Applied the 0001-CLJ-1232-auto-qualify-arglists-class-names.patch to master (1.8) and ran through the example scenario to check the veracity of the change. Additionally, I ran a modified snippet using def and verified that it too worked. Finally, after checking the code it seems reasonable in implementation and scope.

Comment by Rich Hickey [ 08/Aug/15 9:56 AM ]

I'd rather not change the compiler and it seems hystrix was broken. Also please make it clear what the single strategy and patch is at the top of the ticket.

Comment by Alex Miller [ 11/Aug/15 5:01 PM ]

Made preferred approach clear in the description and moved back to vetted. I believe the other ticket was screened by Fogus so this needs to be re-screened with the preferred approach ticket.

Comment by Alex Miller [ 11/Aug/15 5:08 PM ]

Nicola and Fogus, would appreciate an eye as to whether I just made the proper alterations in the description.

Comment by Nicola Mometto [ 12/Aug/15 6:17 AM ]

Alex Miller Reading Fogus' last comment, it seems to me that he actually screened the current patch (the one who resolves tags in defn rather than in the compiler) "Applied the 0001-CLJ-1232-auto-qualify-arglists-class-names.patch to master [..]"

Comment by Fogus [ 14/Aug/15 9:18 AM ]

It would be better if the description said the word "preferred," but based on my reading resolving the tags in defn is the winner (0001-CLJ-1232-auto-qualify-arglists-class-names.patch). Thankfully that's the one that I screened as I agree that non-trivial changes to the compiler are to be avoided. That said, of course that patch does modify the compiler but since it's a change that pulls out some code into a useful method I let it through. To my mind that was a trivial (and justifiable change). Of course, if we want to avoid any compiler changes then this patch would have to be reworked.

Comment by Alex Miller [ 18/Aug/15 5:03 PM ]

Moving back to screened as fogus screened the preferred (by Rich) version.





[CLJ-1766] Serializing+deserializing lists breaks their hash Created: 21/Jun/15  Updated: 03/Sep/15  Resolved: 03/Sep/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6, Release 1.7
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Chris Vermilion Assignee: Unassigned
Resolution: Completed Votes: 0
Labels: interop
Environment:

Tested on OS X 10.10.3, Clojure 1.6.0, Java 1.8.0_45-b14, but don't think this is a factor.


Attachments: Text File clj-1766-2.patch     Text File CLJ-1766.patch     File serialization_test_mod.diff     File serialize-test.clj    
Patch: Code and Test
Approval: Ok

 Description   

clojure.lang.PersistentList implements Serializable but a serialization roundtrip results in a hash of 0. This appears to be caused by ASeq marking its _hash property as transient; other collection base classes don't do this. I don't know if there is a rationale for the distinction, but the result is that serializing, then deserializing, a list results in its _hash property being 0 instead of either its previous, calculated value, or -1, which would trigger recalculation.

This means that any code that relies on a list's hash value can break in subtle ways.

Examples:

(import '[java.io ByteArrayInputStream ByteArrayOutputStream ObjectInputStream ObjectOutputStream])

(defn obj->bytes [obj]
  (let [bytestream (ByteArrayOutputStream.)]
    (doto (ObjectOutputStream. bytestream) (.writeObject obj))
    (.toByteArray bytestream)))

(defn bytes->obj [bs]
  (.readObject (ObjectInputStream. (ByteArrayInputStream. bs))))

(defn round-trip [x] (bytes->obj (obj->bytes x)))
user=> (hash '(1))
-1381383523
user=> (hash (round-trip '(1)))
0
user=> #{'(1) (round-trip '(1))}
#{(1) (1)}
user=> (def m {'(1) 1})
#'user/m
user=> (= m (round-trip m))
true
user=> (= (round-trip m) m)
false

Approach: Use 0 as the "uncomputed" hash value, not -1. The auto initialized value on serialization construction will then automatically trigger a re-computation.

Alternate approaches:

  • Implement a readObject method that sets the _hash property to -1. This is somewhat complicated in the face of subclasses, etc.

Note: Also need to consider other classes that use -1 instead of 0 as the uncomputed hash value: APersistentMap, APersistentSet, APersistentVector, PersistentQueue - however I believe in those cases they are not transient and thus may avoid the issue. Perhaps they should be transient though.

Patch: clj-1766-2.patch
Screened by: Alex Miller



 Comments   
Comment by Chris Vermilion [ 21/Jun/15 1:10 PM ]

Yikes, sorry about the formatting above; JIRA newbie and looks like I can't edit. Also, I should have noted that the above functions require an import: (import '[java.io ByteArrayInputStream ByteArrayOutputStream ObjectInputStream ObjectOutputStream]).

Comment by Alex Miller [ 22/Jun/15 7:55 AM ]

Yes, this is a bug. My preference would be to use 0 to indicate "not computed" and thus sidestep this issue. The downside of changing from -1 to 0 is that serialization/deserialization is then broken between versions of Clojure (although maybe if it was already broken, that's not an issue). -1 is used like this for lazily computed hashes in many classes in Clojure so the issue should really be fixed across the board.

Comment by Alex Miller [ 22/Jun/15 8:22 AM ]

There are some serialization round-trip tests in clojure.test-clojure.serialization - seems like those should be updated to compare the .hashCode and hash before/after, which should catch this. I attached a diff (not a proper patch) with that change just as a demonstration, which indeed produces a bunch of failures. That should be incorporated into any fix, possibly along with other failures.

Comment by Alex Miller [ 31/Jul/15 9:04 AM ]

Also, don't worry about crediting me on that test change, please just incorporate it into whatever gets made here.

Comment by Andrew Rosa [ 31/Jul/15 9:31 AM ]

Alex Miller I hope to work on this issue this weekend. There are some conflict with the alpha release schedule?

Comment by Alex Miller [ 31/Jul/15 9:47 AM ]

No conflict - when it's ready we'll look at it.

Comment by Andrew Rosa [ 04/Aug/15 8:21 AM ]

As Alex Miller recomended, I followed the change-default-to-zero path.
Not only it sidesteps the serialization issue, it looks more aligned with
the semantics of transient. Of course, there is no guarantees
about how different JVM implementations will deal with it - sometimes
they will be serialized, sometimes they will not.

So, together with the patch I've made some manual tests between some versions. The script used is attached for further inspection.

Running on a 1.6 REPL has shown on the original described issue:

serialize-test=> *clojure-version*
{:major 1, :minor 6, :incremental 0, :qualifier nil}
serialize-test=> (def f "seq-1-6.dump")
#'serialize-test/f
serialize-test=> (def x {'(1) 1})
#'serialize-test/x
serialize-test=> (dump-bytes f (serialize x))
nil
serialize-test=> (deserialize (slurp-bytes f))
{(1) 1}
serialize-test=> (hash x)
202919476
serialize-test=> (hash (deserialize (serialize x)))
202919476
serialize-test=> (hash (deserialize (slurp-bytes f)))
-1619826937
serialize-test=> (= x (deserialize (slurp-bytes f)))
true
serialize-test=> (= (deserialize (slurp-bytes f)) x)
false

Running on 1.8 master. Despite of the = behavior looking ok, the
hashes are different between roundtrips, like we saw on 1.6:

serialize-test=> *clojure-version*
{:major 1, :minor 8, :incremental 0, :qualifier "master", :interim true}
serialize-test=> (def f "seq-1-8-master.dump")
#'serialize-test/f
serialize-test=> (def x {'(1) 1})
#'serialize-test/x
serialize-test=> (dump-bytes f (serialize x))
nil
serialize-test=> (deserialize (slurp-bytes f))
{(1) 1}
serialize-test=> (hash x)
202919476
serialize-test=> (hash (deserialize (serialize x)))
202919476
serialize-test=> (hash (deserialize (slurp-bytes f)))
-1619826937
serialize-test=> (= x (deserialize (slurp-bytes f)))
true
serialize-test=> (= (deserialize (slurp-bytes f)) x)
true

Running on 1.8 after patch the hashes are correctly computed on both cases:

serialize-test=> *clojure-version*
{:major 1, :minor 8, :incremental 0, :qualifier "master", :interim true}
serialize-test=> (def f "seq-1-8-patch.dump")
#'serialize-test/f
serialize-test=> (def x {'(1) 1})
#'serialize-test/x
serialize-test=> (dump-bytes f (serialize x))
nil
serialize-test=> (deserialize (slurp-bytes f))
{(1) 1}
serialize-test=> (hash x)
202919476
serialize-test=> (hash (deserialize (serialize x)))
202919476
serialize-test=> (hash (deserialize (slurp-bytes f)))
202919476
serialize-test=> (= x (deserialize (slurp-bytes f)))
true
serialize-test=> (= (deserialize (slurp-bytes f)) x)
true

Please note I've dumped each serialized data into different files, so I could test the interoperability between those versions. What I found:

  • 1.6 serialization is already incompatible with 1.8, on both directions;
  • 1.8 data could be exchange between master and patched versions, not affecting version behavior (hashes are computed only on patched REPL).

Did I miss something or everything looks correct for you too? I'm open to do some more exhaustive testing if anyone could give me some directions about where to explore.

Comment by Rich Hickey [ 08/Aug/15 9:46 AM ]

This patch both fixes the serialization problem and changes the implementation for no reason related to the problem. The implementation works in place in order to avoid head-holding, which the implementation change reintroduces. Screeners - please make sure patches contain the minimal code to address the problem and don't incorporate other extraneous 'improvements'.

Comment by Ghadi Shayban [ 08/Aug/15 11:59 AM ]

Rich Hickey The only change I see is the removal of a commented-out impl.

Comment by Alex Miller [ 08/Aug/15 10:33 PM ]

I agree with Ghadi - there is no change in implementation here, just some commented (non-used) code was removed. Moving back to Screened.

Comment by Alex Miller [ 18/Aug/15 12:26 PM ]

Latest patch is identical, just does not remove the commented out code.





[CLJ-701] Primitive return type of loop and try is lost Created: 03/Jan/11  Updated: 03/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Backlog
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Chouser Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: None
Environment:

Clojure commit 9052ca1854b7b6202dba21fe2a45183a4534c501, version 1.3.0-master-SNAPSHOT


Attachments: Text File 0001-CLJ-701-add-HoistedMethod-to-the-compiler-for-hoisti.patch     Text File 0001-CLJ-701-add-HoistedMethod-to-the-compiler-for-hoisti-v2.patch     File hoistedmethod-pass-1.diff     File hoistedmethod-pass-2.diff     File hoistedmethod-pass-3.diff     File hoistedmethod-pass-4.diff     File hoistedmethod-pass-5.diff     File hoistedmethod-pass-6.diff     File hoistedmethod-pass-7.diff    
Patch: Code and Test
Approval: Vetted

 Description   
(set! *warn-on-reflection* true)
(fn [] (loop [b 0] (recur (loop [a 1] a))))

Generates the following warnings:

recur arg for primitive local: b is not matching primitive, had: Object, needed: long
Auto-boxing loop arg: b

This is interesting for several reasons. For one, if the arg to recur is a let form, there is no warning:

(fn [] (loop [b 0] (recur (let [a 1] a))))

Also, the compiler appears to understand the return type of loop forms just fine:

(use '[clojure.contrib.repl-utils :only [expression-info]])
(expression-info '(loop [a 1] a))
;=> {:class long, :primitive? true}

The problem can of course be worked around using an explicit cast on the loop form:

(fn [] (loop [b 0] (recur (long (loop [a 1] a)))))

Reported by leafw in IRC: http://clojure-log.n01se.net/date/2011-01-03.html#10:31

See Also: CLJ-1422

Patch: 0001-CLJ-701-add-HoistedMethod-to-the-compiler-for-hoisti-v2.patch



 Comments   
Comment by a_strange_guy [ 03/Jan/11 4:36 PM ]

The problem is that a 'loop form gets converted into an anonymous fn that gets called immediately, when the loop is in a expression context (eg. its return value is needed, but not as the return value of a method/fn).

so

(fn [] (loop [b 0] (recur (loop [a 1] a))))

gets converted into

(fn [] (loop [b 0] (recur ((fn [] (loop [a 1] a))))))

see the code in the compiler:
http://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Compiler.java#L5572

this conversion already bites you if you have mutable fields in a deftype and want to 'set! them in a loop

http://dev.clojure.org/jira/browse/CLJ-274

Comment by Christophe Grand [ 23/Nov/12 2:28 AM ]

loops in expression context are lifted into fns because else Hotspot doesn't optimize them.
This causes several problems:

  • type inference doesn't propagate outside of the loop[1]
  • the return value is never a primitive
  • mutable fields are inaccessible
  • surprise allocation of one closure objects each time the loop is entered.

Adressing all those problems isn't easy.
One can compute the type of the loop and emit a type hint but it works only with reference types. To make it works with primitive, primitie fns aren't enough since they return only long/double: you have to add explicit casts.
So solving the first two points can be done in a rather lccal way.
The two other points require more impacting changes, the goal would be to emit a method rather than a fn. So it means at the very least changing ObjExpr and adding a new subclassof ObjMethod.

[1] beware of CLJ-1111 when testing.

Comment by Alex Miller [ 21/Oct/13 10:28 PM ]

I don't think this is going to make it into 1.6, so removing the 1.6 tag.

Comment by Kevin Downey [ 21/Jul/14 7:14 PM ]

an immediate solution to this might be to hoist loops out in to distinct non-ifn types generated by the compiler with an invoke method that is typed to return the getJavaClass() type of the expression, that would give us the simplifying benefits of hoisting the code out and free use from the Object semantics of ifn

Comment by Kevin Downey [ 22/Jul/14 8:39 PM ]

I have attached a 3 part patch as hoistedmethod-pass-1.diff

3ed6fed8 adds a new ObjMethod type to represent expressions hoisted out in to their own methods on the enclosing class

9c39cac1 uses HoistedMethod to compile loops not in the return context

901e4505 hoists out try expressions and makes it possible for try to return a primitive expression (this might belong on http://dev.clojure.org/jira/browse/CLJ-1422)

Comment by Kevin Downey [ 22/Jul/14 8:54 PM ]

with hoistedmethod-pass-1.diff the example code generates bytecode like this

user=> (println (no.disassemble/disassemble (fn [] (loop [b 0] (recur (loop [a 1] a))))))
// Compiled from form-init1272682692522767658.clj (version 1.5 : 49.0, super bit)
public final class user$eval1675$fn__1676 extends clojure.lang.AFunction {
  
  // Field descriptor #7 Ljava/lang/Object;
  public static final java.lang.Object const__0;
  
  // Field descriptor #7 Ljava/lang/Object;
  public static final java.lang.Object const__1;
  
  // Method descriptor #10 ()V
  // Stack: 2, Locals: 0
  public static {};
     0  lconst_0
     1  invokestatic java.lang.Long.valueOf(long) : java.lang.Long [16]
     4  putstatic user$eval1675$fn__1676.const__0 : java.lang.Object [18]
     7  lconst_1
     8  invokestatic java.lang.Long.valueOf(long) : java.lang.Long [16]
    11  putstatic user$eval1675$fn__1676.const__1 : java.lang.Object [20]
    14  return
      Line numbers:
        [pc: 0, line: 1]

 // Method descriptor #10 ()V
  // Stack: 1, Locals: 1
  public user$eval1675$fn__1676();
    0  aload_0 [this]
    1  invokespecial clojure.lang.AFunction() [23]
    4  return
      Line numbers:
        [pc: 0, line: 1]
  
  // Method descriptor #25 ()Ljava/lang/Object;
  // Stack: 3, Locals: 3
  public java.lang.Object invoke();
     0  lconst_0
     1  lstore_1 [b]
     2  aload_0 [this]
     3  lload_1 [b]
     4  invokevirtual user$eval1675$fn__1676.__hoisted1677(long) : long [29]
     7  lstore_1 [b]
     8  goto 2
    11  areturn
      Line numbers:
        [pc: 0, line: 1]
      Local variable table:
        [pc: 2, pc: 11] local: b index: 1 type: long
        [pc: 0, pc: 11] local: this index: 0 type: java.lang.Object

 // Method descriptor #27 (J)J
  // Stack: 2, Locals: 5
  public long __hoisted1677(long b);
    0  lconst_1
    1  lstore_3 [a]
    2  lload_3
    3  lreturn
      Line numbers:
        [pc: 0, line: 1]
      Local variable table:
        [pc: 2, pc: 3] local: a index: 3 type: long
        [pc: 0, pc: 3] local: this index: 0 type: java.lang.Object
        [pc: 0, pc: 3] local: b index: 1 type: java.lang.Object

}
nil
user=> 
  

the body of the method __hoisted1677 is the inner loop

for reference the part of the bytecode from the same function compiled with 1.6.0 is pasted here https://gist.github.com/hiredman/f178a690718bde773ba0 the inner loop body is missing because it is implemented as its own IFn class that is instantiated and immediately executed. it closes over a boxed version of the numbers and returns an boxed version

Comment by Kevin Downey [ 23/Jul/14 12:43 AM ]

hoistedmethod-pass-2.diff replaces 901e4505 with f0a405e3 which fixes the implementation of MaybePrimitiveExpr for TryExpr

with hoistedmethod-pass-2.diff the largest clojure project I have quick access to (53kloc) compiles and all the tests pass

Comment by Alex Miller [ 23/Jul/14 12:03 PM ]

Thanks for the work on this!

Comment by Kevin Downey [ 23/Jul/14 2:05 PM ]

I have been working through running the tests for all the contribs projects with hoistedmethod-pass-2.diff, there are some bytecode verification errors compiling data.json and other errors elsewhere, so there is still work to do

Comment by Kevin Downey [ 25/Jul/14 7:08 PM ]

hoistedmethod-pass-3.diff

49782161 * add HoistedMethod to the compiler for hoisting expresssions out well typed methods
e60e6907 * hoist out loops if required
547ba069 * make TryExpr MaybePrimitive and hoist tries out as required

all contribs whose tests pass with master pass with this patch.

the change from hoistedmethod-pass-2.diff in this patch is the addition of some bookkeeping for arguments that take up more than one slot

Comment by Nicola Mometto [ 26/Jul/14 1:37 AM ]

Kevin there's still a bug regarding long/doubles handling:
On commit 49782161, line 101 of the diff, you're emitting gen.pop() if the expression is in STATEMENT position, you need to emit gen.pop2() instead if e.getReturnType is long.class or double.class

Test case:

user=> (fn [] (try 1 (finally)) 2)
VerifyError (class: user$eval1$fn__2, method: invoke signature: ()Ljava/lang/Object;) Attempt to split long or double on the stack  user/eval1 (NO_SOURCE_FILE:1)
Comment by Kevin Downey [ 26/Jul/14 1:46 AM ]

bah, all that work to figure out the thing I couldn't get right and of course I overlooked the thing I knew at the beginning. I want to get rid of some of the code duplication between emit and emitUnboxed for TryExpr, so when I get that done I'll fix the pop too

Comment by Kevin Downey [ 26/Jul/14 12:52 PM ]

hoistedmethod-pass-4.diff logically has the same three commits, but fixes the pop vs pop2 issue and rewrites emit and emitUnboxed for TryExpr to share most of their code

Comment by Kevin Downey [ 26/Jul/14 12:58 PM ]

hoistedmethod-pass-5.diff fixes a stupid mistake in the tests in hoistedmethod-pass-4.diff

Comment by Nicola Mometto [ 21/Aug/15 1:03 PM ]

Kevin Downey FWIW the patch no longer applies

Comment by Michael Blume [ 21/Aug/15 3:06 PM ]

A naive attempt to bring the patch up to date results in

compile-clojure:
     [java] Exception in thread "main" java.lang.ExceptionInInitializerError
     [java] 	at clojure.lang.Compile.<clinit>(Compile.java:29)
     [java] Caused by: java.lang.ClassCastException: clojure.lang.Compiler$HoistedMethod cannot be cast to clojure.lang.Compiler$FnMethod, compiling:(clojure/core.clj:439:11)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7360)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7154)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7341)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7154)
     [java] 	at clojure.lang.Compiler.access$300(Compiler.java:38)
     [java] 	at clojure.lang.Compiler$DefExpr$Parser.parse(Compiler.java:589)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7353)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7154)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7112)
     [java] 	at clojure.lang.Compiler.eval(Compiler.java:7416)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7859)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:372)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:363)
     [java] 	at clojure.lang.RT.load(RT.java:453)
     [java] 	at clojure.lang.RT.load(RT.java:419)
     [java] 	at clojure.lang.RT.doInit(RT.java:461)
     [java] 	at clojure.lang.RT.<clinit>(RT.java:331)
     [java] 	... 1 more
     [java] Caused by: java.lang.ClassCastException: clojure.lang.Compiler$HoistedMethod cannot be cast to clojure.lang.Compiler$FnMethod
     [java] 	at clojure.lang.Compiler$FnExpr.parse(Compiler.java:4466)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7351)
     [java] 	... 17 more
Comment by Kevin Downey [ 25/Aug/15 1:00 PM ]

the pass 6 patch builds cleanly on 1d5237f9d7db0bc5f6e929330108d016ac7bf76c(HEAD of master as of this moment) and runs clojure's tests. I have not verified it against other projects as I did with the previous patches (I don't remember how I did that)

Comment by Michael Blume [ 25/Aug/15 1:38 PM ]

I get the same error I shared before applying the new patch to 1d5237f

Comment by Alex Miller [ 25/Aug/15 1:56 PM ]

I get some whitespace warnings with hoistedmethod-pass-6.diff but the patch applies. On a compile I get:

compile-clojure:
     [java] Exception in thread "main" java.lang.ExceptionInInitializerError
     [java] 	at clojure.lang.Compile.<clinit>(Compile.java:29)
     [java] Caused by: java.lang.ClassCastException: clojure.lang.Compiler$HoistedMethod cannot be cast to clojure.lang.Compiler$FnMethod, compiling:(clojure/core.clj:439:11)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7362)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7156)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7343)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7156)
     [java] 	at clojure.lang.Compiler.access$300(Compiler.java:38)
     [java] 	at clojure.lang.Compiler$DefExpr$Parser.parse(Compiler.java:588)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7355)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7156)
     [java] 	at clojure.lang.Compiler.analyze(Compiler.java:7114)
     [java] 	at clojure.lang.Compiler.eval(Compiler.java:7418)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7861)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:372)
     [java] 	at clojure.lang.RT.loadResourceScript(RT.java:363)
     [java] 	at clojure.lang.RT.load(RT.java:453)
     [java] 	at clojure.lang.RT.load(RT.java:419)
     [java] 	at clojure.lang.RT.doInit(RT.java:461)
     [java] 	at clojure.lang.RT.<clinit>(RT.java:331)
     [java] 	... 1 more
     [java] Caused by: java.lang.ClassCastException: clojure.lang.Compiler$HoistedMethod cannot be cast to clojure.lang.Compiler$FnMethod
     [java] 	at clojure.lang.Compiler$FnExpr.parse(Compiler.java:4468)
     [java] 	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7353)
     [java] 	... 17 more
Comment by Nicola Mometto [ 25/Aug/15 2:36 PM ]

Looks like direct linking interacts with the diffs in this patch in non trivial ways.

Comment by Kevin Downey [ 25/Aug/15 3:05 PM ]

I must have screwed up running the tests some how, I definitely get the same error now

Comment by Kevin Downey [ 25/Aug/15 4:00 PM ]

after you get past the cast issuing (just adding some conditional logic there)

it looks like HoistedMethodInvocationExpr needs to be made aware of if it is emitting in an instance method or a static method, and do the right thing with regard to this pointers and argument offsets. this will likely require HoistedMethod growing the ability to be a static method (and maybe preferring static methods when possible).

if you cause HoistedMethod to set usesThis to true on methods that use it, then everything appears hunky-dory (if I ran the tests correctly), but this largely negates the new direct linking stuff, which is not good.

Comment by Kevin Downey [ 27/Aug/15 9:30 PM ]

hoistedmethod-pass-7.diff adds a single commit to hoistedmethod-pass-5.diff

the single commit changes hoisted methods to always be static methods, and adjusts the arguments in the invocation of the hoisted method based on if the containing function is a static/direct function or not.

again I haven't done the extended testing with this patch.

here is an example of what the hoisted methods look like

user> (println (disassemble (fn ^long [^long y] (let [x (try (/ 1 0) (catch Throwable t 0))] x))))                                                                       

// Compiled from form-init3851661302895745152.clj (version 1.5 : 49.0, super bit)                                                                                        
public final class user$eval1872$fn__1873 extends clojure.lang.AFunction implements clojure.lang.IFn$LL {                                                                
                                                                                                                                                                         
  // Field descriptor #9 Lclojure/lang/Var;                                                                                                                              
  public static final clojure.lang.Var const__0;                                                                                                                         
                                                                                                                                                                         
  // Field descriptor #11 Ljava/lang/Object;                                                                                                                             
  public static final java.lang.Object const__1;                                                                                                                         
                                                                                                                                                                         
  // Field descriptor #11 Ljava/lang/Object;                                                                                                                             
  public static final java.lang.Object const__2;                                                                                                                         
                                                                                                                                                                         
  // Method descriptor #14 ()V                                                                                                                                           
  // Stack: 2, Locals: 0                                                                                                                                                 
  public static {};                                                                                                                                                      
     0  ldc <String "clojure.core"> [16]                                                                                                                                 
     2  ldc <String "/"> [18]                                                                                                                                            
     4  invokestatic clojure.lang.RT.var(java.lang.String, java.lang.String) : clojure.lang.Var [24]                                                                     
     7  checkcast clojure.lang.Var [26]                                                                                                                                  
    10  putstatic user$eval1872$fn__1873.const__0 : clojure.lang.Var [28]                                                                                                
    13  lconst_1    
    14  invokestatic java.lang.Long.valueOf(long) : java.lang.Long [34]                                                                                                  
    17  putstatic user$eval1872$fn__1873.const__1 : java.lang.Object [36]                                                                                                
    20  lconst_0                                                                                                                                                         
    21  invokestatic java.lang.Long.valueOf(long) : java.lang.Long [34]                                                                                                  
    24  putstatic user$eval1872$fn__1873.const__2 : java.lang.Object [38]                                                                                                
    27  return                                                                                                                                                           
      Line numbers:                                                                                                                                                      
        [pc: 0, line: 1]                                                                                                                                                 
                                                                                                                                                                         
  // Method descriptor #14 ()V                                                                                                                                           
  // Stack: 1, Locals: 1                                                                                                                                                 
  public user$eval1872$fn__1873();                                                                                                                                       
    0  aload_0 [this]                                                                                                                                                    
    1  invokespecial clojure.lang.AFunction() [41]                                                                                                                       
    4  return                                                                                                                                                            
      Line numbers:                                                                                                                                                      
        [pc: 0, line: 1]                                                                                                                                                 
                                                                                                                                                                         
  // Method descriptor #43 (J)J                                                                                                                                          
  // Stack: 2, Locals: 4  
  public final long invokePrim(long y);                                                                                                                                  
     0  lload_1 [y]                                                                                                                                                      
     1  invokestatic user$eval1872$fn__1873.__hoisted1874(long) : java.lang.Object [47]                                                                                  
     4  astore_3 [x]                                                                                                                                                     
     5  aload_3 [x]                                                                                                                                                      
     6  aconst_null                                                                                                                                                      
     7  astore_3                                                                                                                                                         
     8  checkcast java.lang.Number [50]                                                                                                                                  
    11  invokevirtual java.lang.Number.longValue() : long [54]                                                                                                           
    14  lreturn                                                                                                                                                          
      Line numbers:                                                                                                                                                      
        [pc: 0, line: 1]                                                                                                                                                 
      Local variable table:                                                                                                                                              
        [pc: 5, pc: 8] local: x index: 3 type: java.lang.Object                                                                                                          
        [pc: 0, pc: 14] local: this index: 0 type: java.lang.Object                                                                                                      
        [pc: 0, pc: 14] local: y index: 1 type: long                                                                                                                     
                                                                                                                                                                         
  // Method descriptor #59 (Ljava/lang/Object;)Ljava/lang/Object;                                                                                                        
  // Stack: 5, Locals: 2                                                                                                                                                 
  public java.lang.Object invoke(java.lang.Object arg0);                                                                                                                 
     0  aload_0 [this]                                                                                                                                                   
     1  aload_1 [arg0]                                                                                                                                                   
     2  checkcast java.lang.Number [50]                                                                                                                                  
     5  invokestatic clojure.lang.RT.longCast(java.lang.Object) : long [63]                                                                                              
     8  invokeinterface clojure.lang.IFn$LL.invokePrim(long) : long [65] [nargs: 3]                                                                                      
    13  new java.lang.Long [30]                                                                                                                                          
    16  dup_x2                                                                                                                                                           
    17  dup_x2                                                                                                                                                           
    18  pop                                                                                                                                                              
    19  invokespecial java.lang.Long(long) [68]                                                                                                                          
    22  areturn                                                                                                                                                          
                                                                                                                                                                         
                                                                                                                                                                         
  // Method descriptor #45 (J)Ljava/lang/Object;                                                                                                                         
  // Stack: 4, Locals: 4                                                                                                                                                 
  public static java.lang.Object __hoisted1874(long arg0);                                                                                                               
     0  lconst_1                                                                                                                                                         
     1  lconst_0                                      
                                                                                                                                                        
     2  invokestatic clojure.lang.Numbers.divide(long, long) : java.lang.Number [74]                                                                                     
     5  astore_2                                                                                                                                                         
     6  goto 17                                                                                                                                                          
     9  astore_3 [t]                                                                                                                                                     
    10  getstatic user$eval1872$fn__1873.const__2 : java.lang.Object [38]                                                                                                
    13  astore_2                                                                                                                                                         
    14  goto 17                                                                                                                                                          
    17  aload_2                                                                                                                                                          
    18  areturn                                                                                                                                                          
      Exception Table:                                                                                                                                                   
        [pc: 0, pc: 6] -> 9 when : java.lang.Throwable                                                                                                                   
      Line numbers:                                                                                                                                                      
        [pc: 0, line: 1]                                                                                                                                                 
        [pc: 2, line: 1]                                                                                                                                                 
      Local variable table:                                                                                                                                              
        [pc: 9, pc: 14] local: t index: 3 type: java.lang.Object                                                                                                         
        [pc: 0, pc: 18] local: y index: 1 type: java.lang.Object                                                                                                         
                                                                                                                                                                         
}                           

Comment by Kevin Downey [ 27/Aug/15 9:43 PM ]

there is still an issue with patch 7, (defn f [^long y] (let [x (try (+ 1 0) (catch Throwable t y))] x)) causes a verifier error

Comment by Nicola Mometto [ 28/Aug/15 11:41 AM ]

Patch 0001-CLJ-701-add-HoistedMethod-to-the-compiler-for-hoisti.patch is patch hoistedmethod-pass-7 with the following changes:

  • fixes the bug mentioned in the last comment by Kevin Downey
  • removes unnecessary whitespace and indentation changes
  • conforms indentation with the surrounding lines
  • squashes the commits into one as preferred by Rich

Authorship is maintained for Kevin Downey

Comment by Kevin Downey [ 28/Aug/15 5:40 PM ]

maybe we need a "Bronsa's Guide to Submitting Patches" to supplement http://dev.clojure.org/display/community/Developing+Patches, I had no idea a single commit was preferred, but that makes sense given the format, although I just noticed the bit on deleting old patches to avoid confusion. Is there a preferred format for patch names too?
If you have recommendations beyond patch formatting into the code itself I am all ears.

Comment by Nicola Mometto [ 28/Aug/15 6:08 PM ]

Kevin, having a single commit per patch is something that I've seen Rich and Alex ask for in a bunch of tickets, as I guess it makes it easier to evaluate the overall diff (even though it sacrifices granularity of description).
No idea for a preferred patch name, I find it hard to imagine it would matter – I just use whatever git format patch outputs.

One thing I personally prefer is to add the ticket name at the beginning of the commit message, it makes it easier to understand changes when using e.g. git blame

Comment by Andy Fingerhut [ 28/Aug/15 6:33 PM ]

Just now I added a suggestion to http://dev.clojure.org/display/community/Developing+Patches that one read their patches before attaching them, and remove any spurious white space changes. Also to consider submitting patches with a single commit, rather than ones broken up into multiple commits, as reviewers tend to prefer those.

Alex Miller recently edited that page with the note about putting the ticket id first in the commit comment.

Only preference on patch file names is the one on that page – that they end with '.patch' or '.diff', because Rich's preferred editor for reading them recognizes those suffixes and displays the file in a patch-specific mode, I would guess.

Comment by Nicola Mometto [ 28/Aug/15 6:38 PM ]

0001-CLJ-701-add-HoistedMethod-to-the-compiler-for-hoisti-v2.patch is the same as 0001-CLJ-701-add-HoistedMethod-to-the-compiler-for-hoisti.patch but it changes some indentation to avoid mixing tabs and spaces

Comment by Michael Blume [ 01/Sep/15 5:27 PM ]

I have a verify error on the latest patch, will attempt to provide a small test case:

Exception in thread "main" java.lang.VerifyError: (class: com/climate/scouting/homestead_test$fn__17125, method: invokeStatic signature: (Ljava/lang/Object;)Ljava/lang/Object Expecting to find integer on stack, compiling:(homestead_test.clj:43:3)

Comment by Michael Blume [ 01/Sep/15 6:29 PM ]

https://github.com/MichaelBlume/hoist-fail

I'd rather remove the dependency on compojure-api but I'm having trouble reproducing without it

Comment by Michael Blume [ 01/Sep/15 8:01 PM ]

Ok, less stupid test case:

(let [^boolean foo true]
  (do
    (try foo
      (catch Throwable t))
    nil))
Comment by Michael Blume [ 01/Sep/15 8:30 PM ]

...I wonder how hard it would be to write a generative test which produced small snippets of valid Clojure code, compiled them, and checked for errors. I bet we could've caught this.

Comment by Nicola Mometto [ 02/Sep/15 12:53 AM ]

Michael Blume I'm not sure this should be considered a compiler bug.
Contrary to numeric literals, boolean literals are never emitted unboxed and type hints are not casts so

(let [^boolean a true])
is lying to the compiler telling it a is an unboxed boolean when infact it is a boxed Boolean.

I'd say we wait for what Alex Miller or Rich Hickey think of this before considering Kevin Downey's current patch bugged, I personally (given the current compiler implementation) would consider this an user-code bug, currently ignored, that this patch merely exposes. IOW an instance of currently working code that is not however valid code

(I rest my case that if we had some better spec on what type-hints are supposed to be valid and some better compile-time validation on them, issues like this would not arise)

Comment by Nicola Mometto [ 02/Sep/15 1:13 AM ]

If this is considered a bug, the fix is trivial btw

@@ -5888,7 +5888,7 @@ public static class LocalBinding{
 
 	public boolean hasJavaClass() {
 		if(init != null && init.hasJavaClass()
-		   && Util.isPrimitive(init.getJavaClass())
+		   && (Util.isPrimitive(init.getJavaClass()) || Util.isPrimitive(tag))
 		   && !(init instanceof MaybePrimitiveExpr))
 			return false;
 		return tag != null
Comment by Kevin Downey [ 03/Sep/15 11:22 AM ]

I imagine ^boolean type hints (that don't do anything and are ignored) are going to become very common in clojure code, given everyone's keen interest in code sharing between clojure and clojurescript, and iirc clojurescript actually using ^boolean





[CLJ-1435] 'numerator and 'denominator fail to handle integral values (i.e. N/1) Created: 30/May/14  Updated: 01/Sep/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Aaron Brooks Assignee: Unassigned
Resolution: Unresolved Votes: 11
Labels: None

Approval: Triaged

 Description   

Because ratio values reduce to lowest terms and, for integral values where the lowest term is N/1, are auto-converted to BigInts (and formerly Longs), the current behavior of clojure.core/numerator and clojure.core/denominator yield unexpected results.

user=> (numerator 1/3)
1
user=> (numerator (+ 1/3 2/3))

ClassCastException clojure.lang.BigInt cannot be cast to clojure.lang.Ratio  clojure.core/numerator (core.clj:3306)
user=> (denominator 1/3)
3
user=> (denominator (+ 1/3 2/3))

ClassCastException clojure.lang.BigInt cannot be cast to clojure.lang.Ratio  clojure.core/denominator (core.clj:3314)
user=>

The auto-conversion to Longs is not really the problem in my mind. I'd like to see numerator return the original value when presented with a BigInt and denominator always return 1 when presented with a BigInt. It seems reasonable to request the same for Longs.

If desired, I'd be happy to produce a patch.



 Comments   
Comment by Andy Fingerhut [ 30/May/14 6:35 PM ]

I don't know the official stance on this ticket, but will add some notes.

Aaron, numerator and denominator are pretty clearly documented to work on Ratio types only.

It is pretty easy to write my-numerator and my-denominator that work exactly as you wish, checking for the type of arg and using numerator, denominator for Ratio types, and doing whatever you think is correct for other numeric types.

Comment by Aaron Brooks [ 30/May/14 7:44 PM ]

I'm aware that they are documented as such. Part of my point is that you can be working entirely with Ratio types and, via arithmetic operations between them, sometimes wind up with a non-Ratio number unexpectedly.

Also consider:

user=> (numerator 2/1)
ClassCastException java.lang.Long cannot be cast to clojure.lang.Ratio  clojure.core/numerator (core.clj:3238)

You're then left either implementing a try/catch correction or always checking the type before using numerator or denominator which is a loss in performance.

The patch I have in mind is creating a protocol, extended to Ratio, BigInt and Long which calls the appropriate method (Ratios) or returns either the given number or 1 (numerator/denominator) for the integral types. I expect this to maintain the current level of performance in the cases where it works and behave properly in the cases currently not handled.

Comment by Gary Fredericks [ 27/Aug/15 10:38 AM ]

I've definitely written the helper functions Andy describes on several occasions.

Comment by Felipe Micaroni Lalli [ 01/Sep/15 4:58 PM ]

Related issue: https://stackoverflow.com/questions/25194809/how-to-convert-any-number-to-a-clojure-lang-ratio-type-in-clojure

A workaround to that is (numerator (clojure.lang.Numbers/toRatio (rationalize <put any type of number here>)))





[CLJ-1809] Compiler produces VerifyError when compiling simple let expression inside a finally block Created: 30/Aug/15  Updated: 30/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: Release 1.8

Type: Defect Priority: Critical
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: compiler

Attachments: Text File 0001-CLJ-1809-fix-off-by-one-error-in-direct-linking.patch    
Patch: Code and Test
Approval: Vetted

 Description   

Compiling

(defn x [y]
  (try
    (finally
      (let [z y]))))

produces

VerifyError (class: user$x, method: invokeStatic signature: (Ljava/lang/Object;)Ljava/lang/Object;) Can only throw Throwable objects  java.lang.Class.getDeclaredConstructors0 (Class.java:-2)

Disabling the verifier, here's a dump of the emitted bytecode for inspection

public static java.lang.Object invokeStatic(java.lang.Object);
    Code:
       0: aconst_null
       1: astore_1
       2: aload_0
       3: aconst_null
       4: astore_0
       5: astore_2
       6: aconst_null
       7: pop
       8: goto 20
      11: astore_2
      12: aload_0
      13: aconst_null
      14: astore_0
      15: astore_2
      16: aconst_null
      17: pop
      18: aload_2
      19: athrow
      20: aload_1
      21: areturn
    Exception table:
       from to target type
           0 2 11 any

Here's the bytecode emitted by clojure 1.7.0 for comparison

public java.lang.Object invoke(java.lang.Object);
    Code:
       0: aconst_null
       1: astore_2
       2: aload_1
       3: aconst_null
       4: astore_1
       5: astore_3
       6: aconst_null
       7: pop
       8: goto          22
      11: astore        4
      13: aload_1
      14: aconst_null
      15: astore_1
      16: astore_3
      17: aconst_null
      18: pop
      19: aload         4
      21: athrow
      22: aload_2
      23: areturn
    Exception table:
       from    to  target type
           0     2    11   any


 Comments   
Comment by Nicola Mometto [ 30/Aug/15 11:30 PM ]

The attached patch fixes the issue however I'm unfamiliar with the direct linking support in the compiler so I'm not sure this is the right fix.





[CLJ-1805] :rettag encourages wrong runtime type hints Created: 28/Aug/15  Updated: 29/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: typehints

Approval: Triaged

 Description   

Currently :rettag works only for expressions like:

(defn ^long x [..] ..)

or

(defn ^double x [..] ..)

But at runtime those type-hints are resolved to

#<clojure.core$long ..>
and
#<clojure.core$double ..>

which can cause the compiler to fail, see CLJ-1674 for an example



 Comments   
Comment by Nicola Mometto [ 29/Aug/15 1:27 PM ]

It also appears that the current impl is completely useless: see David Miller's comment: https://github.com/clojure/clojure/commit/2344de2b2aadd5b0e47f1594a6f9e4eb2fdbdf5c#commitcomment-12962027





[CLJ-1119] inconsistent behavior of lazy-seq w/ macro & closure on excptions Created: 03/Dec/12  Updated: 28/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Hank Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

lazy-seq seems to evaluate inconsistently when body includes a macro and throws and exception. 1st evalutation throws the exceptions, subsequent ones return empty sequence.

demo code:

(defn gen-lazy []
  (let [coll [1 2 3]]
    (lazy-seq
      (when-let [s (seq coll)]
        (throw (Exception.))))))

(def lazy (gen-lazy))

(try
  (println "lazy:" lazy)
  (catch Exception ex
    (println ex)))

(try
  (println "lazy, again:" lazy)
  (catch Exception ex
    (println ex)))

It should throw an exception both times but only does on 1st. Generally speaking an expression shouldn't evaluate to different things depending on whether it's been evaluated before.

When removing the closure ...

(defn gen-lazy []
  (lazy-seq
    (when-let [s (seq [1 2 3])]
      (throw (Exception.)))))

... or removing the when-let macro ...

(defn gen-lazy []
  (let [coll [1 2 3]]
    (lazy-seq
      (seq coll)
      (throw (Exception.)))))

It works i.e. consistently throws the exception, so seems to be some interaction between the closure and the macro at work here. This particular combination is used in the 'map' function.

See also: https://groups.google.com/forum/?fromgroups=#!topic/clojure/Z3EiBUQ7Inc



 Comments   
Comment by Hank [ 03/Dec/12 4:26 AM ]

N.B. The primary use case I have for this, in case it matters, is interrupting the evaluation of a 'map' expression in which the mapped fn is slow to evaluate (throwing an InterruptedException), because I am not interested in its result any more. Then later I re-evaluate it because I am interested in the result after all, however with above bug the lazy sequence terminates instead of recommencing where it left off.

(UPDATE: This use case is similar to the kind of ersatz continuations that Jetty does (RetryRequest runtime exception) or even Clojure itself when barging STM transactions with the RetryEx exception.)

Comment by Hank [ 03/Dec/12 8:45 AM ]

Related to CLJ-457 according to Christophe. His patch fixes this,too.

Comment by Hank [ 04/Dec/12 5:02 AM ]

Sorry Christophe's patch doesn't work for me here. It avoids evaluating the LazySeq a second time by prematurely throwing an exception. However a LazySeq might evaluate properly the second time around b/c the situation causing the exception was transient. As per comment above an evaluation might get interrupted, throwing InterruptedException the first time around but not the second time.

Also the observation with the closure and macro need explanation IMHO.

Comment by Hank [ 08/Dec/12 3:51 AM ]

further insight: 'delay' exhibits the same behavior and is a more simple case to examine. the macro suspicion is a red herring: as demoed below it is actually the closed over variable magically turns to nil, the when-let macro simply turned that into a nil for the whole expression.

(def delayed
  (let [a true]
    (delay
      (print "a=" a)
      (throw (Exception.)))))

(try
  (print "delayed 1:")
  (force delayed)
  (catch Exception ex (println ex)))

(try
  (print "delayed 2:")
  (force delayed)
  (catch Exception ex (println ex)))

prints:

delayed 1:a= true#<Exception java.lang.Exception>
delayed 2:a= nil#<Exception java.lang.Exception>

Comment by Hank [ 09/Dec/12 1:31 AM ]

The above leads to dodgy outcomes such as: The following expression leads to an Exception on 1st evaluation and to "w00t!" on subsequent ones.

(def delayed
  (let [a true]
    (delay
      (if a
        (throw (Exception.))
        "w00t!"))))

Try it like this:

(try
  (print "delayed 1:" )
  (println (force delayed))
  (catch Exception ex (println ex)))

(try
  (print "delayed 2:")
  (println (force delayed))
  (catch Exception ex (println ex)))

Results in:
delayed 1:#<Exception java.lang.Exception>
delayed 2:w00t!

This code shows that the problem is tied to the :once flag as suspected.

(def test-once
  (let [a true]
    (^{:once true} fn* foo []
	    (println "a=" a)
	    (throw (Exception.)))))

Invoking the fn twice will show 'a' turning from 'true' to 'nil', try it like this:

(try
  (print "test-once 1:")
  (test-once)
  (catch Exception ex (println ex)))

(try
  (print "test-once 2:")
  (test-once)
  (catch Exception ex (println ex)))

Results in:
test-once 1:a= true
#<Exception java.lang.Exception>
test-once 2:a= nil
#<Exception java.lang.Exception>

That doesn't happen when the ^{:once true} is removed. Now one could argue that above fn is invoked twice, which is exactly what one is not supposed to do when decorated with the :once flag, but I argue that an unsuccessful call doesn't count as invocation towards the :once flag. The delay and lazy-seq macros agree with me there as the resulting objects are not considered realized (as per realized? function) if the evaluation of the body throws an exception, and realization/evaluation of the body is therefore repeated on re-evaluation of the delay/lazy-seq.

Try this using

(realized? delayed)
after the first evaluation in the code above. In the implementation this can be seen e.g. here for clojure.lang.Delay (similarly for LazySeq), the body-fn is set to null (meaning realized) after the invocation returns successfully only.

The :once flag affects this part of the compiler only. Some field is set to nil there in the course of a function invocation, for the good reason of letting the garbage compiler collect objects, however this should to be done after the function successfully completes only. Can this be changed?

Comment by Hank [ 16/Dec/12 4:02 AM ]

A workaround for the case of the 'map' function as described in the 1st comment, works as this: The original map function, if you take out the cases for several colls, the performance enhancements for chunked seqs and forcing the coll argument to a seq, looks like this:

(defn map [f s]
  (lazy-seq
    (cons (f (first s)) (map f (rest s)))))

In my workaround I evaluate f twice:

(defn map [f s]
  (lazy-seq
    (f (first s))
    (cons (f (first s)) (map f (rest s)))))

Because the downstream functions that are slow to evaluate are all of the deref kind that cache their result (more lazy-seqs, delays, futures, promises), the InterruptedException can only happen during the 1st evaluation, while the tail call optimization that sets closed-over variables to nil (it pays to read this here: http://clojure.org/lazy) only happens on the second call. The first still creates an fn that captures the head of the sequence 's', however this is not being held onto as it is not returned.

I use this special version of map (and other, similarly rewritten functions based on lazy-seq such as iterate) when I want interruptible, restartable seq evaluations.

Comment by Hank [ 06/Aug/13 9:06 AM ]

Instead of above hack, implementing map and all other combinators using a lazy Y-combinator, and removing the :once meta-data tag in the lazy-seq definition fixes things properly. Since the compiler sort-of-hack that clears closed-over variables that's triggered by the :once meta-data tag is basically only for cases of recursion that can be implemented using the Y-combinator, that won't be needed anymore either.

Comment by Alex Miller [ 08/Aug/13 3:20 AM ]

Re the Delay reference above, this is covered (and addressed) in CLJ-1175 which is waiting for a final screen.





[CLJ-1803] Enable destructuring of sequency maps Created: 22/Aug/15  Updated: 26/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Reid McKenzie Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: destructuring

Attachments: Text File 0001-Enable-destructuring-of-seq-map-types.patch    
Patch: Code
Approval: Triaged

 Description   

At present, Clojure's destructuring implementation will create a hash-map from any encountered value satisfying clojure.core/seq? This has the I argue undesirable side effect of making it impossible to employ destructuring on a custom Associative type which is also a Seq. This came up when trying to destructure instances of a tagged value class which for the purpose of pattern matching behave as [k v] seqs, but since the v is known to be a map, are also associative on the map part so as to avoid the syntactic overhead of updates preserving the tag.

;; A sketch of such a type
(deftype ATaggedVal [t v]
  clojure.lang.Indexed
  (nth [self i]    (nth self i nil))
  (nth [self i o] (case i (0) t (1) v o))

  clojure.lang.Sequential
  clojure.lang.ISeq
  (next  [this]     (seq this))
  (first [this]     t)
  (more  [this]     (.more (seq this)))
  (count [this]     2)
  (equiv [this obj] (= (seq this) obj))
  (seq   [self]     (cons t (cons v nil)))

  clojure.lang.Associative
  (entryAt [self key] (.entryAt v key))
  (assoc [_ sk sv]    (ATaggedVal. t (.assoc v sk sv)))

  clojure.lang.ILookup
  (valAt [self k]   (.valAt v k))
  (valAt [self k o] (.valAt v k o))

  clojure.lang.IPersistentMap
  (assocEx [_ sk sv] (ATaggedVal. t (.assocEx v sk sv)))
  (without [_ sk]    (ATaggedVal. t (.without v sk))))

So using such a thing,

(let [{:keys [x]} (ATaggedVal. :foo {:x 3 :y 4})] x)
;; expect 3
=> nil

Since for any type T such that clojure.core/get will behave, T should satisfy clojure.core/map? it should be correct simply to change the behavior of destructure to only build a hash-map if map? isn't already satisfied.

The attached patch makes this change.



 Comments   
Comment by Alex Miller [ 22/Aug/15 1:21 PM ]

Probably worth watching CLJ-1778 too which might cause this not to apply anymore.

Could you add an example of what doesn't work to the description?

Comment by Ragnar Dahlen [ 24/Aug/15 1:20 AM ]

The current patch for CLJ-1778 does not address this issue.

The idea seems sound to me, if we're map destructuring a value that's
already a map (as determined by map?), we don't need to create a
new one by calling seq and HashMap/create, unless there's a
really good reason it should be exactly that map implementation (I
don't see one).

I don't think the current patch is OK though as it makes an
(unneccessary) breaking change to the behaviour of map destructuring.
Previously, destructuring a non-seqable value returned nil, but with
patch, seq is always called on value and for non-seqble types this
will instead throw an exception. It should be trivial to change the
patch to retain the original behaviour.

1.8.0-master:

user=> (let [{:keys [x]} (java.util.Date.)] x)
nil

with 0001-Enable-destructuring-of-seq-map-types.patch:

user=> (let [{:keys [x]} (java.util.Date.)] x)
IllegalArgumentException Don't know how to create ISeq from: java.util.Date  clojure.lang.RT.seqFrom (RT.java:528)
Comment by Reid McKenzie [ 26/Aug/15 1:15 PM ]

I contend that the behavior broken is, at best, undefined behavior consequent from the implementation and that the failure to cast exception is at least clearer than the silent nil behavior of the original implementation.

I would personally prefer to extend the destructuring checks logically to (cond map? x seq? (hash-map (seq x)) :else (throw "Failed to destructure non-map type") but I think that change is sufficiently large that it would meaningfully decrease the chances of this patch being accepted.





[CLJ-1804] take transducer optimization Created: 25/Aug/15  Updated: 26/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Karsten Schmidt Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: performance, transducers

Attachments: Text File clj-1804-1.patch     Text File clj-1804-2.patch    
Patch: Code
Approval: Triaged

 Description   

A basic refactoring to remove the let form and only requires a single counter check for each iteration, yields an 25% performance increase. With the patch, 2 checks are only required for the last iteration (in case counter arg was <= 0)...

;; master
(quick-bench (into [] (take 1000) (range 2000)))
WARNING: Final GC required 34.82584189073624 % of runtime
Evaluation count : 13050 in 6 samples of 2175 calls.
             Execution time mean : 46.921254 µs
    Execution time std-deviation : 1.904733 µs
   Execution time lower quantile : 45.124921 µs ( 2.5%)
   Execution time upper quantile : 49.427201 µs (97.5%)
                   Overhead used : 2.367243 ns

;; w/ patch
(quick-bench (into [] (take 1000) (range 2000)))
WARNING: Final GC required 34.74448252054369 % of runtime
Evaluation count : 18102 in 6 samples of 3017 calls.
             Execution time mean : 34.301193 µs
    Execution time std-deviation : 1.714105 µs
   Execution time lower quantile : 32.341349 µs ( 2.5%)
   Execution time upper quantile : 37.046851 µs (97.5%)
                   Overhead used : 2.367243 ns


 Comments   
Comment by Karsten Schmidt [ 25/Aug/15 10:35 AM ]

Proposed patch, passes all existing tests

Comment by Alex Miller [ 25/Aug/15 11:28 AM ]

From a superficial skim, I see checks for pos? and then neg? which both exclude 0 and that gives me doubts about that branch. That may actually be ok but I will have to read it more closely.

Comment by Karsten Schmidt [ 25/Aug/15 11:58 AM ]

Hi Alex, try running the tests... AFAICT it's all still working as advertised: For (take 0) or (take -1) then the pos? check fails, but we must ensure to not call rf for that iteration. For all other (take n) cases only the pos? check is executed apart from the last iteration (which causes a single superfluous neg? call) The current path/impl always does 2 checks for all iterations and hence is (much) slower.

Comment by Alex Miller [ 25/Aug/15 7:22 PM ]

The only time that neg? case will be hit is if take is passed n <= 0, so I think this is correct. However, you might consider some different way to handle that particular case - for example, it could be pulled out of the transducer function entirely and be created as a separate transducer function entirely. I'm not sure that's worth doing, but it's an idea.

Comment by Karsten Schmidt [ 26/Aug/15 6:01 AM ]

Good idea, Alex! This 2nd patch removes the neg? check and adds a quick bail transducer for cases where n <= 0. It also made it slightly faster still:

(quick-bench (into [] (take 1000) (range 2000)))
Evaluation count : 20370 in 6 samples of 3395 calls.
             Execution time mean : 30.302673 µs

(now ~35% faster than original)





[CLJ-1778] let-bound namespace-qualified bindings should throw (if not map destructuring) Created: 16/Jul/15  Updated: 24/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Ragnar Dahlen
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File clj-1778-2-with-tests.patch     Text File clj-1778.patch    
Patch: Code and Test
Approval: Screened

 Description   

Seen in a tweet...

user=> (let [a/x 42] x) ; throws CompilerException "Can't let qualified name ..."
user=> (let [a/x 42, [y] [1]] x) ;=> 42

The second one should throw like the first one (also see CLJ-1318 where support for map destructuring of namespaced symbols was added).

Approach: Rather than allowing namespaced symbols to be returned from the map destructuring (the pmap fn), convert those symbols before returning them, so that any namespaced symbols encountered in the main cond of pb are an error and can be handled uniformly.

Patch: clj-1778-2-with-tests.patch

Screened by: Alex Miller



 Comments   
Comment by Ragnar Dahlen [ 16/Jul/15 11:41 AM ]

I can confirm that this is a regression from CLJ-1318. After that change, namespaces were removed from symbols regardless of whether they were used as part of a map destruring or not. The only reason the exception is caught in the first test case is because that binding form has no destructuring at all, in which case the destructuring logic is bypassed.

I've attached a patch that moves the namespace removal (and keyword handling) into `pmap` instead as it's really a special case for map destructuring only.

Comment by Ragnar Dahlen [ 16/Jul/15 3:53 PM ]

Updated patch with two changes:

1. Removed initial check for keywords in binding keys as that only looked for keywords at top-level and failed to catch cases like:

(let [[:x] [1]] x)
;; => 1

Keywords in non-map destructuring binding positions (like the example) will now fail with "Unsupported binding form: :x" instead. This is a change from the current behaviour where only top-level keywords would be caught, but the current exception is the slightly more specific "Unsupported binding key: :x". If a better, more specific exception is required, I'd be happy to update the patch.

2. Tests: added test for regression, updated/amended existing tests.

Comment by Rich Hickey [ 24/Aug/15 9:37 AM ]

Is this the smallest change that can be made?

Comment by Ragnar Dahlen [ 24/Aug/15 4:51 PM ]

I thought so, but in writing this reply I realised a smaller change
might be possible, depending on the desired behaviour/outcome.

CLJ-1318 ("Support destructuring maps with namespaced keywords")
introduced two potentially undesired behaviours:

1. Namespaces were removed from symbols occurring in any binding key
position regardless of whether they were used in map destructuring
or not. This lead to the behaviour initially reported in this
issue; the illusion of being able to bind a qualified name in a
non-map-destructuring context:

user=> (let [a/x 1, [y] [2]] x) ;=> 1

This currently "works" because namespaces are unconditionally removed
from symbols. However:

user=> (let [a/x 42] x) ; throws CompilerException "Can't let qualified name ..."

throws because destructuring logic is bypassed if every binding key is
a symbol.

2. Keywords were allowed in any binding key positions. Keywords are
converted to symbols (retaining namespace) and treated according to
the rules of symbols in the rest of the destructuring logic. From
what I understand the idea was to allow keywords only in map
destructuring, but again the change change was effected for any
binding key. This leads the following:

(let [[:a :b :c] [1 2 3]] a) ;=> 1

A top-level check was put in place to (from my understanding) try and
prevent the usage of keywords in such positions:

(if-let [kwbs (seq (filter #(keyword? (first %)) bents))]
  (throw (new Exception (str "Unsupported binding key: " (ffirst kwbs))))
  (reduce1 process-entry [] bents))

However, this check only checks the top level binding entries, and
fails the recursive nature of destructuring (as demonstrated by
example above).

The current patch makes the following changes:

1. Revert problematic changes introduced by CLJ-1318:
1.1 revert unconditional removal of namespace from any namespaced symbol encountered
1.2 revert acceptence of keywords in any binding key position
1.3 revert insufficient check for keywords used in "illegal" positions.
2. Re-implement support for namespaced symbols, and keywords, but only
in the context of map destructuring.

If that is what we want to accomplish I believe the patch is the
smallest possible. However, if the keyword behaviour is actually
desired (basically allowing keywords being used interchangably with
symbols for binding keys) then the patch could be smaller.

Please excuse the rather lengthy reply.

Comment by Alex Miller [ 24/Aug/15 5:05 PM ]

Ragnar, I do not believe we wish to use keywords interchangeably with symbols for binding keys.

Furthermore, I think this is a good patch that solves the problems introduced in CLJ-1318 (I'll take the blame for those!).





[CLJ-1793] Clear 'this' before calls in tail position Created: 05/Aug/15  Updated: 24/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: None

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Unresolved Votes: 0
Labels: compiler
Environment:

1.8.0-alpha2 - 1.8.0-alpha4


Attachments: Text File 0001-Clear-this-before-calls-in-tail-position.patch     Text File clj-1793-2.patch    
Patch: Code and Test
Approval: Vetted

 Description   

(This ticket started life as CLJ-1250, was committed in 1.8.0-alpha2, pulled out after alpha4, and this is the new version that fixes the logic about whether in a tail call as well as addresses direct linking added in 1.8.0-alpha3.)

Problem: Original example was with reducers holding onto the head of a lazy seq:

(time (reduce + 0 (map identity (range 1e8))))    ;; works
(time (reduce + 0 (r/map identity (range 1e8))))  ;; oome from holding head of range

Example of regression from CLJ-1250 that now works correctly (doesn't clear this in nested loop) with CLJ-1793:

(let [done (atom false)
        f (future-call
            (fn inner []
              (while (not @done)
                (loop [found []]
                  (println (conj found 1))))))]
    (doseq [elem [:a :b :c :done]]
      (println "queue write " elem))
    (reset! done true)
    @f)

Approach: When invoking a method in a tail call, clear this prior to invoking.

The criteria for when a tail call is a safe point to clear 'this':

1) Must be in return position
2) Not in a try block (might need 'this' during catch/finally)
3) When not direct linked

Return position (#1) isn't simply (context == C.RETURN) because loop bodies are always parsed in C.RETURN context

A new dynvar METHOD_RETURN_CONTEXT tracks whether an InvokeExpr in tail position can directly leave the body of the compiled java method. It is set to RT.T in the outermost parsing of a method body and invalidated (set to null) when a loop body is being parsed where the context for the loop expression is not RETURN parsed. Added clear in StaticInvokeExpr as that is now a thing with direct linking again.

Removes calls to emitClearLocals(), which were a no-op.

Patch: clj-1793-2.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 05/Aug/15 12:16 PM ]

The this ref is cleared prior to the println, but the next time through the while loop it needs the this ref to look up the closed over done field (via getfield).

Adding an additional check to the inTailCall() method to not include tail call in a loop addresses this case:

static boolean inTailCall(C context) {
-    return (context == C.RETURN) && (IN_TRY_BLOCK.deref() == null);
+    return (context == C.RETURN) && (IN_TRY_BLOCK.deref() == null) && (LOOP_LOCALS.deref() == null);
}

But want to check some more things before concluding that's all that's needed.

Comment by Alex Miller [ 05/Aug/15 1:36 PM ]

This change undoes the desired behavior in the original CLJ-1250 (new tests don't pass). For now, we are reverting the CLJ-1250 patch in master.

Comment by Ghadi Shayban [ 05/Aug/15 3:12 PM ]

Loop exit edges are erroneously being identified as places to clear 'this'. Only exits in the function itself or the outermost loop are safe places to clear.

Comment by Ghadi Shayban [ 05/Aug/15 8:43 PM ]

Patch addresses this bug and the regression in CLJ-1250.

See the commit message for an extensive-ish comment.

Comment by Alex Miller [ 18/Aug/15 12:33 PM ]

New patch is same as old, just adds jira id to beginning of commit message.

Comment by Rich Hickey [ 24/Aug/15 10:00 AM ]

Not doing this for 1.8, more thought needs to go into whether this is the right solution to the problem. And, what is the problem? This title of this patch is just something to do.

Comment by Alex Miller [ 24/Aug/15 10:21 AM ]

changing to vetted so this is at a valid place in the jira workflow

Comment by Ghadi Shayban [ 24/Aug/15 10:45 AM ]

Rich the original context is in CLJ-1250 which was a defect/problem. It was merged and revert because of a problem in the impl. This ticket is the continuation of the previous one, but unfortunately the title lost the context and became approach-oriented and not problem-oriented. Blame Alex. (I kid, it's an artifact of the mutable approach to issue management.)





[CLJ-1523] Add 'doseq' like macro for transducers Created: 08/Sep/14  Updated: 22/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Trivial
Reporter: Timothy Baldridge Assignee: Unassigned
Resolution: Unresolved Votes: 2
Labels: None

Attachments: File doreduced2.diff     File doreduced.diff    
Patch: Code and Test
Approval: Triaged

 Description   

Doseq is currently a good way to execute a lazy sequence and perform side-effects. It would be nice to have a matching macro for transducers.

Approach: The included patch simply calls transduce with the provided xform, collection, and a reducing function that throws away the accumulated value at each step. The value from each reducing step is bound to the provided symbol. A shorter arity is provided for those cases when no xform is desired, but fast doseq-like semantics are still wanted.

Patch: doreduced2.diff



 Comments   
Comment by Jozef Wagner [ 09/Sep/14 4:19 AM ]

How about making xform parameter optional? And you have a typo in docstring example, doseq -> doreduced.

Comment by Timothy Baldridge [ 09/Sep/14 7:52 AM ]

Good point, fixed typeo, added other arity.





[CLJ-1802] StackTraceElement's FileName null Created: 21/Aug/15  Updated: 21/Aug/15  Resolved: 21/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5, Release 1.6, Release 1.7
Fix Version/s: None

Type: Defect Priority: Critical
Reporter: Amir Assignee: Unassigned
Resolution: Not Reproducible Votes: 0
Labels: None
Environment:

linux



 Description   

I tried DBAppender from logback but after logging 16-20 events no event was being logged in Database, when i debuged the app i came to know that FileName from StackTraceElement is null which is raising the following exception

java.sql.SQLIntegrityConstraintViolationException: ORA-01400: cannot insert NULL into ("XYZ"."LOGGING_EVENT"."CALLER_FILENAME")

I made a simple application in java and issue was not present there.

The DBAppender of lobback fills information about log event in method bindCallerDataWithPreparedStatement.

i suspect clojure is not passing the information completely sometimes. in this case when i call the following

(dotimes[x 100]
(logr/infor "abcd" x))

please keep in mind it happens when using a marker

<appender name="REPORTING-DB" class="ch.qos.logback.classic.db.DBAppender">
<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
<evaluator class="ch.qos.logback.classic.boolex.OnMarkerEvaluator">
<marker>DB_REPORT</marker>
</evaluator>
<onMismatch>DENY</onMismatch>
<onMatch>ACCEPT</onMatch>
</filter>

<connectionSource class="ch.qos.logback.core.db.DataSourceConnectionSource">
<dataSource class="com.zaxxer.hikari.HikariDataSource">
<!-- <dataSource class="com.mchange.v2.c3p0.ComboPooledDataSource"> -->
<driverClassName>oracle.jdbc.driver.OracleDriver</driverClassName>
<jdbcUrl>jdbc:oracle:thin:xxxx/xxxx@xdomain:1521:EDB
</jdbcUrl>
<user>xxxx</user>
<password>xxxx</password>
</dataSource>
</connectionSource>
</appender>

<appender name="FILE" class="ch.qos.logback.core.FileAppender">
<file>test.log</file>
<append>true</append>
<encoder>
<pattern>%-4relative [%thread] %-5level %logger{35} - %msg%n
</pattern>
</encoder>
</appender>

<root level="INFO">
<appender-ref ref="REPORTING-DB" />
<appender-ref ref="FILE" />
</root>



 Comments   
Comment by Amir [ 21/Aug/15 5:27 AM ]

(defmacro infor [ & args ]
`(let logger# (impl/get-logger logging/*logger-factory* ~*ns*)
(println (class logger#))
(when (impl/enabled? logger# :info)
(.info logger# DB_REPORT (print-str ~@args)))))

Comment by Alex Miller [ 21/Aug/15 7:15 AM ]

First, there is not enough info here to reproduce the problem apart from the logging framework and oracle db.

Second, StackTraceElement's getFileName() is allowed to return null, so this seems like a faulty assumption in the database.

If there is a specific example (with code to reproduce) where a stack trace is missing a file name, but one could have been provided, then this could be reopened.





[CLJ-1780] Test OOME from locals clearing change Created: 17/Jul/15  Updated: 21/Aug/15  Resolved: 21/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Alex Miller Assignee: Unassigned
Resolution: Not Reproducible Votes: 0
Labels: regression
Environment:

1.8.0-alpha2


Attachments: Text File strengthen-clearing-test.patch    
Approval: Triaged

 Description   

IBM JDK 1.6 in test matrix is throwing errors from the new test for CLJ-1250 in 1.8.0-alpha2.

[java] ERROR in (test-closed-over-clearing) (Range.java:133)
     [java] expected: (number? (reduce + 0 (r/map identity (range 1.0E8))))
     [java]   actual: java.lang.OutOfMemoryError: null
     [java]  at clojure.lang.Range.forceChunk (Range.java:133)
     [java]     clojure.lang.Range.chunkedFirst (Range.java:150)
     [java]     clojure.core$chunk_first.invoke (core.clj:667)
     [java]     clojure.core.protocols/fn (protocols.clj:136)
     [java]     clojure.core.protocols$fn__6478$G__6473__6487.invoke (protocols.clj:19)
     [java]     clojure.core.protocols$seq_reduce.invoke (protocols.clj:31)
     [java]     clojure.core.protocols/fn (protocols.clj:95)
     [java]     clojure.core.protocols$fn__6452$G__6447__6465.invoke (protocols.clj:13)
     [java]     clojure.core.reducers$folder$reify__21452.coll_reduce (reducers.clj:126)
     [java]     clojure.core$reduce.invoke (core.clj:6519)
     [java]     clojure.test_clojure.reducers/fn (reducers.clj:95)
     [java]     clojure.test$test_var$fn__7669.invoke (test.clj:703)
     [java]     clojure.test$test_var.invoke (test.clj:703)
     [java]     clojure.test$test_vars$fn__7691$fn__7696.invoke (test.clj:721)
     [java]     clojure.test$default_fixture.invoke (test.clj:673)
     [java]     clojure.test$test_vars$fn__7691.invoke (test.clj:721)
     [java]     clojure.test$default_fixture.invoke (test.clj:673)
     [java]     clojure.test$test_vars.invoke (test.clj:717)
     [java]     clojure.test$test_all_vars.invoke (test.clj:727)
     [java]     clojure.test$test_ns.invoke (test.clj:746)
     [java]     clojure.core$map$fn__4553.invoke (core.clj:2624)
     [java]     clojure.lang.LazySeq.sval (LazySeq.java:40)
     [java]     clojure.lang.LazySeq.seq (LazySeq.java:49)
     [java]     clojure.lang.Cons.next (Cons.java:39)
     [java]     clojure.lang.RT.next (RT.java:681)
     [java]     clojure.core/next (core.clj:64)
     [java]     clojure.core$reduce1.invoke (core.clj:909)
     [java]     clojure.core$reduce1.invoke (core.clj:900)
     [java]     clojure.core$merge_with.doInvoke (core.clj:2936)
     [java]     clojure.lang.RestFn.applyTo (RestFn.java:139)
     [java]     clojure.core$apply.invoke (core.clj:632)
     [java]     clojure.test$run_tests.doInvoke (test.clj:761)
     [java]     clojure.lang.RestFn.applyTo (RestFn.java:137)
     [java]     clojure.core$apply.invoke (core.clj:630)
     [java]     user$eval28810.invoke (run_test.clj:8)
     [java]     clojure.lang.Compiler.eval (Compiler.java:6850)


 Comments   
Comment by Nicola Mometto [ 18/Jul/15 12:02 AM ]

I don't understand how forcing a 32 element chunk could consume the memory. If locals clearing worked properly there should be no other part of that sequence in memory but I might be missing some detail.

Might there be something going on with the recent change in impl for Range?

Comment by Ghadi Shayban [ 18/Jul/15 12:19 AM ]

An interesting note is that (reduce + 0 (r/map identity (range 1e8))) is going through the seq path, despite reducers && reducible range. This is because coll-reduce doesn't prefer IReduceInit.

The CLJ-1250 test should be modified to intentionally hold the head of a seq in order to exercise the locals clearing. A good hypothesis from Alex is that GC is a bit slower on the archaic IBM JDK.

Comment by Ghadi Shayban [ 18/Jul/15 1:21 AM ]

I can't reproduce this on Linux x86-64 with IBM JDK and an artificially tiny heap.

[ghadi@amdhex ibm-java-x86_64-60]$ ./bin/java -Xmx128m -cp $HOME/jsr166y-1.7.0.jar:$HOME/clojure-1.8.0-master-SNAPSHOT.jar clojure.main
Clojure 1.8.0-master-SNAPSHOT
user=> (require '[clojure.core.reducers :as r])
nil
user=> (number? (reduce + 0 (r/map identity (range 1e8))))
true

Can you post details on the IBM JDK environment in CI? Need point release, heap size, kernel, distro, and jvm opts.

I've strengthened the CLJ-1250 test case by relying on neither reducer impl nor range impl, and I reverified that the bug is in fact present on <1.7 and gone on -master using Oracle JDK and 128m heap.

Comment by Alex Miller [ 18/Jul/15 8:52 AM ]

OS is CentOS 5.5 afaict

JDK details:

java version "1.6.0"
Java(TM) SE Runtime Environment (build pxa6460sr9fp2-20110625_01(SR9 FP2))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr9-20110624_85526 (JIT enabled, AOT enabled)
J9VM - 20110624_085526
JIT - r9_20101028_17488ifx17
GC - 20101027_AA)
JCL - 20110530_01

As far as I can tell, nothing is being done to alter the default heap size or other jvm opts during the build.

Comment by Ghadi Shayban [ 18/Jul/15 2:27 PM ]

The IBM JDK6 version I used was:

java version "1.6.0"
Java(TM) SE Runtime Environment (build pxa6460sr16fp7-20150708_01(SR16 FP7))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr16fp7-20150701_255724 (JIT enabled, AOT enabled)
J9VM - 20150701_255724
JIT - r9_20150630_95420
GC - GA24_Java6_SR16_20150701_1008_B255724)
JCL - 20150628_01

Seems like SR16 FP7 == 6.0.16.7, and the one on the CI build is SR9 FP2 == 6.0.9.2, a four or five year difference between point releases.

Comment by Ghadi Shayban [ 18/Jul/15 2:51 PM ]

OK I found a download for the (old) IBM JDK 6.0.9.2 and installed it on Linux x86-64. I cannot reproduce the bug.

Comment by Alex Miller [ 18/Jul/15 3:53 PM ]

I'd be happy to update the ibm jdk 1.6 version and n the build box too to see what happens.

Comment by Alex Miller [ 21/Aug/15 3:09 AM ]

Since the CLJ-1250 patch has been reverted, this is no longer failing in the build and I'm going to close it for now. Will reopen if necessary when CLJ-1793 (the updated version of CLJ-1250) goes in.





[CLJ-304] clojure.repl/source does not work with deftype Created: 20/Apr/10  Updated: 21/Aug/15

Status: In Progress
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Assembla Importer Assignee: Unassigned
Resolution: Unresolved Votes: 20
Labels: repl

Approval: Triaged

 Description   

clojure.repl/source does not work on a deftype

user> (deftype Foo [a b])
user.Foo
user> (source Foo)
Source not found

Cause: deftype creates a class but not a var so no file/line info is attached anywhere.

Approach:

Patch:

Screened by:



 Comments   
Comment by Assembla Importer [ 24/Aug/10 4:38 PM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/304

Comment by Assembla Importer [ 24/Aug/10 4:38 PM ]

chouser@n01se.net said: That's a great question. get-source just needs a file name and line number.

If IMeta were a protocol, it could be extended to Class. That implementation could look for a "well-known" static field, perhaps? __clojure_meta or something? Then deftype would just have to populate that field, and get-source would be all set.

Does that plan have any merit? Is there a better place to store a file name and line number?

Comment by Assembla Importer [ 24/Aug/10 4:38 PM ]

stu said: Seems like a reasonable idea, but this is going to get back-burnered for now, unless there is a dire use case we have missed.

Comment by Gary Trakhman [ 19/Feb/14 10:31 AM ]

I could use this for cider's file/line jump-around mechanism as well.

With records, I can work around it by deriving and finding the corresponding constructor var, but it's a bit nasty.

Comment by Bozhidar Batsov [ 03/Mar/14 6:37 AM ]

I'd also love to see this fixed.

Comment by Andy Fingerhut [ 03/Mar/14 8:33 AM ]

Bozhidar, voting on a ticket (clicking the Vote link in the right of the page when viewing the ticket) can help push it upwards on listings of tickets by # of votes.

Comment by Bozhidar Batsov [ 19/Sep/14 1:17 PM ]

Andy, thanks for the pointer. They should have made this button much bigger, I hadn't noticed it all until now.





[CLJ-1522] Enhance multimethods metadata Created: 08/Sep/14  Updated: 21/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Bozhidar Batsov Assignee: Unassigned
Resolution: Unresolved Votes: 17
Labels: metadata

Approval: Triaged

 Description   

I think that multimethod metadata can be extended a bit with some property indicating the var in question is referring to a multimethod (we have something similar for macros) and some default arglists property.

I'm raising this issue because as a tool writer (CIDER) I'm having hard time determining if something is a multimethod (I have to resort to code like (instance? clojure.lang.MultiFn obj) which is acceptable, but not ideal I think (compared to macros and special forms)). There's also the problem that I cannot provide the users with eldoc (function signature) as it's not available in the metadata (this issue was raised on the mailing list as well https://groups.google.com/forum/#!topic/clojure/crje_RLTWdk).

I feel that we really have a problem with the missing arglist and we should solve it somehow. I'm not sure I'm suggesting the best solution and I'll certainly take any solution.



 Comments   
Comment by Bozhidar Batsov [ 09/Sep/14 4:24 AM ]

Btw, I failed to mention this as I thought it was obvious, but I think we should use the dispatch function's arglist in the multimethod metadata.





[CLJ-1800] Doc that lazy-seq with-meta forces realization Created: 13/Aug/15  Updated: 19/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Max Penet Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: docstring

Attachments: Text File CLJ-1800-no-realize-v1.patch     Text File CLJ-1800-v2.patch    
Approval: Triaged

 Description   

Applying meta to a lazy-seq causes realization:

(def x (vary-meta (lazy-seq (prn :realized)) assoc :foo :bar))
:realized

This might be surprising, so modify docstring of lazy-seq to mention it.

Patch:



 Comments   
Comment by Alex Miller [ 13/Aug/15 9:02 AM ]

I think it's likely that seq() is called here so that the old LazySeq instance and the new one share the sequence. Otherwise the pre-meta and post-meta versions would be performing the same function calls on the same inputs but would be disconnected, which seems bad.

Comment by Alex Miller [ 13/Aug/15 9:03 AM ]

I'm not really sure where this would be documented. Maybe on the http://clojure.org/metadata page?

Comment by Max Penet [ 13/Aug/15 9:18 AM ]

That would make sense yes and on the docstring of lazy-seq as well.

Comment by Alex Miller [ 13/Aug/15 9:47 AM ]

I added a sentence to the metadata page and updated the description to be more applicable here to a docstring change.

Comment by Michael Blume [ 13/Aug/15 1:29 PM ]

With this patch, with-meta doesn't realize the seq, but realization still only happens once – would this be an acceptable approach?

Comment by Michael Blume [ 19/Aug/15 4:46 PM ]

Added test





[CLJ-888] defprotocol should throw error when signatures include variable number of parameters Created: 29/Nov/11  Updated: 17/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Greg Chapman Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: errormsgs, protocols

Attachments: Text File 0001-Forbid-vararg-declaration-in-defprotocol-definterfac.patch     Text File CLJ-888-v2.patch    
Patch: Code and Test
Approval: Triaged

 Description   

I tried to use & in the signature for a method in defprotocol. Apparently (see below), this is compiled so that & becomes a simple parameter name, and there is no special handling for variable number of parameters. I think the use of & in a protocol signature ought to be detected and immediately cause an exception (I also think this restriction on the signatures ought to be documented; I couldn't find it specified in the current documentation, though of course it is implied (as I later realized) by the fact that defprotocol creates a Java interface).

user=> (defprotocol Applier (app [this f & args]))
Applier
user=> (deftype A [] Applier (app [_ f & args] (prn f & args) (apply f args)))
user.A
user=> (app (A.) + 1 2)
#<core$PLUS clojure.core$PLUS@5d9d0d20> 1 2
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long
clojure.lang.RT.seqFrom (RT.java:487)



 Comments   
Comment by Alex Coventry [ 21/Oct/13 4:21 PM ]

Patch with test code attached. I have it throwing a CompilerException so that it shows source code location. Not sure whether this is kosher in clojure code, but I wish more macros provided this in their error handling.

Comment by Tassilo Horn [ 22/Oct/13 6:26 AM ]

This issue has already been discussed in CLJ-1024. There I provided a patch that forbids varargs and destructuring forms at various places including defprotocol/definterface. My patch had been applied shortly before clojure 1.5 was released, but it had a bug (forbid too many uses), so it got reverted and the bug closed and declined.

I was told to bring up the issue again after 1.5 has been released.

So here is my patch again. This time it's much more relaxed and only forbids varargs in defprotocol/definterface method declarations, and in deftype/defrecord and reify method implementations.

Comment by Alex Coventry [ 22/Oct/13 7:30 AM ]

Thanks, Tassilo. If there's anywhere in the JIRA system where I could check for prior work like that for other similar issues, I'd be grateful for a pointer.

Best regards,
Alex

Comment by Tassilo Horn [ 22/Oct/13 7:39 AM ]

New version of my patch.

Now I use a CompilerException with proper file/line/column information like Alex did. I also added his test case (which passes).

Concerning your question, Alex: a search for "varargs" would have listed CLJ-1024, but probably you wouldn't have looked into it anyway, because it's a closed issue...

Comment by Tassilo Horn [ 22/Oct/13 7:44 AM ]

Alex, if you don't object could we remove your patch in favor of mine which covers a bit more cases?

Comment by Alex Coventry [ 22/Oct/13 10:57 AM ]

Yep. Just read through 1024 and the associated mailing list discussion. You should totally get the credit: Your patch is more comprehensive and you have been on this a long time. Thanks for folding in the good parts of my patch.

Best regards,
Alex

Comment by Tassilo Horn [ 22/Oct/13 12:15 PM ]

Ok, great.

It seems I don't have the permissions to delete other peoples' attachments, so could you please delete your patch yourself?

Comment by Alex Coventry [ 23/Oct/13 2:44 PM ]

Sure, Tassilo. It's done.

I think this also needs a regression test for the case hugod originally pointed out. I initially made the same mistake as you there, but amalloy pointed it out[1] before I submitted the patch, so it is a natural mistake to make and should probably be documented in the source code.

Best regards,
Alex

[1] http://logs.lazybot.org/irc.freenode.net/%23clojure/2013-10-21.txt search for 14:48:34.

Comment by Tassilo Horn [ 24/Oct/13 2:00 AM ]

Alex, I've added the regression test you suggested. Thanks for pointing that out.

Also, I added tests checking definterface method declarations, and tests checking inline method implementations made with defrecord, deftype, and reify.

However, there's a problem with the tests for deftype and reify I don't know how to fix. When I eval the macroexpand forms used in the tests in a REPL, I can see that the CompilerException is successfully thrown and printed. But it also seems to be caught somewhere in the middle, so that the macroexpand returns a form and the exception doesn't make it to the (is (thrown? ...)). Therefore, I've commented the these tests and added a big FIXME.

Comment by Tassilo Horn [ 24/Oct/13 2:28 AM ]

New version of the patch with now all tests uncommented and passing. Andy Fingerhut made me aware that for the 4 deftype and reify tests, I need eval instead of just macroexpand.

Comment by Andy Fingerhut [ 25/Oct/13 6:25 PM ]

I have not investigated the reason yet, but patch 0001-Forbid-vararg-declaration-in-defprotocol-definterfac.patch no longer applies cleanly after the latest commits to Clojure master on Oct 25 2013.

Comment by Tassilo Horn [ 28/Oct/13 2:21 AM ]

I've rebased the patch onto the current master so that it applies cleanly again.

Comment by Tassilo Horn [ 28/Oct/13 2:25 AM ]

Stu, I've assigned this issue to you because you've been assigned to CLJ-1165 which I have closed as duplicate of this issue.

One minor difference between my patch to this issue and CLJ-1165 is that here I use a CompilerException with file/line/column info whereas in CLJ-1165 I've used `ex-info`. I think the CE is more appropriate/informative, as the error is already triggered during macro expansion.

Comment by Michael Blume [ 16/Aug/15 4:36 PM ]

Rebased onto master

Comment by Michael Blume [ 17/Aug/15 12:46 PM ]

Be nice if we got this merged – I just got a pull request using a varargs protocol that seemed to work by accident because the only time the varargs arity was called, it was called with two additional arguments, matching the '& and the 'args in the interface. Confused the hell out of me before I worked out what was going on.





[CLJ-1493] Fast keyword intern Created: 06/Aug/14  Updated: 14/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: dennis zhuang Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: keywords, performance
Environment:

Mac OS X 10.9.4 / 2.6 GHz Intel Core i5 / 8 GB 1600 MHz DDR3
java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)


Attachments: File fast_keyword_intern.diff    
Patch: Code
Approval: Triaged

 Description   

Keyword's intern(Symbol) method uses recursive invocation to get a valid keyword instance.I think it can be rewrite into a 'for loop'
to reduce method invocation cost.
So i developed this patch, and make some simple benchmark.Run the following command line three times after 'ant jar':

java -Xms64m -Xmx64m -cp test:clojure.jar clojure.main -e "(time (dotimes [n 10000000] (keyword (str n))))"

Before patched:

"Elapsed time: 27343.827 msecs"
"Elapsed time: 26172.653 msecs"
"Elapsed time: 25673.764 msecs"

After patched:

"Elapsed time: 24884.142 msecs"
"Elapsed time: 23933.423 msecs"
"Elapsed time: 25382.783 msecs"

It looks the patch make keyword's intern a little more fast.

The patch is attached and test.

Thanks.

P.S. I've signed the contributor agreement, and my email is killme2008@gmail.com .



 Comments   
Comment by Alex Miller [ 07/Aug/14 9:01 AM ]

Looks intriguing (and would be a nice change imo). I ran this on a json parsing benchmark I used for the keyword changes and saw ~3% improvement.

Comment by dennis zhuang [ 07/Aug/14 9:54 PM ]

Updated the patch, remove the 'k == null' clause in for loop,it's not necessary.

Comment by Andy Fingerhut [ 11/Aug/14 1:29 AM ]

Dennis, while JIRA can handle multiple patches with the same name, it can be confusing for people discussing the patches, and for some scripts I have to evaluate them. Please consider giving the patches different names (e.g. with version numbers in them), or removing older ones if they are obsolete.

Comment by dennis zhuang [ 11/Aug/14 9:19 AM ]

Hi,andy

Thank you for reminding me.I deleted the old patch.

Comment by dennis zhuang [ 11/Sep/14 10:34 AM ]

I am glad to see it is helpful.I benchmark the patch with current master branch,it's fine too.

Comment by dennis zhuang [ 14/Aug/15 9:12 AM ]

Is this patch can be merged? Or is it rejected?

Comment by Alex Miller [ 14/Aug/15 9:41 AM ]

As a minor enhancement, this patch has not yet been high enough priority to be considered yet.

Comment by dennis zhuang [ 14/Aug/15 11:31 AM ]

All right.Hope to merge it.Thanks.





[CLJ-1799] Replace refs in pprint Created: 13/Aug/15  Updated: 14/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: performance, print

Attachments: Text File CLJ-1799.patch    

 Description   

I noticed that pprint uses refs and dosync transactions in a number of places, which seems unlikely to be necessary. It seems like these could be replaced by atoms, or even volatiles, given that printing typically happens in a single thread. Presumably this would improve performance of pprint significantly.



 Comments   
Comment by dennis zhuang [ 14/Aug/15 11:28 AM ]

I develop a patch to fix this issue.I run all the tests in clojure and clojure.data.json, and no one fails.

Use criterium to do a simple benchmark as below:

(use 'criterium.core)
(require '[clojure.data.json :as json])
(bench (json/write-str 
  {:a 1 :b 2 :c (range 10) :d "hello world"
   :e (apply hash-set (range 10))}))

before patch:

Evaluation count : 6180060 in 60 samples of 103001 calls.
             Execution time mean : 10.302604 µs
    Execution time std-deviation : 597.958933 ns
   Execution time lower quantile : 9.631444 µs ( 2.5%)
   Execution time upper quantile : 11.618551 µs (97.5%)
                   Overhead used : 1.724553 ns

After patch:

Evaluation count : 6000900 in 60 samples of 100015 calls.
             Execution time mean : 10.212543 µs
    Execution time std-deviation : 564.874941 ns
   Execution time lower quantile : 9.528383 µs ( 2.5%)
   Execution time upper quantile : 11.334033 µs (97.5%)
                   Overhead used : 1.827143 ns




[CLJ-1453] Most Iterator implementations do not correctly implement next failing to throw the required NoSuchElementException Created: 24/Jun/14  Updated: 14/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Meikel Brandmeyer Assignee: Alex Miller
Resolution: Unresolved Votes: 3
Labels: interop

Attachments: Text File 0001-Fix-iterator-implementations-to-throw-NSEE-when-exha.patch     Text File 0001-Throw-NSEE-in-gvec-iterator.patch     Text File clj-1453-2.patch     Text File CLJ-1453.patch     Text File CLJ-1453-tests.patch    
Patch: Code and Test
Approval: Incomplete

 Description   

Iterators on Clojure's collections should follow the expected JDK behavior of throwing NoSuchElementException on next() when an iterator is exhausted. Current collections have a variety of other behaviors.

Issue encountered in real world code using http://pipes.tinkerpop.com.

To reproduce:

(-> [] .iterator .next)

This throws a NPE instead of NSEE.

(doto (.iterator [1 2]) .next .next .next)

This throws an ArrayIndexOutOfBoundsException instead of NSEE.

The attached patch fixes the methods by adding a check for hasNext before actually trying to provide the next element. If there is no next element the correct exception is thrown.

Performance:

(use 'criterium.core)

(def v (vec (range 10000)))
(def sm (into (sorted-map) (zipmap (range 10000) (range 10000))))
(def pv (apply vector-of :long (range 10000)))

(defn consume-iter [^Iterable src]
  (let [^java.util.Iterator i (.iterator ^Iterable src)]
    (loop [c []]
      (if (.hasNext i)
        (recur (conj c (.next i)))
        c))))

(bench (consume-iter v))
(bench (consume-iter sm))
(bench (consume-iter pv)
coll 1.8.0-alpha4 +patch
v 300 us 335 us
sm 776 us 784 us
pv 519 us 508 us

Patch: clj-1453-2.patch and tests: CLJ-1453-tests.patch

Screened by: Alex Miller



 Comments   
Comment by Tim McCormack [ 15/Jul/14 9:56 PM ]

To establish a baseline, this piece of code checks all the iterators I could find within Clojure 1.6.0 and makes sure they throw an appropriate exception:

iterator-checker.clj
(defn next-failure
  []
  (let [ok (atom [])]
    (doseq [[tp v]
            (sorted-map
             :vec-0 []
             :vec-n [1 2 3]
             :vec-start (subvec [1 2 3 4] 0 1)
             :vec-end (subvec [1 2 3 4] 3 4)
             :vec-ls-0 (.listIterator [])
             :vec-ls-n (.listIterator [1 2 3])
             :vec-start-ls (.listIterator (subvec [1 2 3 4] 0 1))
             :vec-end-ls (.listIterator (subvec [1 2 3 4] 3 4))
             :seq ()
             :list-n '(1 2 3)
             :set-hash-0 (hash-set)
             :set-hash-n (hash-set 1 2 3)
             :set-sor-0 (sorted-set)
             :set-sor-n (sorted-set 1 2 3)
             :map-arr-0 (array-map)
             :map-arr-n (array-map 1 2, 3 4)
             :map-hash-0 (hash-map)
             :map-hash-n (hash-map 1 2, 3 4)
             :map-sor-n (sorted-map)
             :map-sor-n (sorted-map 1 2, 3 4)
             :map-sor-ks-0 (.keys (sorted-map))
             :map-sor-ks-n (.keys (sorted-map 1 2, 3 4))
             :map-sor-vs-0 (.vals (sorted-map))
             :map-sor-vs-n (.vals (sorted-map 1 2, 3 4))
             :map-sor-rev-0 (.reverseIterator (sorted-map))
             :map-sor-rev-n (.reverseIterator (sorted-map 1 2, 3 4))
             :map-ks-0 (.keySet {})
             :map-ks-n (.keySet {1 2, 3 4})
             :map-vs-0 (.values {})
             :map-vs-n (.values {1 2, 3 4})
             :gvec-int-0 (vector-of :long)
             :gvec-int-n (vector-of :long 1 2 3))]
      (let [it (if (instance? java.util.Iterator v)
                 v
                 (.iterator v))]
        (when-not it
          (println "Null iterator:" tp))
        (try (dotimes [_ 50]
               (.next it))
             (catch java.util.NoSuchElementException nsee
               (swap! ok conj tp))
             (catch Throwable t
               (println tp "threw" (class t))))))
    (println "OK:" @ok)))

The output as of current Clojure master (201a0dd970) is:

:gvec-int-0 threw java.lang.IndexOutOfBoundsException
:gvec-int-n threw java.lang.IndexOutOfBoundsException
:map-arr-0 threw java.lang.ArrayIndexOutOfBoundsException
:map-arr-n threw java.lang.ArrayIndexOutOfBoundsException
:map-hash-0 threw java.lang.ArrayIndexOutOfBoundsException
:map-ks-0 threw java.lang.ArrayIndexOutOfBoundsException
:map-ks-n threw java.lang.ArrayIndexOutOfBoundsException
:map-sor-ks-0 threw java.util.EmptyStackException
:map-sor-ks-n threw java.util.EmptyStackException
:map-sor-n threw java.util.EmptyStackException
:map-sor-rev-0 threw java.util.EmptyStackException
:map-sor-rev-n threw java.util.EmptyStackException
:map-sor-vs-0 threw java.util.EmptyStackException
:map-sor-vs-n threw java.util.EmptyStackException
:map-vs-0 threw java.lang.ArrayIndexOutOfBoundsException
:map-vs-n threw java.lang.ArrayIndexOutOfBoundsException
:vec-0 threw java.lang.NullPointerException
:vec-end threw java.lang.ArrayIndexOutOfBoundsException
:vec-end-ls threw java.lang.IndexOutOfBoundsException
:vec-ls-0 threw java.lang.IndexOutOfBoundsException
:vec-ls-n threw java.lang.IndexOutOfBoundsException
:vec-n threw java.lang.ArrayIndexOutOfBoundsException
:vec-start threw java.lang.ArrayIndexOutOfBoundsException
:vec-start-ls threw java.lang.IndexOutOfBoundsException
OK: [:list-n :map-hash-n :seq :set-hash-0 :set-hash-n :set-sor-0 :set-sor-n]
Comment by Tim McCormack [ 15/Jul/14 9:57 PM ]

0001-Fix-iterator-implementations-to-throw-NSEE-when-exha.patch missed one thing: clojure.gvec. With the patch in place, my checker outputs the following:

:gvec-int-0 threw java.lang.IndexOutOfBoundsException
:gvec-int-n threw java.lang.IndexOutOfBoundsException
OK: [:list-n :map-arr-0 :map-arr-n :map-hash-0 :map-hash-n :map-ks-0 :map-ks-n :map-sor-ks-0 :map-sor-ks-n :map-sor-n :map-sor-rev-0 :map-sor-rev-n :map-sor-vs-0 :map-sor-vs-n :map-vs-0 :map-vs-n :seq :set-hash-0 :set-hash-n :set-sor-0 :set-sor-n :vec-0 :vec-end :vec-end-ls :vec-ls-0 :vec-ls-n :vec-n :vec-start :vec-start-ls]

That should be a quick fix.

Comment by Michał Marczyk [ 15/Jul/14 10:01 PM ]

CLJ-1416 includes a fix for gvec's iterator impls (and some other improvements to interop).

Comment by Tim McCormack [ 15/Jul/14 10:17 PM ]

Attaching a fix for the gvec iterator. Combined with the existing patch, this fixes all broken iterators that I could find.

Comment by Andy Fingerhut [ 07/Aug/14 10:25 AM ]

I believe this Clojure commit: https://github.com/clojure/clojure/commit/e7215ce82215bda33f4f0887cb88570776d558a0

introduces more implementations of the java.util.Iterator interface where next() returns null instead of throwing a NoSuchElementException

Comment by Alex Miller [ 29/Apr/15 11:34 AM ]

Would love to have a patch that:
1) combined patches to date
2) updated to master
3) reviewed for new iterators since this ticket was written
4) added the tests in the comments

Comment by Alex Miller [ 18/Jun/15 3:30 PM ]

Bump - would be happy to screen this for 1.8 if my last comments were addressed.

Comment by Andrew Rosa [ 18/Jul/15 4:38 PM ]

Alex Miller Addressed you comments on the first patch. On the process of applying them to master, I've changed the code to follow a single "format" of bound checking and throw, so everything will be more consistent with the code style of Clojure codebase. The commit themselves still maintain the original authors, to give correct credit.

The second patch includes only the tests for these features, which I used test.check to do them, as I talked to you on clojure-dev channel. If you think they are too much complex or prefer a different style of testing, please let me know - that's why I made a separate patch only for the tests.

Comment by Alex Miller [ 18/Jul/15 5:04 PM ]

Thanks Andrew. It will likely be a couple weeks before I have time to look at this.

Comment by Alex Miller [ 30/Jul/15 11:02 PM ]

clj-1453-2.patch squashes CLJ-1453.patch and updates to current master.

Comment by Rich Hickey [ 08/Aug/15 9:53 AM ]

have we looked at the perf impacts of the redundant hashNext calls? I understand hasNext should be idempotent, but could do work. An alternative would be putting in try and throwing NSE





[CLJ-1801] Boxed Boolean values break `if` special form Created: 13/Aug/15  Updated: 13/Aug/15  Resolved: 13/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Critical
Reporter: Adrian Medina Assignee: Unassigned
Resolution: Declined Votes: 1
Labels: None


 Description   

We came across this in production code where at some point a value in a map (e.g., {:some-key false}) was being used as the test form of a conditional statement which was evaluating counter-intuitively. After much bewilderment, we figured out that it was because somewhere along the line the literal false value was being boxed. You can reproduce this by evaluating the following form in a REPL:

(if (Boolean. false)
true
false)

=> true



 Comments   
Comment by Ghadi Shayban [ 13/Aug/15 3:05 PM ]

Not a bug in Clojure. See [1] Make sure the library you are using does the proper canonicalization of boolean [2].

[1] http://clojure.org/special_forms#toc2

[2] https://github.com/ghadishayban/squee/blob/master/src/squee/impl/resultset.clj#L66-L69

Comment by Alex Miller [ 13/Aug/15 3:08 PM ]

Long ago, it was decided that only the canonical Boolean/FALSE value would be considered false in Clojure and there are no plans to change this.

Comment by Adrian Medina [ 13/Aug/15 3:15 PM ]

This is really not considered a bug? I didn't mean to imply we were using boxed booleans purposefully - in fact we weren't (the map in question get assoc'd a literal false value), and I had no idea the boxing was occurring until I dug deeper into the bug. The printed representation of a boxed boolean false value is false making debugging this issue very difficult. Perhaps the printed representation of a boxed boolean value should be changed?

Comment by Nicola Mometto [ 13/Aug/15 5:42 PM ]

Boolean/FALSE and Boolean/TRUE are already boxed booleans so your code must be constructing a new boxed boolean and using that.





[CLJ-1798] The RetryEx in LockingTransaction should be static Created: 13/Aug/15  Updated: 13/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: dennis zhuang Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: STM, performance
Environment:

java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Mac OSX 10.10.4


Attachments: PNG File profile.png     Text File static_retryex.patch    
Patch: Code
Approval: Triaged

 Description   

We are using clojure.data.json, and we profiled our project with jprofiler, it shows that there is a hotspot in LockingTransaction.Too many RetryEx instances were created during the profiling.
The retryex instance variable should be static as a class variable to avoid creating when a new STM transaction begins.

The attacments are the profile result screen snapshot and the patch to make the retryex to be static.
I don't do any benchmark right now,but maybe a clojure.data.json performance benchmark will approve the patch works better.



 Comments   
Comment by Alex Miller [ 13/Aug/15 8:04 AM ]

I think the suggestion here is sound, but I have a hard time believing it will make much of a difference. My real question is why pprint is using dosync.

Comment by dennis zhuang [ 13/Aug/15 9:03 AM ]

I found it uses many dosync in writer as below:

https://github.com/clojure/clojure/blob/master/src/clj/clojure/pprint/pretty_writer.clj

It seems that using the dosync to change some mutable states.Maybe they can be rewrote into other forms such as atom,java object/lock etc.
But it's another story.





[CLJ-1714] Some static initialisers still run at compile time if used in type hints Created: 22/Apr/15  Updated: 12/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Adam Clements Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: compiler, typehints

Attachments: Text File CLJ-1714.patch     Text File CLJ-1714-v2.patch    
Patch: Code
Approval: Triaged

 Description   

AOT compiling on an x86 machine to be run on an ARM machine when a Java dependency has a native component and the class with the native dependency is used in a type hint.

In this situation, the only native library available on the classpath is the ARM dependency, and obviously won't load on the compiling x86 machine. Java libraries tend to load the native dependencies in the static initialiser of the class, which will fail in this situation as the architecture is x86 and the dependencies are ARM, for which reason CLJ-1315 made the change to not run static initialisers at compile time.

This covers a case which didn't come up as part of CLJ-1315, that the same problem occurs if rather than constructing the class, you simply use it as a type hint (which IMO is doubly surprising as something to have a side-effect).

This patch fixes that - happy to try and create a test, but would appreciate some advice on the shape such a test would take - presumably loading a java native library would be undesirable. I could simply check for static initialisers being run, but first would need some agreement that this is universally undesirable at compile time.

I have been using this patch in production for over a year with no adverse effects (as has anybody using the clojure-android build of clojure).



 Comments   
Comment by Alex Miller [ 22/Apr/15 10:53 AM ]

I think this might have been logged already but I'm not sure.

Comment by Michael Blume [ 22/Apr/15 12:30 PM ]

Patch won't apply to master for me

Comment by Adam Clements [ 22/Apr/15 2:39 PM ]

Really sorry, don't know what happened there. I checked out a fresh copy of the repo and re-applied the changes, deleted the old patch as it was garbage. Try the new one, timestamped 2:37pm

Comment by Stuart Halloway [ 30/Jul/15 1:52 PM ]

Please add an example of the problem, and if possible a failing test.

Comment by Alex Miller [ 30/Jul/15 5:14 PM ]

Reset to "Open" as moving from Triaged->Incomplete is not valid in our current workflow.

Comment by Adam Clements [ 31/Jul/15 10:56 AM ]

Example problem:
AOT compiling on an x86 machine to be run on an ARM machine when a Java dependency has a native component and the class with the native dependency is used in a type hint.

In this situation, the only native library available on the classpath is the ARM dependency, and obviously won't load on the compiling x86 machine. Java libraries tend to load the native dependencies in the static initialiser of the class, which will fail in this situation as the architecture is x86 and the dependencies are ARM, for which reason CLJ-1315 made the change to not run static initialisers at compile time.

This covers a case which didn't come up as part of CLJ-1315, that the same problem occurs if rather than constructing the class, you simply use it as a type hint (which IMO is doubly surprising as something to have a side-effect).

This patch fixes that - happy to try and create a test, but would appreciate some advice on the shape such a test would take - presumably loading a java native library would be undesirable. I could simply check for static initialisers being run, but first would need some agreement that this is universally undesirable at compile time.

I have been using this patch in production for over a year with no adverse effects (as has anybody using the clojure-android build of clojure).

Comment by Stuart Halloway [ 31/Jul/15 11:34 AM ]

Hi Adam,

Thanks for the quick response. I think checking for static initializers being run is OK for a test.

Comment by Adam Clements [ 12/Aug/15 9:12 AM ]

Added failing tests which now pass





[CLJ-1226] set! of a deftype field using field-access syntax causes ClassCastException Created: 26/Jun/13  Updated: 11/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: compiler, deftype, interop

Attachments: Text File 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r.patch     Text File 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r-v2.patch    
Patch: Code and Test
Approval: Screened

 Description   

set! can be used to set a public field on an instance with (set! (.field inst) val). This does not work inside a protocol function defined on a deftype with a mutable field for an instance of that type itself.

user=> (defprotocol p (f [_]))
p
user=> (deftype t [^:unsynchronized-mutable x] p (f [this] (set! (.x this) 1)))
user.t
user=>  (f (t. 1))   ;; expect: 1
ClassCastException user.t cannot be cast to compile__stub.user.t  user.t (NO_SOURCE_FILE:1

Cause: The type assigned in the bytecode at this point is the compile_stub type, not the expected class type.

Approach: Use getType(targetClass) instead of Type.getType(targetClass)

Patch: 0001-CLJ-1226-fix-set-of-instance-field-expression-that-r-v2.patch
Screened by: Fogus



 Comments   
Comment by Nicola Mometto [ 27/Jun/13 5:30 AM ]

This patch offers a better workaround for CLJ-1075, making it possible to write
(deftype foo [^:unsynchronized-mutable x] MutableX (set-x [this v] (try (set! (.x this) v)) v))

Comment by Nicola Mometto [ 25/Mar/15 4:39 PM ]

Updated patch to apply to current master

Comment by Fogus [ 07/Aug/15 3:10 PM ]

Straight-forward fix and test.

Comment by Rich Hickey [ 08/Aug/15 10:05 AM ]

Screeners - please make sure the patch has an Approach section that explains how and why the patch will fix the problem. Symptom is X and now after patch behavior is Y is not good enough.





[CLJ-1773] Support for REPL commands for tooling Created: 01/Jul/15  Updated: 11/Aug/15  Resolved: 11/Aug/15

Status: Resolved
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Release 1.8

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Duplicate Votes: 0
Labels: repl


 Description   

Per http://dev.clojure.org/display/design/Socket+Server+REPL, want to enhance repl to support "commands" useful for nested repls or for parallel tooling repls communicating over sockets (CLJ-1671).

Commands are defined as keywords in the "repl" namespace. The REPL will trap these after reading but before evaluation. Currently this is a closed set, but perhaps it could be open - the server socket repl could then install new ones if desired when repl is invoked.

Commands:

  • :repl/quit - same as Ctrl-D but works in terminal environments where that's not feasible. Allows for backing out of a nested REPL.
  • :repl/push - push the current repl "state" (tbd what that is, but at least ns) to a stateful map in the runtime. Returns coordinates that can be used to retrieve it elsewhere
  • :repl/pull <coords> - retrieves the repl state defined by the coordinates

In the tooling scenario, it is expected that there are two repls - the client repl and the tooling repl. The tooling can send :repl/push over the client repl before startup and retrieve the coordinates (which don't change). Then the tooling repl can call :repl/pull at any time to retrieve the state of the client repl.



 Comments   
Comment by Alex Miller [ 11/Aug/15 12:24 PM ]

I originally was trying to split up the stuff in the socket repl ticket with this but so far it has been far easier to work on them in tandem, so I'm going to just dupe this into that one (CLJ-1671).





[CLJ-1797] Mention cljc when require fails Created: 10/Aug/15  Updated: 10/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Mike Fikes Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: errormsgs

Approval: Triaged

 Description   

If you attempt to require a namespace for which the code cannot be located, now that cljc files are considered when locating code, it could be mentioned in the list of files considered on the classpath.

$ java -jar clojure-1.7.0.jar 
Clojure 1.7.0
user=> (require 'foo.bar)
FileNotFoundException Could not locate foo/bar__init.class or foo/bar.clj on classpath.  clojure.lang.RT.load (RT.java:449)

Note: FWIW, ClojureScript has similar error messages, and the order listed in the error message is the order tried when loading. If this were followed, I suspect the text of the error message above would end up looking like:

Could not locate foo/bar__init.class, foo/bar.clj, or foo/bar.cljc on classpath.





[CLJ-1659] compile leaks files Created: 16/Feb/15  Updated: 10/Aug/15  Resolved: 17/Jul/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Ralf Schmitt Assignee: Unassigned
Resolution: Completed Votes: 3
Labels: compiler, ft

Attachments: Text File clj-1659.patch     Text File clj-1659-v2.patch     Text File clj-1659-v3.patch    
Patch: Code
Approval: Ok

 Description   

clojure's compile function leaks file descriptors, i.e. it relies on garbage collection to close the files. I'm trying to use boot [1] on windows and ran into the problem, that files could not be deleted intermittently [2]. The problem is that clojure's compile function, or rather clojure.lang.RT.lastModified() relies on garbage collection to close files. lastModified() looks like:

static public long lastModified(URL url, String libfile) throws IOException{
	if(url.getProtocol().equals("jar")) {
		return ((JarURLConnection) url.openConnection()).getJarFile().getEntry(libfile).getTime();
	}
	else {
		return url.openConnection().getLastModified();
	}
}

Here's the stacktrace from file leak detector [3]:

#205 C:\Users\ralf\.boot\tmp\Users\ralf\home\steinmetz\2mg\-x24pa9\steinmetz\fx\config.clj by thread:clojure-agent-send-off-pool-0 on Sat Feb 14 19:58:46 UTC 2015
    at java.io.FileInputStream.(FileInputStream.java:139)
    at java.io.FileInputStream.(FileInputStream.java:93)
    at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
    at sun.net.www.protocol.file.FileURLConnection.initializeHeaders(FileURLConnection.java:110)
    at sun.net.www.protocol.file.FileURLConnection.getLastModified(FileURLConnection.java:178)
    at clojure.lang.RT.lastModified(RT.java:390)
    at clojure.lang.RT.load(RT.java:421)
    at clojure.lang.RT.load(RT.java:411)
    ...

Cause: getLastModified() opens the URLConnection's InputStream but does not close it.

Approach: On Stackoverflow [4] there's a discussion on how to close the URLConnection correctly.

On non-Windows operating systems this shouldn't be much of a problem. But on windows this hurts very much, since you can't delete files that are opened by some process.

Patch: clj-1659-v3.patch

Screened by: Alex Miller

[1] http://boot-clj.com/
[2] https://github.com/boot-clj/boot/issues/117
[3] http://file-leak-detector.kohsuke.org/
[4] http://stackoverflow.com/questions/9150200/closing-urlconnection-and-inputstream-correctly



 Comments   
Comment by Pietro Menna [ 06/May/15 11:10 AM ]

First attempt Patch

Comment by Pietro Menna [ 06/May/15 11:13 AM ]

Hi Alex,

This is my first patch to Clojure and to any OSS. So maybe I will need a little guidance. I follow the steps on how to generate the patch and just uploaded the patch to this thread.

The link from Stack Overflow was good, but unfortunately it is not possible to cast to HttpURLConnection in order to have the .disconnect() method.

Please, let me know if I should attempt anything else.

Kind regards,

Pietro

Comment by Andy Fingerhut [ 06/May/15 11:20 AM ]

Seems that creating a test for this to be run on every build might be difficult.

Have you verified that on Windows it has the desired effect?

Comment by Ralf Schmitt [ 06/May/15 4:49 PM ]

I don't understand how the patch solves that issue. It just sets connection to null. Or am I missing something?

You can test for the file leak with the following program. This works on windows and Linux for me.

leak-test.clj
;; test file leak
;; on UNIX-like systems set hard limit on open files via
;; ulimit -H -n 200 and then run
;; java -jar clojure-1.6.0.jar leak-test.clj

(let [file (java.io.File. "test-leak.txt")
      url (.. file toURI toURL)]
  (doseq [x (range 2000)]
    (print x " ")
    (flush)
    (spit file "")
    (clojure.lang.RT/lastModified url nil)
    (assert (.delete file) "delete failed")))
Comment by Ralf Schmitt [ 06/May/15 5:21 PM ]

dammit, the formatting is wrong. but this patch seems to fix the problem for me (tested on linux).

Comment by Ralf Schmitt [ 06/May/15 6:16 PM ]

indentation fixed

Comment by Andy Fingerhut [ 07/May/15 8:45 AM ]

Ralf, thanks for the patch. I can't say if or when this ticket will be considered for a change to Clojure, but I do know that patches are only considered if they were written by someone who has signed a CA. Were you considering doing so? You can do it on-line here if you wish: http://clojure.org/contributing

Also, patches should be in a slightly different format that include the author's name, email, date of change, etc. Instructions for creating a patch in that format are here: http://dev.clojure.org/display/community/Developing+Patches

Comment by Ralf Schmitt [ 07/May/15 8:50 AM ]

Thanks for the explanation. I've signed the CA and I will update the patch.

Comment by Ralf Schmitt [ 07/May/15 9:34 AM ]

patch vs current master, created with git format-patch

Comment by Andy Fingerhut [ 26/May/15 8:41 AM ]

Ralph, it would be good if all attached files on a ticket have different names, for clarity when referring to them. Could you remove your clj-1659.patch and upload it with a different name, e.g. clj-1659-v2.patch ?

Comment by Ralf Schmitt [ 26/May/15 8:49 AM ]

upload my last patch with non-conflicting filename

Comment by Ralf Schmitt [ 26/May/15 8:52 AM ]

yes, sorry about that. I don't know how to remove my previous patches. (ok, found it now)

Comment by Sven Richter [ 08/Jun/15 2:57 AM ]

This one is keeping me from using boot on windows and seeing how far the 1.7 release process is I would like to express my strong wish that this one makes it into the 1.7 release.

Thanks everyone for their hard work

Comment by Alex Miller [ 18/Jun/15 4:09 PM ]

The clj-1659-v3.patch is same as v2, just removed unneeded braces to match rest of compiler.

Comment by Erik Dannenberg [ 06/Aug/15 12:09 PM ]

Issue still persists with Clojure 1.8.0-alpha4 and latest boot. Running Windows 8.1.

Example app that should reproduce the error: https://github.com/danielsz/sente-boot

clone, run boot dev, change a source file => file exists errors

Any chances to get a fix for this into Clojure 1.7? It's pretty much a show stopper for us as it prevents the reloaded workflow for our windows devs.

Comment by Ralf Schmitt [ 06/Aug/15 4:21 PM ]

This must be another issue. Please try to find the leak with http://file-leak-detector.kohsuke.org/

(I once dug into a similar issue in boot's bug tracker and the problem was somewhere in the cljs compiler).

Comment by Erik Dannenberg [ 10/Aug/15 11:46 AM ]

I attached the file leak detector and ran boot with clojure 1.7 and 1.8-alpha4, the stacktraces for both are a bit different but from what i can tell the leak cause is still the same and looking very similiar to the stacktrace posted in the issue description. I do however see the cljs compiler further down in the stacktrace.

Should i open an issue on CLJS tracker?

Comment by Ralf Schmitt [ 10/Aug/15 11:51 AM ]

This is hard to tell without a stacktrace...

Comment by Erik Dannenberg [ 10/Aug/15 11:59 AM ]
1 descriptors are open
#1 C:\Users\ed\.boot\cache\tmp\Users\ed\projekte\sente-system\5ho\-c4hx1l\adzerk\boot_reload.cljs by thread:clojure-agent-send-off-pool-0 on Mon Aug 10 17:54:39 CEST 2015
	at java.io.FileInputStream.<init>(FileInputStream.java:139)
	at java.io.FileInputStream.<init>(FileInputStream.java:93)
	at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
	at sun.net.www.protocol.file.FileURLConnection.initializeHeaders(FileURLConnection.java:110)
	at sun.net.www.protocol.file.FileURLConnection.getLastModified(FileURLConnection.java:178)
	at cljs.util$last_modified.invokeStatic(util.cljc:136)
	at cljs.util$last_modified.invoke(util.cljc)
	at cljs.analyzer$requires_analysis_QMARK_.invokeStatic(analyzer.cljc:2179)
	at cljs.analyzer$requires_analysis_QMARK_.invoke(analyzer.cljc)
	at cljs.analyzer$requires_analysis_QMARK_.invokeStatic(analyzer.cljc:2164)
	at cljs.analyzer$requires_analysis_QMARK_.invoke(analyzer.cljc)
	at cljs.analyzer$analyze_file.invokeStatic(analyzer.cljc:2220)
	at cljs.analyzer$analyze_file.invoke(analyzer.cljc)
	at cljs.analyzer$analyze_deps.invokeStatic(analyzer.cljc:1291)
	at cljs.analyzer$analyze_deps.invoke(analyzer.cljc)
	at cljs.analyzer$eval1868$fn__1870.invoke(analyzer.cljc:1545)
	at clojure.lang.MultiFn.invoke(MultiFn.java:251)
	at cljs.analyzer$analyze_seq.invokeStatic(analyzer.cljc:1900)
	at cljs.analyzer$analyze_seq.invoke(analyzer.cljc)
	at cljs.analyzer$analyze$fn__2118.invoke(analyzer.cljc:1992)
	at cljs.analyzer$analyze.invokeStatic(analyzer.cljc:1985)
	at cljs.analyzer$analyze.invoke(analyzer.cljc)
	at cljs.compiler$compile_file_STAR_$fn__3216.invoke(compiler.cljc:1027)
	at cljs.compiler$with_core_cljs.invokeStatic(compiler.cljc:968)
	at cljs.compiler$with_core_cljs.invoke(compiler.cljc)
	at cljs.compiler$compile_file_STAR_.invokeStatic(compiler.cljc:988)
	at cljs.compiler$compile_file_STAR_.invoke(compiler.cljc)
	at cljs.compiler$compile_file$fn__3248.invoke(compiler.cljc:1129)
	at cljs.compiler$compile_file.invokeStatic(compiler.cljc:1109)
	at cljs.compiler$compile_file.invoke(compiler.cljc)
	at cljs.compiler$compile_root.invokeStatic(compiler.cljc:1181)
	at cljs.compiler$compile_root.invoke(compiler.cljc)
	at cljs.closure$compile_dir.invokeStatic(closure.clj:385)
	at cljs.closure$compile_dir.invoke(closure.clj)
	at cljs.closure$eval3614$fn__3615.invoke(closure.clj:425)
	at cljs.closure$eval3567$fn__3568$G__3558__3575.invoke(closure.clj:331)
	at cljs.closure$eval3627$fn__3628.invoke(closure.clj:439)
	at cljs.closure$eval3567$fn__3568$G__3558__3575.invoke(closure.clj:331)
	at adzerk.boot_cljs.impl.CljsSourcePaths$fn__4011.invoke(impl.clj:14)
Comment by Alex Miller [ 10/Aug/15 12:30 PM ]

Looks like similar problem in CLJS to me - I would open a ticket there.

Comment by Erik Dannenberg [ 10/Aug/15 1:35 PM ]

Done. CLJS-1416





[CLJ-1796] Protocol functions fail to find future extensions when assigned to a local or new var Created: 08/Aug/15  Updated: 10/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Nathan Marz Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: protocols

Approval: Triaged

 Description   
(defprotocol TestProtocol
  (tester [o]))

(let [t tester]
  (defn another-tester [o]
  	(t o)))

(def another-tester2 tester)

(extend-protocol TestProtocol
  String
  (tester [o] (println "Strings work!")))

(another-tester "A") ;; Error
(another-tester2 "A") ;; Error
(tester "A") ;; Works fine

(let [t tester]
  (defn another-tester [o]
  	(t o)))

(another-tester "A") ;; Works fine

(def another-tester2 tester)

(another-tester2 "A") ;; Works fine

(extend-protocol TestProtocol
  Long
  (tester [o] (println "Longs work!")))

(another-tester "A") ;; Works fine
(another-tester 3) ;; Error
(another-tester2 3) ;; Error


 Comments   
Comment by Nathan Marz [ 08/Aug/15 12:47 PM ]

This issue appears to be Clojure specific – I did some testing in CLJS and was unable to reproduce the issue.

Comment by Ghadi Shayban [ 09/Aug/15 9:51 AM ]

Nathan,
Not sure if you tried this, but using:

(def another-handle #'the-protocol-function)
rather than dereffing outright.

Comment by Nathan Marz [ 09/Aug/15 6:25 PM ]

That's a good workaround but it does seem that my test case should work. I ran into this because I was passing around functions dynamically and saving them for later execution – and this issue popped up with protocol methods. Having to pass around protocol methods differently than regular functions doesn't seem right.

Comment by Kevin Downey [ 10/Aug/15 11:21 AM ]

this is a result of the protocol implementation in clojure, protocol extension mutates the vars, once you have taken then value of the var (which happens once for top level forms) you will not see further mutations of the var so no more protocol extension





[CLJ-1794] Sorting vector yields non-indexed ArraySeq Created: 05/Aug/15  Updated: 10/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: collections

Attachments: Text File 0001-CLJ-1794-Make-ArraySeqs-implement-Indexed.patch    
Approval: Triaged

 Description   

Sorting a vector gives back an ArraySeq with O(n) gets instead of O(log N) gets. This means it can be more efficient to take a vector, sort, then turn it back into a vector.

Cause: sort works by copying the collection to be sorted into an array, calls Arrays/sort to sort it, and then returns a seq on the sorted array. The seq returned is an ArraySeq and doesn't implement Indexed.

Alternatives:

1. Make ArraySeq (and primitive specializations thereof) implement Indexed, providing constant time lookup by index.
2. Specialize sorting for different collection types
3. ???



 Comments   
Comment by Ragnar Dahlen [ 06/Aug/15 2:28 AM ]

Update description, attach patch.

Comment by Ragnar Dahlen [ 06/Aug/15 2:31 AM ]

Added link to current patch.

Comment by Alex Miller [ 06/Aug/15 6:50 AM ]

Another alternative to consider here is to have sort do something smarter.

Comment by Ragnar Dahlen [ 06/Aug/15 7:44 AM ]

Having thought a bit more about the approach and implications of this I'm not sure this patch is a good idea at all. It makes a little bit sense for the particular case of sorting a vector, but on the other hand sort only promises to return a sorted sequence of given coll. Implementing Indexed for a sequence type just because the underlying data structure supports efficient lookup by index feels wrong. Like you suggest, effort is maybe better spent thinking about making sort smarter, which is a different issue, or just using sorted collections instead.

Comment by Kevin Downey [ 06/Aug/15 12:49 PM ]

It seems like the best thing here would be to change sort to return a vector. Usages of sort in the middle of sequence pipelines will continue to work, but a sort followed by conj will break (I cannot recall an instance of this off hand, but I am sure they exist). Sorting seems to imply a fully realized collection, and vectors are the "strongest" realized collections that can be returned here.

Given the conservative nature of core, and the issue with conj ordering above, the next best thing might be to add a sortv similar to the existing mapv.

Another option might be to remove the call to seq, so sort returns the sorted array. This would actually be really useful because you can use Arrays.binarySearch. Calls to conj after a sort would then fail with an exception instead of conj to the "wrong" place.





[CLJ-1795] Protocol functions don't work properly when metadata is added to them Created: 08/Aug/15  Updated: 10/Aug/15  Resolved: 10/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Nathan Marz Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: metadata, protocols
Environment:

Clojure 1.7.0



 Description   

When you add metadata to a protocol function, the version with metadata will not work for any extensions added afterwards.

(defprotocol TestProtocol
  (tester [o]))

(def tester-with-meta (with-meta tester {:a 1}))

(extend-protocol TestProtocol
  String
  (tester [o] (println "Strings work!")))

(tester-with-meta "A") ;; Error
(tester "A") ;; Works fine

(def tester-with-meta (with-meta tester {:a 1}))

(extend-protocol TestProtocol
  Long
  (tester [o] (println "Longs work!")))

(tester-with-meta "A") ;; Works fine
(tester-with-meta 3) ;; Error


 Comments   
Comment by Alex Miller [ 08/Aug/15 9:16 AM ]

Can you specify version you're testing with too...

Comment by Nathan Marz [ 08/Aug/15 9:21 AM ]

Clojure 1.7.0

Comment by Nathan Marz [ 08/Aug/15 10:53 AM ]

This is subsumed by http://dev.clojure.org/jira/browse/CLJ-1796 which seems to be closer to the root cause

Comment by Alex Miller [ 10/Aug/15 9:10 AM ]

Subsumed by CLJ-1796





[CLJ-1791] Issue defining a defrecord protocol method named "clear" Created: 04/Aug/15  Updated: 08/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Mike Anderson Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

There seems to be a problem in trying to define a protocol with a method named "clear"

(defprotocol PClear
(clear [o]))
=> PClear

(defrecord Foo []
PClear
(clear [o] o))
=> CompilerException java.lang.ClassFormatError: Duplicate method name&signature in class file xxxx/Foo, compiling:(NO_SOURCE_PATH:1:1)

I assume this is due to a name conflict with the Java method Collection.clear() in the underlying implementation. However the error is very unclear about this, and the potential for conflict appears to be undocumented as far as I can see.

There seem to be two possible approaches to fixing this:
a) Disallow the use of "clear" as a protocol method name (in which case the error should be more informative, and the rule should be documented)
b) Find a way to support this in the class file format (possibly by overloading on JVM return types, since Collection.clear() returns void??)



 Comments   
Comment by Nicola Mometto [ 04/Aug/15 6:58 AM ]

Mike, the jvm doesn't support return type overloading so your second suggestion is not technically possible.

Reading the doc for defrecord " The class will have implementations of several (clojure.lang)
interfaces generated automatically: IObj (metadata support) and
IPersistentMap, and all of their superinterfaces.
"

Perharps java.util.Collection (or even better, java.util.Map) should be mentioned here.

Comment by Alex Miller [ 04/Aug/15 7:46 AM ]

I think this should be a doc enhancement request.

Comment by OHTA Shogo [ 08/Aug/15 3:42 AM ]

It might be out of the scope of this ticket, but protocol method conflicts can cause some other kinds of errors:

user=> (defprotocol P1 (finalize [this]))
P1
user=> (defrecord R1 [] P1 (finalize [this]))

CompilerException java.lang.VerifyError: (class: user/R1, method: finalize signature: ()Ljava/lang/Object;) Unable to pop operand off an empty stack, compiling: ...
user=> (defprotocol P2 (wait [this]))
P2
user=> (defrecord R2 [] P2 (wait [this]))
user.R2
user=> (def r (->R2))
#'user/r
user=> (wait r)
CompilerException java.lang.IllegalArgumentException: No single method: wait of interface: user.P2 found for function: wait of protocol: P2, compiling: ...
user=>

IMHO it would be nicer if defprotocol would warn method conflicts with a more informative message.





[CLJ-1380] Three-arg ExceptionInfo constructor permits nil data Created: 13/Mar/14  Updated: 07/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Gordon Syme Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: checkargs, ft

Attachments: File clj-1380.diff     Text File clj-1380-v2.patch     Text File clj-1380-v3.patch    
Patch: Code and Test
Approval: Triaged

 Description   

The argument check in the two-arg clojure.lang.ExceptionInfo constructor isn't present in the three-arg constructor so it's possible to create an ExceptionInfo with arbitrary (or nil) data.

E.g.:

user=> (clojure-version)
"1.5.1"

user=> (ex-info "hi" nil)
IllegalArgumentException Additional data must be a persistent map: null  clojure.lang.ExceptionInfo.<init> (ExceptionInfo.java:26)

user=> (ex-info "hi" nil (Throwable.))
NullPointerException   clojure.lang.ExceptionInfo.toString (ExceptionInfo.java:40)

Patch: clj-1380-v3.patch

Screened by: Alex Miller



 Comments   
Comment by Gordon Syme [ 13/Mar/14 10:47 AM ]

Sorry, didn't meant to classify as "major" and I don't have permissions to edit.

Comment by Gordon Syme [ 13/Mar/14 11:11 AM ]

Patch + tests

I'm not at all familiar with the project so may have put tests in the wrong language and/or wrong place.

The ex-info-works test is a bit dorky but shows that both constructors are equivalent (and passes without the patch to ExceptionInfo).

Comment by Alex Miller [ 13/Mar/14 12:18 PM ]

No worries on the classification - I adjust most incoming tickets in some way or another.

Thanks for the patch, however it cannot be considered unless you complete the Clojure Contributor's Agreement - http://clojure.org/contributing. This is an important step in the process that keeps the Clojure codebase on a sound legal basis.

Someone else could develop a clean room patch implementation for this ticket later, but of course it would be ideal if you could become a contributor!

Comment by Gordon Syme [ 13/Mar/14 1:15 PM ]

Hi Alex,

sure, that makes sense. I'll get the contributor's agreement in the post. It may take a while to arrive since I'm based in Europe.

Comment by Gordon Syme [ 25/Mar/14 10:03 AM ]

I just checked http://clojure.org/contributing, looks like my CCA made it through

Comment by Andy Fingerhut [ 01/Oct/14 6:48 PM ]

Gordon, I do not know if your patch is of interest to the Clojure developers, so I can't comment on that aspect of this ticket.

Instructions for creating a patch in the expected format is given on the wiki page below. Your patch is not in the expected format.

http://dev.clojure.org/display/community/Developing+Patches

Comment by Gordon Syme [ 27/Oct/14 5:30 AM ]

Whoops, sorry Andy.

I've rebased against master and added a correctly formatted patch.

Comment by Michael Blume [ 05/Aug/15 5:51 PM ]

Rebased onto current master (1d5237)

Comment by Alex Miller [ 07/Aug/15 9:25 AM ]

Can we change the data instanceof IPersistentMap check to data != null (which is the real thing being checked? If something non-null and not IPersistentMap is passed, a ClassCastException will be thrown on invocation.

The error message should then be modified as well to something like "Additional data must be non-nil."

Comment by Gordon Syme [ 07/Aug/15 3:03 PM ]

Done and done.
Throws on null rather than instanceof failure and updated the error message





[CLJ-1093] Empty PersistentCollections get incorrectly evaluated as their generic clojure counterpart Created: 24/Oct/12  Updated: 07/Aug/15  Resolved: 05/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Completed Votes: 5
Labels: collections, compiler

Attachments: Text File 0001-CLJ-1093-fix-compilation-of-empty-PersistentCollecti.patch     Text File 0001-CLJ-1093-v2.patch     Text File clj-1093-fix-empty-record-literal-patch-v2.txt    
Patch: Code and Test

 Description   
user> (defrecord x [])
user.x
user> #user.x[]   ;; expect: #user.x{}
{}
user> #user.x{}   ;; expect: #user.x{}
{}
user> #clojure.lang.PersistentTreeMap[]
{}
user> (class *1)  ;; expect: clojure.lang.PersistentTreeMap
clojure.lang.PersistentArrayMap

Cause: Compiler's ConstantExpr parser returns an EmptyExpr for all empty persistent collections, even if they are of types other than the core collections (for example: records, sorted collections, custom collections). EmptyExpr reports its java class as one the classes - IPersistentList/IPersistentVector/IPersistentMap/IPersistentSet rather than the original type.

Proposed: If one of the Persistent* classes, then create EmptyExpr as before, otherwise retain the ConstantExpression of the original collection.
Since EmptyExpr is a compiler optimization that applies only to some concrete clojure collections, making EmptyExpr dispatch on concrete types rather than on generic interfaces makes the compiler behave as expected

Patch: 0001-CLJ-1093-v2.patch

Screened by:



 Comments   
Comment by Timothy Baldridge [ 27/Nov/12 11:41 AM ]

Unable to reproduce this bug on latest version of master. Most likely fixed by some of the recent changes to data literal readers.

Marking Not-Approved.

Comment by Timothy Baldridge [ 27/Nov/12 11:41 AM ]

Could not reproduce in master.

Comment by Nicola Mometto [ 01/Mar/13 1:23 PM ]

I just checked, and the problem still exists for records with no arguments:

Clojure 1.6.0-master-SNAPSHOT
user=> (defrecord a [])
user.a
user=> #user.a[]
{}

Admittedly it's an edge case and I see little usage for no-arguments records, but I think it should be addressed aswell since the current behaviour is not what one would expect

Comment by Herwig Hochleitner [ 02/Mar/13 8:14 AM ]

Got the following REPL interaction:

% java -jar ~/.m2/repository/org/clojure/clojure/1.5.0/clojure-1.5.0.jar
user=> (defrecord a [])
user.a
user=> (a.)
#user.a{}
user=> #user.a{}
{}
#user.a[]
{}

This should be reopened or declined for another reason than reproducability.

Comment by Nicola Mometto [ 10/Mar/13 2:18 PM ]

I'm reopening this since the bug is still there.

Comment by Andy Fingerhut [ 13/Mar/13 2:04 PM ]

Patch clj-1093-fix-empty-record-literal-patch-v2.txt dated Mar 13, 2013 is identical to Bronsa's patch 001-fix-empty-record-literal.patch dated Oct 24, 2012, except that it applies cleanly to latest master. I'm not sure why the older patch doesn't but git doesn't like something about it.

Comment by Nicola Mometto [ 26/Jun/13 8:06 PM ]

Patch 0001-CLJ-1093-fix-empty-records-literal-v2.patch solves more issues than the previous patch that was not evident to me at the time.

Only collections that are either PersistentList or PersistentVector or PersistentHash[Map|Set] or PersistentArrayMap can now be EmptyExpr.
This is because we don't want every IPersistentCollection to be emitted as a generic one eg.

user=> (class #clojure.lang.PersistentTreeMap[])
clojure.lang.PersistentArrayMap

Incidentally, this patch also solves CLJ-1187
This patch should be preferred over the one on CLJ-1187 since it's more general

Comment by Jozef Wagner [ 09/Aug/13 2:08 AM ]

Maybe this is related:

user=> (def x `(quote ~(list 1 (clojure.lang.PersistentTreeMap/create (seq [1 2 3 4])))))
#'user/x
user=> x
(quote (1 {1 2, 3 4}))
user=> (class (second (second x)))
clojure.lang.PersistentTreeMap
user=> (eval x)
(1 {1 2, 3 4})
user=> (class (second (eval x)))
clojure.lang.PersistentArrayMap

Even if the collection is not evaluated, it is still converted to the generic clojure counterpart.

Comment by Alex Miller [ 24/Apr/14 4:44 PM ]

In the change for ObjectExpr.emitValue() where you've added PersistentArrayMap to the PersistentHashMap case, should the IPersistentVector case below that be PersistentVector instead, otherwise it would snare a custom IPersistentVector that's not a PersistentVector, right?

This line: "else if(form instanceof ISeq)" at the end of the Compiler diff has different leading tabs which makes the diff slightly more confusing than it could be.

Would be nice to add a test for the sorted map case in the description.

Marking incomplete to address some of these.

Comment by Alex Miller [ 13/May/14 10:43 PM ]

bump

Comment by Nicola Mometto [ 14/May/14 4:24 AM ]

Attached patch 0001-CLJ-1093-fix-empty-collection-literal-evaluation.patch which implements your suggestions.

replacing IPersistentVector with PersistentVector in ObjectExpr.emitValue() exposes a print-dup limitation: it expects every IPersistentCollection to have a static "create" method.

This required special casing for MapEntry and APersistentVector$SubVector

Comment by Nicola Mometto [ 16/May/14 3:57 PM ]

I updated the patch adding print-dups for APersistentVector$SubVec and other IPersistentVectors rather than special casing them in the compiler

Comment by Alex Miller [ 23/May/14 4:21 PM ]

All of the checks on concrete classes in the Compiler parts of this patch don't sit well with me. I understand how you got to this point and I don't have an alternate recommendation (yet) but all of that just feels like the wrong direction.

We want to be built on abstractions such that internal collections are not special; they should conform to the same expectations as an external collection and both should follow the same paths in the compiler - needing to check for those types is a flag for me that something is amiss.

I am marking Incomplete for now based on these thoughts.

Comment by Nicola Mometto [ 06/Jul/14 10:01 AM ]

I've been thinking for a while about this issue and I've come to the conclusion that in my later patches I've been trying to incorporate fixes for 3 different albeit related issues:

1- Clojure transforms all empty IPersistentCollections in their generic Clojure counterpart

user> (defrecord x [])
user.x
user> #user.x[]   ;; expected: #user.x{}
{}
user> #user.x{}   ;; expected: #user.x{}
{}
user> #clojure.lang.PersistentTreeMap[]
{}
user> (class *1)  ;; expected: clojure.lang.PersistentTreeMap
clojure.lang.PersistentArrayMap

2- Clojure transforms all the literals of collections implementing the Clojure interfaces (IPersistentList, IPersistentVector ..) that are NOT defined with deftype or defrecord, to their
generic Clojure counterpart

user=> (class (eval (sorted-map 1 1)))
clojure.lang.PersistentArrayMap ;; expected: clojure.lang.PersistentTreeMap

3- print-dup is broken for some Clojure persistent collections

user=> (print-dup (subvec [1] 0) *out*)
#=(clojure.lang.APersistentVector$SubVector/create [1])
user=> #=(clojure.lang.APersistentVector$SubVector/create [1])
IllegalArgumentException No matching method found: create  clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:53)

I'll keep this ticket regarding issue #1 and open two other tickets for issue #2 and #3

Comment by Nicola Mometto [ 06/Jul/14 10:15 AM ]

I've attached a new patch fixing only this issue, the approach is explained in the description

Comment by Nicola Mometto [ 12/Sep/14 5:45 PM ]

0001-CLJ-1093-v2.patch is an updated patch that correctly handles metadata evaluation semantics on empty literals and adds some tests for it

Comment by Stuart Halloway [ 10/Jul/15 12:12 PM ]

Nicola, thanks for making this smaller. Two questions:

  • why is the meta check added only to parse, and not analyze
  • do the tests cover both the parse and analyze code paths?
Comment by Nicola Mometto [ 10/Jul/15 12:25 PM ]

Stuart, the meta check is added only in ConstantExpr.parse and not in analyze because:

  • EmptyExpr.parse which all codepaths delegate to has a meta check that wraps it in a MetaExpr if meta is found
  • We don't want metadata on ConstantExprs to be handled by MetaExpr since MetaExpr handles metadata on non-quoted literals, evaluating the meta.

IOW it encodes the difference between

^{:foo (println "bar")} {}
and
'^{:foo (println "bar")} {}

The tests handle both code paths

Comment by Nicola Mometto [ 10/Jul/15 3:34 PM ]

Just a note that this patch will probably need to be updated considering CLJ-1517 and CLJ-1610 if the compiler will be changed to emit those unrolled collections (which is not in the current patches)

Comment by Alex Miller [ 13/Jul/15 9:08 AM ]

Nicola, I think that highlights my biggest qualm with this patch: the hardcoded list of concrete classes in the compiler. To me, that just feels wrong as I do not want that set of classes to be "special". I have not thought about it enough to have an alternative suggestion though.

Comment by Nicola Mometto [ 13/Jul/15 9:43 AM ]

Alex, I generally would agree with you that hardcoding concrete classes is a wrong approach but I honestly think that in this particular case it's the sanest thing to do.

This also reflects my opinion that, with the ever growing number of custom collections and the addition of tagged literals and ctor literals in clojure allowing for the embedding custom collections at read-time, the current approach the compiler has of relying on generic interfaces rather than on a closed set of known internal classes to guide code emission, is and will be more and more problematic (see also CLJ-1460,CLJ-1492, CLJ-1575, CLJ-1461)

This is an optimization and as such it makes sense for it to target specific classes: we only want to apply the optimization on such empty literals whose "conversion" to a generic impl won't change behaviour, and this is a decision the compiler can only do using a closed set of classes. Using interfaces or the abstract classes won't be possible (sorted maps implement APersistentMap too).

Other than my proposed solution or removing EmptyExpr altogether I can't think of any other way to fix this issue without

Comment by Stuart Halloway [ 30/Jul/15 2:55 PM ]

More subtle than, but not a bug for the same reasons as, CLJ-1460.

Comment by Nicola Mometto [ 30/Jul/15 3:13 PM ]

Really? The fact that record ctor syntax doesn't roundtrip is not a bug?

Comment by Andy Fingerhut [ 30/Jul/15 4:04 PM ]

Is it true that the only record ctors affected by this are those that have 0 fields? At least cursory testing with similar examples as in the description, but with 1 field in the record, show what seems to be the expected behavior.

Comment by Nicola Mometto [ 30/Jul/15 4:08 PM ]

Yes, this ticket is about empty IPCs, and the only way a record can be empty is if it has 0 fields

Comment by Alex Miller [ 05/Aug/15 3:45 PM ]

Rich made a change for this here:

https://github.com/clojure/clojure/commit/1d5237f9d7db0bc5f6e929330108d016ac7bf76c





[CLJ-1250] Reducer (and folder) instances hold onto the head of seqs Created: 03/Sep/13  Updated: 07/Aug/15  Resolved: 07/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.8

Type: Defect Priority: Major
Reporter: Christophe Grand Assignee: Unassigned
Resolution: Completed Votes: 10
Labels: compiler, memory, reducers

Attachments: Text File after-change.txt     Text File before-change.txt     Text File CLJ-1250-08-29.patch     Text File CLJ-1250-08-29-ws.patch     Text File CLJ-1250-20131211.patch     Text File clj-1250-2.patch     Text File CLJ-1250-AllInvokeSites-20140204.patch     Text File CLJ-1250-AllInvokeSites-20140320.patch     Text File clj1250.patch     Text File do-not-clear-this-when-direct-linked.patch    
Patch: Code and Test
Approval: Vetted

 Description   

Problem Statement
A shared function #'clojure.core.reducers/reducer holds on to the head of a reducible collection, causing it to blow up when the collection is a lazy sequence.

Reproduction steps:
Compare the following calls:

(time (reduce + 0 (map identity (range 1e8))))
(time (reduce + 0 (r/map identity (range 1e8))))

The second call should fail on a normal or small heap.

(If reducers are faster than seqs, increase the range.)

Cause: #'reducer closes over a collection when in order to reify CollReduce, and the closed-over is never cleared. When code attempts to reduce over this anonymous transformed collection, it will realize the tail while the head is stored in the closed-over.

Patch: clj-1250-2.patch

Approach:

Clear the reference to 'this' on the stack just before a tail call occurs

Removes calls to emitClearLocals(), which is a no-op.

When the context is RETURN (indicating a tail call), and the operation
is an InvokeExpr, StaticMethodExpr, or InstanceMethodExpr, clear the
reference to 'this' which is in slot 0 of the locals.

Edge-case: Inside the body of a try block, we cannot clear 'this' at the tail
position as we might need to keep refs around for use in the catch or finally
clauses. Introduces another truthy dynamic binding var to track position being
inside a try block.

Adds two helpers to emitClearThis and inTailCall.

Advantages: Fixes this case with no user code changes. Enables GC to do reclaim closed-overs references earlier.
Disadvantages: A compiler change.

Screened by: Alex Miller

Alternate Approaches:

1) Reimplement the #'reducer (and #'folder) transformation fns similar to the manner that Christophe proposes here:

(defrecord Reducer [coll xf])

(extend-protocol 
  clojure.core.protocols/CollReduce
  Reducer
      (coll-reduce [r f1]
                   (clojure.core.protocols/coll-reduce r f1 (f1)))
      (coll-reduce [r f1 init]
                   (clojure.core.protocols/coll-reduce (:coll r) ((:xf r) f1) init)))

(def rreducer ->Reducer) 

(defn rmap [f coll]
  (rreducer coll (fn [g] 
                   (fn [acc x]
                     (g acc (f x))))))

Advantages: Relatively non-invasive change.
Disadvantages: Not evident code. Additional protocol dispatch, though only incurred once

2) Alternate approach

from Christophe Grand:
Another way would be to enhance the local clearing mechanism to also clear "this" but it's complex since it may be needed to keep a reference to "this" for caches long after obvious references to "this" are needed.

Advantages: Fine-grained
Disadvantages: Complex, invasive, and the compiler is hard to hack on.

Mitigations
Avoid reducing on lazy-seqs and instead operate on vectors / maps, or custom reifiers of CollReduce or CollFold. This could be easier with some implementations of common collection functions being available (like iterate and partition).

See https://groups.google.com/d/msg/clojure-dev/t6NhGnYNH1A/2lXghJS5HywJ for previous discussion.



 Comments   
Comment by Gary Fredericks [ 03/Sep/13 8:53 AM ]

Fixed indentation in description.

Comment by Ghadi Shayban [ 11/Dec/13 11:08 PM ]

Adding a patch that clears "this" before tail calls. Verified that Christophe's repro case is fixed.

Will upload a diff of the bytecode soon.

Any reason this juicy bug was taken off 1.6?

Comment by Ghadi Shayban [ 11/Dec/13 11:17 PM ]

Here's the bytecode for the clojure.core.reducers/reducer reify before and after the change... Of course a straight diff isn't useful because all the line numbers changed. Kudos to Gary Trakhman for the no.disassemble lein plugin.

Comment by Christophe Grand [ 12/Dec/13 6:58 AM ]

Ghadi, I'm a bit surprised by this part of the patch: was the local clearing always a no-op here?

-		if(context == C.RETURN)
+		if(shouldClear)
 			{
-			ObjMethod method = (ObjMethod) METHOD.deref();
-			method.emitClearLocals(gen);
+                            gen.visitInsn(Opcodes.ACONST_NULL);
+                            gen.visitVarInsn(Opcodes.ASTORE, 0);
 			}

The problem with this approach (clear this on tail call) is that it adds yet another special case. To me the complexity stem from having to keep this around even if the user code doesn't refer to it.

Comment by Ghadi Shayban [ 12/Dec/13 7:19 AM ]

Thank you - I failed to mention this in the commit message: it appears that emitClearLocals() belonging to both ObjMethod and FnMethod (its child) are empty no-ops. I believe the actual local clearing is on line 4855.

I agree re: another special case in the compiler.

Comment by Alex Miller [ 12/Dec/13 8:56 AM ]

Ghadi re 1.6 - this ticket was never in the 1.6 list, it has not yet been vetted by Rich but is ready to do so when we open up again after 1.6.

Comment by Ghadi Shayban [ 12/Dec/13 8:59 AM ]

Sorry I confused the critical list with the Rel1.6 list.

Comment by Ghadi Shayban [ 14/Dec/13 11:16 AM ]

New patch 20131214 that handles all tail invoke sites (InvokeExpr + StaticMethodExpr + InstanceMethodExpr). 'StaticInvokeExpr' seems like an old remnant that had no active code path, so that was left as-is.

The approach taken is still the same as the original small patch that addressed only InvokeExpr, except that it is now using a couple small helpers. The commit message has more details.

Also a 'try' block with no catch or finally clause now becomes a BodyExpr. Arguably a user error, historically accepted, and still accepted, but now they are a regular BodyExpr, instead of being wrapped by a the no-op try/catch mechanism. This second commit can be optionally discarded.

With this patch on my machine (4/8 core/thread Ivy Bridge) running on bare clojure.main:
Christophe's test cases both run i 3060ms on a artificially constrained 100M max heap, indicating a dominant GC overhead. (But they now both work!)

When max heap is at a comfortable 2G the reducers version outpaces the lazyseq at 2100ms vs 2600ms!

Comment by Ghadi Shayban [ 13/Jan/14 10:48 AM ]

Updating stale patch after latest changes to master. Latest is CLJ-1250-AllInvokeSites-20140113

Comment by Ghadi Shayban [ 04/Feb/14 3:50 PM ]

Updating patch after murmur changes

Comment by Tassilo Horn [ 13/Feb/14 4:52 AM ]

Ghadi, I suffer from the problem of this issue. Therefore, I've applied your patch CLJ-1250-AllInvokeSites-20140204.patch to the current git master. However, then I get lots of "java.lang.NoSuchFieldError: array" errors when the clojure tests are run:

     [java] clojure.test-clojure.clojure-set
     [java] 
     [java] java.lang.NoSuchFieldError: array
     [java] 	at clojure.core.protocols$fn__6026.invoke(protocols.clj:123)
     [java] 	at clojure.core.protocols$fn__5994$G__5989__6003.invoke(protocols.clj:19)
     [java] 	at clojure.core.protocols$fn__6023.invoke(protocols.clj:147)
     [java] 	at clojure.core.protocols$fn__5994$G__5989__6003.invoke(protocols.clj:19)
     [java] 	at clojure.core.protocols$seq_reduce.invoke(protocols.clj:31)
     [java] 	at clojure.core.protocols$fn__6017.invoke(protocols.clj:48)
     [java] 	at clojure.core.protocols$fn__5968$G__5963__5981.invoke(protocols.clj:13)
     [java] 	at clojure.core$reduce.invoke(core.clj:6213)
     [java] 	at clojure.set$difference.doInvoke(set.clj:61)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:442)
     [java] 	at clojure.test_clojure.clojure_set$fn__1050$fn__1083.invoke(clojure_set.clj:109)
     [java] 	at clojure.test_clojure.clojure_set$fn__1050.invoke(clojure_set.clj:109)
     [java] 	at clojure.test$test_var$fn__7123.invoke(test.clj:704)
     [java] 	at clojure.test$test_var.invoke(test.clj:704)
     [java] 	at clojure.test$test_vars$fn__7145$fn__7150.invoke(test.clj:721)
     [java] 	at clojure.test$default_fixture.invoke(test.clj:674)
     [java] 	at clojure.test$test_vars$fn__7145.invoke(test.clj:721)
     [java] 	at clojure.test$default_fixture.invoke(test.clj:674)
     [java] 	at clojure.test$test_vars.invoke(test.clj:718)
     [java] 	at clojure.test$test_all_vars.invoke(test.clj:727)
     [java] 	at clojure.test$test_ns.invoke(test.clj:746)
     [java] 	at clojure.core$map$fn__2665.invoke(core.clj:2515)
     [java] 	at clojure.lang.LazySeq.sval(LazySeq.java:40)
     [java] 	at clojure.lang.LazySeq.seq(LazySeq.java:49)
     [java] 	at clojure.lang.Cons.next(Cons.java:39)
     [java] 	at clojure.lang.RT.boundedLength(RT.java:1655)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:130)
     [java] 	at clojure.core$apply.invoke(core.clj:619)
     [java] 	at clojure.test$run_tests.doInvoke(test.clj:761)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:137)
     [java] 	at clojure.core$apply.invoke(core.clj:617)
     [java] 	at clojure.test.generative.runner$run_all_tests$fn__527.invoke(runner.clj:255)
     [java] 	at clojure.test.generative.runner$run_all_tests$run_with_counts__519$fn__523.invoke(runner.clj:251)
     [java] 	at clojure.test.generative.runner$run_all_tests$run_with_counts__519.invoke(runner.clj:251)
     [java] 	at clojure.test.generative.runner$run_all_tests.invoke(runner.clj:253)
     [java] 	at clojure.test.generative.runner$test_dirs.doInvoke(runner.clj:304)
     [java] 	at clojure.lang.RestFn.applyTo(RestFn.java:137)
     [java] 	at clojure.core$apply.invoke(core.clj:617)
     [java] 	at clojure.test.generative.runner$_main.doInvoke(runner.clj:312)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:408)
     [java] 	at user$eval564.invoke(run_tests.clj:3)
     [java] 	at clojure.lang.Compiler.eval(Compiler.java:6657)
     [java] 	at clojure.lang.Compiler.load(Compiler.java:7084)
     [java] 	at clojure.lang.Compiler.loadFile(Compiler.java:7040)
     [java] 	at clojure.main$load_script.invoke(main.clj:274)
     [java] 	at clojure.main$script_opt.invoke(main.clj:336)
     [java] 	at clojure.main$main.doInvoke(main.clj:420)
     [java] 	at clojure.lang.RestFn.invoke(RestFn.java:408)
     [java] 	at clojure.lang.Var.invoke(Var.java:379)
     [java] 	at clojure.lang.AFn.applyToHelper(AFn.java:154)
     [java] 	at clojure.lang.Var.applyTo(Var.java:700)
     [java] 	at clojure.main.main(main.java:37)
Comment by Ghadi Shayban [ 13/Feb/14 8:23 AM ]

Can you give some details about your JVM/environment that can help reproduce? I'm not encountering this error.

Comment by Tassilo Horn [ 13/Feb/14 9:41 AM ]

Sure. It's a 64bit ThinkPad running GNU/Linux.

% java -version
java version "1.7.0_51"
OpenJDK Runtime Environment (IcedTea 2.4.5) (ArchLinux build 7.u51_2.4.5-1-x86_64)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
Comment by Ghadi Shayban [ 13/Feb/14 10:19 AM ]

Strange, that is exactly my mail env, OpenJDK7 on Arch, 64-bit. I have also tested on JDK 6/7/8 on OSX mavericks. Are you certain that the git tree is clean besides the patch? (Arch users unite!)

Comment by Tassilo Horn [ 14/Feb/14 1:13 AM ]

Yes, the tree is clean. But now I see that I get the same error also after resetting to origin/master, so it's not caused by your patch at all. Oh, now the error vanished after doing a `mvn clean`! So problem solved.

Comment by Nicola Mometto [ 19/Feb/14 12:32 PM ]

Ghandi, FnExpr.parse should bind IN_TRY_BLOCK to false before analyzing the fn body, consider the case

(try (do something (fn a [] (heap-consuming-op a))) (catch Exception e ..))

Here in the a function the this local will never be cleared even though it's perfectly safe to.
Admittedly this is an edge case but we should cover this possibility too.

Comment by Ghadi Shayban [ 19/Feb/14 2:06 PM ]

You may have auto-corrected my name to Ghandi instead of Ghadi. I wish I were that wise =)

I will update the patch for FnExpr (that seems reasonable), but maybe after 1.6 winds down and the next batch of tickets get scrutiny. It would be nice to get input on a preferred approach from Rich or core after it gets vetted – or quite possibly not vetted.

Comment by Nicola Mometto [ 19/Feb/14 6:11 PM ]

hah, sorry for the typo on the name

Seems reasonable to me, in the meantime I just pushed to tools.analyzer/tools.emitter complete support for "this" clearing, I'll test this a bit in the next few days to make sure it doesn't cause unexpected problems.

Comment by Andy Fingerhut [ 24/Feb/14 12:13 PM ]

Patch CLJ-1250-AllInvokeSites-20140204.patch no longer applies cleanly to latest master as of Feb 23, 2014. It did on Feb 14, 2014. Most likely some of its context lines are changed by the commit to Clojure master on Feb 23, 2014 – I haven't checked in detail.

Comment by Ghadi Shayban [ 20/Mar/14 4:39 PM ]

Added a patch that 1) applies cleanly, 2) binds the IN_TRY_EXPR to false initially when analyzing FnExpr and 3) uses RT.booleanCast

Comment by Alex Miller [ 22/Aug/14 9:31 AM ]

Can you squash the patch and add tests to cover all this stuff?

Comment by Ghadi Shayban [ 22/Aug/14 9:47 AM ]

Sure. Have any ideas for how to test proper behavior of reference clearing? Know of some prior art in the test suite?

Comment by Alex Miller [ 22/Aug/14 10:25 AM ]

Something like the test in the summary would be a place to start. I don't know of any test that actually inspects bytecode or anything but that's probably not wise anyways. Need to make that kind of a test but get coverage on the different kinds of scenarios you're covering - try/catch, etc.

Comment by Ghadi Shayban [ 22/Aug/14 12:13 PM ]

Attached new squashed patch with a couple of tests.

Removed (innocuous but out-of-scope) second commit that analyzed try blocks missing a catch or finally clause as BodyExprs

Comment by Ghadi Shayban [ 29/Aug/14 11:43 AM ]

Rebased to latest master. Current patch CLJ-1250-08-29

Comment by Jozef Wagner [ 29/Aug/14 2:40 PM ]

CLJ-1250-08-29.patch is fishy, 87k size and it includes many unrelated commits

Comment by Alex Miller [ 29/Aug/14 2:44 PM ]

Agreed, Ghadi that last rebase looks wrong.

Comment by Ghadi Shayban [ 29/Aug/14 3:06 PM ]

Oops. Used format-patch against the wrong base. Updated.

Apologies that ticket is longer than War & Peace

Comment by Alex Miller [ 08/Sep/14 7:02 PM ]

I have not had enough time to examine all the bytecode diffs that I want to on this yet but preliminary feedback:

Compiler.java

  • need to use tabs instead of spaces to blend into the existing code better
  • why do StaticFieldExpr and InstanceFieldExpr not need this same logic?

compilation.clj

  • has some whitespace diffs that you could get rid of
  • there seem to be more cases in the code than are covered in the tests here?
Comment by Ghadi Shayban [ 08/Sep/14 11:19 PM ]

The germ of the issue is to clear the reference to 'this' (arg 0) when transferring control to another activation frame. StaticFieldExpr and InstanceFieldExpr do not transfer control to another frame. (StaticMethod and InstanceMethod do transfer control, and are covered by the patch)

Comment by Alex Miller [ 25/Sep/14 9:03 AM ]

Makes sense - can you address the tabs and whitespace issues?

Comment by Ghadi Shayban [ 26/Sep/14 12:51 PM ]

Latest patch CLJ-1250-08-29-ws.patch with whitespace issues fixed.

Comment by Michael Blume [ 17/Jun/15 4:43 PM ]

Patch doesn't apply, will attempt to fix and upload

Comment by Michael Blume [ 17/Jun/15 4:58 PM ]

Nope, sorry, admitting defeat on this one. Ghadi, can you update?

Comment by Ghadi Shayban [ 22/Jun/15 9:40 PM ]

I've updated the patch for first 1.8 merge window.

Comment by Alex Miller [ 23/Jun/15 7:37 AM ]

Added clj-1250-2.patch which is same as clj1250.patch but squashes commits and fixes whitespace tab/space issues.

Comment by Nicola Mometto [ 17/Jul/15 8:24 PM ]

Not sure what it means but the testcase is failing with an OOM exception in the IBM JDK 1.6 instance http://build.clojure.org/job/clojure-test-matrix/284/jdk=IBM%20JDK%201.6/console

Comment by Alex Miller [ 17/Jul/15 10:29 PM ]

CLJ-1780 created to track the ibm failure. maybe just difference in gc speed or something?

Comment by Alex Miller [ 04/Aug/15 2:32 PM ]

Here's a greatly simplified version of some real code that now fails due to this change:

(let [done (atom 2)]
    (while (pos? @done)
      (swap! done dec)
      (loop [found []]
        (println (conj found 1)))))
  • Before: [1][1]
  • After: [1] NullPointerException
Comment by Ghadi Shayban [ 04/Aug/15 5:18 PM ]

Reproduced on 1.8.0-alpha4. Could not reproduce with the previously committed patch applied directly to 1.7.

Cause: Tail clearing is insensitive to direct linking

In a direct linking static method (through invokeStatic) there's no "this" to clear.

Patch addresses the root cause

Comment by Alex Miller [ 05/Aug/15 9:02 AM ]

Ok, backing out a little, here's another example that still fails with the patch:

(let [done (atom false)
        f (future-call
            (fn inner []
              (while (not @done)
                (loop [found []]
                  (println (conj found 1))))))]
    (doseq [elem [:a :b :c :done]]
      (println "queue write " elem))
    (reset! done true)
    @f)

1.7:
queue write :a
queue write :b
queue write :c
queue write :done
[1]
nil

1.8:
queue write :a
[1]
queue write :b
queue write :c
queue write :done
NullPointerException

Comment by Alex Miller [ 05/Aug/15 10:58 AM ]

Moved latest case to CLJ-1793

Comment by Ghadi Shayban [ 05/Aug/15 8:44 PM ]

Both this regression and the loop regression are fixed on CLJ-1793.

Comment by Alex Miller [ 07/Aug/15 8:56 AM ]

The patch for this ticket has been reverted in master after 1.8.0-alpha4. Further work on re-implementation is moved to CLJ-1793.





[CLJ-1792] array-map and hash-map differ in handling of keys comparing identical but not equal Created: 04/Aug/15  Updated: 04/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Nicola Mometto Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Text File 0001-do-pointer-check-in-Util.EquivPred.patch    
Patch: Code

 Description   
user=> (let [a (array-map Double/NaN 1)] (assoc a (key (first a)) "foo"))
{NaN 1, NaN "foo"}
user=> (let [a (hash-map Double/NaN 1)] (assoc a (key (first a)) "foo"))
{NaN "foo"}

Approach: Array-map's comparison skips identity-check and always delegates to .equals calls. The attached patch alignes array-map behaviour with hash-map by doing the appropriate pointer checks before delegating to .equals






[CLJ-457] lazy recursive definition giving incorrect results Created: 13/Oct/10  Updated: 03/Aug/15

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Assembla Importer Assignee: Christophe Grand
Resolution: Unresolved Votes: 0
Labels: None

Attachments: File CLJ-457-2.diff     File clj-457-3.diff    
Patch: Code and Test

 Description   

If you define a global data var in terms of a lazy sequence referring to that same var, you can get different results depending on the chunkiness of laziness of the functions being used to build the collection.

Clojure's lazy sequences don't promise to support this, but they shouldn't return wrong answers. In the example given in http://groups.google.com/group/clojure/browse_thread/thread/1c342fad8461602d (and repeated below), Clojure should not return bad data. An error message would be good, and even an infinite loop would be more reasonable than the current behavior.

(Similar issue reported here: https://groups.google.com/d/topic/clojure/yD941fIxhyE/discussion)

(def nums (drop 2 (range Long/MAX_VALUE)))
(def primes (cons (first nums)
             (lazy-seq (->>
               (rest nums)
               (remove
                 (fn [x]
                   (let [dividors (take-while #(<= (* % %) x)
primes)]
                     (println (str "primes = " primes))
                     (some #(= 0 (rem x %)) dividors))))))))
(take 5 primes)

It prints out:
(primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
2 3 5 7 9)


 Comments   
Comment by Assembla Importer [ 13/Oct/10 3:00 PM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/457

Comment by Aaron Bedra [ 10/Dec/10 9:08 AM ]

Stu and Rich talked about making this an error, but it would break some existing code to do so.

Comment by Rich Hickey [ 17/Dec/10 8:03 AM ]

Is there a specific question on this?

Comment by Aaron Bedra [ 05/Jan/11 9:05 PM ]

Stu, you and I went over this but I can't remember exactly what the question was here.

Comment by Christophe Grand [ 28/Nov/12 12:24 PM ]

Tentative patch attached.
Have you an example of existing code which is broken by such a patch (as mention by Aaron Bedra)?

Comment by Rich Hickey [ 30/Nov/12 9:43 AM ]

The patch intends to do what? We have only a problem description and code. Please enumerate the plan rather than make us decipher the patch.

As a first principle, I don't want Clojure to promise that such recursively defined values are possible.

Comment by Christophe Grand [ 30/Nov/12 10:23 AM ]

The proposal here is to catch recursive seq realization (ie when computing the body of a lazy-seq attempts to access the same seq) and throw an exception.

Currently when such a case happens, the recursive access to the seq returns nil. This results in incorrect code seemingly working but producing incorrect results or even incorrect code producing correct results out of luck (see https://groups.google.com/d/topic/clojure/yD941fIxhyE/discussion for such an example).

So this patch moves around the modification to the LazySeq state (f, sv and s fields) before all potentially recursive method call (.sval in the while of .seq and .invoke in .sval) so that, upon reentrance, the state of the LazySeq is coherent and able to convey the fact the seq is already being computed.

Currently a recursive call may find f and sv cleared and concludes the computation is done and the result is in s despite s being unaffected yet.

Currently:

State f sv s
Unrealized not null null null
Realized null null anything
Being realized/recursive call from fn.invoke not null null null
Being realized/recursive call from ls.sval null null null

Note that "Being realized" states overlap with Unrealized or Realized.
(NB: "anything" includes null)

With the patch:

State f sv s
Unrealized not null null null
Realized null null anything but this
Being realized null null this
Comment by Andy Fingerhut [ 30/Nov/12 2:06 PM ]

That last comment, Christophe, goes a long way to explaining the idea to me, at least. Any chance comments with similar content could be added as part of the patch?

Comment by Christophe Grand [ 03/Dec/12 11:18 AM ]

New patch with a comment explaining the expected states.
Note: I tidied the states table up.

// Before calling user code (f.invoke() in sval and, indirectly,
// ((LazySeq)ls).sval() in seq -- and even RT.seq() in seq), ensure that 
// the LazySeq state is in one of these states:
//
// State            f          sv
// ================================
// Unrealized       not null   null
// Realized         null       null
// Being realized   null       this

CLJ-1119 is also fixed by this patch.

Comment by Andy Fingerhut [ 23/Nov/13 12:35 AM ]

Patch clj-457-3.diff is identical to Christophe's CLJ-457-2.diff, except it has been updated to no longer conflict with the commit made for CLJ-949 on Nov 22, 2013, which basically removed the try/catch in method sval(). It passes tests, and I don't see anything wrong with it, but it would be good if Christophe could give it a look, too.

Comment by Alex Miller [ 02/Aug/15 7:46 PM ]

No longer reproducible

Comment by Nicola Mometto [ 03/Aug/15 10:01 AM ]

Alex, the posted example is no longer reproducible because `(range)` does not produce a chunked-seq anymore.
OTOH `(range n)` does so replacing `(range)` with something like `(range Long/MAX_VALUE)` in the code example in the ticket description is sufficient to resurface the bug:

Clojure 1.8.0-master-SNAPSHOT
user=> (def nums (drop 2 (range Long/MAX_VALUE)))
#'user/nums
(def primes (cons (first nums)
             (lazy-seq (->>
               (rest nums)
               (remove
                 (fn [x]
                   (let [dividors (take-while #(<= (* % %) x)
primes)]
                     (println (str "primes = " primes))
                     (some #(= 0 (rem x %)) dividors))))))))
#'user/primes
user=> (take 5 primes)
(primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
primes = (2)
2 3 5 7 9)




[CLJ-401] Add seqable? predicate Created: 13/Jul/10  Updated: 03/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 5
Labels: None

Approval: Triaged

 Description   

Many people have found a need for this function and one exists in clojure.core.incubator that is sometimes used and/or copied elsewhere:

https://github.com/clojure/core.incubator/blob/master/src/main/clojure/clojure/core/incubator.clj#L83

This predicate would be valuable to have as it is not a simple check on Seqable since RT.seq() covers a number of additional cases. Alternatively, there could be a protocol for this that could be extended to both Seqable as well as other supported Java use cases turning this into a satisfies? check.

Old prior discussion (although this also comes up regularly on #clojure):



 Comments   
Comment by Assembla Importer [ 24/Aug/10 9:19 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/401

Comment by Jeremy Heiler [ 26/Jul/14 5:37 PM ]

A reference to the implementation in contrib: https://github.com/clojure/clojure-contrib/blob/master/modules/core/src/main/clojure/clojure/contrib/core.clj#L78

It seems like that the only thing that is inconsistent with RT.seqFrom is that seqable? checks for String instead of CharSequence.

Comment by Alex Miller [ 03/Aug/15 10:11 AM ]

In the proposed patch referenced in the ticket above, if seqable? could be used in place of sequential? flatten could be more powerful and work with maps/sets/java collections. Here's how it would look:

(defn flatten [coll] 
  (lazy-seq 
    (when-let [coll (seq coll)] 
      (let [x (first coll)] 
        (if (seqable? x) 
          (concat (flatten x) (flatten (next coll))) 
          (cons x (flatten (next coll))))))))

And an example:

user=> (flatten #{1 2 3 #{4 5 {6 {7 [8 9 10 #{11 12 (java.util.ArrayList. [13 14 15]) (int-array [16 17 18])}]}}}}) 
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18)




[CLJ-1212] Silent truncation/downcasting of primitive type on reflection call to overloaded method (Math/abs) Created: 28/May/13  Updated: 03/Aug/15

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Matthew Willson Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: primitives, typehints
Environment:

Clojure 1.5.1
OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1)



 Description   

I realise relying on reflection when calling these kinds of methods isn't a great idea performance-wise, but it shouldn't lead to incorrect or dangerous behaviour.

Here it seems to trigger a silent downcast of the input longs, giving a truncated integer output:

user> (defn f [a b] (Math/abs (- a b)))
Reflection warning, NO_SOURCE_PATH:1:15 - call to abs can't be resolved.
#'user/f
user> (f 1000000000000 2000000000000)
727379968
user> (class (f 1000000000000 2000000000000))
java.lang.Integer
user> (defn f [^long a ^long b] (Math/abs (- a b)))
#'user/f
user> (f 1000000000000 2000000000000)
1000000000000
user> (class (f 1000000000000 2000000000000))
java.lang.Long



 Comments   
Comment by Matthew Willson [ 28/May/13 12:50 PM ]

For an even simpler way to replicate the issue:

user> (#(Math/abs %) 1000000000000)
Reflection warning, NO_SOURCE_PATH:1:3 - call to abs can't be resolved.
727379968

Comment by Andy Fingerhut [ 28/May/13 1:36 PM ]

I was able to reproduce the behavior you see with these Java 6 JVMs on Ubuntu 12.04.2:

java version "1.6.0_27"
OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

However, I tried two Java 7 JVMs, and it gave the following behavior which looks closer to what you would hope for. I do not know what is the precise difference between Java 6 and Java 7 that leads to this behavior difference, but this is some evidence that this has something to do with Java 6 vs. Java 7.

user=> (set! warn-on-reflection true)
true
user=> (defn f [a b] (Math/abs (- a b)))
Reflection warning, NO_SOURCE_PATH:1:15 - call to abs can't be resolved.
#'user/f
user=> (f 1000000000000 2000000000000)
1000000000000
user=> (class (f 1000000000000 2000000000000))
java.lang.Long

Above behavior observed with Clojure 1.5.1 on these JVMs:

Ubuntu 12.04.2 plus this JVM:
java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

Mac OS X 10.8.3 plus this JVM:
java version "1.7.0_15"
Java(TM) SE Runtime Environment (build 1.7.0_15-b03)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Comment by Matthew Willson [ 29/May/13 5:17 AM ]

Ah, interesting.
Maybe it's a difference in the way the reflection API works in java 7?

Here's the bytecode generated incase anyone's curious:

public java.lang.Object invoke(java.lang.Object);
Code:
0: ldc #14; //String java.lang.Math
2: invokestatic #20; //Method java/lang/Class.forName:(Ljava/lang/String;)Ljava/lang/Class;
5: ldc #22; //String abs
7: iconst_1
8: anewarray #24; //class java/lang/Object
11: dup
12: iconst_0
13: aload_1
14: aconst_null
15: astore_1
16: aastore
17: invokestatic #30; //Method clojure/lang/Reflector.invokeStaticMethod:(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/Object;
20: areturn

Comment by Matthew Willson [ 29/May/13 5:20 AM ]

Just an idea (and maybe this is what's happening under java 7?) but given it's a static method and all available overloaded variants are presumably known at compile time, perhaps it could generate code along the lines of:

(cond
(instance? Long x) (Math/abs (long x))
(instance? Integer x) (Math/abs (int x))
;; ...
)

Comment by Andy Fingerhut [ 29/May/13 3:19 PM ]

In Reflector.java method invokeStaticMethod(Class c, String methodName, Object[] args) there is a call to getMethods() followed by a call to invokeMatchingMethod(). getMethods() returns the 4 java.lang.Math/abs methods in different orders on Java 6 and 7, causing invokeMatchingMethod() to pick a different one on the two JVMs:

java version "1.6.0_39"
Java(TM) SE Runtime Environment (build 1.6.0_39-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

user=> (pprint (seq (clojure.lang.Reflector/getMethods java.lang.Math 1 "abs" true)))
(#<Method public static int java.lang.Math.abs(int)>
#<Method public static long java.lang.Math.abs(long)>
#<Method public static float java.lang.Math.abs(float)>
#<Method public static double java.lang.Math.abs(double)>)
nil

java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

user=> (pprint (seq (clojure.lang.Reflector/getMethods java.lang.Math 1 "abs" true)))
(#<Method public static double java.lang.Math.abs(double)>
#<Method public static float java.lang.Math.abs(float)>
#<Method public static long java.lang.Math.abs(long)>
#<Method public static int java.lang.Math.abs(int)>)
nil

That might be a sign of undesirable behavior in invokeMatchingMethod() that is too dependent upon the order of methods given to it.

As you mention, type hinting is good for avoiding the significant performance hit of reflection.

Comment by Alex Miller [ 02/Aug/15 8:24 PM ]

Not reproducible on 1.8.0-alpha3.

Comment by Nicola Mometto [ 03/Aug/15 9:52 AM ]

Alex, I could reproduce using 1.8.0-master-SNAPSHOT and jdk 1.8:

[~]> java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
[~]> clj
Clojure 1.8.0-master-SNAPSHOT
user=> (set! *warn-on-reflection* true)
true
user=> (#(Math/abs %) 1000000000000)
Reflection warning, NO_SOURCE_PATH:2:3 - call to static method abs on java.lang.Math can't be resolved (argument
727379968
user=> (class *1)
java.lang.Integer

Andy's last comment mentions that clojure.lang.Reflector.invokeStaticMethod is dependant on the order of methods passed to it and that that order can change between jdk versions so maybe that's why you couldn't reproduce it





[CLJ-1319] array-map fails lazily if passed an odd number of arguments Created: 08/Jan/14  Updated: 03/Aug/15

Status: Reopened
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.8

Type: Defect Priority: Minor
Reporter: Stuart Sierra Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: errormsgs, ft

Attachments: Text File 0001-CLJ-1319-Throw-on-odd-arguments-to-PersistentArrayMa.patch     Text File 0002-CLJ-1319-Throw-on-odd-arguments-to-PersistentArrayMa.patch    
Patch: Code and Test
Approval: Ok

 Description   

If called with an odd number of arguments, array-map does not throw an exception until the map is realized, when it throws the confusing ArrayIndexOutOfBoundsException.

Example, in 1.5.1 and 1.6.0-alpha3:

user=> (def m (hash-map :a 1 :b))
IllegalArgumentException No value supplied for key: :b  clojure.lang.PersistentHashMap.create (PersistentHashMap.java:77)

user=> (def m (array-map :a 1 :b))
#'user/m
user=> (prn m)
ArrayIndexOutOfBoundsException 3
  clojure.lang.PersistentArrayMap$Seq.first
  (PersistentArrayMap.java:313)

Approach: Catch on construction and throw same error as hash-map.

user=> (def m (array-map :a 1 :b))
IllegalArgumentException No value supplied for key: :b  clojure.lang.PersistentArrayMap.createAsIfByAssoc (PersistentArrayMap.java:78)

Patch: 0002-CLJ-1319-Throw-on-odd-arguments-to-PersistentArrayMa.patch
Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 08/Jan/14 11:01 AM ]

PersistentArrayMap.createAsIfByAssoc could check length is even to catch this

Comment by Jason Felice [ 27/Jan/14 1:01 PM ]

A better error message would be nice... this is the best I could think of.

Comment by OHTA Shogo [ 14/May/15 7:06 AM ]

I suffered from this problem recently. Anything to resolve before applying the patch?

Comment by Alex Miller [ 14/May/15 8:21 AM ]

The error message should match the message for hash-map:

user=> (hash-map 1 2 3)
IllegalArgumentException No value supplied for key: 3  clojure.lang.PersistentHashMap.create (PersistentHashMap.java:77)
user=> (array-map 1 2 3)
IllegalArgumentException Odd number of arguments when creating map  clojure.lang.PersistentArrayMap.createAsIfByAssoc (PersistentArrayMap.java:78)
Comment by OHTA Shogo [ 14/May/15 8:49 PM ]

Let me have a try. I just updated the patch!

0002-CLJ-1319-Throw-on-odd-arguments-to-PersistentArrayMa.patch

Comment by Alex Miller [ 15/May/15 9:10 AM ]

Looks good.

Comment by Alex Miller [ 03/Aug/15 8:05 AM ]

Reopened - does not seem to have correct behavior in 1.8.0-alpha4.

user=> (array-map 1 2 3)
ArrayIndexOutOfBoundsException 3  clojure.lang.PersistentArrayMap$Seq.first (PersistentArrayMap.java:321)
user=> (def m (array-map :a 1 :b))
#'user/m
Comment by Alex Miller [ 03/Aug/15 8:17 AM ]

Doesn't like this was merged so moving to Ok but not closed for next round.





[CLJ-1229] count silently overflows for sequences with more than Integer/MAX_VALUE elements Created: 09/Jul/13  Updated: 02/Aug/15  Resolved: 02/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Andy Fingerhut Assignee: Unassigned
Resolution: Duplicate Votes: 1
Labels: math

Attachments: Text File clj-1229-count-overflow-patch-v1.txt    
Patch: Code

 Description   

Found by John Jacobsen: https://mail.google.com/mail/?shva=1#label/clojure/13fbba0c3e4ba6b7

user> (time (count (range (*' 1000 1000 1000 3))))
"Elapsed time: 375225.663 msecs"
-1294967296



 Comments   
Comment by Andy Fingerhut [ 09/Jul/13 1:41 AM ]

Patch clj-1229-count-overflow-patch-v1.txt dated Jul 8 2013 modifies count to throw an ArithmeticException for sequences with more than Integer/MAX_VALUE elements.

Performance experiments with this expression show no significant time differences before and after the change. I verified with -XX:+PrintCompilation, and only paid attention to timing results after no further JIT compilation was performed when executing the expression:

(defn foo [n m] (doall (for [i (range n)] (count (range m)))))
(time (foo 100 1000000))

No tests are added. A test verifying that an exception is thrown for such a long sequence would be prohibitively slow.

Comment by Mike Anderson [ 09/Jul/13 3:08 AM ]

I think a better strategy might be to convert all count-like functions from int to long. int overflow exceptions aren't a particularly nice result, especially since they probably won't get picked up in tests for the reasons Andy mentioned.

That buys us a quite a few years more future proofing...

Comment by Andy Fingerhut [ 09/Jul/13 10:15 AM ]

Mike, unless I am missing something, that would require changing the method count() in the Counted interface to return a long, and that in turn requires little changes throughout the code base wherever Counted is used. It could be done, of course, but it is not a small change.

Comment by John Jacobsen [ 09/Jul/13 12:35 PM ]

I agree that Mike's approach is nicer overall, but think Andy's patch is an immediate improvement over what we have now, and could be implemented until someone takes the time to correctly make all the detailed changes Mike is suggesting.

Comment by John Jacobsen [ 09/Jul/13 12:52 PM ]

FWIW I did apply the patch, build, and test manually:

user=> (count (range (* 1000 1000 1000 3)))
ArithmeticException integer overflow clojure.lang.RT.countFrom (RT.java:549)
user=>

Comment by Alex Miller [ 11/Jul/13 3:26 PM ]

Perhaps of interest, Java's Collection.size() returns Integer.MAX_VALUE if the size of the collection > MAX_VALUE. I can't say that either that behavior or overflow is particularly helpful in practice of course.

Comment by Alex Miller [ 02/Aug/15 8:27 PM ]

Dupe to CLJ-1729 (similar change logged at Rich's request)





[CLJ-1207] Importing a class that does not exist fails to report the name of the class that did not exist Created: 29/Apr/13  Updated: 02/Aug/15  Resolved: 02/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Howard Lewis Ship Assignee: Unassigned
Resolution: Not Reproducible Votes: 1
Labels: errormsgs
Environment:

1.5.1, OS X


Waiting On: Howard Lewis Ship

 Description   

Pop quiz: What Java class is missing from the classpath?

java.lang.NoClassDefFoundError: Could not initialize class com.annadaletech.nexus.util.logging__init
 at java.lang.Class.forName0 (Class.java:-2)
    java.lang.Class.forName (Class.java:264)
    clojure.lang.RT.loadClassForName (RT.java:2098)
    clojure.lang.RT.load (RT.java:430)
    clojure.lang.RT.load (RT.java:411)
    clojure.core$load$fn__5018.invoke (core.clj:5530)
    clojure.core$load.doInvoke (core.clj:5529)
    clojure.lang.RestFn.invoke (RestFn.java:408)
    clojure.core$load_one.invoke (core.clj:5336)
    clojure.core$load_lib$fn__4967.invoke (core.clj:5375)
    clojure.core$load_lib.doInvoke (core.clj:5374)
    clojure.lang.RestFn.applyTo (RestFn.java:142)
    clojure.core$apply.invoke (core.clj:619)
    clojure.core$load_libs.doInvoke (core.clj:5413)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invoke (core.clj:619)
    clojure.core$require.doInvoke (core.clj:5496)
    clojure.lang.RestFn.invoke (RestFn.java:512)
    novate.console.app$eval1736$loading__4910__auto____1737.invoke (app.clj:1)
    novate.console.app$eval1736.invoke (app.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:6619)
    clojure.lang.Compiler.eval (Compiler.java:6608)
    clojure.lang.Compiler.load (Compiler.java:7064)
    user$eval1732.invoke (NO_SOURCE_FILE:1)
    clojure.lang.Compiler.eval (Compiler.java:6619)
    clojure.lang.Compiler.eval (Compiler.java:6582)
    clojure.core$eval.invoke (core.clj:2852)
    clojure.main$repl$read_eval_print__6588$fn__6591.invoke (main.clj:259)
    clojure.main$repl$read_eval_print__6588.invoke (main.clj:259)
    clojure.main$repl$fn__6597.invoke (main.clj:277)
    clojure.main$repl.doInvoke (main.clj:277)
    clojure.lang.RestFn.invoke (RestFn.java:1096)
    clojure.tools.nrepl.middleware.interruptible_eval$evaluate$fn__584.invoke (interruptible_eval.clj:56)
    clojure.lang.AFn.applyToHelper (AFn.java:159)
    clojure.lang.AFn.applyTo (AFn.java:151)
    clojure.core$apply.invoke (core.clj:617)
    clojure.core$with_bindings_STAR_.doInvoke (core.clj:1788)
    clojure.lang.RestFn.invoke (RestFn.java:425)
    clojure.tools.nrepl.middleware.interruptible_eval$evaluate.invoke (interruptible_eval.clj:41)
    clojure.tools.nrepl.middleware.interruptible_eval$interruptible_eval$fn__625$fn__628.invoke (interruptible_eval.clj:171)
    clojure.core$comp$fn__4154.invoke (core.clj:2330)
    clojure.tools.nrepl.middleware.interruptible_eval$run_next$fn__618.invoke (interruptible_eval.clj:138)
    clojure.lang.AFn.run (AFn.java:24)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1110)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:603)
    java.lang.Thread.run (Thread.java:722)

If you guess "com.annadaletech.nexus.util.logging__init" you are wrong!

Wait, I'll give you a hint:

(ns com.annadaletech.nexus.util.logging
  (:use [clojure.string :only [trim-newline]]
        [clojure.pprint :only [code-dispatch pprint with-pprint-dispatch *print-right-margin*]])
  (:import [java.io StringWriter]
           [org.slf4j MDC MarkerFactory Marker LoggerFactory]
           [java.util.concurrent.locks ReentrantLock]))

Oh, sorry, did that not help?

The correct answer is "org.slf4j.MDC".

Having that information in the stack trace would have saved me nearly an hour. I think it is worth the effort to get that reported correctly.



 Comments   
Comment by Gabriel Horner [ 10/May/13 1:56 PM ]

When I try this on a fresh project, I get this error:
"ClassNotFoundException org.slf4j.MDC
java.net.URLClassLoader$1.run (URLClassLoader.java:202)
java.security.AccessController.doPrivileged (AccessController.java:-2)"

Howard, could you give us a project.clj or better yet a github repository that recreates this issue?

Comment by Howard Lewis Ship [ 10/May/13 4:51 PM ]

I'll see what I can do. Probably be next week. Thanks for looking at this.

Comment by Gary Fredericks [ 26/May/13 8:20 AM ]

This reminds me of an issue with `lein run` that resulted from it trying to figure out whether you wanted to run a namespace or a java class:

https://github.com/technomancy/leiningen/issues/1182

Comment by Alex Miller [ 02/Aug/15 8:23 PM ]

I tried a few things but there is not enough info here for me to reproduce it. Please re-open if you can do so.





[CLJ-1199] Record values are not 'eval'uated, unlike values of PersistentMap: Created: 13/Apr/13  Updated: 02/Aug/15  Resolved: 02/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Jason Wolfe Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None


 Description   

I'm not sure if this is by design, but it caught me off guard.

user> (defrecord A [x])
user.A

user> (eval (hash-map :x `long))
{:x #<core$long clojure.core$long@5de54eb7>}
user> (eval (->A `long))
#user.A{:x clojure.core/long}
user> (eval (map->A (hash-map :x `long)))
#user.A{:x clojure.core/long}

and in case it matters, here's a simplified version of the real use case where this came up, with no eval – just a macro:

user> (defmacro munge-meta1 [x] (assoc x :schema (->A (:schema (meta x)))))
#'user/munge-meta1
user> (munge-meta1 ^{:schema long} {})
{:schema #user.A{:x long}}

user> (defmacro munge-meta2 [x] (assoc x :schema (hash-map :x (:schema (meta x)))))
#'user/munge-meta2
user> (munge-meta2 ^{:schema long} {})
{:schema {:x #<core$long clojure.core$long@5de54eb7>}}

This seems to be fixed by moving the record creation post-evaluation, so it's not a big deal, just surprising (plus I haven't yet convinced myself that this will always work if the user's schema itself contains record-creating forms, although it seems to work OK):

user> (defmacro munge-meta1 [x] (assoc x :schema `(->A ~(:schema (meta x)))))
#'user/munge-meta1
user> (munge-meta1 ^{:schema long} {})
{:schema #user.A{:x #<core$long clojure.core$long@5de54eb7>}}

I brought this up on the mailing list here:

https://groups.google.com/forum/?fromgroups=#!topic/clojure-dev/UgD35E1RQTo



 Comments   
Comment by Alex Miller [ 02/Aug/15 8:18 PM ]

This is by design, see http://clojure.org/datatypes - "each element in the vector form is passed to the deftype/defrecord's constructor un-evaluated"





[CLJ-1189] Map-destructuring :or fumble needs compiler warning Created: 31/Mar/13  Updated: 02/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Phill Wolf Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: errormsgs

Attachments: Text File CLJ-1189-p1.patch    
Patch: Code and Test

 Description   

Here is a map-destructuring blunder that I wish the compiler warned about:

(defn server
[{servlet ::servlet
type ::type
:or {::type :jetty}
:as service-map}]

It would be splendid to get a warning that :or keys that are not symbols being bound have no effect.

The incomplete code snippet above comes from Pedestal.service 0.1.0.

Here is a complete one-line example with the coding error:

user> (defn picnic [{botulism :botulism :or {:botulism 6}}] botulism)
#'user/picnic
user> (picnic {})
nil
user> ;; I intended 6.



 Comments   
Comment by Gary Fredericks [ 26/May/13 8:25 AM ]

Should this be a warning or an exception? I don't know of any similar example of the compiler giving a warning, so I would expect the latter.

Comment by Gary Fredericks [ 26/May/13 9:54 AM ]

Added a patch that throws an exception when :or is not a map or its keys are not symbols. Also some tests.





[CLJ-1138] data-reader returning nil causes exception Created: 22/Dec/12  Updated: 02/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4, Release 1.5, Release 1.6, Release 1.7, Release 1.8
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Steve Miner Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: reader
Environment:

clojure 1.5 beta2, Mac OS X 10.8.2, java version "1.6.0_37"


Attachments: Text File 0001-CLJ-1139-allow-nil-in-data-reader.patch    
Patch: Code and Test
Approval: Triaged

 Description   

If a data-reader returns nil, the reader throws java.lang.RuntimeException: No dispatch macro... The error message implies that there is no dispatch macro for whatever the first character of the tag happens to be.

Here's a simple example:

(binding [*data-readers* {'f/ignore (constantly nil)}] (read-string "#f/ignore 42 10"))

RuntimeException No dispatch macro for: f clojure.lang.Util.runtimeException (Util.java:219)



 Comments   
Comment by Steve Miner [ 22/Dec/12 9:43 AM ]

clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch allows a data-reader to return nil instead of throwing. Does sanity check that possible tag or record isJavaIdentifierStart(). Gives better error message for special characters that might actually be dispatch macros (rather than assuming it's a tagged literal).

Comment by Steve Miner [ 22/Dec/12 10:06 AM ]

clj-1138-data-reader-return-nil-for-no-op.patch allows a data-reader returning nil to be treated as a no-op by the reader (like #_). nil is not normally a useful value (actually it causes an exception in Clojure 1.4 through 1.5 beta2) for a data-reader to return. With this patch, one could get something like a conditional feature reader using data-readers.

Comment by Steve Miner [ 22/Dec/12 10:26 AM ]

clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch is the first patch to consider. It merely allows nil as a value from a data-reader and returns nil as the final value. I think it does what was originally intended for dispatch macros, and gives a better error message in many cases (mostly typos).

The second patch, clj-1138-data-reader-return-nil-for-no-op.patch, depends on the other being applied first. It takes an extra step to treat a nil value returned from a data-reader as a no-op for the reader (like #_).

Comment by Steve Miner [ 23/Dec/12 11:52 AM ]

It turns out that you can work around the original problem by having your data-reader return '(quote nil) instead of plain nil. That expression conveniently evaluates to nil so you can get a nil if necessary. This also works after applying the patches so there's still a way to return nil if you really want it.

(binding [*data-readers* {'x/nil (constantly '(quote nil))}] (read-string "#x/nil 42"))
;=> (quote nil)

Comment by Andy Fingerhut [ 07/Feb/13 9:20 AM ]

Patch clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch dated Dec 22 2012 still applies cleanly to latest master if you use the following command:

% git am --keep-cr -s --ignore-whitespace < clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch

Without the --ignore-whitespace option, the patch fails only because some whitespace was changed in Clojure master recently.

Comment by Andy Fingerhut [ 13/Feb/13 11:24 AM ]

OK, now with latest master (1.5.0-RC15 at this time), patch clj-1138-allow-data-reader-to-return-nil-instead-of-throwing.patch no longer applies cleanly, not even using --ignore-whitespace in the 'git am' command given above. Steve, if you could see what needs to be updated, that would be great. Using the patch command as suggested in the "Updating stale patches" section of http://dev.clojure.org/display/design/JIRA+workflow wasn't enough, so it should probably be carefully examined by hand to see what needs updating.

Comment by Steve Miner [ 14/Feb/13 12:21 PM ]

I removed my patches. Things have changes recently with the LispReader and new EdnReader.





[CLJ-955] java object reader constructor doesn't work Created: 18/Mar/12  Updated: 02/Aug/15  Resolved: 02/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.4
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Brent Millare Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: reader


 Description   

Here is a transcript:

;user=> clojure-version
{:major 1, :minor 4, :incremental 0, :qualifier "beta5"}
;user=> (.getProtocol #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
java.lang.RuntimeException: Can't embed object in code, maybe print-dup not defined: file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs, compiling:(null:2)
;user=> (.getProtocol (java.net.URL. "file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"))
"file"

Another transcript from google groups https://groups.google.com/forum/?fromgroups&hl=en#!topic/clojure/vlsFgVaKcSQ

user=> (def x #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
#'user/x
user=> x
#<URL file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs>
user=> (.getProtocol x)
"file"

user=> (.getProtocol #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
CompilerException java.lang.RuntimeException: Can't embed object in code, maybe print-dup not defined: file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs, compiling:(NO_SOURCE_PATH:5)
user=> (defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w (str o)))
#<MultiFn clojure.lang.MultiFn@2e694f12>

user=> (.getProtocol #java.net.URL["file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
ClassCastException clojure.lang.Symbol cannot be cast to java.net.URL user/eval11 (NO_SOURCE_FILE:7)
user=>

user=> (def x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"])
#'user/x
user=> (printf "(class x)=%s x='%s'\n" (class x) x)
(class x)=class java.net.URL x='file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs'
nil
user=> (let [x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"]]
(printf "(class x)=%s x='%s'\n" (class x) x))
CompilerException java.lang.RuntimeException: Can't embed object in code, maybe print-dup not defined: file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs, compiling:(NO_SOURCE_PATH:4)

user=> (defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w (str o)))
#<MultiFn clojure.lang.MultiFn@3362a63>
user=> (let [x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"]]
(printf "(class x)=%s x='%s'\n" (class x) x))
(class x)=class clojure.lang.Symbol x='file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs'
nil



 Comments   
Comment by Andy Fingerhut [ 19/Mar/12 2:58 AM ]

I've confirmed this behavior with 1.4.0 beta5. I've only tracked it down as far as finding the "Can't embed object in code" message, which is easy to find in Compiler.java in method emitValue. That exception is thrown because printString in RT.java throws an exception.

It isn't clear to me what should be done instead, though.

Comment by Fogus [ 19/Mar/12 9:03 AM ]

What would it mean to construct an arbitrary Java object for the purpose of embedding it in code in a generic way? You could say that it's just a matter of calling its constructor with the right args but very often in Java that is not enough to make an object considered "initialize". Sometimes there are init methods or putters or whatever that are required for object construction. So right there I hope it's clear that even though #some.Klass["foo"] provides a way to call an arbitrary constructor, it's in no way a guarantee that an instance is properly constructed. The reason that (for example) defrecords are embeddable anywhere is because we know for certain that the constructor creates a fully initialized instance. If you need to embed specific instances in your own code then you have two options:

  • Implement a print-dup for the class that guarantees a fully initialized object is built.
  • Use Clojure 1.4's tagged literal feature to do the same.

Quick point of note:

Your code

(defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w (str o)))

Doesn't do what you think it does. It spits out exactly file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs that Clojure reads in as a symbol. Something like the following might be more appropriate:

(defmethod print-dup java.net.URL [o, ^java.io.Writer w] (.write w "#java.net.URL") (.write w (str [ (str o) ])))

(let [x #java.net.URL["file:///home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs"]]                             
  (printf "(class x)=%s x='%s'\n" (class x) x) x)                                                                          

; (class x)=class java.net.URL x='file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs'

;=> #<URL file:/home/hara/dj/usr/src/clojurescript/src/cljs/cljs/core.cljs>

(.getProtocol *1)                        
;=> "file"
Comment by Brent Millare [ 19/Mar/12 10:09 AM ]

Fogus, can you please elaborate on using clojure 1.4's tagged literal features. While I understand that you can define data-reader functions, for example

(binding [*data-readers* {'user/f (fn [x] (java.io.File. (first x)))}] (read-string "#user/f [\"hello\"]")) ;=> #<File hello>

however, I feel this is only half a fix, compared with the first mentioned solution (involving print-dup), since clojure's tagged literals only are important for reading, not for printing. Does using tagged literals, so that (read-string (pr-str (read-string ...) works, imply that you must also define print methods per class? If this is true, it seems problematic since if different code wants to define different print methods, this will conflict since defining print-dup methods is global. Is there a good solution for printing objects depending on the context? As an alternative solution, I propose making the default print-method of all objects that didn't already have a printed representation to be a tagged literal, this way, users can customize what it means to read it. (See https://groups.google.com/forum/?fromgroups&hl=en#!topic/clojure/GdT5cO6JoSQ )

Comment by Fogus [ 19/Mar/12 10:18 AM ]

"Fogus, can you please elaborate on using clojure 1.4's tagged literal features."

I can, but it falls outside of the scope of this particular ticket. It might be better to take the broader conversation to the design page at http://dev.clojure.org/display/design/Tagged+Literals and the clojure-dev list.

Has the original motivation for this ticket been addressed?

Comment by Brent Millare [ 19/Mar/12 10:26 AM ]

I think you've explained the underlying problem, so yes. One possible caveat might be that it may be a concern that the current error message is not telling that the underlying problem lies in an improperly initialized object. (or maybe it is, and that's the error message I should expect).





[CLJ-878] DispatchReader always calls CtorReader when no dispatch macro found Created: 15/Nov/11  Updated: 02/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Greg Chapman Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: reader
Environment:

Windows 7, Java 1.7



 Description   

At the REPL, I accidentally typed:

user=> # "\w+"
RuntimeException Unsupported escape character: \w clojure.lang.Util.runtimeExce
ption (Util.java:156)
#<core$PLUS clojure.core$PLUS@6b7dc78>

You can see the confusing result (the REPL is also left in an unclosed string). After looking at LispReader.java, it seems to me DispatchReader ought to at least check for whitespace before calling CtorReader (perhaps better would be to check for a valid symbol character).



 Comments   
Comment by Nicola Mometto [ 25/Mar/15 5:33 PM ]

No longer reproducible

Comment by Alex Miller [ 25/Mar/15 9:33 PM ]

Seems at least partially reproducible to me. I see same behavior on first example. Second example works with a symbol {{# user.X[5]}} but fails with a reasonable error on the given case.

Comment by Nicola Mometto [ 25/Mar/15 9:56 PM ]

Ah, right.
The only way to fix the first error is to make `# "\w+"` a valid regex literal, that is, allowing whitespaces between # and the next dispatch char.
Is this something we want?

Comment by Alex Miller [ 25/Mar/15 10:06 PM ]

dunno

Comment by Alex Miller [ 02/Aug/15 7:53 PM ]

Removed 2nd (no longer present) example in description.





[CLJ-379] problem with classloader when run as windows service Created: 13/Jun/10  Updated: 02/Aug/15  Resolved: 02/Aug/15

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Anonymous Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: None


 Description   

I found following error when I run clojure application as MS Windows service (via procrun from Apache Daemon project). When I tried to do 'require' during run-time, I got NullPointerException. This happened as baseLoader function from RT class returned null in such environment (the value of Thread.currentThread().getContextClassLoader()). (Although my app works fine when I run my application as standalone program, not as service).
This error was fixed by explicit setting of class loader with following code:

(.setContextClassLoader (Thread/currentThread) (java.lang.ClassLoader/getSystemClassLoader))

before any call to 'require'....

May be you need to modify 'baseLoader' function, so it will check is value of Thread.currentThread().getContextClassLoader() is null or not, and if null, then return value of java.lang.ClassLoader.getSystemClassLoader() ?



 Comments   
Comment by Assembla Importer [ 24/Aug/10 10:40 AM ]

Converted from http://www.assembla.com/spaces/clojure/tickets/379
Attachments:
ticket-379-fix.diff - https://www.assembla.com/spaces/clojure/documents/c5XWHcD4yr34HveJe5ccaP/download/c5XWHcD4yr34HveJe5ccaP

Comment by Assembla Importer [ 24/Aug/10 10:40 AM ]

alexott said: possible fix is attached

Comment by Assembla Importer [ 24/Aug/10 10:40 AM ]

alexott said: [file:c5XWHcD4yr34HveJe5ccaP]

Comment by Alex Miller [ 02/Aug/15 7:42 PM ]

Too old, too few details to repro, too many changes since logging to properly evaluate.





[CLJ-1364] Primitive VecSeq does not implement equals or hashing methods Created: 19/Feb/14  Updated: 02/Aug/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: collections

Approval: Triaged

 Description   

VecSeq (as produced by (seq (vector-of :int 1 2 3))) does not implements equals, hashCode, or hasheq and does not play with any other Clojure collections or sequences appropriately in this regard.

user=> (def rs (range 3))
user=> (def vs (seq (vector-of :int 0 1 2)))
user=> rs
(0 1 2)
user=> vs
(0 1 2)
user=> (.equals rs vs)
true
user=> (.equals vs rs)    ;; expect: true
false
user=> (.equiv rs vs)
true
user=> (.equiv vs rs)
true
user=> (.hashCode rs)
29824
user=> (.hashCode vs)     ;; expect to match (.hashCode rs)
2081327893
user=> (System/identityHashCode vs)  ;; show that we're just getting Object hashCode
2081327893
user=> (.hasheq rs)
29824
user=> (.hasheq vs)       ;; expect same as (.hasheq rs) but not implemented at all
IllegalArgumentException No matching field found: hasheq for class clojure.core.VecSeq  clojure.lang.Reflector.getInstanceField (Reflector.java:271)





[CLJ-1517] Unrolled small vectors Created: 01/Sep/14  Updated: 31/Jul/15

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.7
Fix Version/s: Backlog

Type: Enhancement Priority: Critical
Reporter: Zach Tellman Assignee: Unassigned
Resolution: Unresolved Votes: 21
Labels: collections, performance

Attachments: File unrolled-collections-2.diff     File unrolled-collections.diff     Text File unrolled-vector-2.patch     Text File unrolled-vector.patch    
Patch: Code
Approval: Incomplete

 Description   

As discussed on the mailing list [1], this patch has two unrolled variants of vectors and maps, with special inner classes for each cardinality. Currently both grow to six elements before spilling over into the general versions of the data structures, which is based on rough testing but can be easily changed. At Rich's request, I haven't included any integration into the rest of the code, and there are top-level static create() methods for each.

The sole reason for this patch is performance, both in terms of creating data structures and performing operations on them. This can be seen as a more verbose version of the trick currently played with PersistentArrayMap spilling over into PersistentHashMap. Based on the benchmarks, which can be run by cloning cambrian-collections [2] and running 'lein test :benchmark', this should supplant PersistentArrayMap. Performance is at least on par with PAM, and often much faster. Especially noteworthy is the creation time, which is 5x faster for maps of all sizes (lein test :only cambrian-collections.map-test/benchmark-construction), and on par for 3-vectors, but 20x faster for 5-vectors. There are similar benefits for hash and equality calculations, as well as calls to reduce().

This is a big patch (over 5k lines), and will be kind of a pain to review. My assumption of correctness is based on the use of collection-check, and the fact that the underlying approach is very simple. I'm happy to provide a high-level description of the approach taken, though, if that will help the review process.

I'm hoping to get this into 1.7, so please let me know if there's anything I can do to help accomplish that.

[1] https://groups.google.com/forum/#!topic/clojure-dev/pDhYoELjrcs
[2] https://github.com/ztellman/cambrian-collections

Patch: unrolled-vector-2.patch

Screener Notes: The approach is clear and understandable. Given the volume of generated code, I believe that best way to improve confidence in this code is to get people using it asap, and add collection-test [3] to the Clojure test suite. I would also like to get the generator [4] included in the Clojure repo. We don't need to necessarily automate running it, but would be nice to have it nearby if we want to tweak something later.

[3] https://github.com/ztellman/collection-check/blob/master/src/collection_check.clj
[4] https://github.com/ztellman/cambrian-collections/blob/master/generate/cambrian_collections/vector.clj



 Comments   
Comment by Zach Tellman [ 01/Sep/14 10:13 PM ]

Oh, I forgot to mention that I didn't make a PersistentUnrolledSet, since the existing wrappers can use the unrolled map implementation. However, it would be moderately faster and more memory efficient to have one, so let me know if it seems worthwhile.

Comment by Nicola Mometto [ 02/Sep/14 5:23 AM ]

Zach, the patch you added isn't in the correct format, they need to be created using `git format-patch`

Comment by Nicola Mometto [ 02/Sep/14 5:31 AM ]

Also, I'm not sure if this is on-scope with the ticket but those patches break with *print-dup*, as it expects a static create(x) method for each inner class.

I'd suggest adding a create(Map x) static method for the inner PersistentUnrolledMap classes and a create(ISeq x) one for the inner PersistentUnrolledVector classes

Comment by Alex Miller [ 02/Sep/14 8:14 AM ]

Re making patches, see: http://dev.clojure.org/display/community/Developing+Patches

Comment by Jozef Wagner [ 02/Sep/14 9:16 AM ]

I wonder what is the overhead of having meta and 2 hash fields in the class. Have you considered a version where the hash is computed on the fly and where you have two sets of collections, one with meta field and one without, using former when the actual metadata is attached to the collection?

Comment by Zach Tellman [ 02/Sep/14 12:13 PM ]

I've attached a patch using the proper method. Somehow I missed the detailed explanation for how to do this, sorry. I know the guidelines say not to delete previous patches, but since the first one isn't useful I've deleted it to minimize confusion.

I did the print-dup friendly create methods, and then realized that once these are properly integrated, 'pr' will just emit these as vectors. I'm fairly sure the create methods aren't necessary, so I've commented them out, but I'm happy to add them back in if they're useful for some reason I can't see.

I haven't given a lot of thought to memory efficiency, but I think caching the hashes are worthwhile. I can see an argument for creating a "with-meta" version of each collection, but since that would double the size of an already enormous patch, I think that should probably wait.

Comment by Zach Tellman [ 03/Sep/14 4:31 PM ]

I found a bug! Like PersistentArrayMap, I have a special code path for comparing keywords, but my generators for collection-check were previously using only integer keys. There was an off-by-one error in the transient map implementation [1], which was not present for non-keyword lookups.

I've taken a close look for other gaps in my test coverage, and can't find any. I don't think this substantively changes the risk of this patch (an updated version of which has been uploaded as 'unrolled-collections-2.diff'), but obviously where there's one bug, there may be others.

[1] https://github.com/ztellman/cambrian-collections/commit/eb7dfe6d12e6774512dbab22a148202052442c6d#diff-4bf78dbf5b453f84ed59795a3bffe5fcR559

Comment by Zach Tellman [ 03/Oct/14 2:34 PM ]

As an additional data point, I swapped out the data structures in the Cheshire JSON library. On the "no keyword-fn decode" benchmark, the current implementation takes 6us, with the unrolled data structures takes 4us, and with no data structures (just lexing the JSON via Jackson) takes 2us. Other benchmarks had similar results. So at least in this scenario, it halves the overhead.

Benchmarks can be run by cloning https://github.com/dakrone/cheshire, unrolled collections can be tested by using the 'unrolled-collections' branch. The pure lexing benchmark can be reproduced by messing around with the cheshire.parse namespace a bit.

Comment by Zach Tellman [ 06/Oct/14 1:31 PM ]

Is there no way to get this into 1.7? It's an awfully big win to push off for another year.

Comment by Alex Miller [ 07/Oct/14 2:08 PM ]

Hey Zach, it's definitely considered important but we have decided to drop almost everything not fully done for 1.7. Timeframe for following release is unknown, but certainly expected to be significantly less than a year.

Comment by John Szakmeister [ 30/Oct/14 2:53 PM ]

You are all free to determine the time table, but I thought I'd point out that Zach is not entirely off-base. Clojure 1.4.0 was released April 5th, 2012. Clojure 1.5.0 was released March 1st, 2013 with 1.6.0 showing up March 25th, 2014. So it appears that the current cadence is around a year.

Comment by Alex Miller [ 30/Oct/14 3:40 PM ]

John, there is no point to comments like this. Let's please keep issue comments focused on the issue.

Comment by Zach Tellman [ 13/Nov/14 12:23 PM ]

I did a small write-up on this patch which should help in the eventual code review: http://blog.factual.com/using-clojure-to-generate-java-to-reimplement-clojure

Comment by Zach Tellman [ 07/Dec/14 10:34 PM ]

Per my conversation with Alex at the Conj, here's a patch that only contains the unrolled vectors, and uses the more efficient constructor for PersistentVector when spilling over.

Comment by Alex Miller [ 08/Dec/14 1:10 PM ]

Zach, I created a new placeholder for the map work at http://dev.clojure.org/jira/browse/CLJ-1610.

Comment by Jean Niklas L'orange [ 09/Dec/14 1:52 PM ]

It should probably be noted that core.rrb-vector will break for small vectors by this patch, as it peeks into the underlying structure. This will also break other libraries which peeks into the vector implementation internals, although I'm not aware of any other – certainly not any other contrib library.

Also, two comments on unrolled-vector.patch:

private transient boolean edit = true;
in the Transient class should probably be
private volatile boolean edit = true;
as transient means something entirely different in Java.

conj in the Transient implementation could invalidate itself without any problems (edit = false;) if it is converted into a TransientVector (i.e. spills over) – unless it has a notable overhead. The invalidation can prevent some subtle bugs related to erroneous transient usage.

Comment by Alex Miller [ 09/Dec/14 1:58 PM ]

Jean - understanding the scope of the impact will certainly be part of the integration process for this patch. I appreciate the heads-up. While we try to minimize breakage for things like this, it may be unavoidable for libraries that rely on implementation internals.

Comment by Michał Marczyk [ 09/Dec/14 2:03 PM ]

I'll add support for unrolled vectors to core.rrb-vector the moment they land on master. (Probably with some conditional compilation so as not to break compatibility with earlier versions of Clojure – we'll see when the time comes.)

Comment by Michał Marczyk [ 09/Dec/14 2:06 PM ]

I should say that it'd be possible to add generic support for any "vector lookalikes" by pouring them into regular vectors in linear time. At first glance it seems to me that that'd be out of line with the basic promise of the library, but I'll give it some more thought before the changes actually land.

Comment by Zach Tellman [ 09/Dec/14 5:43 PM ]

Somewhat predictably, the day after I cut the previous patch, someone found an issue [1]. In short, my use of the ArrayChunk wrapper applied the offset twice.

This was not caught by collection-check, which has been updated to catch this particular failure. It was, however, uncovered by Michael Blume's attempts to merge the change into Clojure, which tripped a bunch of alarms in Clojure's test suite. My own attempt to do the same to "prove" that it worked was before I added in the chunked seq functionality, hence this issue persisting until now.

As always, there may be more issues lurking. I hope we can get as many eyeballs on the code between now and 1.8 as possible.

[1] https://github.com/ztellman/cambrian-collections/commit/2e70bbd14640b312db77590d8224e6ed0f535b43
[2] https://github.com/MichaelBlume/clojure/tree/test-vector

Comment by Zach Tellman [ 10/Jul/15 1:54 PM ]

As a companion to the performance analysis in the unrolled map issue, I've run the benchmarks and posted the results at https://gist.github.com/ztellman/10e8959501fb666dc35e. Some notable results:

Comment by Alex Miller [ 13/Jul/15 9:02 AM ]

Stu: I do not think this patch should be marked "screened" until the actual integration and build work (if the generator is integrated) has been completed.

Comment by Alex Miller [ 14/Jul/15 4:33 PM ]

FYI, we have "reset" all big features for 1.8 for the moment (except the socket repl work). We may still include it - that determination will be made later.

Comment by Zach Tellman [ 14/Jul/15 4:43 PM ]

Okay, any idea when the determination will be made? I was excited that we seemed to be finally moving forward on this.

Comment by Alex Miller [ 14/Jul/15 4:51 PM ]

No, but it is now on my work list.

Comment by Rich Hickey [ 15/Jul/15 8:17 AM ]

I wonder if all of the overriding of APersistentVector yields important benefits - e.g. iterator, hashcode etc.

Comment by Zach Tellman [ 15/Jul/15 11:51 AM ]

In the case of hashcode, definitely: https://gist.github.com/ztellman/10e8959501fb666dc35e#file-gistfile1-txt-L1013-L1076. This was actually one of the original things I wanted to speed up.

In the case of the iterator, probably not. I'd be fine removing that.

Comment by Zach Tellman [ 16/Jul/15 5:17 PM ]

So am I to infer from https://github.com/clojure/clojure/commit/36d665793b43f62cfd22354aced4c6892088abd6 that this issue is defunct? If so, I think there's a lot of improvements being left on the table for no particular reason.

Comment by Rich Hickey [ 16/Jul/15 6:34 PM ]

Yes, that commit covers this functionality. It takes a different approach from the patch in building up from a small core, and maximizing improvements to the bases rather than having a lot of redundant definitions per class. That also allowed for immediate integration without as much concern for correctness, as there is little new code. It also emphasizes the use case for tuples, e.g. small vectors used as values that won't be changed, thus de-emphasizing the 'mutable' functions. I disagree that many necessary improvements are being left out. The patch 'optimized' many things that don't matter. Further, there are not big improvements to the pervasive inlining. In addition, the commit includes the integration work at a fraction of the size of the patch. In all, it would have taken much more back and forth to get the patch to conform with this approach than to just get it all done, but I appreciate the inspiration and instigation - thanks!

Comment by Rich Hickey [ 16/Jul/15 6:46 PM ]

That said, this commit need not be the last word - it can serve as a baseline for further optimization. But I'd rather that be driven by need. Clojure could become 10x as large optimizing things that don't matter.

Comment by Zach Tellman [ 19/Jul/15 1:36 PM ]

What is our reference for "relevant" performance? I (or anyone else) can provide microbenchmarks for calculating hashes or whatever else, but that doesn't prove that it's an important improvement. I previously provided benchmarks for JSON decoding in Cheshire, but that's just one of many possible real-world benchmarks. It might be useful to have an agreed-upon list of benchmarks that we can use when debating what is and isn't useful.

Comment by Mike Anderson [ 19/Jul/15 11:14 PM ]

I was interested in this implementation so created a branch that integrates Zach's unrolled vectors on top of clojure 1.8.0-alpha2. I also simplified some of the code (I don't think the metadata handling or unrolled seqs are worthwhile, for example)

Github branch: https://github.com/mikera/clojure/tree/clj-1517

Then I ran a set of micro-benchmarks created by Peter Taoussanis

Results: https://gist.github.com/mikera/72a739c84dd52fa3b6d6

My findings from this testing:

  • Performance is comparable (within +/- 20%) on the majority of tests
  • Zach's approach is noticeably faster (by 70-150%) for 4 operations (reduce, mapv, into, equality)

My view is that these additional optimisations are worthwhile. In particular, I think that reduce and into are very important operations. I also care about mapv quite a lot for core.matrix (It's fundamental to many numerical operations on arrays implemented using Clojure vectors).

Happy to create a patch along these lines if it would be acceptable.

Comment by Zach Tellman [ 19/Jul/15 11:45 PM ]

The `reduce` improvements are likely due to the unrolled reduce and kvreduce impls, but the others are probably because of the unrolled transient implementation. The extra code required to add these would be pretty modest.

Comment by Mike Anderson [ 20/Jul/15 9:20 PM ]

I actually condensed the code down to a single implementation for `Transient` and `TupleSeq`. I don't think these really need to be fully unrolled for each Tuple type. That helps by making the code even smaller (and probably also helps performance, given JVM inline caching etc.)

Comment by Peter Taoussanis [ 21/Jul/15 11:46 AM ]

Hey folks,

Silly question: is there actually a particular set of reference benchmarks that everyone's currently using to test the work on tuples? It took me a while to notice how bad the variance was with my own set of micro benchmarks.

Bumping all the run counts up till the noise starts ~dying down, I'm actually seeing numbers now that don't seem to agree with others here .

Google Docs link: https://docs.google.com/spreadsheets/d/1QHY3lehVF-aKrlOwDQfyDO5SLkGeb_uaj85NZ7tnuL0/edit?usp=sharing
gist with the benchmarks: https://gist.github.com/ptaoussanis/0a294809bc9075b6b02d

Thanks, cheers!

Comment by Zach Tellman [ 21/Jul/15 6:52 PM ]

Hey Peter, I can't reproduce your results, and some of them are so far off what I'd expect that I have to think there was some data gathering error. For instance, the assoc operation being slower is kind of inexplicable, considering the unrolled version doesn't do any copying, etc. Also, all of your numbers are significantly slower than the ones on my 4 year old laptop, which is also a bit strange.

Just to make sure that we're comparing apples to apples, I've adapted your benchmarks into something that pretty-prints the mean runtime and variance for 1.7, 1.8-alpha2, and Mike's 1517 fork. It can be found at https://github.com/ztellman/tuple-benchmark, and the results of a run at https://gist.github.com/ztellman/3701d965228fb9eda084.

Comment by Mike Anderson [ 22/Jul/15 2:24 AM ]

Hey Zach just looked at your benchmarks and they are definitely more consistent with what I am seeing. The overall nanosecond timings look about right from my experience with similar code (e.g. working with small vectors in vectorz-clj).

Comment by Peter Taoussanis [ 22/Jul/15 2:41 AM ]

Hi Zach, thanks for that!

Have updated the results -
Gist: https://gist.github.com/ptaoussanis/0a294809bc9075b6b02d
Google docs: https://goo.gl/khgT83

Note that I've added an extra sheet/tab to the Google doc for your own numbers at https://gist.github.com/ztellman/3701d965228fb9eda084.

Am still struggling to produce results that show any consistent+significant post-JIT benefit to either of the tuple implementations against the micro benchmarks and one larger small-vec-heavy system I had handy.

It's looking to me like it's maybe possible that the JIT's actually optimising away most of the non-tuple inefficiencies in practice?

Of course it's very possible that my results are off, or my expectations wrong. The numbers have been difficult to pin down.

It definitely helped to have a standardised reference micro benchmark to work against (https://github.com/ztellman/tuple-benchmark). Could I perhaps suggest a similar reference macro benchmark (maybe something from core.matrix, Mike?)

Might also be a good idea to define a worthwhile target performance delta for ops like these that run in the nanosecond range (or for the larger reference macro benchmark)?

Just some thoughts from someone passing through in case they're at all useful; know many of you have been deeply involved in this for some time so please feel free to ignore any input that's not helpful

Appreciate all the efforts, cheers!

Comment by Rich Hickey [ 22/Jul/15 9:24 AM ]

I think everyone should back off on their enthusiasm for this approach. After much analysis, I am seeing definite negative impacts to tuples, especially the multiple class approach proposed by Zach. What happens in real code is that the many tuple classes cause call sites that see different sized vectors to become megamorphic, and nothing gets adequately optimized. In particular, call sites that will see tuple-sized and large vectors (i.e. a lot of library code) will optimize differently depending upon which they see often first. So, if you run your first tight loop on vector code that sees tuples, that code could later do much worse (50-100%) on large vectors than before the tuples patch was in place. Being much slower on large collections is a poor tradeoff for being slightly faster on small ones.

Certainly different tuple classes for arity 0-6 is a dead idea. You get as good or better optimization (at some cost in size) from a single class e.g. one with four fields, covering sizes 0-4. I have a working version of this in a local branch. It is better in that sites that include pvectors are only bi-morphic, but I am still somewhat skittish given what I've seen.

The other takeaway is that the micro benchmarks are nearly worthless for detecting these issues.

Comment by Zach Tellman