Clojure

Clojure can leak memory when used in a servlet container

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: Release 1.6
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code
  • Approval:
    Incomplete

Description

When used within a servlet container
(Jetty/Tomcat/JBossAS/Immutant/etc), the thread locals Var.dvals (used
to store dynamic bindings) and LockingTransaction.transaction (used to
store the currently active transaction(s)) prevent all of the classes
loaded by an application's clojure runtime from being garbage collected,
resulting in a memory leak.

The issue comes from threads living beyond the lifetime of a
deployment - servlet containers use thread pools that are shared
across all applications within the container. Currently, the dvals and
transaction thread locals are not discarded when they are no longer
needed, causing their contents to retain a hard reference to their
classloaders, which, in turn, causes all of the classes loaded under
the application's classloader to be retained until the thread exits
(which is generally at JVM shutdown).

I've attached a patch that does the following:

  • Var.dvals is now removed when the thread bindings are popped
  • Var.dvals no longer has an initialValue, so checking to see if it is
    set will no longer set it to an empty Frame
  • The outer transaction in LockingTransaction.transaction now removes
    the thread local when it is finished

There is still the opportunity for memory leaks if agents or futures
are used, and the executors used for them are not shutdown when the
app is undeployed. That's a solvable problem, but should probably be
solved by the containers themselves (and/or the war generation tools)
instead of in clojure itself.

This patch has a small performance impact: its use of a try/finally
around running transactions to remove the outer transaction adds
4-6 microseconds to each transaction call on my hardware.

Providing an automated test for this patch is difficult - I've tested
it locally with repeated deployments to a container while monitoring
GC and permgen. All of clojure's tests pass with it applied.

The above is a condensation of:
https://groups.google.com/d/topic/clojure-dev/3CXDe8_9G58/discussion

I'm happy to provide whatever feedback/work is needed to get this
applied.

Activity

Hide
Colin Jones added a comment -

This patch works great for me to avoid OOM/PermGen crashes from classloaders being retained [mine is a non-servlet use case].

Show
Colin Jones added a comment - This patch works great for me to avoid OOM/PermGen crashes from classloaders being retained [mine is a non-servlet use case].
Hide
Stuart Halloway added a comment -

Does Tomcat create warnings for Clojure, as described e.g. here?

If so, does this patch make the warnings go away?

Show
Stuart Halloway added a comment - Does Tomcat create warnings for Clojure, as described e.g. here? If so, does this patch make the warnings go away?
Hide
Toby Crawley added a comment -

Stu: that's a good question. I'll take a look at Tomcat this afternoon.

Show
Toby Crawley added a comment - Stu: that's a good question. I'll take a look at Tomcat this afternoon.
Hide
Stuart Halloway added a comment -

The code that calls transaction.remove() seems unncessarily subtle. There are two exits from the method, and only one is protected by the finally block.

If the "outer" case was a top-level if, the logic would be more clear, and only the "outer" case would need try/finally, which might reduce the performance penalty in the case of deeply nested dosyncs.

Did your transaction overhead of 4-6 microseconds test only one level of dosync, or many?

Show
Stuart Halloway added a comment - The code that calls transaction.remove() seems unncessarily subtle. There are two exits from the method, and only one is protected by the finally block. If the "outer" case was a top-level if, the logic would be more clear, and only the "outer" case would need try/finally, which might reduce the performance penalty in the case of deeply nested dosyncs. Did your transaction overhead of 4-6 microseconds test only one level of dosync, or many?
Hide
Stuart Halloway added a comment -

Because the unwind code calls remove at the top (as opposed to set(null)), the code should now be safe for use with Clojure-defined ThreadLocal subclasses.

Therefore, Var's use of an initialValue should be irrelevant to this patch, and it should be possible to fix this bug with a patch half the size of the current patch, touching only LockingTransaction.runInTransaction and Var.popThreadBindings.

Show
Stuart Halloway added a comment - Because the unwind code calls remove at the top (as opposed to set(null)), the code should now be safe for use with Clojure-defined ThreadLocal subclasses. Therefore, Var's use of an initialValue should be irrelevant to this patch, and it should be possible to fix this bug with a patch half the size of the current patch, touching only LockingTransaction.runInTransaction and Var.popThreadBindings.

People

Vote (8)
Watch (6)

Dates

  • Created:
    Updated: