There is a PermGen memory leak that we have tracked down to protocol methods and multimethods called inside an eval, because of the caches these methods use. The problem only arises when the value being cached is an instance of a class (such as a function or reify) that was defined inside the eval. Thus extending IFn or dispatching a multimethod on an IFn are likely triggers.
- multifn_weak_method_cache.diff - a WeakReference solution
- naive-lru-for-multimethods-and-protocols.diff - an LRU cache solution
Reproducing: The easiest way that I have found to test this is to set "-XX:MaxPermSize" to a reasonable value so you don't have to wait too long for the PermGen spaaaaace to fill up, and to use "-XX:+TraceClassLoading" and "-XX:+TraceClassUnloading" to see the classes being loaded and unloaded.
You can use lein swank 45678 and connect with slime in emacs via M-x slime-connect.
To monitor the PermGen usage, you can find the Java process to watch with "jps -lmvV" and then run "jstat -gcold 1s". According to the jstat docs, the first column (PC) is the "Current permanent space capacity (KB)" and the second column (PU) is the "Permanent space utilization (KB)". VisualVM is also a nice tool for monitoring this.
Evaluating the following code will run a loop that eval's (take* (fn foo )).
In the lein swank session, you will see many lines like below listing the classes being created and loaded.
These lines will stop once the PermGen space fills up.
In the jstat monitoring, you'll see the amount of used PermGen space (PU) increase to the max and stay there.
A workaround is to run prefer-method before the PermGen space is all used up, e.g.
Then, when the used PermGen space is close to the max, in the lein swank session, you will see the classes created by the eval'ing being unloaded.
In the jstat monitoring, there will be a long pause when used PermGen space stays close to the max, and then it will drop down, and start increasing again when more eval'ing occurs.
The defmulti defines a cache that uses the dispatch values as keys. Each eval call in the loop defines a new foo class which is then added to the cache when take* is called, preventing the class from ever being GCed.
The prefer-method workaround works because it calls clojure.lang.MultiFn.preferMethod, which calls the private MultiFn.resetCache method, which completely empties the cache.
The leak with protocol methods similarly involves a cache. You see essentially the same behavior as the multimethod leak if you run the following code using protocols.
Again, the cache is in the take* method itself, using each new foo class as a key.
Workaround: A workaround is to run -reset-methods on the protocol before the PermGen space is all used up, e.g.
This works because -reset-methods replaces the cache with an empty MethodImplCache.