Affects Version/s: Release 1.5
Fix Version/s: None
Description of one example with poor performance discovered by Michał Marczyk in the discussion thread linked below.
The difference between the compiled versions of:
is quite significant. foo gets compiled to a single class, with invocations handled by a single invoke method; bar gets compiled to a class for bar + an extra class for an inner function which handles the (locking o (dec x)) part – probably very similar to the output for the version with the hand-coded locking-part (although I haven't really looked at that yet). The inner function is a closure, so calling it involves an allocation of a closure object; its ctor receives the closed-over locals as arguments and stores them in two fields (lockee and x). Then they get loaded from the fields in the body of the closure's invoke method etc.
Note: The summary line may be too narrow a description of the root cause, and simply the first example of a case where this issue was noticed and examined. Please make the summary and this description more accurate if you diagnose this issue.
See discussion thread on Clojure group here: https://groups.google.com/forum/#!topic/clojure/x86VygZYf4Y