<< Back to previous view

[CLJ-1793] Reducer instances hold onto the head of seqs Created: 05/Aug/15  Updated: 21/Sep/16

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: Release 1.8
Fix Version/s: Release 1.9

Type: Defect Priority: Critical
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Unresolved Votes: 3
Labels: compiler
Environment:

1.8.0-alpha2 - 1.8.0-alpha4


Attachments: Text File 0001-Clear-this-before-calls-in-tail-position.patch     Text File clj-1793-2.patch     Text File clj-1793-3.patch    
Patch: Code and Test
Approval: Screened

 Description   

(This ticket started life as CLJ-1250, was committed in 1.8.0-alpha2, pulled out after alpha4, and this is the new version that fixes the logic about whether in a tail call as well as addresses direct linking added in 1.8.0-alpha3.)

Problem: Original example was with reducers holding onto the head of a lazy seq:

(time (reduce + 0 (map identity (range 1e8))))    ;; works
(time (reduce + 0 (r/map identity (range 1e8))))  ;; oome from holding head of range

Trickier example from CLJ-1250 that doesn't clear `this` in nested loop:

(let [done (atom false)
      f (future-call
          (fn inner []
            (while (not @done)
              (loop [found []]
                (println (conj found 1))))))]
  (doseq [elem [:a :b :c :done]]
    (println "queue write " elem))
  (reset! done true)
  @f)

Problem: #'reducer closes over a collection in order to reify CollReduce, and the closed-over collection reference is never cleared. When code attempts to reduce over this anonymous transformed collection, it will realize the tail while the head is stored in the closed-over.

Approach: When invoking a method in a tail call, clear 'this' prior to invoking.

The criteria for when a tail call is a safe point to clear 'this':

1) Must be in return position
2) Not in a try block (might need 'this' during catch/finally)
3) Not direct linked

Return position (#1) isn't simply (context == C.RETURN) because loop bodies are always parsed in C.RETURN context

A new dynvar METHOD_RETURN_CONTEXT tracks whether an InvokeExpr in tail position can directly leave the body of the compiled java method. It is set to RT.T in the outermost parsing of a method body and invalidated (set to null) when a loop body is being parsed where the context for the loop expression is not RETURN parsed. Added clear in StaticInvokeExpr as that is now a thing with direct linking again.

Removes calls to emitClearLocals(), which were a no-op.

Patch: clj-1793-3.patch

Screened by: Alex Miller



 Comments   
Comment by Alex Miller [ 05/Aug/15 12:16 PM ]

The this ref is cleared prior to the println, but the next time through the while loop it needs the this ref to look up the closed over done field (via getfield).

Adding an additional check to the inTailCall() method to not include tail call in a loop addresses this case:

static boolean inTailCall(C context) {
-    return (context == C.RETURN) && (IN_TRY_BLOCK.deref() == null);
+    return (context == C.RETURN) && (IN_TRY_BLOCK.deref() == null) && (LOOP_LOCALS.deref() == null);
}

But want to check some more things before concluding that's all that's needed.

Comment by Alex Miller [ 05/Aug/15 1:36 PM ]

This change undoes the desired behavior in the original CLJ-1250 (new tests don't pass). For now, we are reverting the CLJ-1250 patch in master.

Comment by Ghadi Shayban [ 05/Aug/15 3:12 PM ]

Loop exit edges are erroneously being identified as places to clear 'this'. Only exits in the function itself or the outermost loop are safe places to clear.

Comment by Ghadi Shayban [ 05/Aug/15 8:43 PM ]

Patch addresses this bug and the regression in CLJ-1250.

See the commit message for an extensive-ish comment.

Comment by Alex Miller [ 18/Aug/15 12:33 PM ]

New patch is same as old, just adds jira id to beginning of commit message.

Comment by Rich Hickey [ 24/Aug/15 10:00 AM ]

Not doing this for 1.8, more thought needs to go into whether this is the right solution to the problem. And, what is the problem? This title of this patch is just something to do.

Comment by Alex Miller [ 24/Aug/15 10:21 AM ]

changing to vetted so this is at a valid place in the jira workflow

Comment by Ghadi Shayban [ 24/Aug/15 10:45 AM ]

Rich the original context is in CLJ-1250 which was a defect/problem. It was merged and revert because of a problem in the impl. This ticket is the continuation of the previous one, but unfortunately the title lost the context and became approach-oriented and not problem-oriented. Blame Alex. (I kid, it's an artifact of the mutable approach to issue management.)

Comment by Nicola Mometto [ 23/Mar/16 7:34 AM ]

Just a note that the original ticket for this issue had 10 votes

Comment by Nicola Mometto [ 30/Mar/16 8:50 AM ]

The following code currently eventually causes an OOM to happen, the patch in this ticket correctly helps not holding onto the collection and doesn't cause memory to run infinitely

Before patch:

user=> (defn range* [x] (cons x (lazy-seq (range* (inc x)))))
#'user/range*
user=> (reduce + 0 (eduction (range* 0)))
OutOfMemoryError Java heap space  clojure.lang.RT.cons (RT.java:660)

After patch:

user=> (defn range* [x] (cons x (lazy-seq (range* (inc x)))))
#'user/range*
user=> (reduce + 0 (eduction (range* 0)))
;; runs infinitely without causing OOM
Comment by Alex Miller [ 08/Sep/16 5:14 PM ]

Refreshed patch to apply to master. No semantic changes, attribution retained.





Generated at Sat Oct 01 12:16:30 CDT 2016 using JIRA 4.4#649-r158309.