Clojure

StackOverflowError on exception in reducef for PersistentHashMap fold

Details

  • Type: Defect Defect
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Completed
  • Affects Version/s: Release 1.5
  • Fix Version/s: Release 1.6
  • Component/s: None
  • Labels:
  • Environment:
    Clojure 1.5.0-alpha4, Sun Java 1.6.0_35, with [org.codehaus.jsr166-mirror/jsr166y "1.7.0"]
  • Patch:
    Code and Test
  • Approval:
    Ok

Description

If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem.

user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError   clojure.lang.Numbers.add (Numbers.java:126)

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt

Cause: The problem area in the code catches Exception and swallows it - this is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java#L444

Approach: Throw the Exception and allow it to propagate instead of swallowing it.

Patch: sneakyThrow-clj-1058.diff

Screened by: Alex Miller

Activity

Timothy Baldridge made changes -
Field Original Value New Value
Assignee Timothy Baldridge [ halgari ]
Hide
Timothy Baldridge added a comment -

Verified as a bug.

Show
Timothy Baldridge added a comment - Verified as a bug.
Timothy Baldridge made changes -
Approval Vetted [ 10003 ]
Priority Minor [ 4 ] Major [ 3 ]
Assignee Timothy Baldridge [ halgari ]
Rich Hickey made changes -
Fix Version/s Release 1.6 [ 10157 ]
Hide
Dave Sann added a comment -

Some comments here:

https://groups.google.com/d/msg/clojure-dev/kwJEw9YjeGc/VKxlixGVnMIJ

in short
//aargh

can be replaced by

throw new RuntimeException(e);

or maybe: Util.sneakyThrow(e);

which will surface the exception rather than swallow it.

I have questions about clean-up - in the thread - I am not very familiar with the fork join framework.

Show
Dave Sann added a comment - Some comments here: https://groups.google.com/d/msg/clojure-dev/kwJEw9YjeGc/VKxlixGVnMIJ in short //aargh can be replaced by throw new RuntimeException(e); or maybe: Util.sneakyThrow(e); which will surface the exception rather than swallow it. I have questions about clean-up - in the thread - I am not very familiar with the fork join framework.
Alex Miller made changes -
Labels reducers
Hide
Bruce Adams added a comment -

Don't ignore exceptions thrown by tasks

Show
Bruce Adams added a comment - Don't ignore exceptions thrown by tasks
Bruce Adams made changes -
Attachment patch-for-clj-1058.diff [ 12266 ]
Andy Fingerhut made changes -
Patch Code [ 10001 ]
Hide
Bruce Adams added a comment -

This change removes the problematic
try ... catch
and enhances "foldTasks" and "fold" method declarations by adding "throws Exception".

Show
Bruce Adams added a comment - This change removes the problematic try ... catch and enhances "foldTasks" and "fold" method declarations by adding "throws Exception".
Bruce Adams made changes -
Attachment declare-throws-clj-1058.diff [ 12269 ]
Hide
Andy Fingerhut added a comment -

Bruce, others can comment better than I can on this comment, but a large number of places in the Clojure implementation avoid the need for declaring checked exceptions by using Util.sneakyThrow. Search for that in the Java code for examples, and how it is implemented if you are curious. It might be that this is another place this should be done.

Show
Andy Fingerhut added a comment - Bruce, others can comment better than I can on this comment, but a large number of places in the Clojure implementation avoid the need for declaring checked exceptions by using Util.sneakyThrow. Search for that in the Java code for examples, and how it is implemented if you are curious. It might be that this is another place this should be done.
Hide
Bruce Adams added a comment -

I don't yet know the Clojure code base very well. Doing a quick count, there are slightly more methods with a declared "throws" clause then there are "sneakyThrows" calls. I am assuming (based on other Java experience) that sneakyThrows is only used when declaring a throws clause is problematic. In my proposed fix, throws declarations are needed only on four methods, all in the same module. The actual exceptions appear to be dealt with cleanly.

What I haven't yet figured out is how to write relevant tests for this. The clearly broken thing about the existing code is falling out of the if count==1 block into code that assumes count>1. Even just returning a null at least avoids the infinite recursion problem (and also swallows the exception, which seems wrong). So a test for once-and-only-once seems relevant. Also, a test for exception behavior seems called for.

(Also, I don't know protocol for assigning bugs. I'm tempted to hit the "Assign To Me" button in JIRA, but I'm not sure what it means in the larger scheme of things.)

Show
Bruce Adams added a comment - I don't yet know the Clojure code base very well. Doing a quick count, there are slightly more methods with a declared "throws" clause then there are "sneakyThrows" calls. I am assuming (based on other Java experience) that sneakyThrows is only used when declaring a throws clause is problematic. In my proposed fix, throws declarations are needed only on four methods, all in the same module. The actual exceptions appear to be dealt with cleanly. What I haven't yet figured out is how to write relevant tests for this. The clearly broken thing about the existing code is falling out of the if count==1 block into code that assumes count>1. Even just returning a null at least avoids the infinite recursion problem (and also swallows the exception, which seems wrong). So a test for once-and-only-once seems relevant. Also, a test for exception behavior seems called for. (Also, I don't know protocol for assigning bugs. I'm tempted to hit the "Assign To Me" button in JIRA, but I'm not sure what it means in the larger scheme of things.)
Hide
Alex Miller added a comment -

"Assign to Me" means that you are currently working on something for that ticket and others should probably check with you before investing time. From my perspective, anyone is welcome to assign to themselves with the caveat that someone else may still create a patch and/or interpret non-progress as an invitation to work on it.

Is this ticket in a state where it should be screened or do you feel it is incomplete? I am trying to judge whether I should invest time in screening it.

Show
Alex Miller added a comment - "Assign to Me" means that you are currently working on something for that ticket and others should probably check with you before investing time. From my perspective, anyone is welcome to assign to themselves with the caveat that someone else may still create a patch and/or interpret non-progress as an invitation to work on it. Is this ticket in a state where it should be screened or do you feel it is incomplete? I am trying to judge whether I should invest time in screening it.
Hide
Bruce Adams added a comment -

I believe that declare-throws-clj-1058.diff correctly deals with the problem. It certainly behaves better than code it replaces. I can imagine you screening, and then Rich accepting, it as is.

However, I think we will all be happier with tests demonstrating behavior. I've been struggling with writing tests, especially tests that cleanly fail on the old code, instead of hard crashing the test suite with a stack overflow.

Show
Bruce Adams added a comment - I believe that declare-throws-clj-1058.diff correctly deals with the problem. It certainly behaves better than code it replaces. I can imagine you screening, and then Rich accepting, it as is. However, I think we will all be happier with tests demonstrating behavior. I've been struggling with writing tests, especially tests that cleanly fail on the old code, instead of hard crashing the test suite with a stack overflow.
Bruce Adams made changes -
Description If reducef throws an exception, the PHM fold code can descend into an infinite loop, causing a stack overflow which masks the problem. This situation is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).

To reproduce:
{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
;; boom
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt
If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem. The problem area in the code is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).

To reproduce:
{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError clojure.lang.Numbers.add (Numbers.java:126)
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt
Bruce Adams made changes -
Assignee Bruce Adams [ bruceadams ]
Hide
Bruce Adams added a comment - - edited

The code fix in "patch-with-tests-clj-1058.diff" is identical to my earlier "declare-throws-clj-1058.diff" This new patch includes an additional commit adding a unit test.

This is ready for screening.

Show
Bruce Adams added a comment - - edited The code fix in "patch-with-tests-clj-1058.diff" is identical to my earlier "declare-throws-clj-1058.diff" This new patch includes an additional commit adding a unit test. This is ready for screening.
Bruce Adams made changes -
Attachment patch-with-tests-clj-1058.diff [ 12334 ]
Bruce Adams made changes -
Attachment patch-for-clj-1058.diff [ 12266 ]
Alex Miller made changes -
Patch Code [ 10001 ] Code and Test [ 10002 ]
Alex Miller made changes -
Description If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem. The problem area in the code is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).

To reproduce:
{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError clojure.lang.Numbers.add (Numbers.java:126)
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt
If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem.

{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError clojure.lang.Numbers.add (Numbers.java:126)
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt

*Cause:* The problem area in the code catches Exception and swallows it - this is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java#L444

*Approach:* Throw the Exception instead of swallowing it to allow it to propagate.

*Patch:* patch-with-tests-clj-1058.diff

*Screened by:*
Hide
Alex Miller added a comment -

I don't think we want to throw Exception out of (at least) fold() and possibly the others as well. Can you rework with sneakyThrow?

Show
Alex Miller added a comment - I don't think we want to throw Exception out of (at least) fold() and possibly the others as well. Can you rework with sneakyThrow?
Alex Miller made changes -
Approval Vetted [ 10003 ] Incomplete [ 10006 ]
Hide
Bruce Adams added a comment -

I can use sneakyThrow. The patch I uploaded in September 17 did exactly that. I have since deleted that patch.

I've now had two people I respect, you and Andy Fingerhut, suggest a need for using sneakyThrow in this code. Each of you know the Clojure code far better than I do. I'm very inclined to trust your intuition that sneakyThrow is the right thing here, but I would really like to understand why. In my testing, checked exceptions (non-RuntimeExceptions) out of this code already get wrapped in a RuntimeException. I assume this wrapping happens higher up the Java stack and is implemented via sneakyThrow. Why do we want to add yet another layer of exception wrapping? It adds complexity to this Java code.

I am happy to add tests demonstrating the existing checked exception wrapping behavior.

Can you help me understand why we want sneakyThrow here? I would really appreciate it.

Show
Bruce Adams added a comment - I can use sneakyThrow. The patch I uploaded in September 17 did exactly that. I have since deleted that patch. I've now had two people I respect, you and Andy Fingerhut, suggest a need for using sneakyThrow in this code. Each of you know the Clojure code far better than I do. I'm very inclined to trust your intuition that sneakyThrow is the right thing here, but I would really like to understand why. In my testing, checked exceptions (non-RuntimeExceptions) out of this code already get wrapped in a RuntimeException. I assume this wrapping happens higher up the Java stack and is implemented via sneakyThrow. Why do we want to add yet another layer of exception wrapping? It adds complexity to this Java code. I am happy to add tests demonstrating the existing checked exception wrapping behavior. Can you help me understand why we want sneakyThrow here? I would really appreciate it.
Hide
Alex Miller added a comment -

Most of the internal Clojure code eschews checked exceptions (they are an artifact of Java, not a JVM thing) and I think it's unlikely that Rich would accept a patch that explicitly declared it in this case. My expectation is that wrapping the exception at the beginning via sneakyThrow will prevent it from being wrapped again farther up, so I don't know that this change will nest things any more deeply.

Show
Alex Miller added a comment - Most of the internal Clojure code eschews checked exceptions (they are an artifact of Java, not a JVM thing) and I think it's unlikely that Rich would accept a patch that explicitly declared it in this case. My expectation is that wrapping the exception at the beginning via sneakyThrow will prevent it from being wrapped again farther up, so I don't know that this change will nest things any more deeply.
Bruce Adams made changes -
Attachment declare-throws-clj-1058.diff [ 12269 ]
Bruce Adams made changes -
Attachment patch-with-tests-clj-1058.diff [ 12334 ]
Hide
Bruce Adams added a comment -

My earlier patch had an extra, unneeded "throws Exception" declaration on a publicly visible "fold" method. This new patch, declarative-clj-1058.diff, only adds "throws Exception" to methods used within src/jvm/clojure/lang/PersistentHashMap.java.

To make this locality clear, I've attached another patch, private-interface-INode.diff, which declares the internal INode interface as "private". I don't think the "private" declaration needs to go into the Clojure code.

These patches do not taking your advice to use sneakyThrow here. Using sneakyThrow still feels wrong to me. I will (attempt to) start a conversation on the Clojure-dev mailing asking people for thoughts about the two approaches.

Show
Bruce Adams added a comment - My earlier patch had an extra, unneeded "throws Exception" declaration on a publicly visible "fold" method. This new patch, declarative-clj-1058.diff, only adds "throws Exception" to methods used within src/jvm/clojure/lang/PersistentHashMap.java. To make this locality clear, I've attached another patch, private-interface-INode.diff, which declares the internal INode interface as "private". I don't think the "private" declaration needs to go into the Clojure code. These patches do not taking your advice to use sneakyThrow here. Using sneakyThrow still feels wrong to me. I will (attempt to) start a conversation on the Clojure-dev mailing asking people for thoughts about the two approaches.
Bruce Adams made changes -
Attachment declarative-clj-1058.diff [ 12694 ]
Attachment private-interface-INode.diff [ 12693 ]
Hide
Bruce Adams added a comment -

I forgot to thank Alex for expressing concerns with my earlier patchset. That got me to think harder about what exactly is happening here and to learn more about the Java code implementing Clojure. I enjoy thinking and learning.

Thank you, Alex!

Show
Bruce Adams added a comment - I forgot to thank Alex for expressing concerns with my earlier patchset. That got me to think harder about what exactly is happening here and to learn more about the Java code implementing Clojure. I enjoy thinking and learning. Thank you, Alex!
Hide
Bruce Adams added a comment -

Thanks to Kevin Downey on the Clojure-dev mailing list, I now understand sneakyThrow better.

This patch uses sneakyThrow to surface exceptions.

Show
Bruce Adams added a comment - Thanks to Kevin Downey on the Clojure-dev mailing list, I now understand sneakyThrow better. This patch uses sneakyThrow to surface exceptions.
Bruce Adams made changes -
Attachment sneakyThrow-clj-1058.diff [ 12707 ]
Hide
Alex Miller added a comment -

Switching to Vetted so this is available to be screened.

Show
Alex Miller added a comment - Switching to Vetted so this is available to be screened.
Alex Miller made changes -
Approval Incomplete [ 10006 ] Vetted [ 10003 ]
Description If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem.

{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError clojure.lang.Numbers.add (Numbers.java:126)
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt

*Cause:* The problem area in the code catches Exception and swallows it - this is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java#L444

*Approach:* Throw the Exception instead of swallowing it to allow it to propagate.

*Patch:* patch-with-tests-clj-1058.diff

*Screened by:*
If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem.

{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError clojure.lang.Numbers.add (Numbers.java:126)
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt

*Cause:* The problem area in the code catches Exception and swallows it - this is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java#L444

*Approach:* Throw the Exception and allow it to propagate instead of swallowing it.

*Patch:* sneakyThrow-clj-1058.diff

*Screened by:*
Hide
Alex Miller added a comment -

Bruce, what's with the k-fail and def stuff in the tests?

Show
Alex Miller added a comment - Bruce, what's with the k-fail and def stuff in the tests?
Hide
Alex Miller added a comment -

Moving back to Incomplete - I think the tests should be simplified - I don't see why all the internal def stuff is necessary to demonstrate this.

Show
Alex Miller added a comment - Moving back to Incomplete - I think the tests should be simplified - I don't see why all the internal def stuff is necessary to demonstrate this.
Alex Miller made changes -
Approval Vetted [ 10003 ] Incomplete [ 10006 ]
Hide
Bruce Adams added a comment -

Yes, I agree. I'll update the patch with simpler tests. It'll take a few days.

Show
Bruce Adams added a comment - Yes, I agree. I'll update the patch with simpler tests. It'll take a few days.
Bruce Adams made changes -
Attachment declarative-clj-1058.diff [ 12694 ]
Bruce Adams made changes -
Attachment private-interface-INode.diff [ 12693 ]
Bruce Adams made changes -
Attachment sneakyThrow-clj-1058.diff [ 12707 ]
Hide
Bruce Adams added a comment -

Same fix as before, using sneakyThrow, but with a single, much simpler test.

Show
Bruce Adams added a comment - Same fix as before, using sneakyThrow, but with a single, much simpler test.
Bruce Adams made changes -
Attachment sneakyThrow-clj-1058.diff [ 12765 ]
Alex Miller made changes -
Approval Incomplete [ 10006 ] Vetted [ 10003 ]
Alex Miller made changes -
Approval Vetted [ 10003 ] Screened [ 10004 ]
Description If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem.

{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError clojure.lang.Numbers.add (Numbers.java:126)
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt

*Cause:* The problem area in the code catches Exception and swallows it - this is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java#L444

*Approach:* Throw the Exception and allow it to propagate instead of swallowing it.

*Patch:* sneakyThrow-clj-1058.diff

*Screened by:*
If reducef throws an exception, the exception is swallowed sometimes (that is: not propagated up to the caller of r/fold). This can lead to infinite recursion, causing a stack overflow which masks the problem.

{code}
user> (require '[clojure.core.reducers :as r])
nil
user> (r/fold (fn ([]) ([ret k v] (+ 3 "foo") ret)) (into {} (map (juxt identity identity) (range 10000))))
StackOverflowError clojure.lang.Numbers.add (Numbers.java:126)
{code}

This results in a stack like: https://raw.github.com/gist/3bab917287a7fd635a84/f38bfe3e270556e467f3fc02062af7ea10781390/gistfile1.txt

*Cause:* The problem area in the code catches Exception and swallows it - this is commented "aargh" in PersistentHashMap.java line 444 (as of 412a51d).
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java#L444

*Approach:* Throw the Exception and allow it to propagate instead of swallowing it.

*Patch:* sneakyThrow-clj-1058.diff

*Screened by:* Alex Miller
Rich Hickey made changes -
Approval Screened [ 10004 ] Ok [ 10007 ]
Stuart Halloway made changes -
Resolution Completed [ 1 ]
Status Open [ 1 ] Closed [ 6 ]

People

Vote (1)
Watch (3)

Dates

  • Created:
    Updated:
    Resolved: