Affects Version/s: Release 1.5
Fix Version/s: Release 1.7
Patch:Code and Test
reduce gives a StackOverflowError on long sequences that contain many transitions between chunked and unchunked:
Such a sequence is well behaved under most other sequence operations, and its underlying structure can even be masked such that reduce succeeds:
I don't think Clojure developers normally worry about mixing chunked and unchunked seqs, so the existence of such a sequence is not at all unreasonable (and indeed this happened to me at work and was very difficult to debug).
Cause: The implementation of reduce can bounce back and forth between CollReduce and InternalReduce, consuming a stack frame in every transition. Given enough transitions, the stack will overflow.
Approach: Degrade to naive impl after first chunk
In the IChunkedSeq implementation, instead of calling internal-reduce when the
sequence stops being chunked, it could have an (inlined?) unoptimized implementation,
ensuring that no further stack space is taken up. This retains the behavior that a
generic seq with a chunked tail will still run in an optimized fashion, but a seq with
two chunked portions would only be optimized the first time.
1) Use clojure.core/trampoline - This would presumably work, but requires wrapping the normal return values from all implementations of internal-reduce.
2) Create a special type (Unreduced) that signals
an implementation change. The two implementation-change points in internal-reduce
(in the IChunkedSeq impl and the Object impl) are converted to return an instance
of Unreduced instead of a direct call to internal-reduce. Then seq-reduce is converted to check for instances of Unreduced before returning,
and recurs if it finds one.
Screened by: Alex Miller