SeqIterator can return incorrect results

Description

As of 1.7.0-alpha5, we are seeing SeqIterator return iterated results that do reflect the values of the underlying seq, in particular acting as if the seq contains a nil value when it does not. This problem is intermittent but has at times caused clojure master to fail in compilation (which is why this is marked as a blocker).

Two recent changes during 1.7 have created and exposed this problem:

1) This commit https://github.com/clojure/clojure/commit/43cc1854508d655e58e377f84836ba128971f90c changed the SeqIterator implementation to be lazier and to use "this" as a sentinel object in SeqIterator. (1.7.0-alpha2)
2) CLJ-1546 changed the implementation of vec such that PersistentHashMap and PersistentHashSet are now converted using iterator() rather than seq(). PHS/PHM use SeqIterator for their Iterator implementation. (1.7.0-alpha5)

Because of #2, we are now stressing #1 much more than before. In particular, things like defining defrecords rely heavily on vec (and set) of PHS and PHM.

Example stack trace: https://gist.github.com/puredanger/f56e3253f0668a515ec5 (seen compiling Clojure itself)

Cause: Setting seq==this; in the constructor of SeqIterator is allowing unsafe publication of the partially constructed "this" object, which can cause subtle problems in the hasNext() implementation. In particular, it seems that after inlining, on the first call, the seq==this condition when comparing the cached partially constructed instance in seq and the fully constructed version in this will return false, even though these have the same object identity. This causes the wrong path to be executed in hasNext().

Approach: Do not use this as a sentinel value.

Patch: 0001-CLJ-1636-don-t-use-this-as-a-sentinel-in-SeqIterator.patch

Screened by: Alex Miller

Environment

clojure-1.7.0-alpha5

Attachments

2
  • 13 Jan 2015, 11:32 PM
  • 11 Jan 2015, 07:58 PM

Activity

Show:

Nicola Mometto January 18, 2015 at 5:36 PM

I've spent some time reading through both the jvm and the java specs and I can't find a reasonable explaination for what was happening, I can only think this is a bug in some hotspot inlining optimization.

Alex Miller January 17, 2015 at 12:01 AM

While we've applied the patch, I would still love to understand wtf is happening here and would love to see that too. For me, I can quite reliably reproduce it building Clojure itself. To support the theory of it happening during an inlining transition, it's unlikely to reproduce outside the context of other code however. I see it get embedded inside a big wad of calling code when I watching inlining.

Andy Fingerhut January 16, 2015 at 10:59 PM

Out of curiosity, does anyone have a smaller test case that causes this incorrect return value from the SeqIterator, without running the 'lein with-profile 1.7 midje' command on Midje? e.g. running a page or less worth of Clojure code?

Andy Fingerhut January 15, 2015 at 7:17 PM

If you want to collect test results, seems like it would be good for people to respond with the OS version and JVM version they tested on, and whether it was the Midje test in the description of this ticket that they tried, or some other test.

Michael Blume January 15, 2015 at 6:48 PM

Does for me, yes =)

Completed

Details

Assignee

Reporter

Approval

Patch

Priority

Affects versions

Fix versions

Created January 11, 2015 at 5:34 AM
Updated January 18, 2015 at 5:36 PM
Resolved January 18, 2015 at 5:36 PM