Clojure

Reduce broken on some primitive vectors

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: Release 1.3, Release 1.4, Release 1.5, Release 1.6
  • Fix Version/s: Release 1.7
  • Component/s: None
  • Labels:
  • Patch:
    Code and Test
  • Approval:
    Screened

Description

In some cases, reduce over a sequence from a primitive vector created with vector-of will return incorrect answers:

user=> (into [] (drop 32 (into [] (range 33))))
[32]
user=> (into [] (drop 32 (into (vector-of :int) (range 33))))
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32]

Second call should return [32] just like the first one.

Cause: VecSeq (seq on primitive Vec obtained with vector-of) maintains two flags: i is the total number of elements prior to the current node in this seq. offset is the offset in the current anode. When using internal-reduce on a VecSeq, the starting index for the reduce was using offset and ignoring i.

Solution: Use (+ i offset) as the starting index.

Patch: clj-1362-v1.patch

Screened by:

Activity

Alex Miller made changes -
Field Original Value New Value
Labels bug collections
Alex Miller made changes -
Summary vector-of, split-at, and into Primitive vectors and split-at leads to incorrect result
Alex Miller made changes -
Description I'm seeing some really weird behavior under certain conditions when using vector-of vectors when they are split and then the second half is put into another vector.

The following test illustrates the problem:

{code}
(ns vector-of-test.core-test
  (:use clojure.test
        vector-of-test.core))

(defn verify [expected actual]
  (doseq [i (range (count expected))]
    (testing (str "i=" i)
      (is (= (apply concat (split-at i expected))
             (apply concat (split-at i actual))))
      (is (= (second (split-at i expected))
             (second (split-at i actual))))
      (is (= (into [] (second (split-at i expected)))
             (into [] (second (split-at i actual))))))))

(deftest test-vector-of
  (doseq [j (range 35)]
    (testing (str "j=" j)
      (let [test-seq (range j)]
        (verify (into [] test-seq)
                (into (vector-of :int) test-seq))))))
{code}

It appears that for 31 < i < j, the test will fail to pass verify's third check. Basically, the result derived from the (vector-of :int) sequence contains the entire original sequence instead of just the last few items. Oddly, the first two checks succeed -- it's only when you try to put it into something that the problem manifests.. Also, it doesn't seem to matter what kind of vector you put the results in (I'm using a PersistentVector here, but I see the same results with a (vector-of :int), for example).
I'm seeing some really weird behavior under certain conditions when using vector-of vectors when they are split and then the second half is put into another vector.

Simpler test than original report below:

{code}
user=> (into [] (drop 32 (into [] (range 33))))
[32]
user=> (into [] (drop 32 (into (vector-of :int) (range 33))))
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32]
{code}

Second call should return [32] as well.

The following test illustrates the problem:

{code}
(ns vector-of-test.core-test
  (:use clojure.test
        vector-of-test.core))

(defn verify [expected actual]
  (doseq [i (range (count expected))]
    (testing (str "i=" i)
      (is (= (apply concat (split-at i expected))
             (apply concat (split-at i actual))))
      (is (= (second (split-at i expected))
             (second (split-at i actual))))
      (is (= (into [] (second (split-at i expected)))
             (into [] (second (split-at i actual))))))))

(deftest test-vector-of
  (doseq [j (range 35)]
    (testing (str "j=" j)
      (let [test-seq (range j)]
        (verify (into [] test-seq)
                (into (vector-of :int) test-seq))))))
{code}

It appears that for 31 < i < j, the test will fail to pass verify's third check. Basically, the result derived from the (vector-of :int) sequence contains the entire original sequence instead of just the last few items. Oddly, the first two checks succeed -- it's only when you try to put it into something that the problem manifests.. Also, it doesn't seem to matter what kind of vector you put the results in (I'm using a PersistentVector here, but I see the same results with a (vector-of :int), for example).
Alex Miller made changes -
Summary Primitive vectors and split-at leads to incorrect result Reduce broken on some primitive vectors
Alex Miller made changes -
Patch Code and Test [ 10002 ]
Approval Triaged [ 10120 ]
Description I'm seeing some really weird behavior under certain conditions when using vector-of vectors when they are split and then the second half is put into another vector.

Simpler test than original report below:

{code}
user=> (into [] (drop 32 (into [] (range 33))))
[32]
user=> (into [] (drop 32 (into (vector-of :int) (range 33))))
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32]
{code}

Second call should return [32] as well.

The following test illustrates the problem:

{code}
(ns vector-of-test.core-test
  (:use clojure.test
        vector-of-test.core))

(defn verify [expected actual]
  (doseq [i (range (count expected))]
    (testing (str "i=" i)
      (is (= (apply concat (split-at i expected))
             (apply concat (split-at i actual))))
      (is (= (second (split-at i expected))
             (second (split-at i actual))))
      (is (= (into [] (second (split-at i expected)))
             (into [] (second (split-at i actual))))))))

(deftest test-vector-of
  (doseq [j (range 35)]
    (testing (str "j=" j)
      (let [test-seq (range j)]
        (verify (into [] test-seq)
                (into (vector-of :int) test-seq))))))
{code}

It appears that for 31 < i < j, the test will fail to pass verify's third check. Basically, the result derived from the (vector-of :int) sequence contains the entire original sequence instead of just the last few items. Oddly, the first two checks succeed -- it's only when you try to put it into something that the problem manifests.. Also, it doesn't seem to matter what kind of vector you put the results in (I'm using a PersistentVector here, but I see the same results with a (vector-of :int), for example).
In some cases, reduce over a sequence from a primitive vector created with vector-of will return incorrect answers:

{code}
user=> (into [] (drop 32 (into [] (range 33))))
[32]
user=> (into [] (drop 32 (into (vector-of :int) (range 33))))
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32]
{code}

Second call should return [32] just like the first one.

*Cause:* VecSeq (seq on primitive Vec obtained with vector-of) maintains two flags: {{i}} is the total number of elements prior to the current node in this seq. {{offset}} is the offset in the current anode. When using internal-reduce on a VecSeq, the starting index for the reduce was using offset and ignoring i.

*Solution:* Use (+ i offset) as the starting index.

*Patch:* clj-1362-v1.patch

*Screened by:*

The following test illustrates the problem:

{code}
(ns vector-of-test.core-test
  (:use clojure.test
        vector-of-test.core))

(defn verify [expected actual]
  (doseq [i (range (count expected))]
    (testing (str "i=" i)
      (is (= (apply concat (split-at i expected))
             (apply concat (split-at i actual))))
      (is (= (second (split-at i expected))
             (second (split-at i actual))))
      (is (= (into [] (second (split-at i expected)))
             (into [] (second (split-at i actual))))))))

(deftest test-vector-of
  (doseq [j (range 35)]
    (testing (str "j=" j)
      (let [test-seq (range j)]
        (verify (into [] test-seq)
                (into (vector-of :int) test-seq))))))
{code}

It appears that for 31 < i < j, the test will fail to pass verify's third check. Basically, the result derived from the (vector-of :int) sequence contains the entire original sequence instead of just the last few items. Oddly, the first two checks succeed -- it's only when you try to put it into something that the problem manifests.. Also, it doesn't seem to matter what kind of vector you put the results in (I'm using a PersistentVector here, but I see the same results with a (vector-of :int), for example).
Attachment clj-1362-v1.patch [ 12829 ]
Alex Miller made changes -
Description In some cases, reduce over a sequence from a primitive vector created with vector-of will return incorrect answers:

{code}
user=> (into [] (drop 32 (into [] (range 33))))
[32]
user=> (into [] (drop 32 (into (vector-of :int) (range 33))))
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32]
{code}

Second call should return [32] just like the first one.

*Cause:* VecSeq (seq on primitive Vec obtained with vector-of) maintains two flags: {{i}} is the total number of elements prior to the current node in this seq. {{offset}} is the offset in the current anode. When using internal-reduce on a VecSeq, the starting index for the reduce was using offset and ignoring i.

*Solution:* Use (+ i offset) as the starting index.

*Patch:* clj-1362-v1.patch

*Screened by:*

The following test illustrates the problem:

{code}
(ns vector-of-test.core-test
  (:use clojure.test
        vector-of-test.core))

(defn verify [expected actual]
  (doseq [i (range (count expected))]
    (testing (str "i=" i)
      (is (= (apply concat (split-at i expected))
             (apply concat (split-at i actual))))
      (is (= (second (split-at i expected))
             (second (split-at i actual))))
      (is (= (into [] (second (split-at i expected)))
             (into [] (second (split-at i actual))))))))

(deftest test-vector-of
  (doseq [j (range 35)]
    (testing (str "j=" j)
      (let [test-seq (range j)]
        (verify (into [] test-seq)
                (into (vector-of :int) test-seq))))))
{code}

It appears that for 31 < i < j, the test will fail to pass verify's third check. Basically, the result derived from the (vector-of :int) sequence contains the entire original sequence instead of just the last few items. Oddly, the first two checks succeed -- it's only when you try to put it into something that the problem manifests.. Also, it doesn't seem to matter what kind of vector you put the results in (I'm using a PersistentVector here, but I see the same results with a (vector-of :int), for example).
In some cases, reduce over a sequence from a primitive vector created with vector-of will return incorrect answers:

{code}
user=> (into [] (drop 32 (into [] (range 33))))
[32]
user=> (into [] (drop 32 (into (vector-of :int) (range 33))))
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32]
{code}

Second call should return [32] just like the first one.

*Cause:* VecSeq (seq on primitive Vec obtained with vector-of) maintains two flags: {{i}} is the total number of elements prior to the current node in this seq. {{offset}} is the offset in the current anode. When using internal-reduce on a VecSeq, the starting index for the reduce was using offset and ignoring i.

*Solution:* Use (+ i offset) as the starting index.

*Patch:* clj-1362-v1.patch

*Screened by:*
Rich Hickey made changes -
Fix Version/s Release 1.7 [ 10250 ]
Approval Triaged [ 10120 ] Vetted [ 10003 ]
Stuart Sierra made changes -
Approval Vetted [ 10003 ] Screened [ 10004 ]

People

Vote (0)
Watch (2)

Dates

  • Created:
    Updated: