range confused by large bounds

Description

There are a number of issues related to counting and overflow in the current LongRange implementation.

expression

1.6.0

1.7.0-beta2

+patch

comment

(range (- Long/MAX_VALUE 2) Long/MAX_VALUE)

(9223372036854775805 9223372036854775806)

OOME

(9223372036854775805 9223372036854775806)

top of long range

(count (range (- Long/MAX_VALUE 2) Long/MAX_VALUE))

2

2

2

top of long range

(range (+ Long/MIN_VALUE 2) Long/MIN_VALUE -1)

(-9223372036854775806 -9223372036854775807)

OOME

(-9223372036854775806 -9223372036854775807)

bottom of long range

(count (range (+ Long/MIN_VALUE 2) Long/MIN_VALUE -1))

2

2

2

bottom of long range

(range Long/MIN_VALUE Long/MAX_VALUE Long/MAX_VALUE)

ArithmeticEx

OOME

(-9223372036854775806 -1 9223372036854775806)

large positive step

(count (range Long/MIN_VALUE Long/MAX_VALUE Long/MAX_VALUE))

ArithmeticEx

0

3

large positive step

(range Long/MAX_VALUE Long/MIN_VALUE Long/MIN_VALUE)

ArithmeticEx

OOME

(9223372036854775807 -1)

large negative step

(count (range Long/MAX_VALUE Long/MIN_VALUE Long/MIN_VALUE))

ArithmeticEx

0

2

large negative step

(count (range 0 Long/MAX_VALUE))

overflows to nonsense

-2147483648

ArithmeticEx

number of values in range > Integer.MAX_VALUE

Cause: There were several bugs, both old and new, in the counting related code for range, particularly around overflow and (introduced in CLJ-1709) coercion error.

Approach: The patched code:

  • Uses only exact values (no double conversion in there)

  • Fixes the algorithm for integer ceiling to be correct

  • Explicitly does overflow-checking calculations when necessary (chunking, counting, and iterator)

  • In the case of overflow, falls back to slower stepping algorithm if necessary (this is only in pathological cases like above)

  • Added many new tests thanks to Andy Fingerhut and Steve Miner.

One particular question is what to do in the case where the count of a range is > Integer.MAX_VALUE. The choices are:
1. Return Integer.MAX_VALUE (Java collection size() solution)
2. Throw ArithmeticOverflowException (since your answer is going to be wrong)
3. Overflow and let bad stuff happen (Clojure 1.6 does this)

The current patch takes approach #2, per Rich.

Performance check:

expr

1.6

beta1

beta2

beta2+patch

(count (range (* 1024 1024)))

63 ms

0 ms

0 ms

0 ms

(reduce + (map inc (range (* 1024 1024))))

55 ms

35 ms

34 ms

32 ms

(reduce + (map inc (map inc (range (* 1024 1024)))))

74 ms

59 ms

56 ms

54 ms

(count (keep odd? (range (* 1024 1024))))

77 ms

52 ms

48 ms

49 ms

(transduce (comp (map inc) (map inc)) + (range (* 1024 1024)))

n/a

30 ms

26 ms

26 ms

(reduce + 0 (range (* 2048 1024)))

72 ms

29 ms

29 ms

21 ms

(reduce + 0 (rest (range (* 2048 1024))))

73 ms

29 ms

30 ms

21 ms

(doall (range 0 31))

1.38 us

0.97 us

0.73 us

0.75 us

(doall (range 0 32))

1.38 us

0.99 us

0.76 us

0.77 us

(doall (range 0 4096))

171 us

126 us

125 us

98 us

(into [] (map inc (range 31)))

1.87 us

1.34 us

1.27 us

1.33 us

(into [] (map inc) (range 31))

n/a

0.76 ms

0.76 ms

0.76 ms

(into [] (range 128))

5.26 us

2.18 us

2.15 us

2.22 us

Patch: clj-1727-4.patch

Environment

None

Attachments

6

Activity

Nicola Mometto June 5, 2015 at 3:53 PM

Patch committed and the ticket marked as resolved but not closed. I'm closing it.

Steve Miner May 12, 2015 at 9:44 PM

And I will resist suggesting a count' (ala inc' and dec'). Thanks for fixing these edge cases. The original report came from real life experience when I was trying to fix my own double math problems with TCHECK-67.

Andy Fingerhut May 12, 2015 at 4:27 PM

Re: clj-1727-4.patch
I, for one, will never be filing a bug that (count (range Long/MIN_VALUE Long/MAX_VALUE)) overflows a long

Alex Miller May 11, 2015 at 8:43 AM

New -3 patch changes the fallback to use the iterator. The iterator was also not safe from overflow, so I fixed that too.

For counting ranges > Integer/MAX_VALUE, now:

Steve Miner May 10, 2015 at 9:28 PM

Good point about the case of a large step potentially causing the exception, in which case the range may still have a reasonable count.

Completed

Details

Assignee

Reporter

Approval

Patch

Priority

Affects versions

Fix versions

Created May 7, 2015 at 4:38 PM
Updated June 5, 2015 at 3:53 PM
Resolved June 5, 2015 at 3:53 PM