nth fails with ArrayIndexOutOfBoundsException 33 clojure.core.rrb-vector.rrbt.Vector/fn--19277 (rrbt.clj:373)

Description

Was playing with advent of code and puzzle 19 second part.

I have a program that mostly does catvec + subvec at https://github.com/mattiasw2/adventofcode1/blob/master/src/adventofcode1/nineteen_b.clj

The smallest sample that fails with

ArrayIndexOutOfBoundsException 33 clojure.core.rrb-vector.rrbt.Vector/fn--19277 (rrbt.clj:373)

is

(puzzle-b 978)

(I was using (stest/check `puzzle-b) )

I reimplemented the program using plain vectors, and everything works fine (except that I haven't been able to run the big sample)

https://github.com/mattiasw2/adventofcode1/blob/master/src/adventofcode1/nineteen_c.clj

which works fine.

Environment

None

Activity

Show:

Andy Fingerhut September 18, 2019 at 6:02 PM

build.clojure.org test results were good for the committed fix. Resolving as fixed.

Andy Fingerhut September 18, 2019 at 4:47 PM

New automated test cases that fail with the existing core.rrb-vector code, plus a fix that in all testing before has been found to preserve the O(log N) run time of catvec, have been committed here:

I will close this issue as fixed if build.clojure.org gives good test results, as those changes already have on my machine.

Andy Fingerhut August 6, 2019 at 6:58 PM

The best fix for this issue is to make it so that the depth of the tree returned by catvec never gets too large, within the desired O(log N) run-time bounds. I don’t have such a fix yet.

In thinking about the problem, I realized that if one wanted a simpler fix to get going with, but would not be guaranteed O(log N) run-time bounds, one could, inside of catvec, check whether the result tree was “too tall and skinny” just before returning the result, and if it was, do an O(N) creation of a perfectly regular vector, i.e. the densest kind, as created by clojure.core/vector, and return that instead.

It is kind of like a “stop the world garbage collector” :-)

Andy Fingerhut August 5, 2019 at 10:15 PM

A likely culprit, and probably the only one where this could actually be caused, is in catvec returning data structures that add new levels to the tree when it should not.

Andy Fingerhut August 5, 2019 at 4:55 PM

This is an interesting one. From a bit of investigation, the internal tree data structure gets so “tall and skinny”, i.e. so far from an average branching factor of 32-way, that even a vector with only around 1,000 elements gets deeper than 12 levels. The ‘shift’ values go up to around 65, which causes the code to misbehave, since it is written assuming that the shift values will never go over 30.

Fixing this means finding the part of the code that is trying to keep the trees wide and shallow, and fixing it.

Fixed

Details

Assignee

Reporter

Priority

Created December 20, 2016 at 9:00 AM
Updated September 18, 2019 at 6:02 PM
Resolved September 18, 2019 at 6:02 PM