Clojure

aset-* and aget perform poorly on multi-dimensional arrays even with type hints.

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: Release 1.5
  • Fix Version/s: None
  • Component/s: None
  • Environment:
    Clojure 1.5.1.

    Dependencies: criterium
  • Patch:
    Code
  • Approval:
    Triaged

Description

Here's a transcript of the behavior. I don't know for sure that reflection is being done, but the performance penalty (about 1300x) suggests it.

user=> (use 'criterium.core)
nil
user=> (def b (make-array Double/TYPE 1000 1000))
#'user/b
user=> (quick-bench (aget ^"[[D" b 304 175))
WARNING: Final GC required 3.5198021166354323 % of runtime
WARNING: Final GC required 29.172288684474303 % of runtime
Evaluation count : 63558 in 6 samples of 10593 calls.
             Execution time mean : 9.457308 µs
    Execution time std-deviation : 126.220954 ns
   Execution time lower quantile : 9.344450 µs ( 2.5%)
   Execution time upper quantile : 9.629202 µs (97.5%)
                   Overhead used : 2.477107 ns

One workaround is to use multiple agets.

user=> (quick-bench (aget ^"[D" (aget ^"[[D" b 304) 175))
WARNING: Final GC required 40.59820310542545 % of runtime
Evaluation count : 62135436 in 6 samples of 10355906 calls.
             Execution time mean : 6.999273 ns
    Execution time std-deviation : 0.112703 ns
   Execution time lower quantile : 6.817782 ns ( 2.5%)
   Execution time upper quantile : 7.113845 ns (97.5%)
                   Overhead used : 2.477107 ns

Cause: The inlined version only applies to arity 2, and otherwise it reflects.

Activity

Alex Miller made changes -
Field Original Value New Value
Labels bug arrays performance
Alex Miller made changes -
Priority Minor [ 4 ] Major [ 3 ]
Hide
Gary Fredericks added a comment -

A glance at the source makes it obvious that the hypothesis is correct – the inlined version only applies to arity 2, and otherwise it reflects.

I thought this would be as simple as converting the inline function to be variadic (using reduce), but after trying it I realized this is tricky as you have to generate the correct type hints for each step. E.g., given ^"[[D" the inline function needs to type-hint the intermediate result with ^"[D". This isn't difficult if we're just dealing with strings that begin with square brackets, but I don't know for sure that those are the only possibilities.

Show
Gary Fredericks added a comment - A glance at the source makes it obvious that the hypothesis is correct – the inlined version only applies to arity 2, and otherwise it reflects. I thought this would be as simple as converting the inline function to be variadic (using reduce), but after trying it I realized this is tricky as you have to generate the correct type hints for each step. E.g., given ^"[[D" the inline function needs to type-hint the intermediate result with ^"[D". This isn't difficult if we're just dealing with strings that begin with square brackets, but I don't know for sure that those are the only possibilities.
Alex Miller made changes -
Approval Triaged [ 10120 ]
Description Here's a transcript of the behavior. I don't know for sure that reflection is being done, but the performance penalty (about 1300x) suggests it.

user=> (use 'criterium.core)
nil
user=> (def b (make-array Double/TYPE 1000 1000))
#'user/b
user=> (quick-bench (aget ^"[[D" b 304 175))
WARNING: Final GC required 3.5198021166354323 % of runtime
WARNING: Final GC required 29.172288684474303 % of runtime
Evaluation count : 63558 in 6 samples of 10593 calls.
             Execution time mean : 9.457308 µs
    Execution time std-deviation : 126.220954 ns
   Execution time lower quantile : 9.344450 µs ( 2.5%)
   Execution time upper quantile : 9.629202 µs (97.5%)
                   Overhead used : 2.477107 ns
nil

A (n ugly) workaround is to use multiple agets.

user=> (quick-bench (aget ^"[D" (aget ^"[[D" b 304) 175))
WARNING: Final GC required 40.59820310542545 % of runtime
Evaluation count : 62135436 in 6 samples of 10355906 calls.
             Execution time mean : 6.999273 ns
    Execution time std-deviation : 0.112703 ns
   Execution time lower quantile : 6.817782 ns ( 2.5%)
   Execution time upper quantile : 7.113845 ns (97.5%)
                   Overhead used : 2.477107 ns
nil
Here's a transcript of the behavior. I don't know for sure that reflection is being done, but the performance penalty (about 1300x) suggests it.

{code}
user=> (use 'criterium.core)
nil
user=> (def b (make-array Double/TYPE 1000 1000))
#'user/b
user=> (quick-bench (aget ^"[[D" b 304 175))
WARNING: Final GC required 3.5198021166354323 % of runtime
WARNING: Final GC required 29.172288684474303 % of runtime
Evaluation count : 63558 in 6 samples of 10593 calls.
             Execution time mean : 9.457308 µs
    Execution time std-deviation : 126.220954 ns
   Execution time lower quantile : 9.344450 µs ( 2.5%)
   Execution time upper quantile : 9.629202 µs (97.5%)
                   Overhead used : 2.477107 ns
{code}

One workaround is to use multiple agets.

{code}
user=> (quick-bench (aget ^"[D" (aget ^"[[D" b 304) 175))
WARNING: Final GC required 40.59820310542545 % of runtime
Evaluation count : 62135436 in 6 samples of 10355906 calls.
             Execution time mean : 6.999273 ns
    Execution time std-deviation : 0.112703 ns
   Execution time lower quantile : 6.817782 ns ( 2.5%)
   Execution time upper quantile : 7.113845 ns (97.5%)
                   Overhead used : 2.477107 ns
{code}

*Cause:* The inlined version only applies to arity 2, and otherwise it reflects.
Issue Type Defect [ 1 ] Enhancement [ 4 ]
Summary aset-* and aget: on multi-dimensional arrays (e.g. double[][]) these fn's reflect (and, thus, perf. poorly) even with type hints. aset-* and aget perform poorly on multi-dimensional arrays even with type hints.
Hide
Yaron Peleg added a comment - - edited

Bump. I just got bitten bad by this.

There are two seperate issues here:
1) (aget 2d-array-doubles 0 0 ) doesn't emit a reflection warning.
2) It seems like the compiler has enough information to avoid the reflective call here.

Note this gets exp. worse as number of dimensions grows, i.e (get doubles3d 0 0 0)
will be 1M slower, etc' Not true, unless you iterate over all elements. it's
simply n_dims*1000x per lookup.

Nasty surprise, especially considering you often go to primitive arrays for speed,
and a common use case is an inner loop(s) that iterate over arrays.

Show
Yaron Peleg added a comment - - edited Bump. I just got bitten bad by this. There are two seperate issues here: 1) (aget 2d-array-doubles 0 0 ) doesn't emit a reflection warning. 2) It seems like the compiler has enough information to avoid the reflective call here. Note this gets exp. worse as number of dimensions grows, i.e (get doubles3d 0 0 0) will be 1M slower, etc' Not true, unless you iterate over all elements. it's simply n_dims*1000x per lookup. Nasty surprise, especially considering you often go to primitive arrays for speed, and a common use case is an inner loop(s) that iterate over arrays.
Hide
Gary Fredericks added a comment -

I can probably take a stab at this.

Show
Gary Fredericks added a comment - I can probably take a stab at this.
Hide
Gary Fredericks added a comment -

I think the reflection warning problem is pretty much impossible to solve without changing code elsewhere in the compiler, because the reflection done in aget is a different kind than normal clojure reflection – it's explicitly in the function body rather than emitted by the compiler. Since the compiler isn't emitting it, it doesn't reasonably know it's even there. So even if aget is fixed for other arities, you still won't get the warning when it's not inlined.

I can imagine some sort of metadata that you could put on a function telling the compiler that it will reflect if not inlined. Or maybe a more generic not-inlined warning?

The global scope of adding another compiler flag seems about balanced by the seriousness of array functions not being able to warn on reflection.

Show
Gary Fredericks added a comment - I think the reflection warning problem is pretty much impossible to solve without changing code elsewhere in the compiler, because the reflection done in aget is a different kind than normal clojure reflection – it's explicitly in the function body rather than emitted by the compiler. Since the compiler isn't emitting it, it doesn't reasonably know it's even there. So even if aget is fixed for other arities, you still won't get the warning when it's not inlined. I can imagine some sort of metadata that you could put on a function telling the compiler that it will reflect if not inlined. Or maybe a more generic not-inlined warning? The global scope of adding another compiler flag seems about balanced by the seriousness of array functions not being able to warn on reflection.
Hide
Gary Fredericks added a comment - - edited

Attached CLJ-1289-p1.patch which simply inlines variadic calls to aget. It assumes that if it sees a :tag on the array arg that is a string beginning with [, it can assume that the return value from one call to aget can be tagged with the same string with the leading [ stripped off.

I'm not a jvm expert, but having read through the spec a little bit I think this is a reasonable assumption.

Show
Gary Fredericks added a comment - - edited Attached CLJ-1289-p1.patch which simply inlines variadic calls to aget. It assumes that if it sees a :tag on the array arg that is a string beginning with [, it can assume that the return value from one call to aget can be tagged with the same string with the leading [ stripped off. I'm not a jvm expert, but having read through the spec a little bit I think this is a reasonable assumption.
Gary Fredericks made changes -
Attachment CLJ-1289-p1.patch [ 12788 ]
Andy Fingerhut made changes -
Patch Code [ 10001 ]
Hide
Alex Miller added a comment -

I think this probably is actually true, but a more official way to ask that question would be to get the array class and ask for Class.getComponentType() (and less janky than string munging).

Show
Alex Miller added a comment - I think this probably is actually true, but a more official way to ask that question would be to get the array class and ask for Class.getComponentType() (and less janky than string munging).
Hide
Gary Fredericks added a comment -

How would you get the array class based on the :tag type hint?

Show
Gary Fredericks added a comment - How would you get the array class based on the :tag type hint?
Hide
Gary Fredericks added a comment -

I see (-> s (Class/forName) (.getComponentType) (.getName)) does the same thing – is that route preferred, or is there another one?

Show
Gary Fredericks added a comment - I see (-> s (Class/forName) (.getComponentType) (.getName)) does the same thing – is that route preferred, or is there another one?

People

Vote (0)
Watch (2)

Dates

  • Created:
    Updated: