Clojure

ArraySeq dead code cleanup and ArraySeq_short gap filling

Details

  • Type: Enhancement Enhancement
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Completed
  • Affects Version/s: Release 1.5
  • Fix Version/s: Release 1.6
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code
  • Approval:
    Ok

Description

The ArraySeq constructor and all methods retain support for primitive ArraySeq but that code has all been shunted into type-specific ArraySeq primitive array variants (ArraySeq_long, etc). The ArraySeq_short variant is currently missing.

Approach: Remove the vestigial primitive array code paths in ArraySeq and fields (ct and oa). This may provide a performance benefit but it has been hard to find a measurable impact.

The patch also fills in the missing ArraySeq_short variant. Before this patch, reduce on an array of primitive short would throw ClassCastException.

Patch: no-getComponentType--v002.patch
Screened by: Stuart Sierra

Activity

Brandon Bloom made changes -
Field Original Value New Value
Summary RestFn performance RestFn & ArraySeq performance
Alan Malloy made changes -
Labels patch patch, patch
Hide
Zach Tellman added a comment -

As a data point, I was recently profiling a Hadoop job where the code made heavy use of 'partial', which apparently unlike 'comp' and 'juxt', always uses apply [1]. As a result, .getComponentType accounted for 41% of the overall compute time. This might be as much a comment on the implementation of partial as anything else, but it certainly shows that this can have a significant effect on performance.

[1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L2388

Show
Zach Tellman added a comment - As a data point, I was recently profiling a Hadoop job where the code made heavy use of 'partial', which apparently unlike 'comp' and 'juxt', always uses apply [1]. As a result, .getComponentType accounted for 41% of the overall compute time. This might be as much a comment on the implementation of partial as anything else, but it certainly shows that this can have a significant effect on performance. [1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L2388
Hide
Kevin Downey added a comment -

prepRet usage appears to be all about enforcing canonical Boolean values, so I think using Object.class is not the best. Maybe splitting ArraySeq in to ArraySeq and ArraySeqBoolean would be better. ArraySeq would no longer do a prepRet and ArraySeqBoolean would

Show
Kevin Downey added a comment - prepRet usage appears to be all about enforcing canonical Boolean values, so I think using Object.class is not the best. Maybe splitting ArraySeq in to ArraySeq and ArraySeqBoolean would be better. ArraySeq would no longer do a prepRet and ArraySeqBoolean would
Hide
Kevin Downey added a comment -

ArraySeq actually already has specialized implementations, and interestingly the specialized boolean implementation doesn't call prepRet

Show
Kevin Downey added a comment - ArraySeq actually already has specialized implementations, and interestingly the specialized boolean implementation doesn't call prepRet
Hide
Kevin Downey added a comment -

The ArraySeq split I proposed above will fail if you have an array of Objects that happen to be Booleans, it seems like the best bet would be something like preRet that did a instanceof Boolean check without the requirement of passing in a class

Show
Kevin Downey added a comment - The ArraySeq split I proposed above will fail if you have an array of Objects that happen to be Booleans, it seems like the best bet would be something like preRet that did a instanceof Boolean check without the requirement of passing in a class
Hide
Brandon Bloom added a comment -

I traced the surrounding code & iirc, the result was always Object[], so the return value was always Object.class. I'm pretty sure that code wasn't actually doing anything useful here.

Show
Brandon Bloom added a comment - I traced the surrounding code & iirc, the result was always Object[], so the return value was always Object.class. I'm pretty sure that code wasn't actually doing anything useful here.
Alex Miller made changes -
Approval Triaged [ 10120 ]
Hide
Brandon Bloom added a comment -

What does "triaged" mean in this context? Doesn't triage require some sort of decision or classification?

Show
Brandon Bloom added a comment - What does "triaged" mean in this context? Doesn't triage require some sort of decision or classification?
Hide
Andy Fingerhut added a comment -

See http://dev.clojure.org/display/community/JIRA+workflow for way more than the answer to your question (especially the flowchart in the "Workflow" section), but basically it means that a screener believes the ticket description represents an actual problem to be addressed, and they ask that Rich examine the ticket to see if he agrees.

Show
Andy Fingerhut added a comment - See http://dev.clojure.org/display/community/JIRA+workflow for way more than the answer to your question (especially the flowchart in the "Workflow" section), but basically it means that a screener believes the ticket description represents an actual problem to be addressed, and they ask that Rich examine the ticket to see if he agrees.
Hide
Brandon Bloom added a comment -

Gotcha. Thanks.

Show
Brandon Bloom added a comment - Gotcha. Thanks.
Alex Miller made changes -
Priority Minor [ 4 ] Major [ 3 ]
Alex Miller made changes -
Labels patch performance
Rich Hickey made changes -
Approval Triaged [ 10120 ] Vetted [ 10003 ]
Fix Version/s Release 1.6 [ 10157 ]
Alex Miller made changes -
Attachment no-getComponentType--v002.patch [ 12305 ]
Alex Miller made changes -
Attachment no-getComponentType--v002.patch [ 12306 ]
Hide
Alex Miller added a comment - - edited

The existing ArraySeq class has support for different two modes - as an array of Objects and as an array of primitives. In the Object case, this.oa will be set to the array and this.ct will be set to the component type of the array. From running a bunch of code, this.ct is not always Object - it is easy to find cases of Class and many other cases as well.

However, any kind of non-primitive array will cause this.oa to be set and this prevents the calls to prepRet from being called with ct in every method where this is done. All primitive arrays are now being handled via the switch in ArraySeq.createFromObject.

The only thing prepRet is effectively doing is returning canonical Boolean objects. It is possible to create an ArraySeq on a Boolean[]. However, even in this case, Boolean[] is an instanceof Object[] and the object path is triggered.

Thus, I agree that prepRet is never being called from ArraySeq and this path can be removed, which also allows us to remove this.ct and importantly the array.getClass().getComponentType() check. This also allows us to go further though and remove the oa field entirely and all of the prepRet calls.

I have attached an updated patch that makes this change. I also found (based on some test failures) that the ArraySeq_short was inexplicably missing so I added that in as well.

ArraySeq could be made even cleaner with the addition of another variant that specifically handled the case of a null array (ArraySeq_null). I have no idea if that would be worth doing and I have not done that here.

Show
Alex Miller added a comment - - edited The existing ArraySeq class has support for different two modes - as an array of Objects and as an array of primitives. In the Object case, this.oa will be set to the array and this.ct will be set to the component type of the array. From running a bunch of code, this.ct is not always Object - it is easy to find cases of Class and many other cases as well. However, any kind of non-primitive array will cause this.oa to be set and this prevents the calls to prepRet from being called with ct in every method where this is done. All primitive arrays are now being handled via the switch in ArraySeq.createFromObject. The only thing prepRet is effectively doing is returning canonical Boolean objects. It is possible to create an ArraySeq on a Boolean[]. However, even in this case, Boolean[] is an instanceof Object[] and the object path is triggered. Thus, I agree that prepRet is never being called from ArraySeq and this path can be removed, which also allows us to remove this.ct and importantly the array.getClass().getComponentType() check. This also allows us to go further though and remove the oa field entirely and all of the prepRet calls. I have attached an updated patch that makes this change. I also found (based on some test failures) that the ArraySeq_short was inexplicably missing so I added that in as well. ArraySeq could be made even cleaner with the addition of another variant that specifically handled the case of a null array (ArraySeq_null). I have no idea if that would be worth doing and I have not done that here.
Alex Miller made changes -
Description I was profiling one of my projects and noticed a hotspot inside functions using rest arguments.

Most overloads of RestFn.invoke will construct an ArraySeq object. The ArraySeq constructor calls java.lang.Class.getComponentType, which seems to be pretty slow. The array's component type is cached in a field on the ArraySeq object for the sole purpose of passing it to Reflector.prepRet. I'm not entirely sure of the total utility of prepRet, but it's clearly a no-op when passed Object.class, such as the case with the non-specialized base class of ArraySeq: the component type of object[] is Object.class.

The attached patch eliminates the component type field from ArraySeq and passes Object.class to prepRet directly. It may be possible to eliminate calls to prepRet all together, but I'll assume that's a different ticket. With this patch, ArraySeq initialization no longer shows up as a hotspot when profiling.

Before the patch:

{code}
user=> (dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4))))
"Elapsed time: 874.742 msecs"
"Elapsed time: 900.277 msecs"
"Elapsed time: 911.164 msecs"
"Elapsed time: 872.132 msecs"
"Elapsed time: 885.495 msecs"
"Elapsed time: 897.537 msecs"
"Elapsed time: 879.691 msecs"
"Elapsed time: 888.52 msecs"
"Elapsed time: 871.556 msecs"
"Elapsed time: 1088.682 msecs"
{code}

After the patch:

{code}
user=> (dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4))))
"Elapsed time: 628.144 msecs"
"Elapsed time: 634.163 msecs"
"Elapsed time: 617.397 msecs"
"Elapsed time: 622.714 msecs"
"Elapsed time: 646.743 msecs"
"Elapsed time: 648.708 msecs"
"Elapsed time: 629.223 msecs"
"Elapsed time: 638.058 msecs"
"Elapsed time: 725.473 msecs"
"Elapsed time: 636.909 msecs"
{code}

That's about a 30% reduction in execution time.

Granted it only represents a change of 52 nanoseconds per RestFn invoke (181 ns to 129 ns), but this actually was a pretty decent win for a library that makes makes almost exclusively variadic function calls in a tight loop.
It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {Class.getComponentType} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {ct} field and later used to call into {Reflector.prepRet} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

I did not find the prior performance test {(dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4))))} to be repeatable or sufficient and would like better reports from real code bases and a better simple REPL test.

*Screened by:* Alex Miller
Alex Miller made changes -
Description It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {Class.getComponentType} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {ct} field and later used to call into {Reflector.prepRet} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

I did not find the prior performance test {(dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4))))} to be repeatable or sufficient and would like better reports from real code bases and a better simple REPL test.

*Screened by:* Alex Miller
It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {{Class.getComponentType}} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {{ct}} field and later used to call into {{Reflector.prepRet()}} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

I did not find the prior performance test {{(dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4))))}} to be repeatable or sufficient and would like better reports from real code bases and a better simple REPL test.

*Screened by:* Alex Miller
Alex Miller made changes -
Description It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {{Class.getComponentType}} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {{ct}} field and later used to call into {{Reflector.prepRet()}} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

I did not find the prior performance test {{(dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4))))}} to be repeatable or sufficient and would like better reports from real code bases and a better simple REPL test.

*Screened by:* Alex Miller
It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {{Class.getComponentType}} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {{ct}} field and later used to call into {{Reflector.prepRet()}} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

I did not find the prior performance test {code}(dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4)))){code} to be repeatable or sufficient and would like better reports from real code bases and a better simple REPL test.

*Screened by:* Alex Miller
Alex Miller made changes -
Attachment no-getComponentType--v002.patch [ 12306 ]
Alex Miller made changes -
Description It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {{Class.getComponentType}} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {{ct}} field and later used to call into {{Reflector.prepRet()}} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

I did not find the prior performance test {code}(dotimes [n 10] (time (dotimes [i 5000000] ((fn [& args]) 1 2 3 4)))){code} to be repeatable or sufficient and would like better reports from real code bases and a better simple REPL test.

*Screened by:* Alex Miller
It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {{Class.getComponentType}} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {{ct}} field and later used to call into {{Reflector.prepRet()}} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

*Performance:* This is a simple test that leans on partial application which internally uses ArraySeq:

{code}
(def f (partial + 1 2 3 4 5))
(time (reduce (fn [x y] (inc x)) 0 (map f (range 10000000))))
{code}

On my machine, I found these (best time of 20 runs):
{code}
Before - "Elapsed time: 3993.551 msecs"
After - "Elapsed time: 3580.262 msecs"
{code}

which is about a 10% improvement (about what I've seen in other ad hoc tests as well).

*Screened by:* Alex Miller
Hide
Brandon Bloom added a comment -

Hey Alex: Thanks for following through on the vestiges removal.

Show
Brandon Bloom added a comment - Hey Alex: Thanks for following through on the vestiges removal.
Alex Miller made changes -
Approval Vetted [ 10003 ] Screened [ 10004 ]
Rich Hickey made changes -
Approval Screened [ 10004 ] Ok [ 10007 ]
Hide
Rich Hickey added a comment -

please add perf tests that demonstrate benefit

Show
Rich Hickey added a comment - please add perf tests that demonstrate benefit
Rich Hickey made changes -
Approval Ok [ 10007 ] Incomplete [ 10006 ]
Hide
Alex Miller added a comment -

Removed performance angle and focus on clean-up and gap-filling (ArraySeq_short) aspects instead.

Show
Alex Miller added a comment - Removed performance angle and focus on clean-up and gap-filling (ArraySeq_short) aspects instead.
Alex Miller made changes -
Approval Incomplete [ 10006 ] Vetted [ 10003 ]
Description It has been found by the original poster and others (see comments) that the ArraySeq constructor call to {{Class.getComponentType}} is a performance hotspot in calls to RestFn.invoke(). The array's component type is cached in the {{ct}} field and later used to call into {{Reflector.prepRet()}} which currently handles Boolean canonicalization. However, this path is only used for primitive arrays; all subclasses of Object[] are handled by alternate code in every method. The ArraySeq.create() methods (the real entry point) now shunt all primitive array support to supporting variants (ArraySeq_boolean, etc).

*Patch:* no-getComponentType--v002.patch

*Approach:* Remove the vestigial support for primitive arrays in the ArraySeq class. This includes removing the ct and oa fields in ArraySeq and changing the existing array field to an Object[] as well as removing the calls to Class.getComponentType.

Removing this code also exposed the missing ArraySeq_short (the only primitive array type not covered) which has been added.

*Performance:* This is a simple test that leans on partial application which internally uses ArraySeq:

{code}
(def f (partial + 1 2 3 4 5))
(time (reduce (fn [x y] (inc x)) 0 (map f (range 10000000))))
{code}

On my machine, I found these (best time of 20 runs):
{code}
Before - "Elapsed time: 3993.551 msecs"
After - "Elapsed time: 3580.262 msecs"
{code}

which is about a 10% improvement (about what I've seen in other ad hoc tests as well).

*Screened by:* Alex Miller
The ArraySeq constructor and all methods retain support for primitive ArraySeq but that code has all been shunted into type-specific ArraySeq primitive array variants (ArraySeq_long, etc). The ArraySeq_short variant is currently missing.

*Approach:* Remove the vestigial primitive array code paths in ArraySeq and fields (ct and oa). This may provide a performance benefit but it has been hard to find a visible impact. The patch also fills in the missing ArraySeq_short variant as well.
 
*Screened by:*
Labels performance
Summary RestFn & ArraySeq performance ArraySeq dead code cleanup and ArraySeq_short gap filling
Stuart Sierra made changes -
Approval Vetted [ 10003 ] Screened [ 10004 ]
Description The ArraySeq constructor and all methods retain support for primitive ArraySeq but that code has all been shunted into type-specific ArraySeq primitive array variants (ArraySeq_long, etc). The ArraySeq_short variant is currently missing.

*Approach:* Remove the vestigial primitive array code paths in ArraySeq and fields (ct and oa). This may provide a performance benefit but it has been hard to find a visible impact. The patch also fills in the missing ArraySeq_short variant as well.
 
*Screened by:*
The ArraySeq constructor and all methods retain support for primitive ArraySeq but that code has all been shunted into type-specific ArraySeq primitive array variants (ArraySeq_long, etc). The ArraySeq_short variant is currently missing.

*Approach:* Remove the vestigial primitive array code paths in ArraySeq and fields (ct and oa). This may provide a performance benefit but it has been hard to find a measurable impact.

The patch also fills in the missing ArraySeq_short variant. Before this patch, {{reduce}} on an array of primitive {{short}} would throw ClassCastException.

*Screened by:* Stuart Sierra
Priority Major [ 3 ] Minor [ 4 ]
Rich Hickey made changes -
Approval Screened [ 10004 ] Ok [ 10007 ]
Hide
Alex Miller added a comment -

Point to considered patch, this got lost at some point in the description.

Show
Alex Miller added a comment - Point to considered patch, this got lost at some point in the description.
Alex Miller made changes -
Description The ArraySeq constructor and all methods retain support for primitive ArraySeq but that code has all been shunted into type-specific ArraySeq primitive array variants (ArraySeq_long, etc). The ArraySeq_short variant is currently missing.

*Approach:* Remove the vestigial primitive array code paths in ArraySeq and fields (ct and oa). This may provide a performance benefit but it has been hard to find a measurable impact.

The patch also fills in the missing ArraySeq_short variant. Before this patch, {{reduce}} on an array of primitive {{short}} would throw ClassCastException.

*Screened by:* Stuart Sierra
The ArraySeq constructor and all methods retain support for primitive ArraySeq but that code has all been shunted into type-specific ArraySeq primitive array variants (ArraySeq_long, etc). The ArraySeq_short variant is currently missing.

*Approach:* Remove the vestigial primitive array code paths in ArraySeq and fields (ct and oa). This may provide a performance benefit but it has been hard to find a measurable impact.

The patch also fills in the missing ArraySeq_short variant. Before this patch, {{reduce}} on an array of primitive {{short}} would throw ClassCastException.

*Patch:* no-getComponentType--v002.patch
*Screened by:* Stuart Sierra
Stuart Halloway made changes -
Resolution Completed [ 1 ]
Status Open [ 1 ] Closed [ 6 ]

People

Vote (6)
Watch (5)

Dates

  • Created:
    Updated:
    Resolved: