<< Back to previous view

[CLJ-1200] ArraySeq dead code cleanup and ArraySeq_short gap filling Created: 14/Apr/13  Updated: 31/Jan/14  Resolved: 31/Jan/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.5
Fix Version/s: Release 1.6

Type: Enhancement Priority: Minor
Reporter: Brandon Bloom Assignee: Unassigned
Resolution: Completed Votes: 6
Labels: None

Attachments: Text File no-getComponentType--v001.patch     Text File no-getComponentType--v002.patch    
Patch: Code
Approval: Ok

 Description   

The ArraySeq constructor and all methods retain support for primitive ArraySeq but that code has all been shunted into type-specific ArraySeq primitive array variants (ArraySeq_long, etc). The ArraySeq_short variant is currently missing.

Approach: Remove the vestigial primitive array code paths in ArraySeq and fields (ct and oa). This may provide a performance benefit but it has been hard to find a measurable impact.

The patch also fills in the missing ArraySeq_short variant. Before this patch, reduce on an array of primitive short would throw ClassCastException.

Patch: no-getComponentType--v002.patch
Screened by: Stuart Sierra



 Comments   
Comment by Zach Tellman [ 07/Jun/13 5:40 PM ]

As a data point, I was recently profiling a Hadoop job where the code made heavy use of 'partial', which apparently unlike 'comp' and 'juxt', always uses apply [1]. As a result, .getComponentType accounted for 41% of the overall compute time. This might be as much a comment on the implementation of partial as anything else, but it certainly shows that this can have a significant effect on performance.

[1] https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L2388

Comment by Kevin Downey [ 07/Jun/13 5:46 PM ]

prepRet usage appears to be all about enforcing canonical Boolean values, so I think using Object.class is not the best. Maybe splitting ArraySeq in to ArraySeq and ArraySeqBoolean would be better. ArraySeq would no longer do a prepRet and ArraySeqBoolean would

Comment by Kevin Downey [ 07/Jun/13 7:19 PM ]

ArraySeq actually already has specialized implementations, and interestingly the specialized boolean implementation doesn't call prepRet

Comment by Kevin Downey [ 07/Jun/13 7:35 PM ]

The ArraySeq split I proposed above will fail if you have an array of Objects that happen to be Booleans, it seems like the best bet would be something like preRet that did a instanceof Boolean check without the requirement of passing in a class

Comment by Brandon Bloom [ 15/Jul/13 11:43 AM ]

I traced the surrounding code & iirc, the result was always Object[], so the return value was always Object.class. I'm pretty sure that code wasn't actually doing anything useful here.

Comment by Brandon Bloom [ 17/Jul/13 4:54 PM ]

What does "triaged" mean in this context? Doesn't triage require some sort of decision or classification?

Comment by Andy Fingerhut [ 17/Jul/13 5:19 PM ]

See http://dev.clojure.org/display/community/JIRA+workflow for way more than the answer to your question (especially the flowchart in the "Workflow" section), but basically it means that a screener believes the ticket description represents an actual problem to be addressed, and they ask that Rich examine the ticket to see if he agrees.

Comment by Brandon Bloom [ 17/Jul/13 5:22 PM ]

Gotcha. Thanks.

Comment by Alex Miller [ 07/Oct/13 10:25 PM ]

The existing ArraySeq class has support for different two modes - as an array of Objects and as an array of primitives. In the Object case, this.oa will be set to the array and this.ct will be set to the component type of the array. From running a bunch of code, this.ct is not always Object - it is easy to find cases of Class and many other cases as well.

However, any kind of non-primitive array will cause this.oa to be set and this prevents the calls to prepRet from being called with ct in every method where this is done. All primitive arrays are now being handled via the switch in ArraySeq.createFromObject.

The only thing prepRet is effectively doing is returning canonical Boolean objects. It is possible to create an ArraySeq on a Boolean[]. However, even in this case, Boolean[] is an instanceof Object[] and the object path is triggered.

Thus, I agree that prepRet is never being called from ArraySeq and this path can be removed, which also allows us to remove this.ct and importantly the array.getClass().getComponentType() check. This also allows us to go further though and remove the oa field entirely and all of the prepRet calls.

I have attached an updated patch that makes this change. I also found (based on some test failures) that the ArraySeq_short was inexplicably missing so I added that in as well.

ArraySeq could be made even cleaner with the addition of another variant that specifically handled the case of a null array (ArraySeq_null). I have no idea if that would be worth doing and I have not done that here.

Comment by Brandon Bloom [ 08/Oct/13 9:43 AM ]

Hey Alex: Thanks for following through on the vestiges removal.

Comment by Rich Hickey [ 22/Nov/13 1:44 PM ]

please add perf tests that demonstrate benefit

Comment by Alex Miller [ 17/Dec/13 11:40 AM ]

Removed performance angle and focus on clean-up and gap-filling (ArraySeq_short) aspects instead.

Comment by Alex Miller [ 11/Jan/14 8:54 AM ]

Point to considered patch, this got lost at some point in the description.

Generated at Sat Oct 25 14:10:24 CDT 2014 using JIRA 4.4#649-r158309.