Clojure

Support transients in gvec

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Patch:
    Code
  • Approval:
    Triaged

Description

Vectors of primitives produced by vector-of do not support transients.

core.rrb-vector implements transient support for vectors of primitives. Such transient-enabled vectors of primitives can be obtained in a number of ways: (1) using a gvec instance as an argument to fv/catvec (if RRB concatenation happens, which is not guaranteed) or fv/subvec; (2) passing a gvec instance to fv/vec, which as of core.rrb-vector 0.0.11 will simply rewrap the gvec tree in an RRB wrapper; (3) using fv/vector-of instead of clojure.core/vector-of. Native support in gvec would still be useful as part of an effort to make supported functionality consistent across vector flavours (see CLJ-787 in this connection); gvec is also simpler and still has (and is likely to maintain) a performance edge.

A port of core.rrb-vector's transient support to gvec is available here:

https://github.com/michalmarczyk/clojure/tree/transient-gvec

I'll bring it up to date with current master shortly.

See the clojure-dev thread for some benchmarks:

https://groups.google.com/d/msg/clojure-dev/9ozYI1e5SCM/BAIazVOkUmcJ

Activity

Hide
Michał Marczyk added a comment - - edited

Here's the current version of the patch (0001-CLJ-1416-transients-hash-caching-for-gvec-Object-met.patch). It includes a few additional changes – here's the commit message:

CLJ-1416: transients, hash caching for gvec, Object methods for gvec seqs

  • Adds transient support to gvec
  • Adds hash{eq,Code} caching to gvec and gvec seqs
  • Implements hashCode, equals, toString for gvec seqs

https://github.com/michalmarczyk/clojure/tree/transient-gvec-1.6

Show
Michał Marczyk added a comment - - edited Here's the current version of the patch (0001-CLJ-1416-transients-hash-caching-for-gvec-Object-met.patch). It includes a few additional changes – here's the commit message:
CLJ-1416: transients, hash caching for gvec, Object methods for gvec seqs
  • Adds transient support to gvec
  • Adds hash{eq,Code} caching to gvec and gvec seqs
  • Implements hashCode, equals, toString for gvec seqs
https://github.com/michalmarczyk/clojure/tree/transient-gvec-1.6
Hide
Michał Marczyk added a comment - - edited

Here's an updated patch with some additional interop-related improvements.

The new commit message:

CLJ-1416: transients, hash caching, interop improvements for gvec

  • Adds transient support to gvec
  • Adds hash{eq,Code} caching to gvec and gvec seqs
  • Implements hashCode, equals, toString for gvec seqs
  • Correctly implements iterator-related methods for gvec and gvec seqs
  • Introduces throw-unsupported and caching-hash (both marked private)
Show
Michał Marczyk added a comment - - edited Here's an updated patch with some additional interop-related improvements. The new commit message:
CLJ-1416: transients, hash caching, interop improvements for gvec
  • Adds transient support to gvec
  • Adds hash{eq,Code} caching to gvec and gvec seqs
  • Implements hashCode, equals, toString for gvec seqs
  • Correctly implements iterator-related methods for gvec and gvec seqs
  • Introduces throw-unsupported and caching-hash (both marked private)
Hide
Andy Fingerhut added a comment -

Patch 0002-CLJ-1416-transients-hash-caching-interop-improvement.patch dated Jul 5 2014 no longer applied cleanly to latest master after some commits were made to Clojure on Aug 29, 2014. It did apply cleanly before that day.

I have not checked how easy or difficult it might be to update this patch. See section "Updating Stale Patches" on this wiki page for some tips on updating patches: http://dev.clojure.org/display/community/Developing+Patches

Show
Andy Fingerhut added a comment - Patch 0002-CLJ-1416-transients-hash-caching-interop-improvement.patch dated Jul 5 2014 no longer applied cleanly to latest master after some commits were made to Clojure on Aug 29, 2014. It did apply cleanly before that day. I have not checked how easy or difficult it might be to update this patch. See section "Updating Stale Patches" on this wiki page for some tips on updating patches: http://dev.clojure.org/display/community/Developing+Patches
Hide
Michał Marczyk added a comment -

Patch updated to apply cleanly to master.

Show
Michał Marczyk added a comment - Patch updated to apply cleanly to master.
Hide
Brandon Bloom added a comment -

Maybe this should be another ticket, but it would affect this patch, so I'll mention it here:

The ArrayManager interface is an incomplete abstraction. The original gvec code plus the new transients codepaths rely on System/arraycopy, rather than .arraycopy on the manager object. This means that it's impossible to create gvecs backed by non-JVM arrays. Or, in my case, to create a gvec of nibbles backed by an array of longs. See https://gist.github.com/brandonbloom/441a4b5712729dec7467

Show
Brandon Bloom added a comment - Maybe this should be another ticket, but it would affect this patch, so I'll mention it here: The ArrayManager interface is an incomplete abstraction. The original gvec code plus the new transients codepaths rely on System/arraycopy, rather than .arraycopy on the manager object. This means that it's impossible to create gvecs backed by non-JVM arrays. Or, in my case, to create a gvec of nibbles backed by an array of longs. See https://gist.github.com/brandonbloom/441a4b5712729dec7467
Hide
Brandon Bloom added a comment -

The current patch has a bug on line 762:

(let [node ^clojure.core.VecNode (.ensureEditable this node)

There is no such signature, only these:

(ensureEditable [this]
(ensureEditable [this node shift]

I discovered this problem using https://github.com/ztellman/collection-check

Show
Brandon Bloom added a comment - The current patch has a bug on line 762: (let [node ^clojure.core.VecNode (.ensureEditable this node) There is no such signature, only these: (ensureEditable [this] (ensureEditable [this node shift] I discovered this problem using https://github.com/ztellman/collection-check
Hide
Michał Marczyk added a comment -

Thanks for the catch! Fixed patch attached. (There was in fact one more bug in editableArrayFor, also fixed in this version.)

Show
Michał Marczyk added a comment - Thanks for the catch! Fixed patch attached. (There was in fact one more bug in editableArrayFor, also fixed in this version.)
Hide
Michał Marczyk added a comment -

As for gvecs of nibbles, could that be a separate ticket with patches building on top of this one?

On a separate note, core.rrb-vector could support vectors of nibbles as an extra feature (and adopt built-in gvec's representation if indeed the built-in gvec comes to support this feature at some point). Do you think that'd be useful?

Show
Michał Marczyk added a comment - As for gvecs of nibbles, could that be a separate ticket with patches building on top of this one? On a separate note, core.rrb-vector could support vectors of nibbles as an extra feature (and adopt built-in gvec's representation if indeed the built-in gvec comes to support this feature at some point). Do you think that'd be useful?
Hide
Michał Marczyk added a comment -

Of course vectors of nibbles could be implemented today with a separate vector type wrapping a gvec of longs, but the implementation would be more involved. I wonder what kind of performance difference there would be between the wrapper approach and the "nibble AM" approach…

Show
Michał Marczyk added a comment - Of course vectors of nibbles could be implemented today with a separate vector type wrapping a gvec of longs, but the implementation would be more involved. I wonder what kind of performance difference there would be between the wrapper approach and the "nibble AM" approach…

People

Vote (1)
Watch (3)

Dates

  • Created:
    Updated: