Clojure

Add fast path in seq comparison for structurally sharing seqs

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Patch:
    Code and Test

Description

Currently comparing two non identical seqs requires iterating through both seqs comparing value by value, ignoring the possibility of seq `a` and `b` having the same (pointer-equal) rest.

The proposed patch adds a pointer equality check on the seq tails that can make the equality short-circuit if the test returns true, which is helpful when comparing large (or possibly infinite) seqs that share a common subseq.

After this patch, comparisons like

(let [x (range)] (= x (cons 0 (rest x))))
which currently don't terminate, return true in constant time.

Patch: CLJ-1679-v3.patch

  1. 0001-CLJ-1679-do-pointer-checks-in-seq-equality.patch
    21/Mar/15 2:02 PM
    2 kB
    Nicola Mometto
  2. CLJ-1679-v2.patch
    23/Mar/15 1:01 PM
    2 kB
    Michael Blume
  3. CLJ-1679-v3.patch
    23/Mar/15 1:24 PM
    2 kB
    Michael Blume

Activity

Hide
Michael Blume added a comment -

When this test fails (it fails on my master, but I've got a bunch of other development patches, I'm still figuring out where the conflict is), it fails by hanging forever. Maybe it'd be better to check equality in a future and time out the future?

Show
Michael Blume added a comment - When this test fails (it fails on my master, but I've got a bunch of other development patches, I'm still figuring out where the conflict is), it fails by hanging forever. Maybe it'd be better to check equality in a future and time out the future?
Hide
Michael Blume added a comment -

like so =)

Show
Michael Blume added a comment - like so =)
Hide
Nicola Mometto added a comment -

Makes sense, thanks for the updated patch

Show
Nicola Mometto added a comment - Makes sense, thanks for the updated patch
Hide
Michael Blume added a comment -

Hm, previous patch had a problem where the reporting logic still tries to force the sequence and OOMs, this patch prevents that.

Show
Michael Blume added a comment - Hm, previous patch had a problem where the reporting logic still tries to force the sequence and OOMs, this patch prevents that.
Hide
Michael Blume added a comment -

ok, looks like CLJ-1515, CLJ-1603, and this patch, all combine to fail together, though any two of them work fine.

Show
Michael Blume added a comment - ok, looks like CLJ-1515, CLJ-1603, and this patch, all combine to fail together, though any two of them work fine.
Hide
Michael Blume added a comment -

(And really there's nothing wrong with the source of this patch, it still works nicely to short-circuit = where there's structural sharing, it's just that the other two patches break structural sharing for ranges, so the test fails)

Show
Michael Blume added a comment - (And really there's nothing wrong with the source of this patch, it still works nicely to short-circuit = where there's structural sharing, it's just that the other two patches break structural sharing for ranges, so the test fails)
Hide
Nicola Mometto added a comment -

I see, I guess we'll have to change the test if the patches for those tickets get applied.

Show
Nicola Mometto added a comment - I see, I guess we'll have to change the test if the patches for those tickets get applied.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated: