Clojure

LazySeq should utilize cached hash from its underlying seq.

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Environment:
    1.6.0 master SNAPSHOT
  • Patch:
    Code
  • Approval:
    Triaged

Description

Even if underlying seq contains a cached hash, LazySeq computes it every time.

user=> (def a (range 100000))
#'user/a
user=> (time (hash a))
"Elapsed time: 46.904273 msecs"
375952610
user=> (time (hash a)) ;; RECOMPUTE
"Elapsed time: 10.879098 msecs"
375952610
user=> (def b (seq a))
#'user/b
user=> (time (hash b))
"Elapsed time: 10.572522 msecs"
375952610
user=> (time (hash b)) ;; CACHED HASH
"Elapsed time: 0.024927 msecs"
375952610
user=> (def c (lazy-seq b))
#'user/c
user=> (time (hash c))
"Elapsed time: 12.207651 msecs"
375952610
user=> (time (hash c))
"Elapsed time: 10.995798 msecs"
375952610
  1. clj-1373.diff
    09/Mar/14 9:16 AM
    0.7 kB
    Jozef Wagner
  2. clj-1373-2.diff
    04/May/15 12:27 PM
    0.8 kB
    Jozef Wagner

Activity

Jozef Wagner made changes -
Field Original Value New Value
Attachment clj-1373.diff [ 12865 ]
Hide
Jozef Wagner added a comment -

Added patch which checks if underlying seq implements IHashEq and if yes, uses that hash instead of recomputing.

Show
Jozef Wagner added a comment - Added patch which checks if underlying seq implements IHashEq and if yes, uses that hash instead of recomputing.
Alex Miller made changes -
Approval Triaged [ 10120 ]
Labels collections performance
Andy Fingerhut made changes -
Patch Code [ 10001 ]
Alex Miller made changes -
Description Even if underlying seq contains a cached hash, LazySeq computes it every time.

{code}
user=> (def a (range 100000))
#'user/a
user=> (time (hash a))
"Elapsed time: 46.904273 msecs"
375952610
user=> (time (hash a)) ;; RECOMPUTE
"Elapsed time: 10.879098 msecs"
375952610
user=> (def b (seq a))
#'user/b
user=> (time (hash b))
"Elapsed time: 10.572522 msecs"
375952610
user=> (time (hash b)) ;; CACHED HASH
"Elapsed time: 0.024927 msecs"
375952610
user=> (def c (lazy-seq b))
#'user/c
user=> (time (hash c))
"Elapsed time: 12.207651 msecs"
375952610
user=> (time (hash c))
"Elapsed time: 10.995798 msecs"
375952610
{code}
Even if underlying seq contains a cached hash, LazySeq computes it every time.

{code}
user=> (def a (range 100000))
#'user/a
user=> (time (hash a))
"Elapsed time: 46.904273 msecs"
375952610
user=> (time (hash a)) ;; RECOMPUTE
"Elapsed time: 10.879098 msecs"
375952610
user=> (def b (seq a))
#'user/b
user=> (time (hash b))
"Elapsed time: 10.572522 msecs"
375952610
user=> (time (hash b)) ;; CACHED HASH
"Elapsed time: 0.024927 msecs"
375952610
user=> (def c (lazy-seq b))
#'user/c
user=> (time (hash c))
"Elapsed time: 12.207651 msecs"
375952610
user=> (time (hash c))
"Elapsed time: 10.995798 msecs"
375952610
{code}

*Screened by:* Alex Miller
Labels collections performance collections ft performance
Hide
Alex Miller added a comment -

In this patch, can you update the else case (the original code) to use s rather than this, so seq() is not re-called?

Show
Alex Miller added a comment - In this patch, can you update the else case (the original code) to use s rather than this, so seq() is not re-called?
Alex Miller made changes -
Description Even if underlying seq contains a cached hash, LazySeq computes it every time.

{code}
user=> (def a (range 100000))
#'user/a
user=> (time (hash a))
"Elapsed time: 46.904273 msecs"
375952610
user=> (time (hash a)) ;; RECOMPUTE
"Elapsed time: 10.879098 msecs"
375952610
user=> (def b (seq a))
#'user/b
user=> (time (hash b))
"Elapsed time: 10.572522 msecs"
375952610
user=> (time (hash b)) ;; CACHED HASH
"Elapsed time: 0.024927 msecs"
375952610
user=> (def c (lazy-seq b))
#'user/c
user=> (time (hash c))
"Elapsed time: 12.207651 msecs"
375952610
user=> (time (hash c))
"Elapsed time: 10.995798 msecs"
375952610
{code}

*Screened by:* Alex Miller
Even if underlying seq contains a cached hash, LazySeq computes it every time.

{code}
user=> (def a (range 100000))
#'user/a
user=> (time (hash a))
"Elapsed time: 46.904273 msecs"
375952610
user=> (time (hash a)) ;; RECOMPUTE
"Elapsed time: 10.879098 msecs"
375952610
user=> (def b (seq a))
#'user/b
user=> (time (hash b))
"Elapsed time: 10.572522 msecs"
375952610
user=> (time (hash b)) ;; CACHED HASH
"Elapsed time: 0.024927 msecs"
375952610
user=> (def c (lazy-seq b))
#'user/c
user=> (time (hash c))
"Elapsed time: 12.207651 msecs"
375952610
user=> (time (hash c))
"Elapsed time: 10.995798 msecs"
375952610
{code}

Labels collections ft performance collections performance
Jozef Wagner made changes -
Attachment clj-1373-2.diff [ 14121 ]
Hide
Jozef Wagner added a comment -

Added patch clj-1373-2.diff that reuses s for else case.

Show
Jozef Wagner added a comment - Added patch clj-1373-2.diff that reuses s for else case.

People

Vote (1)
Watch (2)

Dates

  • Created:
    Updated: