core.memoize

ttl functions sometimes return nil

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code and Test

Description

If I (memoize/ttl) a function, near the threshold time the result from calling the memoized function will sometimes be nil.

I've attached a patch that exploits the problem in the unit tests.

Reproducing in the REPL is easy. Just call memoize/ttl fn a bunch of times and watch for nil results.

user=> (require '[clojure.core.memoize :as m])
nil

user=> (def f (m/ttl (constantly 1)))
#'user/f

user=> (dotimes [n 1000000] (when (nil? (f)) (println "nil f")))
nil f
nil f
nil f
nil f
nil f
nil

The problem seems to be that when clojure.core.cache/through gets called, the item hasn't expired, but when clojure.core.cache/lookup happens in build-memoizer, the expiration has passed.

  1. CMEMOIZE-15-RETRY.diff
    04/Nov/14 9:49 AM
    3 kB
    Ryan Fowler
  2. fix-CMEMEMOIZE-15.diff
    01/Jul/14 8:22 PM
    2 kB
    Ryan Fowler
  3. ttl_bug.diff
    01/Jul/14 4:24 PM
    0.7 kB
    Ryan Fowler

Activity

Hide
Ryan Fowler added a comment -

I'm out of my depth here, but this patch seems to resolve my problem. It looks in the cache before & after the (through*) swap, picks the "after" value if non-nil and "before" value if the "after" value is nil.

Show
Ryan Fowler added a comment - I'm out of my depth here, but this patch seems to resolve my problem. It looks in the cache before & after the (through*) swap, picks the "after" value if non-nil and "before" value if the "after" value is nil.
Hide
Colin Jones added a comment - - edited

One alternative would be to repeat the

(swap! cache through* f args)
if the value is nil.

Something like

-            (and val @val)))
+            (if (nil? val)
+              (swap! cache through* f args)
+              @val)))

There are a few benefits: less code, only expiration forces you to pay for an additional cache lookup, and it seems like it might be preferable semantically (you don't return a known-to-be-expired value).

A downside is that you may do multiple cache writes if the ttl is low enough (not sure if you can do sub-millisecond ttls?) or the thread scheduling is wonky enough to make that happen. But retries can happen with swap! regardless, so I'm not sure this drawback is significant.

p.s. You should also make sure to check out http://dev.clojure.org/display/community/Developing+Patches and use git format-patch for your patch, since the current ones don't have a `git am`-friendly format.

Show
Colin Jones added a comment - - edited One alternative would be to repeat the
(swap! cache through* f args)
if the value is nil. Something like
-            (and val @val)))
+            (if (nil? val)
+              (swap! cache through* f args)
+              @val)))
There are a few benefits: less code, only expiration forces you to pay for an additional cache lookup, and it seems like it might be preferable semantically (you don't return a known-to-be-expired value). A downside is that you may do multiple cache writes if the ttl is low enough (not sure if you can do sub-millisecond ttls?) or the thread scheduling is wonky enough to make that happen. But retries can happen with swap! regardless, so I'm not sure this drawback is significant. p.s. You should also make sure to check out http://dev.clojure.org/display/community/Developing+Patches and use git format-patch for your patch, since the current ones don't have a `git am`-friendly format.
Hide
Christophe Grand added a comment -

Isn't the problem the TTL cache impl itself? CacheProtocol is purely functional but TTLCache is not (has? or lookup are time sensitive). Cache-invalidation on write (rather than on read) may be a solution that wouldn't require changing the protocol.

Show
Christophe Grand added a comment - Isn't the problem the TTL cache impl itself? CacheProtocol is purely functional but TTLCache is not (has? or lookup are time sensitive). Cache-invalidation on write (rather than on read) may be a solution that wouldn't require changing the protocol.
Hide
Ryan Fowler added a comment -

Thanks Guys.

Chris, I think you're on to something. If TTLCache/lookup doesn't filter out results based on time, things make a whole lot more sense and the TTLCache acts more like normal Clojure.

(miss) and (evict) are the only functions that actually remove stuff from the cache, so one of those functions needs to get called regularly anyway.

Show
Ryan Fowler added a comment - Thanks Guys. Chris, I think you're on to something. If TTLCache/lookup doesn't filter out results based on time, things make a whole lot more sense and the TTLCache acts more like normal Clojure. (miss) and (evict) are the only functions that actually remove stuff from the cache, so one of those functions needs to get called regularly anyway.
Hide
Colin Jones added a comment -

Maybe I'm just being dense, but I'd personally be surprised if an implementation of CacheProtocol's lookup fn gave me back a value that should be expired according to the TTL. Imagine if your TTL was 30 seconds, and you did a read 10 minutes after your last write - calling lookup and getting a 10-minute-old value seems problematic.

Show
Colin Jones added a comment - Maybe I'm just being dense, but I'd personally be surprised if an implementation of CacheProtocol's lookup fn gave me back a value that should be expired according to the TTL. Imagine if your TTL was 30 seconds, and you did a read 10 minutes after your last write - calling lookup and getting a 10-minute-old value seems problematic.
Hide
Ryan Fowler added a comment -

Implementation of Colin Jones' suggestion to retry swap!/lookup when a cache lookup indicates that the cache entry doesn't exist.

Show
Ryan Fowler added a comment - Implementation of Colin Jones' suggestion to retry swap!/lookup when a cache lookup indicates that the cache entry doesn't exist.
Hide
Arthur Maciejewicz added a comment - - edited

We ran into this issue as well, and the temporary fix if anyone else needs it, is to call the memoized function a second time, which results in a cache miss and returns a new value.

@Colin nil is an unexpected value because in the case of a cache miss, the key's value will be repopulated. However, in this scenario, has? returns both true and false in the same function call.

Show
Arthur Maciejewicz added a comment - - edited We ran into this issue as well, and the temporary fix if anyone else needs it, is to call the memoized function a second time, which results in a cache miss and returns a new value. @Colin nil is an unexpected value because in the case of a cache miss, the key's value will be repopulated. However, in this scenario, has? returns both true and false in the same function call.
Hide
Ryan Fowler added a comment - - edited

For what it's worth, we've been running with CMEMOIZE-15-RETRY.diff in production for well over a year without issue. It's a shame that this hasn't been fixed in a proper release of either core.memoize or core.cache.

Show
Ryan Fowler added a comment - - edited For what it's worth, we've been running with CMEMOIZE-15-RETRY.diff in production for well over a year without issue. It's a shame that this hasn't been fixed in a proper release of either core.memoize or core.cache.
Hide
Alex Miller added a comment -

Unless I'm missing something, it seems like this patch hides the issue rather than fixes it? That doesn't seem like the right "fix".

Show
Alex Miller added a comment - Unless I'm missing something, it seems like this patch hides the issue rather than fixes it? That doesn't seem like the right "fix".
Hide
Joshua Tilles added a comment -

I agree with Alex: the “retry if nil” strategy feels like a hack to me, although I think it’s the most appropriate workaround in the absence of an official fix.

Here’s an alternative approach:

diff --git a/src/main/clojure/clojure/core/memoize.clj b/src/main/clojure/clojure/core/memoize.clj
index 5b448eb..a18ebfe 100644
--- a/src/main/clojure/clojure/core/memoize.clj
+++ b/src/main/clojure/clojure/core/memoize.clj
@@ -159,7 +159,11 @@
        (with-meta
         (fn [& args]
           (let [cs  (swap! cache through* f args)
-                val (clojure.core.cache/lookup cs args)]
+                ;; Since `cache/through` (via `through*`) already
+                ;; handled hitting & missing, treat the resulting `cs`
+                ;; as a “snapshot” and sidestep any call to `cache/lookup`.
+                val (get (.cache (.cache ^PluggableMemoization cs))
+                         args)]
             ;; The assumption here is that if what we got
             ;; from the cache was non-nil, then we can dereference
             ;; it.  core.memo currently wraps all of its values in

With this change, all of the tests still pass, including the new test from the attached ttl_bug.diff.

However, it’s more of a proof-of-concept than a viable fix (hence its living in this comment instead of being attached as a patch) because it introduces reflection (the outer .cache) by relying on the implementation detail that all of the cache implementations defined with defcache in clojure.core.cache have a raw backing map in a field named cache.

Of course, the reflection could be resolved with a new protocol method. I don’t yet have an opinion on whether it would be “better” to expand clojure.core.cache/CacheProtocol or to define a new protocol, nor if such a new protocol should live in clojure.core.cache or clojure.core.memoize.

Show
Joshua Tilles added a comment - I agree with Alex: the “retry if nil” strategy feels like a hack to me, although I think it’s the most appropriate workaround in the absence of an official fix. Here’s an alternative approach:
diff --git a/src/main/clojure/clojure/core/memoize.clj b/src/main/clojure/clojure/core/memoize.clj
index 5b448eb..a18ebfe 100644
--- a/src/main/clojure/clojure/core/memoize.clj
+++ b/src/main/clojure/clojure/core/memoize.clj
@@ -159,7 +159,11 @@
        (with-meta
         (fn [& args]
           (let [cs  (swap! cache through* f args)
-                val (clojure.core.cache/lookup cs args)]
+                ;; Since `cache/through` (via `through*`) already
+                ;; handled hitting & missing, treat the resulting `cs`
+                ;; as a “snapshot” and sidestep any call to `cache/lookup`.
+                val (get (.cache (.cache ^PluggableMemoization cs))
+                         args)]
             ;; The assumption here is that if what we got
             ;; from the cache was non-nil, then we can dereference
             ;; it.  core.memo currently wraps all of its values in
With this change, all of the tests still pass, including the new test from the attached ttl_bug.diff. However, it’s more of a proof-of-concept than a viable fix (hence its living in this comment instead of being attached as a patch) because it introduces reflection (the outer .cache) by relying on the implementation detail that all of the cache implementations defined with defcache in clojure.core.cache have a raw backing map in a field named cache. Of course, the reflection could be resolved with a new protocol method. I don’t yet have an opinion on whether it would be “better” to expand clojure.core.cache/CacheProtocol or to define a new protocol, nor if such a new protocol should live in clojure.core.cache or clojure.core.memoize.
Hide
Joshua Tilles added a comment -

Possible names for the prospective protocol method:

  • raw-backing-map
  • backing-data
  • snapshot
    • I am aware that core.memoize already defines its own snapshot function, but I don’t think there would be any problem if the protocol method lived in core.cache.

And for the protocol itself (assuming the method isn’t just added to clojure.core.cache/CacheProtocol):

  • Transparence
  • Snapshottable
Show
Joshua Tilles added a comment - Possible names for the prospective protocol method:
  • raw-backing-map
  • backing-data
  • snapshot
    • I am aware that core.memoize already defines its own snapshot function, but I don’t think there would be any problem if the protocol method lived in core.cache.
And for the protocol itself (assuming the method isn’t just added to clojure.core.cache/CacheProtocol):
  • Transparence
  • Snapshottable
Hide
Joshua Tilles added a comment -

Alex Miller (or Fogus, of course) any thoughts on the strategy described above? Is it worth fleshing out?

Show
Joshua Tilles added a comment - Alex Miller (or Fogus, of course) any thoughts on the strategy described above? Is it worth fleshing out?

People

Vote (3)
Watch (3)

Dates

  • Created:
    Updated: