<< Back to previous view

[CMEMOIZE-24] Thread synchronized and unsynchronized spoofing Created: 22/Apr/18  Updated: 24/Apr/18

Status: Open
Project: core.memoize
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Leon Grapenthin Assignee: Sean Corfield
Resolution: Unresolved Votes: 0
Labels: None


Problem: Function A is memoized. Function B, among other things, calculates a valid function call of A without ever invoking A. We want to utilize this effect in B to further cut calculations of A. In case that we expect B to perform faster than A we also want callers of A to wait when a corresponding B is running.

A desired operation how to do this in B would be
(spoof A args calc)

Where calc is a fn of zero args that produces a value equal to (apply A args). This allows the caller to put exactly the amount of work into calc that he expects to perform faster than A.

Spoof returns what calc returns or in case of a cache hit what is cached for args. I. e. if an invocation (A args) happened before the call of (spoof A args calc), calc won't be invoked.

In cases where we already have the result of (A args) and don't want to be blocked by a concurrent invocation of (A args) a desired operation would be

(spoof-unsynchronized A args val)

it returns val immediately and updates the cache of A soon. It doesn't have to block invocations of (A args) since the unsynchronized spoof is a very fast operation that is subject to a race condition with invocations of (A args) anyway. (So a swap! on the internal cache state should suffice).

Comment by Sean Corfield [ 23/Apr/18 2:16 PM ]

This doesn't feel like something that should be part of the core.memoize library but it should be possible to build something like this on top of it (and it isn't right now, correct?).

Would the minimum change necessary to allow this be to expose the underlying cache (so you could invoke deref/swap! against the cache directly)?

Comment by Leon Grapenthin [ 23/Apr/18 3:02 PM ]

For the unsynchronized case yes, For the synchronized (interesting) case maybe with nasty hacking, I'd have to look at the code.

I disagree that this functionality should not be built in. Its quite general and allows very flexible, user definied memoization implementations.

The design recommended above is stolen from Google Guavas cache.


It allows memoization like this. You just call
(.get c [arg1] (fn [] "calc for [arg1]")) ;; notice how easy it is to implement a custom memoizer with just that facility!

The necessary logic that runs fn synchronized (once) is in our case implemented in core.memoize. So if core.memoize would expose a similar facility I believe that would be a very generic API that allows much more stuff to be built on top (custom memoizer impls).

It also has .put which handles the unsynchronized case.

Comment by Sean Corfield [ 23/Apr/18 6:45 PM ]

Interesting. My next question was going to be "where did you get the idea for this?".

The underlying machinery in core.memoize works in a similar way, with a through* function that handles the thunk etc, but it's the build-memoizer function that encapsulates all of that, along with the PluggableMemoizer type.

I'll have to give this some thought. The cache itself is attached to the metadata for the function object returned from build-memoizer so, technically, it's all already exposed if you wanted to program this yourself.

Comment by Leon Grapenthin [ 23/Apr/18 7:40 PM ]

My understanding is that I would need to invoke private through* with a patched (args ignoring) fn.

Can't really call that exposed

Comment by Leon Grapenthin [ 23/Apr/18 7:42 PM ]

And I'd also need to copypaste the code following line 209 handling a rare timing issue.

Comment by Leon Grapenthin [ 23/Apr/18 7:46 PM ]

Or did I misunderstand this and you suggested if I wanted to program a patch?

Comment by Sean Corfield [ 23/Apr/18 8:14 PM ]

I meant: the raw cache is right there in the metadata if you wanted to check if an argument list key existed and/or update the cache with a new value via the thunk you talk about above. You can invoke core.cache/through (or core.cache/through-cache more likely) directly on that cache, with a thunk (albeit of one argument, which you could just ignore):

(swap! (:clojure.core.memoize/cache (meta the-fn-var)) clojure.core.cache/through-cache [arg1] (fn [_] "calc for [arg1]"))

Any user of TTL cache has to potentially deal with the edge case of eviction during access (unless they don't care about getting nil back) so, yes, a spoof function that does the above swap! would potentially have to retry the thunk invocation – whether you also updated the cache at that point would be up to you (i.e., what behavior do you want for subsequent calls if a cache eviction occurred during your spoof call?).

The point is that spoof requires the actual cache to do its job. The memoized function only has access to that cache by virtue of being a closure, so spoof has to operate on the Var holding the memoized function.

Does that clarify what I'm suggesting?

Comment by Leon Grapenthin [ 24/Apr/18 4:52 AM ]

Spoof would also utilize to use internal through* for its locking facilities in d-lay. It might require minor adjustments to support a thunk.

The design recommended above intentionally takes a memoized function so that it can extract ::cache.

Regarding the TTL fix I have to look at it again. Please let me if you are interested in a patch with spoof/spoof-aysnc...

Comment by Sean Corfield [ 24/Apr/18 12:07 PM ]

I hate the names but I'd consider a patch that exposed enough of the machinery to allow users to build their own versions outside the library more easily.

Comment by Leon Grapenthin [ 24/Apr/18 1:13 PM ]

Did choose the names to fit the current design of the lib. Where Guava gives you powerful primitives and claims generic low level names, this lib is written in the style where it makes strong assumptions about what the user wants to do with it. It makes you have a memoized function with a cache attached right away.

This is why I have chosen names that only describe why you would use them /additionally/. There might be a better word than "spoof" - I'm not a native speaker.

core.cache only exposes the datastructures of caches, but it doesn't expose a proper reference type to cache calculations (instead of values), like Guava does. I suspect that the core.cache authors missed to do this because they thought that Clojures reference types would suffice. core.memoize implements such a reference type for itself as an internal. I'm unsure whether exposing this functionality, which has a good amount of usecases outside of memoization would be at the right place in this lib. Hence the naming and wrapping around memoized functions to clearly suggest that the functionality is supposed in the scope of memoization.

[CMEMOIZE-9] memo-swap! is misnamed Created: 23/Sep/13  Updated: 01/Mar/18

Status: Open
Project: core.memoize
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Mark Engelberg Assignee: Sean Corfield
Resolution: Unresolved Votes: 0
Labels: None


In order to be analogous to atoms, the current behavior of memo-swap! should be called memo-reset! since it overwrites the entire memo map. Similarly, there should be a function called memo-swap! that does an atomic swap operation on the memo cache so, for example, you can add a single value to the cache.

Comment by Sean Corfield [ 01/Mar/18 2:42 AM ]

Good point! Backward compatibility could be maintained by checking if (map? base) – and perhaps giving a deprecation warning? – but users would have to be very careful that they would be manipulating the cache and core.memoize wraps raw values in IDeref instances so a simple

(memo-swap! f assoc [42] 99)

would not work – it would need to be

(memo-swap! f assoc [42] (delay 99))

which would make this the only place in the API where that is really exposed.

Generated at Tue Jun 19 07:32:58 CDT 2018 using JIRA 4.4#649-r158309.