Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit concurrency of Polly.Cache calling its delegate per corresponding key #657

Closed
mrmartan opened this issue Jun 22, 2019 · 40 comments
Closed

Comments

@mrmartan
Copy link

mrmartan commented Jun 22, 2019

Summary: What are you wanting to achieve?

I want the cache policy execution to limit concurrency on its data loading delegate per given operation key.

Imagine a disrtibuted cache in Redis implemented though the IAsyncCacheProvider. The cached keys have set expiry/TTL in Redis. The system in question experiences thousands of calls per second. Now the key in Redis expires. There is a high probability that tens of requests come in (nearly) simultaneously all requiring the same key (from Redis). It is not there and the cache executes the underlying data provider once per each request, ie. needlessly because all of those executions will yield the same value. Executing the data provider is expensive. Then all of those executions start setting the same key/value pair to Redis where setting it once would be enough. This IMHO causes unnessesary load on both the cache (Redis) and the backing data store. Or rather bursts of load whenever a key expires.

I would like the provider to be executed only once and once it returns all outstanding/awaiting cache request would be fulfilled immediately, also setting the value into the cache only once.

Do you have any suggestions please? Is this possible with Polly.Cache? Or is this something you would have to implement?

I have tried implementing this using Bulkhead unsuccessfuly. I would have to have two of them - one for the cache provider and one for the data provider - and even then it just limits the conncurency and does not allow for fullfiling all executions in its queue at once.

As I am writing this I thought of using double cache with an inner layer of memory cache but I can't think of how to reasonably tell when it's OK to evict from the inner cache. The problem with Bulkhead is that it does not work on a per key basis and maintaining a collection of them could get expensive quickly. I am also worried about overhead.
Something like Polly.Cache(Redis:IAsyncCacheProvider) -> Bulkhead -> Polly.Cache(Memory:ConcurrentDictionary) -> Bulkhead -> DataLoader

@reisenberger
Copy link
Member

Courtesy reply: I have a collection of ideas how this could be done, but am unlikely to be able to respond in detail before the middle of next week.

@reisenberger
Copy link
Member

Some quick input to get you a response: I would suggest using the ConcurrentDictionary<TKey, Lazy<TValue>> pattern (reference; reference) around your DataLoader, to ensure that you only execute the underlying call to Redis once, and can share the result of that call across concurrent calls. TKey can be the OperationKey.

You would want to use the ConcurrentDictionary to cache and share the Lazy across the burst of concurrent calls, but you would probably want to thread-safely evict that instance of the Lazy from the ConcurrentDictionary once it had returned a value (indicating that particular execution cycle on the lazy had finished).

@phatcher
Copy link

@mrmartan I'd also have a look at https://github.com/aspnet/Extensions/issues/708 which discusses that GetOrAddAsync on IMemoryCache/IDistributedCache it not atomic.

You can also look at my NuGet package (Meerkat.Caching) which addresses this and use this as the basis of your own code.

@reisenberger
Copy link
Member

And there is also LazyCache and DoubleCache, which it might be possible to implement a Polly CacheProvider for, or take inspiration from - as well as @phatcher 's example (thanks Paul!).

@reisenberger
Copy link
Member

reisenberger commented Jun 24, 2019

@phatcher Aren't most of the Semaphore-based approaches in that thread (EDIT: https://github.com/aspnet/Extensions/issues/708) more-or-less just rewriting Lazy? (edit: I was wrong about this; it was a different approach to the same problem)

@phatcher
Copy link

phatcher commented Jun 24, 2019

@reisenberger Possibly, I haven't had a look at the implementation of LazyCache - I came across #708 when I was trying to implement my service (as opposed to HttpRequest) level cache that we discussed a little while ago.

Lazy was a pure in-memory implementation, which is why I took an extension method approach against both IMemoryCache/IDistributedCache for mine. I still have to see how it behaves in a large scale environment e.g. Kubernetes with pods coming and going.

What it allows is that the cache key and ISynchronizer key different - which is useful when services could duplicate each other's work, but you can't control the order in which services are invoked.

@mrmartan
Copy link
Author

Thanks guys. I've went with @phatcher's Synchronizer for now. I like how it builds on top of IMemoryCache. Not the nuget though as it drags dependencies I don't want.

As for ConcurrentDictionary<TKey, Lazy<TValue>>: I would have to use something more like ConcurrentDictionary<TKey, Lazy<Task<TValue>>>

@reisenberger
Copy link
Member

@mrmartan Great, glad you have a solution!

Re:

I would have to use something more like ConcurrentDictionary<TKey, Lazy<Task<TValue>>>

Sure, I was making an effort to get you a quick response rather than leave you waiting, I forgot your case was async 🙂 .

@phatcher nice caching library, thanks for linking.

@mrmartan
Copy link
Author

mrmartan commented Jun 25, 2019

I do. I was not using Polly.Cache in the first place and the solution lead me to custom GetOrSet implementation. I have a host of other applications that use Polly.Caching though. Do you imagine a way to extend Polly.Caching behavior in a way discussed here? I.e. that the value provider and cache provider set operation are executed once and atomically.

Once you have time.

@reisenberger
Copy link
Member

Hi @mrmartan . Yes, it would be with the ConcurrentDictionary<TKey, Lazy<Task<TValue>>> approach.

@reisenberger
Copy link
Member

Linking a related discussion; https://github.com/aspnet/Extensions/issues/708

@jjxtra
Copy link

jjxtra commented Dec 21, 2019

Lazy works well if only one box is serving requests, but if you have more than one box, you will still have problems with multiple queries doing the same thing on different machines. In this case, locking on a redis key is necessary.

@jjxtra
Copy link

jjxtra commented Jan 3, 2020

So I think whatever solution happens here needs to happen at the policy level, not the caching level. A single policy on a single machine with a given key should only ever execute one at a time, any additional call that comes in while that policy is executing should either lock or simply be returned the currently executing Task for that policy. So I think this issue is bigger than just caching. If Polly already has a mechanism to do this per key at the policy level, then the caching issue is moot.

A note about distributed caching - if one has a large server farm, think dozens or hundreds of machines or more, it could also be useful to lock on a distributed cache key (redis can do this), that way 100 machines don't do an expensive database query at the same time. The 99 machines that don't get the key lock would wait (with timeout probably) and when the key unlocks, they would execute their policies, most likely getting a distributed cache entry as a result. This would need to be thought out more, this is just a quick and dirty high level idea.

There would need to be a way to check an in memory cache as the very first step before doing any distributed cache locks as well. Although a key lock is super fast, no point in doing it if the object is in local RAM.

@reisenberger
Copy link
Member

reisenberger commented Jan 3, 2020

To recap and summarize two possible technical solutions:

a) Thread-safe Lazy-per-key: such as the ConcurrentDictionary<TKey, Lazy<TValue>> (and equivalent async) pattern proposed here. Re (@jjxtra said on #740) "the GetOrCreateAsync pattern perhaps better applies at a higher level then caching" : yes, exactly. I had had the idea that this could become a ConcurrentDuplicateRequestCollapserPolicy of general use, regardless of whether (or at what inner level) the governed work includes a CachePolicy execution.

b) Thread-safe lock-per-key: @phatcher has a great demonstration of this in Meerkat.Caching (see foot of readme), as a lock-per-key via SemaphoreSlim. A custom Polly policy or native Polly policy could do similar: perhaps LockPerKeyPolicy or DuplicateRequestLockPolicy.

In both cases, the key would default to Context.OperationKey (or the policy could take an ICacheKeyStrategy), for similarity with and likely partnering with CachePolicy.

Both would be primitives, separate policies which could be composed with cache (or other) policies.


A key question for either solution is scoping, ownership and disposal of the semaphores/lazys:

The question-mark over (b) is that the collection of Semaphores can grow over time. Avoiding background jobs to clean this up, the only sensible solution is to hand control of semaphore-eviction-and-disposal back to the user: something like LockPerKeyPolicy.DisposeKey(string key)

Option (a) has no such problem: the Lazy can be (thread-safely) cleared from the ConcurrentDictionary after it has returned a value; all concurrent executions still holding the Lazy will quickly go out of scope, allowing the Lazy to be GCd. The cost is an allocation of a Lazy<T> (32 bytes?) per concurrent invocation set; presumably small in comparison, if the motivation/gain is to avoid the heavy cost of the duplicate guarded operation. But by not baking this into the CachePolicy by default, users could choose where to apply it.

EDIT: And for the caching use-case, assuming the policies are nested cachePolicy.Wrap(duplicateRequestCollapserPolicy), then the cost of the Lazy would only be paid when that cache item was first populated (or expired and repopulated); not every execution.


For comparison, Alastair Crabtree's LazyCache also addresses the same problem (it's a common caching problem) as a standalone project. This (afaiu) caches a Lazy<T> (or async equivalent). That is suitable for memory caches, but not particularly for serialized-to-json distributed caches, so not generally applicable for Polly.

@jjxtra
Copy link

jjxtra commented Jan 4, 2020

I like option a, I think having some sort of wrapper policy as you suggest is the way to go, it makes this "opt in" and is easy to apply to any policy with a potentially expensive operation. The added benefit of not maintaining a dictionary of semaphores is also a plus.

A couple of considerations, some or all of which you may have addressed already :)

  • The GetOrCreateAsync call still needs to be locked, even with a concurrent dictionary. I've tested this and it's possible for multiple factory methods for the same key to be executed when using concurrent dictionary with Lazy. The last factory method wins, but during this condition different Lazy results are returned. The lock can be a spin lock or other super fast locking mechanism. It just needs to lock long enough for the Lazy object to be created and put into the concurrent dictionary. One solution I've done is to make a int array of 8192 ints and use Interlocked.CompareExchange on a hash of the key % 8192. If the lock is not acquired, just await a Thread.Yield and try again. Getting a lock should take no more than a microsecond, or less if no Thread.Yield() call. The lock can then be released as soon as the Lazy has been returned from GetOrCreateAsync. The Lazy can then be removed from the concurrent dictionary upon policy completion.
  • The duplicate request collapser should use an interface to provide the lock, so it can be per machine or per redis key, etc. Something like IKeyLock : IDisposeable {}. Just constructing with a key and disposing would be the lock / unlock mechanism. In the case of a crash with redis, the key lock would have a timeout.
  • Have a way to use an in memory cache first before the wrapper policy. A policy flow like this: [ In memory cache check ] -> [ Wrapper policy on key [ policy a] -> [ policy b ], etc. ]. So don't require the wrapper policy to be an all-encompassing policy. I assume this would be the case, just wanted to be sure.

Thanks for being so open and willing to discuss this, I am in charge of a large back-end with some pretty heavy sql queries and am pushing a transition to Polly for our services.

@jjxtra
Copy link

jjxtra commented Jan 12, 2020

@reisenberger Have you started any work on this? Any concerns if I take a stab at this? I could create a fork and start doing some work and would love to get some early feedback...

@reisenberger
Copy link
Member

@jjxtra Awesome if you want to work on this! I already had an early code sketch - will share in the next 24-48 hours

@jjxtra
Copy link

jjxtra commented Jan 14, 2020

Looking forward to it, would love collaboration here in order to come up with the best solution.

@reisenberger
Copy link
Member

Thanks @jjxtra ! Awesome to collaborate.

Just brain-dumped here (final commit) the early sketch I had. Basically a realisation of the pattern first proposed here; think this is fairly close to your first bullet here too? (thanks, great discussion!). But please feel free to elaborate / make / propose changes! 👍 to the idea of interface to inject lock strategies.

That ^ sketch is just an engine for the heart of the sync implementation. The best way to get an idea for the scaffolding for a policy and its syntax is look at how NoOpPolicy is implemented (or others). We also have a wiki article on this.

Async version fairly similar with ConcurrentDictionary<string, Lazy<Task<>>, do you think?


(Note: I am mid working on Polly v8; the sketch branches off that and is best place now to branch from.)

Thanks again!

@jjxtra
Copy link

jjxtra commented Jan 14, 2020

Should I fork the repo and then build off of your sketch branch then? Initial ideas for key-lock providers would be an in-memory per process lock, and a distributed redis lock. The redis lock depends on stack-exchange redis library, so is it better off in the contrib framework instead of the core framework?

@reisenberger
Copy link
Member

reisenberger commented Jan 14, 2020

Should I fork the repo and then build off of your sketch branch then?

Yes, branch from there now. The on-going v800 branch may be in motion over coming days, but any re-basing should be limited (/I can pitch in).

👍 to ideas for lock providers.

  • Yes, any ILockProvider implementation with new dependencies will def become a Polly-Contrib repo.
  • We have a blank template repo here for non-policy Polly-Contribs, and one here for policy contribs, providing a head start (typical repo content, tfms, ready-made build etc 🙂 )
    • There are a couple of low-hanging-fruit issues in each. to bring up to date with latest Polly
  • Shout when you have proposed lock providers (or share initially as a GitHub gist or something?- your call), then we can decide for each if it lives in the main Polly or in a Polly-Contrib repo.

Thanks for contributing!

@jjxtra
Copy link

jjxtra commented Jan 15, 2020

Here is my idea for the in-process per key lock. It is designed under the assumption that most locks will be acquired and released without contention, which should be the most common path. Acquiring the lock is the cost of computing a hash value from a string, a modulus call and finally an interlocked compare exchange call.

I had considered using Monitor.Enter and Monitor.Exit, but that had thread implications that concerned me, along with requiring an object per key, which would generate more garbage. I also considered semaphore slim and manual reset event, but again more garbage and worse kernel transitions...

I also considered using SpinWait (https://github.com/microsoft/referencesource/blob/master/mscorlib/system/threading/SpinWait.cs) but it seemed to not be net standard 1.1 compliant, and probably overkill for this super tight lock that we want.

Under the assumption that this will be wrapping an expensive computation and/or network call, it should be a minuscule percentage of cpu and wait time.

I also made the lock provider interface async, so that any distributed locking mechanism could fit nicely in, but we could change this back if you think we don't need it.

Finally, I had considered on just locking the string itself, but this would only work if the string was a static or const, and would not work cross libraries, or if the string was generated at runtime.

Here is the gist, please give me any and all feedback: https://gist.github.com/jjxtra/f6116180b2ef5c1550e60567af506c2a

@reisenberger
Copy link
Member

Thanks @jjxtra for this and for everything to take ConcurrentDuplicateRequestCollapser forward!

I also made the lock provider interface async, so that any distributed locking mechanism could fit nicely in, but we could change this back if you think we don't need it

Great question. I think we will need both (separate) ISyncLockProvider and IAsyncLockProvider interfaces:

  • The synchronous version of the policy will need to consume a synchronous ISyncLockProvider.
  • We should provide configuration syntax so that the asynchronous version of the policy can be configured with either an IAsyncLockProvider or ISyncLockProvider (it may be common to use an in-process lock still on async executions). When the async policy is configured with a sync ISyncLockProvider, we can internally use a lightweight ValueTask<IDisposable> wrapper.

With that in mind, I think we can probably move your ProcessLockProvider back to a sync implementation.

I had considered using Monitor.Enter and Monitor.Exit, but that had thread implications that concerned me

With you: Monitor cannot be released on a different thread than acquired it. For this policy implementation, we would only be locking over same-thread manipulation of the in-memory ConcurrentDictionary<,>, so Monitor would be ok (say if you think I'm missing something). (But this is not to take away from your thread-safe implementation...)

considered on just locking the string itself, but this would only work if

Agreed 👍

Still reflecting on where a key-lock implementation like this might eventually live (Polly, Polly.Contrib?), but hopefully some thoughts ☝️ to take things forward!

@jjxtra
Copy link

jjxtra commented Jan 22, 2020

Just a heads up I made some changes to the gist to fix the concurrency issue you mentioned, along with providing a sync and async version. Let me know if you think it's good enough to use or if you have any further suggested changes.

@jjxtra
Copy link

jjxtra commented Jan 23, 2020

I think adding the per-process key lock to the main Polly framework would be ideal. It could be the default implementation for the new policy that ensures a key only executes one at a time.

The distributed lock will have dependencies on redis stackexchange, so probably it belongs in the contrib project...

@reisenberger
Copy link
Member

Thanks @jjxtra , planning to come back to this shortly.

@reisenberger
Copy link
Member

@jjxtra After benchmarking and considering complexity trade-offs, I suggest the default lock implementation should be a single lock per policy, using the standard .Net lock / Monitor (which the benchmarks shows as suitably performant for the majority use case). However, a keyed-lock implementation would be a good candidate for Polly-Contrib!

Thanks again for contributing! Let me know if I can provide any other guidance on building out the policy infrastructure (previous notes here).

@jjxtra
Copy link

jjxtra commented Jan 25, 2020

@reisenberger I ran some additional tests, was wondering if you wanted to run one last benchmark on your machine using my modified code. I put my findings in the gist link. Interesting results as cpu core count increases.

Gist link for convenience since this thread has gotten quite long: https://gist.github.com/jjxtra/f6116180b2ef5c1550e60567af506c2a

@reisenberger
Copy link
Member

@jjxtra I wanted to progress the main Policy implementation for this as well as the locking discussion; I took that up and it is done. Will push that in the coming days and then revert to this discussion.

@jjxtra
Copy link

jjxtra commented Feb 4, 2020

Sounds good!

@reisenberger
Copy link
Member

In some spare weekend hours I opted to push out the main Policy implementation for this before switching my attention back to Polly v8. The policy implementation is complete (with concurrency-exercising unit tests), added doco just now, pushed all here.

The policy has moved to Polly-Contrib at Polly.Contrib.DuplicateRequestCollapser. @jjxtra : Just as you had aspects of the striped-locking where supporting .Net Standard 1.x presented challenges, likewise the policy implementation wanted a dependency only available to .Net Standard 2.x. Moving the policy to Polly-Contrib allows us to make that compromise (we would not drop .Net Standard 1.x support from core Polly just for this policy).

I have released an early version to nuget, just to get a version of this out there and published. But @jjxtra : this is just the start, would be great to see this taken further with further locking implementations! @jjxtra I will send you an invitation to join Polly-Contrib, which gives you rights on that repo. Feel free to expand/change locking implementations if you want to take ownership of that contrib project. @mrmartan @phatcher If you are interested too in contributing there I can invite you.

Welcome feedback on the early nuget release ( @mrmartan @phatcher @jjxtra / anyone) if you get to exercise it in your environments. Perhaps best over on Polly.Contrib.DuplicateRequestCollapser repo, given that is where this has ended up!

Thanks!

@jjxtra
Copy link

jjxtra commented Feb 7, 2020 via email

@reisenberger
Copy link
Member

Np @jjxtra .

Going to close this issue now that a first version of the policy has been published, but know @jjxtra that you plan to expand on this (thanks!). Thanks to everybody on this thread for the contributions in thinking!

Also linked out from the Polly CachePolicy wiki area to the new policy: https://github.com/App-vNext/Polly/wiki/Avoiding-cache-repopulation-request-storms

@jjxtra
Copy link

jjxtra commented Feb 20, 2020

If you have a few seconds to look at my distributed lock implementation, I'd love any feedback. I am using this code on some personal projects, I have no issue licensing the distributed lock bits under your license. It does require stack exchange redis to be brought in, but I think that is a fair trade...

https://github.com/Polly-Contrib/Polly.Contrib.DuplicateRequestCollapser/blob/master/src/Polly.Contrib.DuplicateRequestCollapser/StackexchangeRedisDistributedLockProvider.cs

@reisenberger
Copy link
Member

Thanks @jjxtra . I'll aim to jump back into this thread in the coming days!

@reisenberger
Copy link
Member

@jjxtra Good to have a Redis lock in the mix for the request collapser: thanks for bringing this forward!

I'd like to create a new repo for this: Polly.Contrib.DuplicateRequestCollapser.RedisDistributedLock. There may well be users of the DuplicateRequestCollapserPolicy who only need the in-memory lock, not Redis; we can avoid those users taking unneeded/unwanted dependencies by hosting the Redis lock in a different repo. I'll make the new repo, then let's move your Redis code there.

Can I recommend always using GitFlow on the Polly-Contribs? (ie don't push interim work straight to the master branch). Helps others coming to your Polly-Contrib repo, if branches follow standard expectations. Pretty sure this is old hat, but in case not, I've written up in more detail the motivations (if nothing else, at least for others on future projects).

@reisenberger
Copy link
Member

@jjxtra Repo is ready, with a template: Polly.Contrib.DuplicateRequestCollapser.RedisDistributedLock. Let's pull your Redis contribution into there!

@jjxtra
Copy link

jjxtra commented Feb 25, 2020

Thanks for making the repo, I will do pr moving forward :)

@coultonluke
Copy link

Is this going to be made available as a nuget package or should we just copy the code into our projects?

@martincostello
Copy link
Member

That's a question for the repo containing the code - see Polly-Contrib/Polly.Contrib.DuplicateRequestCollapser.RedisDistributedLock#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants