-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit concurrency of Polly.Cache calling its delegate per corresponding key #657
Comments
Courtesy reply: I have a collection of ideas how this could be done, but am unlikely to be able to respond in detail before the middle of next week. |
Some quick input to get you a response: I would suggest using the You would want to use the |
@mrmartan I'd also have a look at https://github.com/aspnet/Extensions/issues/708 which discusses that GetOrAddAsync on IMemoryCache/IDistributedCache it not atomic. You can also look at my NuGet package (Meerkat.Caching) which addresses this and use this as the basis of your own code. |
And there is also LazyCache and DoubleCache, which it might be possible to implement a Polly CacheProvider for, or take inspiration from - as well as @phatcher 's example (thanks Paul!). |
@phatcher Aren't most of the |
@reisenberger Possibly, I haven't had a look at the implementation of LazyCache - I came across #708 when I was trying to implement my service (as opposed to HttpRequest) level cache that we discussed a little while ago. Lazy was a pure in-memory implementation, which is why I took an extension method approach against both IMemoryCache/IDistributedCache for mine. I still have to see how it behaves in a large scale environment e.g. Kubernetes with pods coming and going. What it allows is that the cache key and ISynchronizer key different - which is useful when services could duplicate each other's work, but you can't control the order in which services are invoked. |
Thanks guys. I've went with @phatcher's Synchronizer for now. I like how it builds on top of IMemoryCache. Not the nuget though as it drags dependencies I don't want. As for |
@mrmartan Great, glad you have a solution! Re:
Sure, I was making an effort to get you a quick response rather than leave you waiting, I forgot your case was async 🙂 . @phatcher nice caching library, thanks for linking. |
I do. I was not using Polly.Cache in the first place and the solution lead me to custom Once you have time. |
Hi @mrmartan . Yes, it would be with the |
Linking a related discussion; https://github.com/aspnet/Extensions/issues/708 |
Lazy works well if only one box is serving requests, but if you have more than one box, you will still have problems with multiple queries doing the same thing on different machines. In this case, locking on a redis key is necessary. |
So I think whatever solution happens here needs to happen at the policy level, not the caching level. A single policy on a single machine with a given key should only ever execute one at a time, any additional call that comes in while that policy is executing should either lock or simply be returned the currently executing A note about distributed caching - if one has a large server farm, think dozens or hundreds of machines or more, it could also be useful to lock on a distributed cache key (redis can do this), that way 100 machines don't do an expensive database query at the same time. The 99 machines that don't get the key lock would wait (with timeout probably) and when the key unlocks, they would execute their policies, most likely getting a distributed cache entry as a result. This would need to be thought out more, this is just a quick and dirty high level idea. There would need to be a way to check an in memory cache as the very first step before doing any distributed cache locks as well. Although a key lock is super fast, no point in doing it if the object is in local RAM. |
To recap and summarize two possible technical solutions: a) Thread-safe b) Thread-safe lock-per-key: @phatcher has a great demonstration of this in Meerkat.Caching (see foot of readme), as a lock-per-key via In both cases, the key would default to Both would be primitives, separate policies which could be composed with cache (or other) policies. A key question for either solution is scoping, ownership and disposal of the semaphores/lazys: The question-mark over (b) is that the collection of Semaphores can grow over time. Avoiding background jobs to clean this up, the only sensible solution is to hand control of semaphore-eviction-and-disposal back to the user: something like Option (a) has no such problem: the EDIT: And for the caching use-case, assuming the policies are nested For comparison, Alastair Crabtree's LazyCache also addresses the same problem (it's a common caching problem) as a standalone project. This (afaiu) caches a |
I like option A couple of considerations, some or all of which you may have addressed already :)
Thanks for being so open and willing to discuss this, I am in charge of a large back-end with some pretty heavy sql queries and am pushing a transition to Polly for our services. |
@reisenberger Have you started any work on this? Any concerns if I take a stab at this? I could create a fork and start doing some work and would love to get some early feedback... |
@jjxtra Awesome if you want to work on this! I already had an early code sketch - will share in the next 24-48 hours |
Looking forward to it, would love collaboration here in order to come up with the best solution. |
Thanks @jjxtra ! Awesome to collaborate. Just brain-dumped here (final commit) the early sketch I had. Basically a realisation of the pattern first proposed here; think this is fairly close to your first bullet here too? (thanks, great discussion!). But please feel free to elaborate / make / propose changes! 👍 to the idea of interface to inject lock strategies. That ^ sketch is just an engine for the heart of the sync implementation. The best way to get an idea for the scaffolding for a policy and its syntax is look at how Async version fairly similar with (Note: I am mid working on Polly v8; the sketch branches off that and is best place now to branch from.) Thanks again! |
Should I fork the repo and then build off of your sketch branch then? Initial ideas for key-lock providers would be an in-memory per process lock, and a distributed redis lock. The redis lock depends on stack-exchange redis library, so is it better off in the contrib framework instead of the core framework? |
Yes, branch from there now. The on-going 👍 to ideas for lock providers.
Thanks for contributing! |
Here is my idea for the in-process per key lock. It is designed under the assumption that most locks will be acquired and released without contention, which should be the most common path. Acquiring the lock is the cost of computing a hash value from a string, a modulus call and finally an interlocked compare exchange call. I had considered using I also considered using Under the assumption that this will be wrapping an expensive computation and/or network call, it should be a minuscule percentage of cpu and wait time. I also made the lock provider interface async, so that any distributed locking mechanism could fit nicely in, but we could change this back if you think we don't need it. Finally, I had considered on just locking the string itself, but this would only work if the string was a static or const, and would not work cross libraries, or if the string was generated at runtime. Here is the gist, please give me any and all feedback: https://gist.github.com/jjxtra/f6116180b2ef5c1550e60567af506c2a |
Thanks @jjxtra for this and for everything to take
Great question. I think we will need both (separate)
With that in mind, I think we can probably move your
With you:
Agreed 👍 Still reflecting on where a key-lock implementation like this might eventually live ( |
Just a heads up I made some changes to the gist to fix the concurrency issue you mentioned, along with providing a sync and async version. Let me know if you think it's good enough to use or if you have any further suggested changes. |
I think adding the per-process key lock to the main Polly framework would be ideal. It could be the default implementation for the new policy that ensures a key only executes one at a time. The distributed lock will have dependencies on redis stackexchange, so probably it belongs in the contrib project... |
Thanks @jjxtra , planning to come back to this shortly. |
@jjxtra After benchmarking and considering complexity trade-offs, I suggest the default lock implementation should be a single lock per policy, using the standard .Net Thanks again for contributing! Let me know if I can provide any other guidance on building out the policy infrastructure (previous notes here). |
@reisenberger I ran some additional tests, was wondering if you wanted to run one last benchmark on your machine using my modified code. I put my findings in the gist link. Interesting results as cpu core count increases. Gist link for convenience since this thread has gotten quite long: https://gist.github.com/jjxtra/f6116180b2ef5c1550e60567af506c2a |
@jjxtra I wanted to progress the main Policy implementation for this as well as the locking discussion; I took that up and it is done. Will push that in the coming days and then revert to this discussion. |
Sounds good! |
In some spare weekend hours I opted to push out the main Policy implementation for this before switching my attention back to Polly v8. The policy implementation is complete (with concurrency-exercising unit tests), added doco just now, pushed all here. The policy has moved to Polly-Contrib at I have released an early version to nuget, just to get a version of this out there and published. But @jjxtra : this is just the start, would be great to see this taken further with further locking implementations! @jjxtra I will send you an invitation to join Polly-Contrib, which gives you rights on that repo. Feel free to expand/change locking implementations if you want to take ownership of that contrib project. @mrmartan @phatcher If you are interested too in contributing there I can invite you. Welcome feedback on the early nuget release ( @mrmartan @phatcher @jjxtra / anyone) if you get to exercise it in your environments. Perhaps best over on Thanks! |
I'll do a PR for it. Thanks so much for setting it up and doing all the
documentation, very thorough and impressive work!
…-- Jeff
On Thu, Feb 6, 2020 at 4:21 PM Dylan Reisenberger ***@***.***> wrote:
In some spare weekend hours I opted to push out the main Policy
implementation for this before switching my attention back to Polly v8. The
policy implementation is complete (with concurrency-exercising unit tests),
added doco just now, pushed all here
<https://github.com/Polly-Contrib/Polly.Contrib.DuplicateRequestCollapser>
.
The policy has moved to Polly-Contrib at
Polly.Contrib.DuplicateRequestCollapser
<https://github.com/Polly-Contrib/Polly.Contrib.DuplicateRequestCollapser>.
@jjxtra <https://github.com/jjxtra> : Just as you had aspects of the
striped-locking where supporting .Net Standard 1.x presented challenges,
likewise the policy implementation wanted a dependency only available to
.Net Standard 2.x. Moving the policy to Polly-Contrib allows us to make
that compromise (we would not drop .Net Standard 1.x support from core
Polly just for this policy).
I have released an early version to nuget
<https://www.nuget.org/packages/Polly.Contrib.DuplicateRequestCollapser/>,
just to get a version of this out there and published. But @jjxtra
<https://github.com/jjxtra> : this is just the start, would be great to
see this taken further with further locking implementations! @jjxtra
<https://github.com/jjxtra> I will send you an invitation to join
Polly-Contrib <https://github.com/Polly-Contrib/>, which gives you rights
on that repo. Feel free to expand/change locking implementations if you
want to take ownership of that contrib project. @mrmartan
<https://github.com/mrmartan> @phatcher <https://github.com/phatcher> If
you are interested too in contributing there I can invite you.
Welcome feedback on the early nuget release ( @mrmartan
<https://github.com/mrmartan> @phatcher <https://github.com/phatcher>
@jjxtra <https://github.com/jjxtra> / anyone) if you get to exercise it
in your environments. Perhaps best over on
Polly.Contrib.DuplicateRequestCollapser repo
<https://github.com/Polly-Contrib/Polly.Contrib.DuplicateRequestCollapser>,
given that is where this has ended up!
Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#657?email_source=notifications&email_token=AAJWQGAIKVQM2P2THGJNACTRBSLPBA5CNFSM4H2XUPTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELBE4FI#issuecomment-583159317>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJWQGCWSARCWHILLK2MGYTRBSLPBANCNFSM4H2XUPTA>
.
|
Np @jjxtra . Going to close this issue now that a first version of the policy has been published, but know @jjxtra that you plan to expand on this (thanks!). Thanks to everybody on this thread for the contributions in thinking! Also linked out from the Polly CachePolicy wiki area to the new policy: https://github.com/App-vNext/Polly/wiki/Avoiding-cache-repopulation-request-storms |
If you have a few seconds to look at my distributed lock implementation, I'd love any feedback. I am using this code on some personal projects, I have no issue licensing the distributed lock bits under your license. It does require stack exchange redis to be brought in, but I think that is a fair trade... |
Thanks @jjxtra . I'll aim to jump back into this thread in the coming days! |
@jjxtra Good to have a Redis lock in the mix for the request collapser: thanks for bringing this forward! I'd like to create a new repo for this: Can I recommend always using GitFlow on the Polly-Contribs? (ie don't push interim work straight to the |
@jjxtra Repo is ready, with a template: Polly.Contrib.DuplicateRequestCollapser.RedisDistributedLock. Let's pull your Redis contribution into there! |
Thanks for making the repo, I will do pr moving forward :) |
Is this going to be made available as a nuget package or should we just copy the code into our projects? |
That's a question for the repo containing the code - see Polly-Contrib/Polly.Contrib.DuplicateRequestCollapser.RedisDistributedLock#4 |
Summary: What are you wanting to achieve?
I want the cache policy execution to limit concurrency on its data loading delegate per given operation key.
Imagine a disrtibuted cache in Redis implemented though the IAsyncCacheProvider. The cached keys have set expiry/TTL in Redis. The system in question experiences thousands of calls per second. Now the key in Redis expires. There is a high probability that tens of requests come in (nearly) simultaneously all requiring the same key (from Redis). It is not there and the cache executes the underlying data provider once per each request, ie. needlessly because all of those executions will yield the same value. Executing the data provider is expensive. Then all of those executions start setting the same key/value pair to Redis where setting it once would be enough. This IMHO causes unnessesary load on both the cache (Redis) and the backing data store. Or rather bursts of load whenever a key expires.
I would like the provider to be executed only once and once it returns all outstanding/awaiting cache request would be fulfilled immediately, also setting the value into the cache only once.
Do you have any suggestions please? Is this possible with Polly.Cache? Or is this something you would have to implement?
I have tried implementing this using Bulkhead unsuccessfuly. I would have to have two of them - one for the cache provider and one for the data provider - and even then it just limits the conncurency and does not allow for fullfiling all executions in its queue at once.
As I am writing this I thought of using double cache with an inner layer of memory cache but I can't think of how to reasonably tell when it's OK to evict from the inner cache. The problem with Bulkhead is that it does not work on a per key basis and maintaining a collection of them could get expensive quickly. I am also worried about overhead.
Something like Polly.Cache(Redis:IAsyncCacheProvider) -> Bulkhead -> Polly.Cache(Memory:ConcurrentDictionary) -> Bulkhead -> DataLoader
The text was updated successfully, but these errors were encountered: