Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetOrCreate is not atomic which is a serious issue for a cache ( eg poor perf and memory issues) #36499

Closed
bklooste opened this issue Aug 14, 2018 · 28 comments
Labels
Milestone

Comments

@bklooste
Copy link

This blog provides the detail

https://tpodolak.com/blog/2017/12/13/asp-net-core-memorycache-getorcreate-calls-factory-method-multiple-times/

We see poor performance in 1 in 10 queries due to multiple invokes when not needed even under non high load conditions and have had to resort to a similar solution to that blog post which is not great .

This was closed with no good reason aspnet/Caching#359

@sebastienros
Copy link
Member

There are two distinctive concerns in the issue you are referencing:

  • The result that is added could be a different one which was returned
  • The lambda is called multiple times concurrently

Based on the blog post you wrote it's clear that you disagree with the fact that the lambda is not blocked from reentrancy "by default". But this behavior is by design (as you also mentioned), as it is for the ConcurrentDictionary.

The idea is that each scenario is different and some might not require a lock. You are free to lock the lambda if you think it's not a cheap operation and even two calls are worst. After all if the default implementation had to lock it would do it quite aggressively on the key, however you might want to minimize the locking by providing a better implementation, which you are showing adequately in your blog post.

And for this reason I can only expect this issue will be closed too.

@xied75
Copy link
Contributor

xied75 commented Oct 2, 2018

@sebastienros I'm here for the 2nd concern you mentioned, i.e. block re-entrance. Although I agree what you said that each scenario is different, that lock-free might be possible. But in my opinion a default implementation that prevents re-entry would be hugely useful for us lay persons, since, given we are "educated" enough to seek the help from a cache, therefore populating the cache must be something high cost, therefore by default to allow re-entry sounds contradictory.

@ghost
Copy link

ghost commented Nov 29, 2018

@sebastienros I'd argue that it doesn't make sense to have a lock-less cache-miss scenario. Let's consider a common scenario where the cache is used during retrieval of something which comes from an external source, I.e. a web request. This may or may not be deterministic, it doesn't really matter either way for this discussion. In this scenario, the caller for a given key has to wait for the web request and parallel web requests does not improve latency. (In-fact they will reduce performance due to the extra resources required)

The typical solution to this is the double-checked lock that was mentioned in the article tha @bklooste linked to. In this implementation a cache hit is typically lock-free and locks are only used during a refresh of the stored item for that key.

A better solution to the one in the article would be to use an "async/awaitable keyed lock", but this needs to be bounded so that you don't leak memory over time. This has the advantage of improving the scope of the locking to just the keys that are being refreshed, but the disadvantage that you effectively need a second level of cache (This time with an "async/awaitable lock" which operates globally across all of the keys) for the sole purpose of managing the keyed locks.

I'll demonstrate with some pseudo code: (Assume it's all async)

Item GetOrCreate(key, factory)
{
	if (memoryCache.TryGet(key, out item))
	{
		return item
	}

	using (await GetKeyedLock(key).Lock())
	{
		if (memoryCache.TryGet(key, out item))
		{
			return item
		}

		item = await factory(key)
		memoryCache.Set(key, item)

		return item;
	}
}

Lock GetKeyedLock(key)
{
	key = key + "¦lock";
	if (memoryCache.TryGet(key, out lock))
	{
		return lock
	}

	using (await Lock())
	{
		if (memoryCache.TryGet(key, out lock))
		{
			return lock
		}

		lock = new Lock()
		memoryCache.Set(key, lock)

		return lock;
	}
}

@aspnet-hello aspnet-hello transferred this issue from aspnet/Caching Dec 13, 2018
@mbirtwistle
Copy link

mbirtwistle commented Jan 15, 2019

I agree that a method called GetOrCreate sounds like it will work atomically. I also took 'Threadsafe' in the documentation as a confirmation that it was atomic. Otherwise what's the point in the method? If its not atomic I might as well make multiple Get / Create calls and know exactly what I have to deal with regarding locking. You really need to make this more clear in the documentation of GetOrCreate as it makes a fundamental difference in whether the Create Lambda is the actual expensive operation, or a cheap Lazy constructor whose .Value lambda is the expensive operation.

@justinzhong
Copy link

justinzhong commented Feb 11, 2019

I came across this issue as well and agree with @mbirtwistle the current state as it stands require manually placing locks to guarantee consistent output. I have created a simple demo that runs 100K concurrent requests on MemoryCache.GetOrCreate and ConcurrentDictionary.GetOrCreate using Lazy factory with side-by-side comparison that shows GetOrCreate is neither re-entrant nor protect its internal state, i.e. the payload of a never expiring cache is updated by making concurrent calls to GetOrCreate.

memorycache-vs-concurrentdictionary

The demo shows that GetOrCreate returns different instances of Lazy that yields different results, which somewhat defeats the purpose of moving the expensive operation to a Lazy factory. Please correct me if you see any mistakes in the way I constructed the demo, otherwise I hope the aspnet team would look at this and hopefully address this in the next release please.

Note: the demo code is extended from reading the related issue aspnet/Caching#359 by @amit-ezra

Edit: mentioning specs
.NET Core 2.2
Microsoft.Extensions.Caching.Memory 2.2.0.0

@bklooste
Copy link
Author

bklooste commented Feb 24, 2019

Its especially an issue with newer devs - the general expectation of how it works is not met - which results in investigations , tickets and reduces peoples opinion of Core . Caching is pretty important .

Many people have this issue now without even knowing about it.

@guylando
Copy link

+1

@BladeWise
Copy link
Contributor

BladeWise commented Mar 16, 2019

These are the extensions I am currently using to ensure that get-or-add and set operations are atomic.
I tested it with a modified version of the code from @justinzhong and it seems to behave as expected.
Synchronization is kept to a minimum, in case of a cache miss a global semaphores concurrent dictionary is accessed and a semaphore is acquired on a per-memorycache-key basis.

namespace Extensions
{
    using System;
    using System.Collections.Concurrent;
    using System.Threading;
    using System.Threading.Tasks;
    using Microsoft.Extensions.Caching.Memory;

    public static class MemoryCacheExtensions
    {
        private static readonly ConcurrentDictionary<int, SemaphoreSlim> _semaphores = new ConcurrentDictionary<int, SemaphoreSlim>();

        public static async Task<(T Value, bool Created)> GetOrCreateAtomicAsync<T>(this IMemoryCache memoryCache, object key, Func<ICacheEntry, T> factory)
        {
            if (memoryCache.TryGetValue(key, out var value))
                return ((T)value, false);

            var isOwner = false;
            var semaphoreKey = (memoryCache, key).GetHashCode();
            if (!_semaphores.TryGetValue(semaphoreKey, out var semaphore))
            {
                SemaphoreSlim createdSemaphore = null;
                semaphore = _semaphores.GetOrAdd(semaphoreKey, k => createdSemaphore = new SemaphoreSlim(1)); // Try to add the value, this is not atomic, so multiple semaphores could be created, but just one will be stored!

                if (createdSemaphore != semaphore)
                    createdSemaphore.Dispose(); // This semaphore was not the one that made it into the dictionary, will not be used!
                else
                    isOwner = true;
            }

            await semaphore.WaitAsync()
                           .ConfigureAwait(false); // Await the semaphore!
            try
            {
                if (!memoryCache.TryGetValue(key, out value))
                {
                    var entry = memoryCache.CreateEntry(key);
                    entry.SetValue(value = factory(entry));
                    entry.Dispose();
                    return ((T)value, true);
                }

                return ((T)value, false);
            }
            finally
            {
                if (isOwner)
                    _semaphores.TryRemove(semaphoreKey, out _);
                semaphore.Release();
            }
        }

        public static async Task<T> SetAtomicAsync<T>(this IMemoryCache memoryCache, object key, Func<ICacheEntry, T, T> factory)
        {
            var isOwner = false;
            var semaphoreKey = (memoryCache, key).GetHashCode();
            if (!_semaphores.TryGetValue(semaphoreKey, out var semaphore))
            {
                SemaphoreSlim createdSemaphore = null;
                semaphore = _semaphores.GetOrAdd(semaphoreKey, k => createdSemaphore = new SemaphoreSlim(1)); // Try to add the value, this is not atomic, so multiple semaphores could be created, but just one will be stored!

                if (createdSemaphore != semaphore)
                    createdSemaphore?.Dispose(); // This semaphore was not the one that made it into the dictionary, will not be used!
                else
                    isOwner = true;
            }

            await semaphore.WaitAsync()
                           .ConfigureAwait(false); // Await the semaphore!
            try
            {
                var currentValue = default(T);
                if (memoryCache.TryGetValue(key, out var v))
                    currentValue = (T)v;

                T newValue;
                var entry = memoryCache.CreateEntry(key);
                entry.SetValue(newValue = factory(entry, currentValue));
                entry.Dispose();

                return newValue;
            }
            finally
            {
                if (isOwner)
                    _semaphores.TryRemove(semaphoreKey, out _);
                semaphore.Release();
            }
        }
    }
}

@phatcher
Copy link

@BladeWise I think you can simplify slightly as although GetOrAdd is not thread safe, it does return a consistent result at least according to this blog post,

So I think

SemaphoreSlim createdSemaphore = null;
locks.GetOrAdd(key, k => createdSemaphore = new SemaphoreSlim(1)); // Try to add the value, this is not atomic, so multiple semaphores could be created, but just one will be stored!
semaphore = locks[key]; // Re-read the value from the dictionary, to ensure the used value is exactly the value stored in the dictionary!

can reduce to

SemaphoreSlim createdSemaphore = null;
semaphore = locks.GetOrAdd(key, k => createdSemaphore = new SemaphoreSlim(1)); // Try to add the value, this is not atomic, so multiple semaphores could be created, but just one will be stored!

You could go the whole hog and use a Lazy to create it which would ensure only one semaphore was ever created.

@BladeWise
Copy link
Contributor

@phatcher You are correct, moreover

semaphore = locks[key];

introduces a race condition, causing a key not found exception in case the key is created and removed by another thread, while the current thread performs the double check. So, just the GetOrAdd should be used as you suggested.

@Timovzl
Copy link

Timovzl commented May 21, 2019

The result that is added could be a different one which was returned

@sebastienros Are you sure that the returned value can differ from the value that ends up being added to the cache? aspnet/Caching#359 (comment) seems to contradict this:

and this would be the same with MemoryCache.AddOrGetExisting from .net framework. and by same I mean you would get the same value from all threads (even if lambda was called multiple times)

If the latter is true, then the behavior matches ConcurrentDictionary and could be deemed correct and expected.

In that case, we could easily wrap the values in a Lazy with the right LazyThreadSafetyMode if we truly want to prevent ever creating multiple values. And let's face it: creating multiple values simultaneously, with one winner, is generally acceptable. Either we lock and we wait for the initial creator to finish creating, or we perform the same operation simultaneously and end up using their value. Generally not worth fussing over unless the creation is excessively expensive or has side effects.

@PureKrome
Copy link
Contributor

👋 @anurse - G'Day! Any chance this issue could get some prioritisation for the next 3.x iteration please? It's catching a number of people out, whom I've been working with over the last few months after we stumbled across this issue 6 odd months ago.

@analogrelay
Copy link
Contributor

analogrelay commented Oct 1, 2019

I guess I'm interested in why this is catching people out when it's generally the common pattern for lazy multi-threaded initialization like this. ConcurrentDictionary.GetOrdAdd has the same approach.

Locking around user-provided delegates is generally a bad approach as it takes control away from the user and makes it easy to unintentionally deadlock. I suppose my initial observation is this: Is it better to unintentionally run the delegate multiple times (because the consumer didn't realize the delegate would be called multiple times) or better to unintentionally deadlock (because the consumer didn't realize they were under a lock and tried to re-access the cache). The cache is still atomic with regard to applying the result of the delegate to the cache. The user is always in a position to lock within their delegate to prevent multiple invocations of expensive code paths (or use something like Lazy<T> to do the locking for them).

Regardless of the outcome of that discussion, I don't expect a change would come in the 3.x milestone. This would be a significant behavioral change which would not be done in a 3.0.x patch. And the 3.1 release is pretty much fully booked at this point (and the bar for changes is quite high given the very short timescale, there's little chance for feedback). The earliest I'd expect a change here is 5.0.

@Timovzl
Copy link

Timovzl commented Oct 1, 2019

It's catching a number of people out

@PureKrome In https://github.com/aspnet/Extensions/issues/708#issuecomment-494393036 I mentioned why the current behavior (if indeed it matches ConcurrentDictionary) seems expected and usual to me.

Do you think this should behave differently from that? If so, what makes this case different, and what makes the alternative implementation more robust or more intuitive than the current one?

@bklooste
Copy link
Author

bklooste commented Oct 3, 2019

@Timovzl this whole page , the blog etc is about how its unusual and how it should behave , eg GetOrCreate being atomic. There are no disagreements on this and no its NOT like ConcurrentDictionary as that call is atomic and part of concurrrent dictionary not via 2 calls done by an extension method. It has the same api signature which is good but the behavior is different .

The only reason I can see it hasn't been done in the first place is because to do it means changing IMemoryCache which is breaking or have IMemoryCache2 : IMemoryCache which is ugly / require
new tests and means people dont want to fix it but 3 would be the right time for this maybe have IAtomicMemoryCache

@PureKrome
Copy link
Contributor

@anurse thanks heaps for the prompt reply and for popping back into this thread - really appreciate it!

I guess I'm interested in why this is catching people out when it's generally the common pattern for lazy multi-threaded initialization like this. ConcurrentDictionary.GetOrdAdd has the same approach.

Great question and I think this is the crux of the situation: it's the knowledge about how to use all of this, correctly. When "we" (friends, colleagues, etc) first all start out using this, we all incorrectly thought it was handling the delegate 'atomically' (is that the correct term, here?).

So for me, I'm not sure if this could be handled better with code or just documentation. Maybe the documentation is sufficient and I've not found it -or- it's been updated recently'ish and I've just not noticed it because I sorta know how to solve this problem, now.

BTW, the solutions you did post are what i've found out too. But it's easy to know what to do, when you've been given the answer. For me, this post by Andrew Lock really was the lightbulb moment for me to teach me what to do (Hint: it's to use Lazy<T>, just like you said). But this so wasn't obvious to me and others, for ages.

@analogrelay
Copy link
Contributor

Maybe the documentation is sufficient and I've not found it

To me this indicates the documentation isn't sufficient :). If it's not in a place that you can easily find it, there's a gap to be filled. The API reference docs are certainly lacking here.

So certainly some action to take here is to improve those docs.

@PureKrome
Copy link
Contributor

PureKrome commented Oct 3, 2019

Sweet! Also,

Maybe the documentation is sufficient and I've not found it

that's a bad typo on my behalf. I ment to say: "Maybe the documentation isn't sufficient and I've not found it ... "

Now I'll just chill until the awesome Docs team get some bandwidth to review this, then :)

EDIT: @ Docs team - would be lovely to have a number of examples/scenario's handled in the docs, if possible :) e.g. typical scenario with loading some database resource and caching it locally in memory forever (until app restarts) or in memory for a fixed amount of time (e.g. 1 hour) with invalidation, etc.

@mbirtwistle
Copy link

mbirtwistle commented Oct 3, 2019 via email

@bklooste
Copy link
Author

bklooste commented Oct 3, 2019

The work around is neither easy or obvious.. I bet like 10-100K devs are hit without knowing and it will take a long time of why do you have this "useless" Lazy or lock delegate there to spread around and then when it is fixed or a new memory cache comes along it needs to be reverted.

Documentation and a few official blogs can certainly help but its a pretty big workaround / smell IMHO.,

@Timovzl
Copy link

Timovzl commented Oct 6, 2019

My request way back in this thread was for the documentation to make absolutely clear that the initialiser will likely get called multiple times in parallel if your app is processing multiple requests during cache warm up. Once you know this, There is a simple workaround to make the cache return a Lazy with your initialiser as the Lazy delegate so the Lazy can implement the locking that ensures a single delegate callback occurs.

I concur. Lazy<T> is the go-to mechanism for these things, and a way to avoid duplicating thread-safety mechanisms all over the place. On top, we get support for both the usual case (multiple initial factory invocations) and the strict case (one and only one factory invocation).

The documentation should definitely mention the behavior, and for those unfamiliar with using Lazy<T> to guarantee one-time creation, that could even be added as a recommendation.

@bklooste
Copy link
Author

bklooste commented Oct 9, 2019

Lazy is not a "goto" for thread safety/ locking it is only for where you actual want to lazy load and in this case you don't either, its sort of a hack / extra level of indirection but it will work.

The question is if cache needs this then IT should to this behind the scenes. However as i said above the poor design decision now leaves a choice of convenience so its ok but not good.

IMemoryCache<Lazy<IDictionary<int,xyz>>> gets pretty unwieldy .. and i know many people would criticize this as overly complex / confusing.

Thats said its probably the best choice provided its documented / blogged unless the team wants to change IMemoryCache for another reason if so than the atomic op should be added and then we dont need Lazy.

@analogrelay analogrelay transferred this issue from dotnet/extensions May 15, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label May 15, 2020
@analogrelay analogrelay added this to the Future milestone May 15, 2020
@maryamariyan maryamariyan removed the untriaged New issue has not been triaged by the area owner label Jul 1, 2020
@alastairtree
Copy link

Another work around that implements the use of Lazy, and the locking required to stop a different version ending up in the cache is to use LazyCache, a library that is available on nuget I wrote to solve these issues.

It wraps MemoryCache with the additional stuff required to make the lambda atomic as requested in this issue. It is also likely to perform better than the nice code provided by @BladeWise because of some smarts in the way the locks are collated by cache key and do not use SemaphoreSlim. Hopefully it helps some people who might stumble on this issue when their cache is not working as expected.

@maryamariyan maryamariyan added the tenet-performance Performance related issue label Sep 30, 2020
@jodydonetti
Copy link
Contributor

On top of the great LazyCache, may I suggest taking a look at FusionCache ⚡🦥 as another option for a concurrent-factory-calls optimized cache?

It is also optionally multi-level (memory + distributed) and it has some other advanced features you may be interested in, like a fail-safe mechanism (see here) , soft/hard timeouts with background factory completion (see here) and more.

I just released it and, if you find it helpful, that would be great to know.

/shameless-plug

Also, any comment on it would be greatly appreciated!

@rmandvikar
Copy link

2c: I'd like to see this bug fixed, as not many devs realize this. It seems the stance is to not fix it and in which case, I'd like to throw in another impl here. /shameless-plug

The impl (implementation) (called CacheWithTaskAndLockPool in repo) caches the task itself instead of the value, but creates the task inside a lock which is fetched from a lock pool (it can be concurrent map even). Concurrent requests get the same task. The task is then awaited outside of the lock to get the value. A task, if it "errors" out (canceled or faulted), is evicted using an IChangeToken. It perfs the same when compared with an impl that uses LazyCache (authored by @alastairtree). See here and look for CacheWithTaskAndLockPool, and CacheWithLazyCache along with tests.

@maryamariyan
Copy link
Member

Closing as dupe of #36500 (comment)

@rmandvikar
Copy link

rmandvikar commented Apr 6, 2022

2c: I feel a snippet in the msdn article providing an example of using lazy of T would be helpful. There is a comment in here with many upvotes.

@ghost ghost locked as resolved and limited conversation to collaborators May 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests