Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute cache value on calling thread to fix thread starvation #13244

Conversation

gwenneg
Copy link
Member

@gwenneg gwenneg commented Nov 11, 2020

Partially fixes #13158.

Deadlock should no longer happen when the lock timeout feature from the cache extension is not used. In that specific case which is probably the most common way to use the extension, the cache value computation is now done synchronously from the calling thread.

Deadlock can still happen while using both the lock timeout feature and the context propagation extension.

This was tested using JMeter and the reproducer provided by @sivelli. I reproduced the issue locally with 1.9.2.Final and it's gone with this PR.

@ghost ghost added the area/cache label Nov 11, 2020
@gwenneg gwenneg changed the title Compute cache value on calling thread when lock timeout is disabled WIP: Compute cache value on calling thread when lock timeout is disabled Nov 12, 2020
@gwenneg
Copy link
Member Author

gwenneg commented Nov 12, 2020

I'm making this a WIP until tomorrow. The lock timeout enabled case can be treated the same way as discussed in #13158.

@gwenneg gwenneg force-pushed the issue-13158-cache-with-context-propagation-deadlock branch from c66e3f6 to c84660e Compare November 12, 2020 12:47
@ghost ghost added the area/documentation label Nov 12, 2020
@gwenneg gwenneg changed the title WIP: Compute cache value on calling thread when lock timeout is disabled Compute cache value on calling thread when lock timeout is disabled Nov 12, 2020
@gwenneg
Copy link
Member Author

gwenneg commented Nov 12, 2020

All cache values computations are now done on the calling thread in a synchronous way. This means the cache extension no longer needs context propagation, so I removed the ManagedExecutor from CaffeineCacheBuildRecorder.

I also removed some code that was used to prevent Caffeine from logging warning messages when an exception was raised during an async cache value computation. It was no longer needed since there's no async computation anymore.

This PR is ready for a review.

@gwenneg
Copy link
Member Author

gwenneg commented Nov 12, 2020

@gsmet Assuming the PR is merged with the current code, it may be worth mentioning in the release note that the cache extension no longer needs the context propagation extension.

@gwenneg gwenneg force-pushed the issue-13158-cache-with-context-propagation-deadlock branch from c84660e to 1f48f99 Compare November 12, 2020 20:03
@gwenneg
Copy link
Member Author

gwenneg commented Nov 12, 2020

Last minute push: since ManagedExecutor is gone, caches can now be initialized at ExecutionTime.STATIC_INIT instead of RUNTIME_INIT.

@gsmet
Copy link
Member

gsmet commented Nov 12, 2020

@gwenneg I wonder if we should backport this to 1.10? It looks safe enough and fixes an annoying issue.

WDYT?

@gwenneg
Copy link
Member Author

gwenneg commented Nov 12, 2020

@gsmet Yes, I think this should be backported not only because of the thread starvation fix but also because this new version should be much better in terms of performances.

@gwenneg gwenneg changed the title Compute cache value on calling thread when lock timeout is disabled Compute cache value on calling thread to fix thread starvation Nov 12, 2020
@gwenneg gwenneg force-pushed the issue-13158-cache-with-context-propagation-deadlock branch from 1f48f99 to 72ea4e8 Compare November 12, 2020 22:00
@gwenneg
Copy link
Member Author

gwenneg commented Nov 12, 2020

Commit message changed to better reflect the PR content.

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround!

@gsmet gsmet merged commit f163e37 into quarkusio:master Nov 13, 2020
@ghost ghost added this to the 1.11 - master milestone Nov 13, 2020
@gwenneg gwenneg deleted the issue-13158-cache-with-context-propagation-deadlock branch November 13, 2020 11:27
@gwenneg
Copy link
Member Author

gwenneg commented Nov 13, 2020

Thanks @ben-manes for your help with this!

@ben-manes
Copy link

If the loader throws an exception, does it complete the future? It doesn’t appear to, which would cause the in-flight future to remain in the cache and block subsequent readers. I might be misreading but can you please verify?

@gwenneg
Copy link
Member Author

gwenneg commented Nov 29, 2020

There was indeed an issue with exception handling that was fixed in #13410.

@ben-manes
Copy link

oh! I forgot about that, thanks!

@gwenneg
Copy link
Member Author

gwenneg commented Nov 29, 2020

Thanks for alerting me here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock using Caffeine cache and context propagation
3 participants