Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cache for GatewayBackend to HaGatewayManager #501

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

choiwaiyiu
Copy link

@choiwaiyiu choiwaiyiu commented Oct 2, 2024

Description

Currently, Trino Gateway stops routing queries when the database is unavailable.

This PR adds a cache of GatewayBackend to the HaGatewayManager, ensuring that queries can still be routed even when the database is unavailable.

The cache is read-only, so update operations to GatewayBackend will still fail if the database is unavailable.

Release notes

( x) This is not user-visible or is docs only, and no release notes are required.

Copy link

cla-bot bot commented Oct 2, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Choi Wai Yiu.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link

cla-bot bot commented Oct 2, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Choi Wai Yiu.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link
Member

@xkrogen xkrogen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am supportive of adding caching at this layer, but it's important to note that this alone won't be sufficient for the GW to continue routing traffic when the DB is unavailable, since writes happen in-the-loop of new query submission, e.g. here:

queryHistoryManager.submitQueryDetail(queryDetail);

We need a more holistic strategy around how to handle DB unavailability which may include treating some writes as best-effort and skipping them in case of DB unavailability.

cc @surajkn

private final GatewayBackendDao dao;
private final AtomicReference<List<GatewayBackend>> cache = new AtomicReference<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use something like a Guava/Caffeine cache here that has more proper cache management including cache expiration times. Remember that the same database will be shared by multiple GW instances. This code seems to assume that it can keep the cache in sync by simply invalidating the cache when update operations are made, but updates may be made out-of-band by other GW instances.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xkrogen
Thank you for your review. @oneonestar and I are currently working on addressing the issues in this PR based on your feedback and advice :)

@choiwaiyiu
Copy link
Author

@xkrogen
Thanks for the review, and sorry for the late reply!
I’ve worked on fixing the cache inconsistency issue by changing how we use the cache.
As for the implementation details, Supplier might not be strictly necessary, but I added it to help with performance.

@choiwaiyiu
Copy link
Author

cc: @oneonestar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants