Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node unable to persist log, but keeps being elected #614

Open
k-jingyang opened this issue Sep 10, 2024 · 1 comment
Open

Node unable to persist log, but keeps being elected #614

k-jingyang opened this issue Sep 10, 2024 · 1 comment

Comments

@k-jingyang
Copy link
Contributor

k-jingyang commented Sep 10, 2024

Hello,

Recently, we faced an issue where we encounter a case where the leader node had a bad persistent store:

This cycle caused unstable leadership during the period. For us, this cycle persisted for 10 mins until another node was finally elected leader.

Wondering if there are recommendations or good practices for handling such cases? Given that Hashicorp runs your own cloud offerings too.

Also, wondering if there is an optimisation that we can do here in the library? I understand there's some nuances to this.

  • Based on my understanding, the current way to fend against this is that heartbeat timeouts has a form of randomness.
  • Given a cluster: node A (leader), node B, node C:
    • When node A demotes itself, because of randomness, node C has a chance to timeout earlier and becomes a candidate before node A becomes one
    • However, if node B doesn't timeout, it will still think that node A is the leader, and will always reject node C's vote request.
    • In such cases, node A has the natural advantage in winning elections. This is not preferred when node A has a persistent store issue
@k-jingyang
Copy link
Contributor Author

Hmmm, I realised that this issue has got more to do with the eccentricity of our logs store, as only StoreLogs was failing and not other StableStore operations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant