Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes could fail to bootstrap using the commit log bootstrapper due to topology changes #2129

Closed
justinjc opened this issue Jan 30, 2020 · 0 comments · Fixed by #2145
Closed
Assignees

Comments

@justinjc
Copy link
Collaborator

justinjc commented Jan 30, 2020

Consider this situation:

  1. Node X is responsible for shard 10.
  2. Node Y gets added to the cluster. As a result, Node X loses responsibility of shard 10.
  3. Node Z gets removed from the cluster. This triggers a bootstrap.
  4. Node X goes through its commit logs and encounters an entry for shard 10, but because it is no longer responsible for the shard, it fails the bootstrap.

This seems possible in the calls to NamespaceDataAccumulator.CheckoutSeriesWithoutLock in the commit log bootstrapper, which ultimately calls shardAtWithRLock, which returns an error with a bad shard input.

This should be rare due to snapshots happening continuously, which means that commit logs should be cleaned up relatively regularly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant