-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: work around can't-swap-leaseholder #40363
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 1 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/replicate_queue.go, line 515 at r3 (raw file):
// likely to be the leaseholder), then this removal would fail. Instead, this // method will attempt to transfer the lease away, and returns true to indicate // to the caller that it should not pursue the current replication change further.
"because it is no longer the leaseholder"
pkg/storage/replicate_queue.go, line 739 at r3 (raw file):
// only, which should succeed, and the next time we touch this // range, we will have one more replica and hopefully it will // take the lease and remove the current leaseholder.
I'm surprised that this case doesn't hit an error when it calls maybeTransferLeaseAway
. Could you mention what we expect to happen when you call that?
There may be nothing to roll back, so don't log unconditionally. Release note: None
This was showing up a lot in TestInitialPartitioning. If we're trying to remove something but nothing needs to be removed, that seems OK (though there is some question of why we're hitting this regularly). Release note: None
As of cockroachdb#40284, the replicate queue was issuing swaps (atomic add+remove) during rebalancing. TestInitialPartitioning helpfully points out (once you flip atomic rebalancing on) that when the replication factor is one, there is no way to perform such an atomic swap because it will necessarily have to remove the leaseholder. To work around this restriction (which, by the way, we dislike - see \cockroachdb#40333), fall back to just adding a replica in this case without also removing one. In the next scanner cycle (which should happen immediately since we requeue the range) the range will be over-replicated and hopefully the lease will be transferred over and then the original leaseholder removed. I would be very doubtful that this all works, but it is how things worked until cockroachdb#40284, so this PR really just falls back to the previous behavior in cases where we can't do better. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
bors r=nvanbenschoten |
40363: storage: work around can't-swap-leaseholder r=nvanbenschoten a=tbg As of #40284, the replicate queue was issuing swaps (atomic add+remove) during rebalancing. TestInitialPartitioning helpfully points out (once you flip atomic rebalancing on) that when the replication factor is one, there is no way to perform such an atomic swap because it will necessarily have to remove the leaseholder. To work around this restriction (which, by the way, we dislike - see \#40333), fall back to just adding a replica in this case without also removing one. In the next scanner cycle (which should happen immediately since we requeue the range) the range will be over-replicated and hopefully the lease will be transferred over and then the original leaseholder removed. I would be very doubtful that this all works, but it is how things worked until #40284, so this PR really just falls back to the previous behavior in cases where we can't do better. Release note: None Co-authored-by: Tobias Schottdorf <[email protected]>
Build succeeded |
40370: storage: prepare for kv.atomic_replication_changes=true r=nvanbenschoten a=tbg First three commits are #40363. ---- This PR enables atomic replication changes by default. But most of it is just dealing with the fallout of doing so: 1. we don't handle removal of multiple learners well at the moment. This will be fixed more holistically in #40268, but it's not worth waiting for that because it's easy for us to just avoid the problem. 2. tests that carry out splits become quite flaky because at the beginning of a split, we transition out of a joint config if we see one, and due to the initial upreplication we often do. If we lose the race against the replicate queue, the split catches an error for no good reason. I took this as an opportunity to refactor the descriptor comparisons and to make this specific case a noop, but making it easier to avoid this general class of conflict where it's avoidable in the future. There are probably some more problems that will only become apparent over time, but it's quite simple to turn the cluster setting off again and to patch things up if we do. Release note (general change): atomic replication changes are now enabled by default. Co-authored-by: Tobias Schottdorf <[email protected]>
As of #40284, the replicate queue was issuing swaps (atomic add+remove)
during rebalancing. TestInitialPartitioning helpfully points out (once you
flip atomic rebalancing on) that when the replication factor is one, there
is no way to perform such an atomic swap because it will necessarily have
to remove the leaseholder.
To work around this restriction (which, by the way, we dislike - see
#40333), fall back to just adding a replica in this case without also
removing one. In the next scanner cycle (which should happen immediately
since we requeue the range) the range will be over-replicated and hopefully
the lease will be transferred over and then the original leaseholder
removed. I would be very doubtful that this all works, but it is how things
worked until #40284, so this PR really just falls back to the previous
behavior in cases where we can't do better.
Release note: None