-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: teach allocator to prescribe non-voter addition/removal #59389
kvserver: teach allocator to prescribe non-voter addition/removal #59389
Conversation
f1e4405
to
6ba4f15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2, 6 of 6 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvserver/allocator.go, line 52 at r3 (raw file):
// Priorities for various repair operations. // TODO DURING REVIEW: Do these priorities look sane? Note that these
This comment was helpful. I never knew that this was the only use of these priorities. Could you add to the comment to indicate this.
Now that I know that, I'm left wondering whether the magnitude of these priorities matter, or whether only their relative order matters. For instance, is it meaningful that addDecommissioningReplacementVoterPriority
has 5x the priority of removeDeadVoterPriority
? Would addDecommissioningReplacementVoterPriority float64 = 1001
have the exact same behavior?
pkg/kv/kvserver/allocator.go, line 52 at r3 (raw file):
// Priorities for various repair operations. // TODO DURING REVIEW: Do these priorities look sane? Note that these
Yes, these priorities make sense to me.
pkg/kv/kvserver/allocator.go, line 159 at r3 (raw file):
type allocatorError struct { constraints []zonepb.ConstraintsConjunction // TODO(aayush): !!! This might need to be augmented.
What did you decide here?
pkg/kv/kvserver/allocator.go, line 317 at r3 (raw file):
// available for up-replication. // // If `num_voters` aren't explicitly configured in the zone config, all replicas
"all num_replicas
replicas"
pkg/kv/kvserver/allocator.go, line 323 at r3 (raw file):
// this method should only be consulted after voting replicas have been // upreplicated / rebalanced off of dead/decommissioning nodes. func GetNeededNonVoters(numVoters, zoneConfigNumNonVoters, clusterNodes int) int {
nit: we have zoneConfigVoterCount
above and zoneConfigNumNonVoters
here. Let's be uniform.
pkg/kv/kvserver/allocator.go, line 411 at r3 (raw file):
quorum := computeQuorum(haveVoters) // TODO(aayush): When haveVoters < neededVoters but we don't have quorum to
Any idea why we don't already do this? The code structure makes me wonder if it's intentional.
pkg/kv/kvserver/allocator.go, line 425 at r3 (raw file):
} liveVoterReplicas, deadVoterReplicas := a.storePool.liveAndDeadReplicas(voterReplicas)
nit: should we rename this to liveVoters
and deadVoters
to fit the rest of the variable naming in this function?
pkg/kv/kvserver/allocator.go, line 460 at r3 (raw file):
// Non-voting replica addition. //liveNonVoters, deadNonVoters := a.storePool.liveAndDeadReplicas(nonVoterReplicas)
Do we need to handle the if haveNonVoters == neededNonVoters && len(deadNonVoters) > 0 {
case? In fact, do we need a AllocatorReplaceDeadNonVoter
action? Does that fall into your future change to rebalance non-voting replicas?
pkg/kv/kvserver/allocator.go, line 465 at r3 (raw file):
if haveNonVoters < neededNonVoters { priority := addMissingNonVoterPriority action := AllocatorAddNonVoter
Isn't this supposed to be a lower priority than AllocatorRemoveDeadVoter
? Will that work if we have this check before the if len(deadVoterReplicas) > 0 {
check?
db93279
to
d033de1
Compare
This TODO was added in 2016 and the surrounding structure of the method has entirely changed in the intervening period. It doesn't make sense anymore as we do indeed handle heterogenous sets of replica constraints now. Release note: None
This commit renames the existing `AllocatorAction`s and priorities to indicate that they pertain only to voting replicas. This is intended to pave the way for new non-voter specific allocator actions and priorities that follow in the subsequent commits in this PR. Release note: None
d033de1
to
4e9b42e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 52 at r3 (raw file):
I'm left wondering whether the magnitude of these priorities matter
The difference between addMissingVoterPriority
and addDecommissioningReplacementVoterPriority
makes sense to me because we modulate the priority of adding missing voters based on how far we are from our expected voter count. But for the rest I don't think the magnitude matters.
pkg/kv/kvserver/allocator.go, line 159 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
What did you decide here?
Sorry, this should be addressed inside #59403. Removed from here.
pkg/kv/kvserver/allocator.go, line 317 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"all
num_replicas
replicas"
Done.
pkg/kv/kvserver/allocator.go, line 323 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: we have
zoneConfigVoterCount
above andzoneConfigNumNonVoters
here. Let's be uniform.
Done.
pkg/kv/kvserver/allocator.go, line 411 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Any idea why we don't already do this? The code structure makes me wonder if it's intentional.
Yeah it does seem intentional, but I don't really see how. It makes sense that it's never really come up before because the symptoms would be hard to discern.
pkg/kv/kvserver/allocator.go, line 425 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: should we rename this to
liveVoters
anddeadVoters
to fit the rest of the variable naming in this function?
Done.
pkg/kv/kvserver/allocator.go, line 460 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do we need to handle the
if haveNonVoters == neededNonVoters && len(deadNonVoters) > 0 {
case? In fact, do we need aAllocatorReplaceDeadNonVoter
action? Does that fall into your future change to rebalance non-voting replicas?
Yup, replacement/rebalancing is coming after. I did not mean to leave that thing commented in here though. Fixed.
pkg/kv/kvserver/allocator.go, line 465 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Isn't this supposed to be a lower priority than
AllocatorRemoveDeadVoter
? Will that work if we have this check before theif len(deadVoterReplicas) > 0 {
check?
Woops, yes. The whole thing should to be structured in exactly the order of the priorities. I added a comment to this effect, both inside computeAction
and above the definition of those priorities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvserver/allocator.go, line 465 at r3 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
Woops, yes. The whole thing should to be structured in exactly the order of the priorities. I added a comment to this effect, both inside
computeAction
and above the definition of those priorities.
👍
TFTR! I'm in no rush to merge this, I'll hold off until @andreimatei takes a took. |
Friendly bump. |
33b6521
to
026e90d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 319 at r7 (raw file):
// available for up-replication. // // If `num_voters` aren't explicitly configured in the zone config, all
it's unclear to me how this comment relates to this function. Is this essentially saying that if zoneConfigNonVoterCount
is zero, then this method returns 0? Consider removing it.
pkg/kv/kvserver/allocator.go, line 427 at r7 (raw file):
// TODO(aayush): When haveVoters < neededVoters but we don't have quorum to // actually execute the addition of a new replica, we should be returning a // ALlocatorRangeUnavailable.
ALl/All
This commit adds basic logic in the allocator to spit out actions that prescribe upreplication or downreplication of non-voting replicas based on zone configs. Note that this commit *does not* enable the `replicateQueue` or the `StoreRebalancer` to actually act on these newly added `AllocatorAction`s, not does it make the allocator smart enough to prescribe _rebalancing_ non-voting replicas. This commit also does not teach the allocator to rank non-voting replicas for addition or removal. Release note: None
026e90d
to
dd4ff5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 319 at r7 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
it's unclear to me how this comment relates to this function. Is this essentially saying that if
zoneConfigNonVoterCount
is zero, then this method returns 0? Consider removing it.
Done.
pkg/kv/kvserver/allocator.go, line 427 at r7 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
ALl/All
Done.
Build succeeded: |
This PR adds basic logic in the allocator to spit out actions that
prescribe upreplication or downreplication of non-voting replicas based
on zone configs.
Note that this commit does not enable the
replicateQueue
or theStoreRebalancer
to actually act on these newly addedAllocatorAction
s, not does it make the allocator smart enough toprescribe rebalancing non-voting replicas. This commit also does not
teach the allocator to rank non-voting replicas for addition or removal.
Release note: None