-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: Consider which replica will be removed when adding a replica… #18364
Conversation
Hmm, I think we may need to take this a little farther. As for testing, the replicate queue is unfortunately in need of some test love (#11987). If you could write one that covers this code, that would be great, otherwise some manual testing may be needed. The best tests for it currently are the allocator acceptance tests and running the Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. pkg/storage/replicate_queue.go, line 498 at r1 (raw file):
I think we'd be best off if we simulate the removal process as closely as possible, which would mean including the new replica in the pkg/storage/replicate_queue.go, line 500 at r1 (raw file):
I'm a little concerned about how this will interact with #18425, since we aren't using the updated stats here. Do you have any thoughts on how to make this more accurately reflect the updated stats? I think we may be best off not doing anything about it, but I imagine this discrepancy will probably be a problem at some point. pkg/storage/replicate_queue.go, line 505 at r1 (raw file):
If you don't mind me rewording this, I'd phrase it as "not rebalancing to candidate %+v because it would be removed immediately after being added" Comments from Reviewable |
@a-robinson , Sorry, I couldn't login in If we already pass the new replica ID as And what do you mean |
Sorry, I'll try to stick to Github comments if Reviewable is inconvenient for you.
Because it makes a difference in what
The goal here is to simulate the actual removal process as accurately as possible. If the actual removal process will be acting on updated stats and this simulated removal process isn't, then the simulated process may be incorrect sometimes. Modifying the stats for the simulation in a way that's easy to roll back and doesn't affect anything else happening in the system may be difficult, though. What do you think? |
I think the actual intent is just to simulate |
That would be ideal, yes. It'll probably require a fair amount of refactoring, so use your judgment, but that would be best. |
@a-robinson, Thank you for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for forgetting about this for a while, @a6802739! We definitely still want this to get in.
Regarding testing, we have a few options of increasing difficulty:
- Do something like in storage: immediately update target store write stats after rebalance #18425, where we create a configuration that would previously have caused thrashing and test that
RebalanceTarget
will prefer not rebalancing over rebalancing to something that will immediately be removed. - Create a test cluster with multiple nodes via something like
multiTestContext
orTestCluster
with localities configured in a way that would normally trigger this bug and ensure that the rebalancing settles down. For an example configuration, see trouble adding a fourth node - will not become functional #19013. To summarize that issue, a cluster with 2 nodes in locality X, 1 node in locality Y, and 1 node in locality Z will exhibit the kind of thrashing that this PR should fix. - Most complicated would be to randomly generate cluster/replica configurations to run through a unit test that tries runs
RebalanceTarget
, makes (or simulates) the requested change, and then runsRemoveTarget
. The test would fail if any configuration chose to remove the replica that it just added.
pkg/storage/allocator.go
Outdated
@@ -374,6 +374,22 @@ func (a *Allocator) AllocateTarget( | |||
} | |||
} | |||
|
|||
func (a Allocator) simulationRemoveTarget( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but I'd change this to simulateRemoveTarget
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. Done.
pkg/storage/allocator.go
Outdated
rangeInfo RangeInfo, | ||
) (roachpb.ReplicaDescriptor, string, error) { | ||
// Update statics first | ||
a.storePool.updateLocalStoreAfterRebalance(targetStore, replica, roachpb.ADD_REPLICA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the following comment here?
// TODO(a-robinson): This could theoretically interfere with decisions made by other goroutines, but as of October 2017 calls to the Allocator are mostly serialized by the ReplicateQueue (with the main exceptions being Scatter and the status server's allocator debug endpoint). Try to make this interfere less with other callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, Done.
pkg/storage/allocator.go
Outdated
candidates []roachpb.ReplicaDescriptor, | ||
rangeInfo RangeInfo, | ||
) (roachpb.ReplicaDescriptor, string, error) { | ||
// Update statics first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/statics/statistics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake. Done.
@@ -491,6 +511,33 @@ func (a Allocator) RebalanceTarget( | |||
if target == nil { | |||
return nil, "" | |||
} | |||
// We could make a simulation here to verify whether we'll remove the target we'll rebalance to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we could avoid this code duplication with what's in replicate_queue.go
and instead create a shared function they can both use? If they're separate like this, it seems likely that they'll accidentally diverge in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, Sorry, I couldn't quite understand what do you mean here. I didn't make any change to replicate_queue.go
.
Would you please point out the code duplication with replicate_queue.go
? Thank you very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to the code for the AllocatorRemove
case in processOneChange
, but on looking at it again it doesn't look like there's much that could be de-duped. Sorry for the confusion!
pkg/storage/allocator.go
Outdated
if newTarget == nil { | ||
return nil, "" | ||
} | ||
target = newTarget |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we actually want this logic to happen in a loop. What if the newTarget
is also going to be immediately removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, You are right. My mistake, Done.
Reviewed 4 of 4 files at r2. Comments from Reviewable |
@a-robinson, Thank you for your reference about unit test. Please have a review again. Thank you very much. |
@a-robinson , Sorry, I don't know why the teamcity failed. And I didn't see any important information from the teamcity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that test failure in teamcity is just a flake - #18554
pkg/storage/allocator.go
Outdated
ctx context.Context, constraints config.Constraints, rangeInfo RangeInfo, filter storeFilter, | ||
ctx context.Context, | ||
constraints config.Constraints, | ||
repl *Replica, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing in a Replica
is pretty broad. It looks like we could shrink this down to just using the RangeInfo
if we made simulateRemoveTarget
just take a RangeInfo
instead of a Replica
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we should guarantee the duration for WritesPerSecond
in RangeInfo
exceeds MinStatsDuration
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suppose so. Since this code only runs on the leaseholder, it pretty much always should be, anyway.
@@ -685,9 +694,164 @@ func TestAllocatorRebalance(t *testing.T) { | |||
} | |||
} | |||
|
|||
// TestAllocatorRebalanceTarget could help us to verify whether we'll rebalance to a target that | |||
// we'll immediately remove. | |||
func TestAllocatorRebalanceTarget(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you confirm that this test passes with your change but fails without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I could confirm that.
pkg/storage/allocator_test.go
Outdated
}, | ||
}, | ||
{ | ||
StoreID: 5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining these stores and the fake replica stats below? I'm not really sure why this fifth store is here or why are the values are set the way that they are. And if I'm not sure, then future readers are even less likely to understand what you were thinking while creating the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, my original thinking is that if we didn't add the fifth store, it will have no store to make an rebalance, So I just add a fifth store here. If we didn't make the simulation, it will choose store 2
as the target to make an rebalance, but will be removed immediately after the rebalance. So I just want to make sure that we will not choose store 2
as the target store.
pkg/storage/allocator_test.go
Outdated
sg.GossipStores(stores, t) | ||
|
||
st := a.storePool.st | ||
EnableStatsBasedRebalancing.Override(&st.SV, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? In 1.1, this problem occurs even if stats-based rebalancing is disabled. Testing the default setting seems more valuable than testing the non-default setting. Keeping this false would also let us remove the capacity, available, logicalbytes, and writespersecond fields from all the store descriptors above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, Yeah, that will be more easier. Thank you very much.
pkg/storage/allocator_test.go
Outdated
testRangeInfo(replicas, firstRange), | ||
storeFilterThrottled, | ||
) | ||
if expected := roachpb.StoreID(5); result.StoreID != expected { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think having a fifth store here is any better than just having 4 stores and verifying that we don't want to move a replica to the fourth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you are right. 4 stores will be enough.
Reviewed 3 of 3 files at r3. Comments from Reviewable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple small comments left. Should be good to go after the last couple changes :)
pkg/storage/allocator.go
Outdated
ctx context.Context, constraints config.Constraints, rangeInfo RangeInfo, filter storeFilter, | ||
ctx context.Context, | ||
constraints config.Constraints, | ||
repl *Replica, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suppose so. Since this code only runs on the leaseholder, it pretty much always should be, anyway.
pkg/storage/allocator.go
Outdated
@@ -491,6 +516,37 @@ func (a Allocator) RebalanceTarget( | |||
if target == nil { | |||
return nil, "" | |||
} | |||
// We could make a simulation here to verify whether we'll remove the target we'll rebalance to. | |||
for len(candidates) >= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not noticing this last time, but for len(candidates) >= 0
is a pretty meaningless loop condition. Did you mean for len(candidates) > 0
? Or to just make this an infinite loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just thought it will break in the loop condition. I have changed it as your suggestion.
pkg/storage/allocator_test.go
Outdated
// We're going to manually mark stores dead in this test. | ||
stopper, g, _, a, _ := createTestAllocator( /* deterministic */ false) | ||
defer stopper.Stop(context.Background()) | ||
stores := []*roachpb.StoreDescriptor{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining in the code review, but a comment in the code explaining this configuration of stores would be very helpful to future maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for that. I have already added a comment about the configuration of stores.
Reviewed 1 of 1 files at r4. Comments from Reviewable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Reviewed 5 of 5 files at r5. Comments from Reviewable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it fails to build, though, according to teamcity:
[TestStyle/TestErrCheck] style_test.go:819: err=exit status 2, stderr=/go/src/github.com/cockroachdb/cockroach/pkg/storage/store_pool_test.go:434:45: cannot use replica (variable of type *Replica) as RangeInfo value in argument to sp.updateLocalStoreAfterRebalance
@a-robinson, Thank you very much. I have changed it already. Wait for a successful CI of the teamcity. :) |
… to improve balance
If the first target attempted was rejected due to the simulation claiming that it would be immediately removed, we would reuse the modified `rangeInfo.Desc.Replicas` that had the target added to it, messing with future iterations of the loop. Also, we weren't properly modifying the `candidates` slice, meaning that we could end up trying the same replica multiple times. I have a test for this, but it doesn't pass yet because the code in cockroachdb#18364 actually isn't quite sufficient for fixing cases like cockroachdb#20241. I'll send that out tomorrow once I have a fix done. Release note: None
@a-robinson
Fixes #17971. @a-robinson , Thanks for your guidance, I just simulate
(*Allocator).RemoveTarget
before we actually add the replica during the rebalance.Still don't know how to add a test for this.