Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Avoid replica thrashing when localities are different sizes #20752

Merged
merged 4 commits into from
Dec 18, 2017

Conversation

a-robinson
Copy link
Contributor

Fixes #20241

Release note (bug fix): avoid rebalance thrashing when localities have
very different numbers of nodes

This could use some careful review. A more comprehensive refactoring as described in #20751 would be nice, but this fixes the immediate problem as seen in #20241.

@a-robinson a-robinson requested review from BramGruneir, a6802739 and a team December 15, 2017 17:54
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@petermattis
Copy link
Collaborator

:lgtm:


Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 1 files at r3.
Review status: 0 of 4 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/storage/allocator.go, line 540 at r3 (raw file):

		replicaCandidates := desc.Replicas
		if raftStatus != nil && raftStatus.Progress != nil {

This could use a comment. When are raftStatus and raftStatus.Progress nil?


pkg/storage/allocator.go, line 582 at r4 (raw file):

}

func shouldRebalanceToFrom(

This name isn't terribly descriptive, but I can't think of a better name. A comment describing what it does would be helpful.


Comments from Reviewable

@bdarnell
Copy link
Contributor

:lgtm:


Reviewed 4 of 4 files at r4.
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


pkg/storage/allocator.go, line 582 at r4 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This name isn't terribly descriptive, but I can't think of a better name. A comment describing what it does would be helpful.

Pair might be clearer than ToFrom, which raised a parse error in my brain.


pkg/storage/allocator.go, line 591 at r4 (raw file):

) bool {
	if from.StoreID == to.store.StoreID {
		log.VEventf(ctx, 2, "not rebalancing to s%d because we'd immediately remove it: %s",

Shouldn't equal store IDs be caught earlier in the process? This is the "can't have two replicas per store (or even per node)" rule, not "we'd immediately remove it".


Comments from Reviewable

Copy link
Contributor

@a6802739 a6802739 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@a6802739 a6802739 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// balanceScore ties and it's a workable stop-gap on the way to something
// like #20751.
avgRangeCount := float64(c.rangeCount+o.rangeCount) / 2.0
overfullThreshold := math.Max(overfullRangeThreshold(options, avgRangeCount), avgRangeCount+1.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain the definition of overfullThreshold here? Thank you very much.

@BramGruneir
Copy link
Member

:lgtm:


Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


Comments from Reviewable

Skipping the simulation when raftStatus.Progress is nil can make
for undesirable thrashing of replicas, as seen when testing cockroachdb#20241.
It's better to run the simulation without properly filtering replicas
than to not run it at all.

Release note: None
Fixes cockroachdb#20241

Release note (bug fix): avoid rebalance thrashing when localities have
very different numbers of nodes
@a-robinson
Copy link
Contributor Author

TFTRs!


Review status: 0 of 4 files reviewed at latest revision, 4 unresolved discussions.


pkg/storage/allocator.go, line 540 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This could use a comment. When are raftStatus and raftStatus.Progress nil?

I saw raftStatus.Progress being nil repeatedly when reproducing #20241. I never saw raftStatus being nil. I didn't dig deep into it at the time, but I'd assume it's because there was a leader-not-leaseholder situation. Do you think it's worth more investigation?

Added a comment, either way.


pkg/storage/allocator.go, line 582 at r4 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Pair might be clearer than ToFrom, which raised a parse error in my brain.

Changed to shouldRebalanceBetween and added a comment.


pkg/storage/allocator.go, line 591 at r4 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Shouldn't equal store IDs be caught earlier in the process? This is the "can't have two replicas per store (or even per node)" rule, not "we'd immediately remove it".

My bad variable naming confused you. The variable previously named from isn't necessarily an existing member of the range -- it's the replica that we think we'll remove after adding the new replica. This is here to prevent us from adding a replica that we would then just immediately remove.


pkg/storage/allocator_scorer.go, line 167 at r4 (raw file):

Previously, a6802739 (songhao) wrote…

Could you please explain the definition of overfullThreshold here? Thank you very much.

Added a comment.


Comments from Reviewable

@bdarnell
Copy link
Contributor

Review status: 0 of 4 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


pkg/storage/allocator.go, line 540 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I saw raftStatus.Progress being nil repeatedly when reproducing #20241. I never saw raftStatus being nil. I didn't dig deep into it at the time, but I'd assume it's because there was a leader-not-leaseholder situation. Do you think it's worth more investigation?

Added a comment, either way.

Yes, that sounds like leader-not-leaseholder. Progress is only present on the leader.


Comments from Reviewable

@a-robinson a-robinson merged commit 30f60fc into cockroachdb:master Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected behaviour in a split-brain scenario
6 participants