Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: add leader verify in balance-region scheduler #2966

Merged
merged 8 commits into from
Sep 18, 2020

Conversation

Yisaer
Copy link
Contributor

@Yisaer Yisaer commented Sep 16, 2020

Signed-off-by: Song Gao [email protected]

What problem does this PR solve?

If the region have no leader during balance-region, it will pause PD panic and restart over and over.

What is changed and how it works?

Check region leader before balance region. If it has no leader, the balance for this region would be skipped.

Check List

Tests

  • Unit test

Related changes

  • Need to cherry-pick to the release branch

Release note

  • Fix the bug that pd might panic If some regions have no leader when balance-region enabled.

Signed-off-by: Song Gao <[email protected]>
Signed-off-by: Song Gao <[email protected]>
Signed-off-by: Song Gao <[email protected]>
@Yisaer Yisaer marked this pull request as ready for review September 16, 2020 09:03
@Yisaer Yisaer added needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. type/bugfix This PR fixes a bug. labels Sep 16, 2020
pkg/pointer/point.go Outdated Show resolved Hide resolved
Signed-off-by: Song Gao <[email protected]>
Copy link
Member

@HunDunDM HunDunDM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

pkg/mock/mockcluster/mockcluster.go Outdated Show resolved Hide resolved
@@ -535,7 +540,7 @@ func (mc *Cluster) UpdateStoreStatus(id uint64) {
}

func (mc *Cluster) newMockRegionInfo(regionID uint64, leaderStoreID uint64, followerStoreIDs ...uint64) *core.RegionInfo {
return mc.MockRegionInfo(regionID, leaderStoreID, followerStoreIDs, []uint64{}, nil)
return mc.MockRegionInfo(regionID, &leaderStoreID, followerStoreIDs, []uint64{}, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does nil need to be passed in here? In addition, 0 can also mean that store does not exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think check 0 is better.

pkg/pointer/pointer.go Outdated Show resolved Hide resolved
Signed-off-by: Song Gao <[email protected]>
Signed-off-by: Song Gao <[email protected]>
@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 17, 2020
Copy link
Member

@HunDunDM HunDunDM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

@@ -583,8 +584,12 @@ func (mc *Cluster) MockRegionInfo(regionID uint64, leaderStoreID uint64,
EndKey: []byte(fmt.Sprintf("%20d", regionID+1)),
RegionEpoch: epoch,
}
leader, _ := mc.AllocPeer(leaderStoreID)
region.Peers = []*metapb.Peer{leader}
region.Peers = []*metapb.Peer{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
region.Peers = []*metapb.Peer{}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest LGTM

@@ -173,6 +173,12 @@ func (s *balanceRegionScheduler) Schedule(cluster opt.Cluster) []*operator.Opera
schedulerCounter.WithLabelValues(s.GetName(), "region-hot").Inc()
continue
}
// Check region whether have leader
if region.GetLeader() == nil {
log.Debug("region have no leader", zap.String("scheduler", s.GetName()), zap.Uint64("region-id", region.GetID()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be warn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

Signed-off-by: Song Gao <[email protected]>
@ti-srebot ti-srebot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 18, 2020
@Yisaer
Copy link
Contributor Author

Yisaer commented Sep 18, 2020

/merge

@ti-srebot
Copy link
Contributor

@Yisaer Oops! auto merge is restricted to Committers of the SIG.See the corresponding SIG page for more information. Related SIG: scheduling(slack).

@lhy1024
Copy link
Contributor

lhy1024 commented Sep 18, 2020

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 18, 2020
@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot ti-srebot merged commit 4a95bca into tikv:master Sep 18, 2020
ti-srebot pushed a commit to ti-srebot/pd that referenced this pull request Sep 18, 2020
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #2994

lhy1024 added a commit that referenced this pull request Sep 21, 2020
* cherry pick #2966 to release-4.0

Signed-off-by: ti-srebot <[email protected]>

* fix conflict

Signed-off-by: Song Gao <[email protected]>

* fix test

Signed-off-by: Song Gao <[email protected]>

Co-authored-by: Song Gao <[email protected]>
Co-authored-by: lhy1024 <[email protected]>
JmPotato pushed a commit to JmPotato/pd that referenced this pull request Feb 5, 2024
JmPotato added a commit that referenced this pull request Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants