Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix scheduling can not immediately start after transfer leader #4875

Merged
merged 8 commits into from
May 17, 2022

Conversation

rleungx
Copy link
Member

@rleungx rleungx commented Apr 29, 2022

Signed-off-by: Ryan Leung [email protected]

What problem does this PR solve?

Issue Number: Close #4769.

What is changed and how does it work?

This PR uses the flag to distinguish if the region's heartbeat is new when transferring the PD leader.

Check List

Tests

  • Unit test

Release note

Fix the issue that scheduling cannot immediately start after PD leader transfers

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Apr 29, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • AndreMouche
  • JmPotato

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-triage-completed release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Apr 29, 2022
@ti-chi-bot ti-chi-bot requested review from JmPotato and lhy1024 April 29, 2022 02:59
@rleungx rleungx changed the title *: fix the issue that scheduling can not immediately start after transfe… *: fix scheduling can not immediately start after transfer leader Apr 29, 2022
@codecov
Copy link

codecov bot commented Apr 29, 2022

Codecov Report

Merging #4875 (54e8fc1) into master (562586c) will decrease coverage by 0.04%.
The diff coverage is 83.33%.

❗ Current head 54e8fc1 differs from pull request most recent head f297bf2. Consider uploading reports for the commit f297bf2 to get more accurate results

@@            Coverage Diff             @@
##           master    #4875      +/-   ##
==========================================
- Coverage   75.45%   75.40%   -0.05%     
==========================================
  Files         298      297       -1     
  Lines       29532    29506      -26     
==========================================
- Hits        22282    22250      -32     
- Misses       5318     5328      +10     
+ Partials     1932     1928       -4     
Flag Coverage Δ
unittests 75.40% <83.33%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/cluster/cluster.go 83.30% <33.33%> (-1.39%) ⬇️
server/region_syncer/client.go 85.82% <50.00%> (+0.10%) ⬆️
server/cluster/coordinator.go 72.74% <100.00%> (-0.92%) ⬇️
server/cluster/prepare_checker.go 100.00% <100.00%> (ø)
server/core/region.go 91.17% <100.00%> (+0.05%) ⬆️
server/core/region_option.go 82.85% <100.00%> (+0.33%) ⬆️
server/grpc_service.go 53.00% <100.00%> (ø)
server/schedulers/shuffle_hot_region.go 55.55% <0.00%> (-10.11%) ⬇️
server/id/id.go 76.19% <0.00%> (-4.77%) ⬇️
server/member/member.go 64.21% <0.00%> (-3.16%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7088c95...f297bf2. Read the comment docs.

@rleungx rleungx marked this pull request as ready for review April 29, 2022 07:23
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 29, 2022
server/cluster/cluster.go Outdated Show resolved Hide resolved
server/cluster/cluster.go Outdated Show resolved Hide resolved
Copy link
Member

@JmPotato JmPotato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. BTW, do we need to triage this issue first to see which version it affects before merging it?

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 11, 2022
@rleungx rleungx requested a review from HunDunDM May 11, 2022 03:56
@ti-chi-bot ti-chi-bot added needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. needs-cherry-pick-release-5.0 The PR needs to cherry pick to release-5.0 branch. needs-cherry-pick-release-5.1 Type: Need cherry pick to release-5.1 needs-cherry-pick-release-5.2 Type: Need cherry pick to release-5.2 needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.0 The PR needs to cherry pick to release-6.0 branch. and removed do-not-merge/needs-triage-completed labels May 16, 2022
@rleungx
Copy link
Member Author

rleungx commented May 16, 2022

LGTM. BTW, do we need to triage this issue first to see which version it affects before merging it?

Done

@rleungx rleungx removed the needs-cherry-pick-release-6.0 The PR needs to cherry pick to release-6.0 branch. label May 16, 2022
@@ -746,6 +746,11 @@ func (c *RaftCluster) processReportBuckets(buckets *metapb.Buckets) error {
return nil
}

// IsPrepared return if the prepare checker is ready.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// IsPrepared return if the prepare checker is ready.
// IsPrepared returns true if the prepare checker is ready.

@@ -812,7 +817,7 @@ func (c *RaftCluster) processRegionHeartbeat(region *core.RegionInfo) error {
regionEventCounter.WithLabelValues("update_cache").Inc()
}

if isNew {
if !c.IsPrepared() && isNew {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we check the condition in the function collect directly?

@@ -310,6 +311,9 @@ func (c *coordinator) runUntilStop() {
}

func (c *coordinator) run() {
failpoint.Inject("runSchedulerCheckInterval", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the Ticker instead of runSchedulerCheckInterval? So we can keep runSchedulerCheckInterval as a constant.
For example , re define the ticker after L317

	failpoint.Inject("runSchedulerCheckInterval", func() {
		ticker = time.NewTicker(100 * time.Millisecond)
	})

@ti-chi-bot
Copy link
Member

@rleungx: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot merged commit 429b492 into tikv:master May 17, 2022
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request May 17, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4967.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request May 17, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4969.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request May 17, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4970.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request May 17, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4971.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request May 17, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4972.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request May 17, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4973.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request May 17, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4974.

ti-chi-bot added a commit that referenced this pull request Jun 11, 2022
) (#4969)

close #4769, ref #4875

Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: Ryan Leung <[email protected]>
ti-chi-bot added a commit that referenced this pull request Jun 14, 2022
) (#4974)

close #4769, ref #4875

Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: Ryan Leung <[email protected]>
ti-chi-bot added a commit that referenced this pull request Jun 22, 2022
) (#4970)

close #4769, ref #4875

Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: Ryan Leung <[email protected]>
ti-chi-bot added a commit that referenced this pull request Jul 5, 2022
) (#4967)

close #4769, ref #4875

Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: Ryan Leung <[email protected]>
ti-chi-bot added a commit that referenced this pull request Sep 20, 2022
) (#4973)

close #4769, ref #4875

Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: Ryan Leung <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. needs-cherry-pick-release-5.0 The PR needs to cherry pick to release-5.0 branch. needs-cherry-pick-release-5.1 Type: Need cherry pick to release-5.1 needs-cherry-pick-release-5.2 Type: Need cherry pick to release-5.2 needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.0 The PR needs to cherry pick to release-6.0 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduling is blocked for around 5 mins after transferring the PD leader
4 participants