Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add region cache state test & fix some issues of replica selector #910

Merged
merged 3 commits into from
Aug 7, 2023

Conversation

you06
Copy link
Contributor

@you06 you06 commented Jul 26, 2023

This PR add the selector replica test for stale read, cover the most possible errors, and fix the bug the test finds:

Fixes:

  • when fallback to followers from leader, stale-read flag should be removed.
  • in some cases, the fallback to follower not work.
  • when using stale read, allow async reload region when leader is exhausted, because leader may just be unavailable for a while, and the late update of region info won't affect stale read processing.

@you06 you06 force-pushed the region-cache-test branch 2 times, most recently from 9bbea85 to 533d7eb Compare August 2, 2023 05:34
@you06 you06 force-pushed the region-cache-test branch from e607df2 to 5efef85 Compare August 2, 2023 07:06
@you06 you06 marked this pull request as ready for review August 2, 2023 07:06
@you06 you06 changed the title Add region cache state test add region cache state test & fix some issues of replica selector Aug 2, 2023
@cfzjywxk
Copy link
Contributor

cfzjywxk commented Aug 4, 2023

/cc @crazycs520

// In stale-read, the request will fallback to leader after the local follower failure.
// If the leader is also unavailable, we can fallback to the follower and use replica-read flag again,
// The remote follower not tried yet, and the local follower can retry without stale-read flag.
if state.isStaleRead {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be printed in the above log so we know if tryFollower happens in current turn, so is the fallbackFromLeader state.

@@ -609,14 +615,30 @@ func (state *accessFollower) next(bo *retry.Backoffer, selector *replicaSelector
// If there is no candidate, fallback to the leader.
if selector.targetIdx < 0 {
leader := selector.replicas[state.leaderIdx]
leaderInvalid := leader.isEpochStale() || state.IsLeaderExhausted(leader)
leaderEpochStale := leader.isEpochStale()
leaderInvalid := leaderEpochStale || state.IsLeaderExhausted(leader)
Copy link
Contributor

@cfzjywxk cfzjywxk Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could refactor the implememtation of state.IsLeaderExhausted, cheking if the leader is exausted by the current states instead of leader.isExhausted(2).


resp, _, err := s.regionRequestSender.SendReqCtx(bo, req, regionLoc.Region, time.Second, tikvrpc.TiKV, ops...)
if !available {
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this err not checked if it's expected or not?

@disksing disksing merged commit 5968ce9 into tikv:tidb-6.5 Aug 7, 2023
you06 added a commit to you06/client-go that referenced this pull request Aug 14, 2023
you06 added a commit to you06/client-go that referenced this pull request Aug 14, 2023
you06 added a commit to you06/client-go that referenced this pull request Aug 14, 2023
you06 added a commit to you06/client-go that referenced this pull request Aug 15, 2023
you06 added a commit to you06/client-go that referenced this pull request Aug 15, 2023
cfzjywxk pushed a commit that referenced this pull request Sep 12, 2023
…) (#942)

* add region cache state test & fix some issues of replica selector (#910)

Signed-off-by: you06 <[email protected]>

remove duplicate code

Signed-off-by: you06 <[email protected]>

* remove comment

Signed-off-by: you06 <[email protected]>

* lint

Signed-off-by: you06 <[email protected]>

* fix flaky test

Signed-off-by: you06 <[email protected]>

---------

Signed-off-by: you06 <[email protected]>
cfzjywxk added a commit that referenced this pull request Sep 26, 2023
* reload region cache when store is resolved from invalid status (#843)

Signed-off-by: you06 <[email protected]>
Co-authored-by: disksing <[email protected]>

* fallback to follower when leader is busy (#916) (#923)

* fallback to follower when leader is busy

Signed-off-by: you06 <[email protected]>
Co-authored-by: cfzjywxk <[email protected]>
Co-authored-by: cfzjywxk <[email protected]>

* Resume max retry time check for stale read retry with leader option(#903) (#911)

* Resume max retry time check for stale read retry with leader option

Signed-off-by: cfzjywxk <[email protected]>

* add cancel

Signed-off-by: cfzjywxk <[email protected]>

---------

Signed-off-by: cfzjywxk <[email protected]>

* add region cache state test & fix some issues of replica selector (#910)

Signed-off-by: you06 <[email protected]>

remove duplicate code

Signed-off-by: you06 <[email protected]>

* enable workflow for tidb-7.1

Signed-off-by: you06 <[email protected]>

* update

Signed-off-by: you06 <[email protected]>

update

Signed-off-by: you06 <[email protected]>

fix test

Signed-off-by: you06 <[email protected]>

fix test

Signed-off-by: you06 <[email protected]>

* lint

Signed-off-by: you06 <[email protected]>

* lint

Signed-off-by: you06 <[email protected]>

* fix flaky test

Signed-off-by: you06 <[email protected]>

---------

Signed-off-by: you06 <[email protected]>
Signed-off-by: cfzjywxk <[email protected]>
Co-authored-by: disksing <[email protected]>
Co-authored-by: cfzjywxk <[email protected]>
Co-authored-by: cfzjywxk <[email protected]>
iosmanthus added a commit that referenced this pull request Dec 20, 2023
Co-authored-by: cfzjywxk <[email protected]>
Co-authored-by: cfzjywxk <[email protected]>
Co-authored-by: disksing <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: zzm <[email protected]>
Co-authored-by: husharp <[email protected]>
Co-authored-by: you06 <[email protected]>
Co-authored-by: buffer <[email protected]>
Co-authored-by: 3pointer <[email protected]>
Co-authored-by: buffer <[email protected]>
Co-authored-by: husharp <[email protected]>
Co-authored-by: crazycs520 <[email protected]>
Co-authored-by: Smilencer <[email protected]>
Co-authored-by: ShuNing <[email protected]>
Co-authored-by: zyguan <[email protected]>
Co-authored-by: Jack Yu <[email protected]>
Co-authored-by: Weizhen Wang <[email protected]>
Co-authored-by: lucasliang <[email protected]>
Co-authored-by: healthwaite <[email protected]>
Co-authored-by: xufei <[email protected]>
Co-authored-by: JmPotato <[email protected]>
Co-authored-by: ekexium <[email protected]>
Co-authored-by: 山岚 <[email protected]>
Co-authored-by: glorv <[email protected]>
Co-authored-by: Yongbo Jiang <[email protected]>
resolve locks interface for tidb gc_worker (#945)
fix some issues of replica selector (#910)  (#942)
fix some issues of replica selector (#910)
fix issue of configure kv timeout not work when disable batch client (#980)
fix batch-client wait too long and add some metrics (#973)
fix batch-client wait too long and add some metrics (#973)" (#984)
fix data race at the aggressiveLockingDirty (#913)
fix MinSafeTS might be set to MaxUint64 permanently (#994)
fix: fix invalid nil pointer when trying to record Store.SlownessStat. (#1017)
Fix batch client batchSendLoop panic (#1021)
fix request source tag unset (#1025)
Fix comment of `SuspendTime` (#1057)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants