Improve error handling when reading request and Region meta change in concurrency #1101

JaySon-Huang · 2020-09-17T06:08:56Z

What problem does this PR solve?

Issue Number: close #1095

Problem Summary:
If region meta changed between learner read and get streams from storage, we can not ensure the correctness of read data. We should retry those key ranges.
Before this PR, if the super batch is enabled and happens to this error, an error will directly be thrown to users. Users need to retry their queries. We should handle those retry inside TiFlash.

What is changed and how it works?

If we happen to RegionException after read from storage, and super batch is enabled, then
- clear streams read from local
- push those regions into region_retry and read from remote storage later
Refine some codes for FailPoints

Related changes

Need to cherry-pick to the release branch 4.0

Check List

Tests

Integration test

Side effects

Performance regression
- If we happen to this situation, it takes more time to read those key-ranges from remote. And we don't apply filters on remote storage now

Release note

Improve error handling when reading request and Region meta change in concurrency

Signed-off-by: JaySon-Huang <[email protected]>

JaySon-Huang · 2020-09-17T07:10:12Z

/run-all-tests

windtalker · 2020-09-17T07:57:10Z

dbms/src/Flash/Coprocessor/DAGQueryBlockInterpreter.cpp

+        {
+            /// Recover from region exception when super batch is enable
+            if (dag.isBatchCop())
+            {


So even if there is only one region meet region error, TiFlash still needs to read all the region remotely?

I think we can optimize it.
For the first time read region [1,2,...,10], if region [1,2] validate fail, then read [3,4,...,10] from local again and push [1,2] to region_retry.
If the second time of local read fails again, no matter how many regions fail, push [3,4,...,10] to region_retry.

What do you think about it? @windtalker

You mean add an extra retry to handle this? Why can't we retry more than one time?

I worry that too much retry will make the whole process time out of control... Now we already have retry while doing learner read and reading from remote.

OK, let's retry it with limited times.

Signed-off-by: JaySon-Huang <[email protected]>

JaySon-Huang · 2020-09-18T04:12:36Z

I will add some tests about retrying to read from local storage later.

Signed-off-by: JaySon-Huang <[email protected]>

JaySon-Huang · 2020-09-21T07:05:32Z

/run-all-tests

windtalker

LGTM

JaySon-Huang · 2020-09-21T07:38:04Z

/run-all-tests

JaySon-Huang · 2020-09-21T07:48:00Z

/run-all-tests

… concurrency (#1101) (#1109) * Improve error handling when reading request and Region meta change in concurrency * Add retry from local storage * Move definitions of FailPoints into cpp file Signed-off-by: JaySon-Huang <[email protected]>

JaySon-Huang added type/enhancement The issue or PR belongs to an enhancement. needs-cherry-pick-release-4.0 PR which needs to be cherry-picked to release-4.0 labels Sep 17, 2020

JaySon-Huang requested review from windtalker, flowbehappy and lidezhu September 17, 2020 06:08

JaySon-Huang self-assigned this Sep 17, 2020

JaySon-Huang changed the title ~~Refine retry when validation fail~~ Improve error handling when reading request and Region meta change in concurrency Sep 17, 2020

JaySon-Huang added 3 commits September 17, 2020 14:27

Make super batch more robust

53fd650

Signed-off-by: JaySon-Huang <[email protected]>

Move definitions of FailPoints into cpp file

c9385e2

Signed-off-by: JaySon-Huang <[email protected]>

Add some comments

fd082d8

Signed-off-by: JaySon-Huang <[email protected]>

JaySon-Huang force-pushed the refine_retry_when_validation_fail branch from f9fd60b to fd082d8 Compare September 17, 2020 06:28

windtalker reviewed Sep 17, 2020

View reviewed changes

JaySon-Huang added 2 commits September 17, 2020 16:06

Only enable when macro defined

7c25db3

Signed-off-by: JaySon-Huang <[email protected]>

Add retry from local storage

2b274dd

Signed-off-by: JaySon-Huang <[email protected]>

JaySon-Huang added 2 commits September 21, 2020 14:42

Add tests and fix bugs

edfa6be

Signed-off-by: JaySon-Huang <[email protected]>

Merge branch 'master' into refine_retry_when_validation_fail

501b023

windtalker approved these changes Sep 21, 2020

View reviewed changes

ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 21, 2020

JaySon-Huang removed the needs-cherry-pick-release-4.0 PR which needs to be cherry-picked to release-4.0 label Sep 21, 2020

JaySon-Huang mentioned this pull request Sep 21, 2020

Improve error handling when reading request and Region meta change in concurrency (#1101) #1109

Merged

JaySon-Huang merged commit b6a151a into pingcap:master Sep 21, 2020

JaySon-Huang deleted the refine_retry_when_validation_fail branch September 21, 2020 08:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve error handling when reading request and Region meta change in concurrency #1101

Improve error handling when reading request and Region meta change in concurrency #1101

JaySon-Huang commented Sep 17, 2020 •

edited

Loading

JaySon-Huang commented Sep 17, 2020

windtalker Sep 17, 2020

JaySon-Huang Sep 17, 2020

windtalker Sep 17, 2020

JaySon-Huang Sep 17, 2020 •

edited

Loading

windtalker Sep 18, 2020

JaySon-Huang commented Sep 18, 2020 •

edited

Loading

JaySon-Huang commented Sep 21, 2020

windtalker left a comment

JaySon-Huang commented Sep 21, 2020

JaySon-Huang commented Sep 21, 2020

Improve error handling when reading request and Region meta change in concurrency #1101

Improve error handling when reading request and Region meta change in concurrency #1101

Conversation

JaySon-Huang commented Sep 17, 2020 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Related changes

Check List

Release note

JaySon-Huang commented Sep 17, 2020

windtalker Sep 17, 2020

Choose a reason for hiding this comment

JaySon-Huang Sep 17, 2020

Choose a reason for hiding this comment

windtalker Sep 17, 2020

Choose a reason for hiding this comment

JaySon-Huang Sep 17, 2020 • edited Loading

Choose a reason for hiding this comment

windtalker Sep 18, 2020

Choose a reason for hiding this comment

JaySon-Huang commented Sep 18, 2020 • edited Loading

JaySon-Huang commented Sep 21, 2020

windtalker left a comment

Choose a reason for hiding this comment

JaySon-Huang commented Sep 21, 2020

JaySon-Huang commented Sep 21, 2020

JaySon-Huang commented Sep 17, 2020 •

edited

Loading

JaySon-Huang Sep 17, 2020 •

edited

Loading

JaySon-Huang commented Sep 18, 2020 •

edited

Loading