Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust retry policy factor and cache invalidation #326

Merged
merged 13 commits into from
Apr 25, 2018

Conversation

Novemser
Copy link
Contributor

@Novemser Novemser commented Apr 24, 2018

  1. Enlarge time gap between retry requests.
  2. Send stale epoch invalidation information back to driver.
  3. Improve cache invalidate framework.

@sre-bot
Copy link
Contributor

sre-bot commented Apr 24, 2018

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

@Novemser
Copy link
Contributor Author

/run-all-tests

@Novemser Novemser changed the title Adjust retry policy Adjust retry policy factor and cache invalidation Apr 24, 2018
@Novemser
Copy link
Contributor Author

/run-all-tests

@Novemser Novemser requested a review from ilovesoup April 24, 2018 20:17
@Novemser
Copy link
Contributor Author

/run-all-tests

@Novemser
Copy link
Contributor Author

/rebuild

@Novemser
Copy link
Contributor Author

/run-all-tests

@Novemser
Copy link
Contributor Author

PTAL @ilovesoup @birdstorm for the coming RC2.

@Novemser Novemser self-assigned this Apr 25, 2018
@Novemser
Copy link
Contributor Author

Novemser commented Apr 25, 2018

Some test result on TPCH100-Q1 concerning :

| Test | Scenario                                                | Run Time | Failed tasks on Spark UI | Job Succeed | Encountered Errors                                           |
| ---- | ------------------------------------------------------- | -------- | ------------------------ | ----------- | ------------------------------------------------------------ |
| 1    | Kill all TiKV during query                              | 4.1m     | 3011/25                  | True        | - NotLeader(including 0 store id and non-zero store id) <br />- Request Out dated |
| 2    | TiKVs recovered from test 1 and rerun the same query    | 1.3m     | 3011/0                   | True        | - Few NotLeader<br />- Few Stale command                     |
| 3    | Loading data into TiKVs, expects region stales          | /        | /                        | True        | - Region stale, drop cache, re-split and retry.              |
| 4    | Decrease TiKVs' end-point processing capacity(2000->10) | 1.9m     | 3011/2                   | True        | - Server is busy, retry.                                     |
| 5    | Kill all TiKV then kill all Pd-server during query           | 3.7m     | 3011/96                  | True        | - NotLeader(including 0 store id and non-zero store id)<br />- Server is busy |

Copy link
Contributor

@ilovesoup ilovesoup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ilovesoup ilovesoup merged commit 7cf6271 into master Apr 25, 2018
@Novemser Novemser deleted the adjust_backoff_time branch April 25, 2018 08:19
ilovesoup pushed a commit that referenced this pull request Apr 25, 2018
* [TISPARK-27]Fix KV error handler logic when encounter store zero store id problem. (#324)

* Refix stale epoch handling (#325)

* Merge master

* Adjust retry policy factor and cache invalidation (#326)

* Fix map single table (#319)

* [TISPARK-25]Improve downgrade logic and efficiency (#310)
wfxxh pushed a commit to wanfangdata/tispark that referenced this pull request Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants