Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource_control: support calibrate resource #42165

Merged
merged 13 commits into from
Mar 17, 2023

Conversation

glorv
Copy link
Contributor

@glorv glorv commented Mar 13, 2023

What problem does this PR solve?

Issue Number: ref #38825

Problem Summary:

What is changed and how it works?

This PR add a new statement calibrate resource to estimate the total Request-Units(RU) of the current cluster.
Because the total ru usage is related to workload resource consuming, so the maximum RU can be different with different workload. Thus, the maximum RU estimated by this PR is based on a given workload -- TPC-C, and we may support other workload(e.g. sysbench) in the future.

In general, the bottle of a cluster can be one of TiDB CPU, TiKV CPU, TiKV IO Bandwidth. Currently, we can get the exact IO bandwidth and for most workload, io is unlikely to be the bottleneck. So here, we only consider TiDB CPU or TiKV CPU as bottleneck.

For a specified workload, the resource consuming is linear co-related with each other. So this PR use pre-benchmarked data of each resource dimension to calculate the ru cost per 1 tikv cpu. So if tikv cpu is the bottleneck, then Max RU = max_ru_per_1_kv_cpu * Total_TiKV_CPU; if tidb cpu is the bottleneck, then we just decrease the total kv cpu with a certain portaion.

The PR calculate the RU cost of different resource dimension separated so we can support calculate total ru with custom ru config and the expected RU capacity can reflect the RU config change.

The current SQL UI is as follows(We may add more information in the future version):

mysql> calibrate resource;
+-------+
| QUOTA |
+-------+
| 68569 |
+-------+
1 row in set (0.18 sec)

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Support estimate cluster total request unit with `calibrate resource`

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Mar 13, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • JmPotato
  • nolouch

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot
Copy link
Member

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 13, 2023
@glorv glorv requested review from nolouch and BornChanger March 13, 2023 12:10
@glorv
Copy link
Contributor Author

glorv commented Mar 13, 2023

@nolouch @BornChanger PTAL

@glorv
Copy link
Contributor Author

glorv commented Mar 13, 2023

/test all

@glorv
Copy link
Contributor Author

glorv commented Mar 15, 2023

/retest

@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Mar 16, 2023
@glorv glorv marked this pull request as ready for review March 16, 2023 04:06
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2023
@glorv glorv requested review from JmPotato and Connor1996 March 16, 2023 04:06
Copy link
Member

@nolouch nolouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

readBytes: units.MiB / 2, // 0.5MiB
writeBytes: units.MiB, // 1MiB
readReqCount: 300,
writeReqCount: 1750,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it means 1 core can provide 1750 request in here? maybe add more comments.

Copy link
Contributor Author

@glorv glorv Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is based on benchmark result. I added comment on the baseResourceCost struct

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 16, 2023
return err
}

workload := "tpcc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using a const or defined type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2023
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2023
@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 17, 2023
for i, f := range fields {
switch f.ColumnAsName.L {
case "instance":
//instanceIdx = i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please clean it

Copy link
Contributor

@tiancaiamao tiancaiamao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result could be inaccurate.
It has many hypothesis, like the bottlenect is TiDB | TiKV CPU, like the workload assumption, like the performance on different hardware...

@glorv
Copy link
Contributor Author

glorv commented Mar 17, 2023

The result could be inaccurate. It has many hypothesis, like the bottlenect is TiDB | TiKV CPU, like the workload assumption, like the performance on different hardware...

Yes. This is the restriction of the current implementation. We plan to expand this command to support estimating the RU capacity based on user's workload dynamically, this should be more useful for the user.

@glorv
Copy link
Contributor Author

glorv commented Mar 17, 2023

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: a6d48fb

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 17, 2023
@ti-chi-bot ti-chi-bot merged commit 9632aa6 into pingcap:master Mar 17, 2023
@glorv glorv deleted the calibrate-resource branch December 15, 2023 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants