Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: add more diagnosis rule to check some metrics exceed thresholds #14843

Merged
merged 8 commits into from
Feb 20, 2020

Conversation

crazycs520
Copy link
Contributor

Signed-off-by: crazycs [email protected]

What problem does this PR solve?

This PR is the continuation of PR #14801.

This PR adds a more threshold-check rule, which is used to check some metrics threshold value, like tso duration, get token duration, storage-write duration and so on. It will detect the following metrics tables in the current implementation:

  • pd_tso_wait_duration
  • tidb_get_token_duration
  • tidb_load_schema_duration
  • tikv_scheduler_command_duration
  • tikv_handle_snapshot_duration
  • tikv_storage_async_request_duration
  • tikv_storage_async_request_duration
  • tikv_engine_write_duration
  • tikv_engine_max_get_duration
  • tikv_engine_max_seek_duration
  • tikv_scheduler_pending_commands
  • tikv_block_index_cache_hit
  • tikv_block_filter_cache_hit
  • tikv_block_data_cache_hit

What is changed and how it works?

Get the metric-data and compare with the threshold.

Check List

Tests

  • Unit test

@crazycs520 crazycs520 added the sig/execution SIG execution label Feb 19, 2020
Copy link
Contributor

@lonng lonng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

@lonng lonng added this to the v4.0.0-beta.1 milestone Feb 19, 2020
@lonng lonng linked an issue Feb 19, 2020 that may be closed by this pull request
56 tasks
Signed-off-by: crazycs <[email protected]>
Signed-off-by: crazycs <[email protected]>
Copy link
Contributor

@lonng lonng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Deardrops Deardrops self-requested a review February 20, 2020 02:33
}
for _, row := range rows {
actual := fmt.Sprintf("%.2f", row.GetFloat64(1))
expect := ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expect := ""
expected := ""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@Deardrops Deardrops added the status/LGT1 Indicates that a PR has LGTM 1. label Feb 20, 2020
Signed-off-by: crazycs <[email protected]>
Copy link
Contributor

@Deardrops Deardrops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Deardrops Deardrops added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Feb 20, 2020
@crazycs520
Copy link
Contributor Author

/run-all-tests

@crazycs520 crazycs520 added the status/can-merge Indicates a PR has been approved by a committer. label Feb 20, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Feb 20, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Feb 20, 2020

@crazycs520 merge failed.

@crazycs520
Copy link
Contributor Author

/run-unit-test

@crazycs520 crazycs520 merged commit d0ad254 into pingcap:master Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants