-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: add more diagnosis rule to check some metrics exceed thresholds #14843
Conversation
…olds Signed-off-by: crazycs <[email protected]>
Signed-off-by: crazycs <[email protected]>
… into thresshold-check-2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
Signed-off-by: crazycs <[email protected]>
Signed-off-by: crazycs <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
executor/inspection_result.go
Outdated
} | ||
for _, row := range rows { | ||
actual := fmt.Sprintf("%.2f", row.GetFloat64(1)) | ||
expect := "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expect := "" | |
expected := "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Signed-off-by: crazycs <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
/run-all-tests |
@crazycs520 merge failed. |
/run-unit-test |
Signed-off-by: crazycs [email protected]
What problem does this PR solve?
This PR is the continuation of PR #14801.
This PR adds a more
threshold-check
rule, which is used to check some metrics threshold value, liketso duration
,get token duration
,storage-write
duration and so on. It will detect the following metrics tables in the current implementation:pd_tso_wait_duration
tidb_get_token_duration
tidb_load_schema_duration
tikv_scheduler_command_duration
tikv_handle_snapshot_duration
tikv_storage_async_request_duration
tikv_storage_async_request_duration
tikv_engine_write_duration
tikv_engine_max_get_duration
tikv_engine_max_seek_duration
tikv_scheduler_pending_commands
tikv_block_index_cache_hit
tikv_block_filter_cache_hit
tikv_block_data_cache_hit
What is changed and how it works?
Get the metric-data and compare with the threshold.
Check List
Tests