Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: improve more metrics #1761

Merged
merged 4 commits into from
Sep 23, 2019
Merged

*: improve more metrics #1761

merged 4 commits into from
Sep 23, 2019

Conversation

rleungx
Copy link
Member

@rleungx rleungx commented Sep 17, 2019

What problem does this PR solve?

At present, it's hard to know the process status of making replicas for offline or down stores. We might not know why the operator is failed to be created.

What is changed and how it works?

This PR adds a new metrics named processStatusCounter to record the failure reason when creating the operator for the offline or down store. Also, this PR renames some labels to make them more clear.

Pros and cons:
Pros:

  1. We can know why the operator cannot be created and the kind of failure reasons.
  2. After schedule: add more dimension for filter metrics #1746, once we encounter "no-store", we can directly find some information from filter metrics.

Cons:

  1. It records some duplicated information with checkerCounter.

UPDATE:
According to the comments, reuse checkerCounter.

Check List

Tests

  • Unit test

@codecov-io
Copy link

codecov-io commented Sep 17, 2019

Codecov Report

Merging #1761 into master will increase coverage by 0.08%.
The diff coverage is 59.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1761      +/-   ##
==========================================
+ Coverage   77.19%   77.27%   +0.08%     
==========================================
  Files         163      163              
  Lines       16408    16417       +9     
==========================================
+ Hits        12666    12687      +21     
+ Misses       2687     2676      -11     
+ Partials     1055     1054       -1
Impacted Files Coverage Δ
server/cluster.go 84.54% <0%> (ø) ⬆️
server/schedulers/random_merge.go 67.5% <0%> (ø) ⬆️
server/schedulers/balance_leader.go 96.77% <100%> (+0.07%) ⬆️
server/schedulers/balance_region.go 84.79% <100%> (ø) ⬆️
server/schedulers/shuffle_region.go 72.88% <100%> (ø) ⬆️
server/statistics/store_collection.go 86.71% <100%> (ø) ⬆️
server/checker/replica_checker.go 78.82% <25%> (-3.5%) ⬇️
pkg/etcdutil/etcdutil.go 76.81% <0%> (-5.8%) ⬇️
server/kv/etcd_kv.go 70.12% <0%> (-2.6%) ⬇️
server/core/storage.go 76.08% <0%> (-1.45%) ⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ca2265...4331b67. Read the comment docs.

@rleungx rleungx added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 17, 2019
@@ -23,8 +23,17 @@ var (
Name: "event_count",
Help: "Counter of checker events.",
}, []string{"type", "name"})

processStatusCounter = prometheus.NewGaugeVec(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use checkerCouter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkerCounter?

@rleungx rleungx changed the title checker: add metrics for offline and down store *: improve more metrics Sep 18, 2019
@rleungx rleungx removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 18, 2019
Signed-off-by: Ryan Leung <[email protected]>
@disksing disksing added the status/can-merge Indicates a PR has been approved by a committer. label Sep 23, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Sep 23, 2019

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Sep 23, 2019

@rleungx merge failed.

@rleungx
Copy link
Member Author

rleungx commented Sep 23, 2019

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Sep 23, 2019

/run-all-tests

@sre-bot sre-bot merged commit 5c648dc into tikv:master Sep 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/metrics Metrics. status/can-merge Indicates a PR has been approved by a committer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants