Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[first issue] Add error type in the NUM_INVALID_BLOCKS metric #9661

Closed
wacban opened this issue Oct 10, 2023 · 0 comments · Fixed by #10164
Closed

[first issue] Add error type in the NUM_INVALID_BLOCKS metric #9661

wacban opened this issue Oct 10, 2023 · 0 comments · Fixed by #10164
Assignees
Labels
C-good-first-issue Category: issues that are self-contained and easy for newcomers to work on. Near Core T-core Team: issues relevant to the core team

Comments

@wacban
Copy link
Contributor

wacban commented Oct 10, 2023

The NUM_INVALID_BLOCKS is an important metrics to track how many blocks were invalid. There are multiple reasons for a block to be invalid but currently it's not clear from the metrics itself. This task is to add a tag to the metric with the string of the error type. This can be achieved by converting it to a gauge vec.

It's important to not add the full error message but only the error type. The full message may contain hashes or other unique strings and it would cause us to have one metric per error which is suboptimal. Adding the error type will still allow for grouping by and analyzing this metric sensibly.

@wacban wacban added C-good-first-issue Category: issues that are self-contained and easy for newcomers to work on. T-core Team: issues relevant to the core team Near Core labels Oct 10, 2023
@jancionear jancionear self-assigned this Nov 10, 2023
jancionear added a commit to jancionear/nearcore that referenced this issue Nov 13, 2023
The metric `near_num_invalid_blocks` counts the number of invalid
blocks processed by neard. Up until now there was no information
why the blocks are invalid. Let's add a label that describes what
kind of error caused the block to be invalid. This will make it easier
to diagnose what's wrong when there are lots of invalid blocks.

The label is called "error", an example prometheus report looks like this:
```
near_num_invalid_blocks{error="chunks_missing"} 1234
```

Fixes: near#9661
github-merge-queue bot pushed a commit that referenced this issue Nov 16, 2023
…10164)

The metric `near_num_invalid_blocks` counts the number of invalid blocks
processed by neard. Up until now there was no information why the blocks
are invalid. Let's add a label that describes what kind of error caused
the block to be invalid. This will make it easier to diagnose what's
wrong when there are lots of invalid blocks.

The label is called "error", an example prometheus report looks like
this:
```
near_num_invalid_blocks{error="chunks_missing"} 1234
```

Fixes: #9661
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-good-first-issue Category: issues that are self-contained and easy for newcomers to work on. Near Core T-core Team: issues relevant to the core team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants