-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch exception when string sent as metric value #2293
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2293 +/- ##
===========================================
- Coverage 85.56% 63.49% -22.08%
===========================================
Files 184 36 -148
Lines 12508 1915 -10593
Branches 1319 333 -986
===========================================
- Hits 10703 1216 -9487
+ Misses 1427 617 -810
+ Partials 378 82 -296 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for avoiding false alerts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Though, don't we want to retain the behavior of checks failing? Bad input is a hard error and making it just a log entry will obfuscate this for support when metrics don't show up.
* Add try around value conversion
What does this PR do?
Supersedes - #2274
Catches when a metric is not a float and logs a warning to the status page. Includes test
Motivation
The current behavior is that when a non numeric value is submitted as a metric, the check will completely fail and stop running. As an example, if a single postgres custom query starts returning a string instead of a number, this will cause the whole postgres check to stop running, causing ALL monitors over postgres to trigger. Instead, this makes it clear for the reason a metric stopped being submitted and also lets other, possibly mission critical metrics continue to be submitted.
Performance:
Worst case, we previously throw an exception and the check stops working, now we throw an exception and the check continues. This will incur a performance impact, but with the benefit of continuing to receive available metrics
In the best case, we have an extra try around something we wouldn't otherwise have had:
returns:
And in the existing functionality:
returns:
Ultimately a 0.01 usec difference with the gain of getting all other available metrics.
Review checklist
no-changelog
label attachedAdditional Notes
This adheres to the similar case where, if a single endpoint isn't available for an application, we continue retrieving available metrics.