-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Add telemetry counter when descriptor corruption error is encountered #61786
Comments
I've been giving this some thought. Are there any arguments against adding telemetry for all descriptor and namespace validation failures? |
I would love that! Would it be a single counter for all descriptor validation failures? |
I'm in the process of finding out how to best do this but I should have an answer for you today. |
I eventually came up with a solution involving a small number of counters with the following dimensions:
Please correct me if I'm wrong but I believe this addresses the original ask and it's not too big a change that it goes against the spirit of the stability period. |
I'll let @ajwerner or @jordanlewis weigh in on the right dimensions. Initially, I would likely aggregate all description corruption-related counters to view the trendline. So this proposal SGTM. |
This commit adds `sql.schema.validation_errors.*` telemetry keys to descriptor validation errors. Fixes cockroachdb#61786. Release note: None
62546: catalog: add telemetry for descriptor validation errors r=postamar a=postamar This commit adds `sql.schema.validation_errors.*` telemetry keys to descriptor validation errors. Fixes #61786. Release note: None Co-authored-by: Marius Posta <[email protected]>
This commit adds `sql.schema.validation_errors.*` telemetry keys to descriptor validation errors. Fixes cockroachdb#61786. Release note: None
In 20.2, we added descriptor validation on write. In 21.1, we will start logging/creating Sentry reports for namespace validation errors.
Desired behavior
Whenever we encounter a corruption error, we should increment a telemetry counter(s). The number of counters should be determined by the granularity level we have for corruption errors (Ex. descriptor corruption vs namespace corruption).
The text was updated successfully, but these errors were encountered: