-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: string match for transient errors as a fallback #132702
roachtest: string match for transient errors as a fallback #132702
Conversation
a7042cd
to
e2f6f96
Compare
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
e2f6f96
to
1e3f6ff
Compare
First two commits are part of #132795 as the newly added tests build off of it. The core logic behind this PR is in |
1e3f6ff
to
cbc9880
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a (mocked) roachtest using the require
package s.t. we have e2e coverage. Otherwise, LGTM!
cbc9880
to
2c2eaf7
Compare
Good idea, added. |
The `require` package is commonly used through roachtest to assert that no error occured. i.e. `require.NoError(t, err)` However, this function does not preserve the error object. This causes our transient error flake detection to not work. Since `require` is an upstream dependency, we cannot easily change this. This change adds a fallback to our flake detection that string matches for the `TRANSIENT_ERROR` message we add. If found it will mark the error as a flake to reduce noise. However, we have seen other cases where we do not preserve the error object but the code lives somewhere that is easily changeable for us. In those cases, we ideally should fix the code instead of resorting to this fallback. To make sure we still do that, the fallback also explicity checks for a message that `require.NoError` prepends to all errors. If we find additional cases that require this fallback, we can review and add them on a case by case basis.
2c2eaf7
to
c547767
Compare
}) | ||
|
||
// Now test that if the transient error is not handled by the `require` package, | ||
// but similarly lost due to casting to a string, the test runner *won't* mark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
TFTR! bors r=srosenberg |
The
require
package is commonly used through roachtest to assert that no error occured. i.e.require.NoError(t, err)
However, this function does not preserve the error object. This causes our transient error flake detection to not work. Sincerequire
is an upstream dependency, we cannot easily change this.This change adds a fallback to our flake detection that string matches for the
TRANSIENT_ERROR
message we add. If found it will mark the error as a flake to reduce noise.However, we have seen other cases where we do not preserve the error object but the code lives somewhere that is easily changeable for us. In those cases, we ideally should fix the code instead of resorting to this fallback.
To make sure we still do that, the fallback also explicity checks for a message that
require.NoError
prepends to all errors. If we find additional cases that require this fallback, we can review and add them on a case by case basis.Fixes: #131094
Epic: none
Release note: none