forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
132702: roachtest: string match for transient errors as a fallback r=srosenberg a=DarrylWong The `require` package is commonly used through roachtest to assert that no error occured. i.e. `require.NoError(t, err)` However, this function does not preserve the error object. This causes our transient error flake detection to not work. Since `require` is an upstream dependency, we cannot easily change this. This change adds a fallback to our flake detection that string matches for the `TRANSIENT_ERROR` message we add. If found it will mark the error as a flake to reduce noise. However, we have seen other cases where we do not preserve the error object but the code lives somewhere that is easily changeable for us. In those cases, we ideally should fix the code instead of resorting to this fallback. To make sure we still do that, the fallback also explicity checks for a message that `require.NoError` prepends to all errors. If we find additional cases that require this fallback, we can review and add them on a case by case basis. Fixes: cockroachdb#131094 Epic: none Release note: none Co-authored-by: DarrylWong <[email protected]>
- Loading branch information
Showing
8 changed files
with
268 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
60 changes: 60 additions & 0 deletions
60
pkg/cmd/roachtest/testdata/github/lost_error_object_and_transient_error
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# When a transient error is lost as a result of an unknown case | ||
# casting it to a string, check that our fallback transient error handling | ||
# *doesn't* catch it. This should be investigated to determine if it should | ||
# be fixed to preserve the error object or added to the list of exceptions | ||
|
||
add-failure name=(oops) type=(ssh-flake, lose-error-object) | ||
---- | ||
ok | ||
|
||
post | ||
---- | ||
---- | ||
roachtest.github_test [failed]() on test_branch @ [test_SHA](). A Side-Eye cluster snapshot was captured on timeout: [https://app.side-eye.io/snapshots/1](https://app.side-eye.io/snapshots/1). | ||
|
||
|
||
``` | ||
TRANSIENT_ERROR(ssh_problem): oops | ||
``` | ||
|
||
Parameters: | ||
- <code>ROACHTEST_arch=amd64</code> | ||
- <code>ROACHTEST_cloud=gce</code> | ||
- <code>ROACHTEST_coverageBuild=false</code> | ||
- <code>ROACHTEST_cpu=4</code> | ||
- <code>ROACHTEST_encrypted=false</code> | ||
- <code>ROACHTEST_fs=ext4</code> | ||
- <code>ROACHTEST_localSSD=true</code> | ||
- <code>ROACHTEST_runtimeAssertionsBuild=false</code> | ||
- <code>ROACHTEST_ssd=0</code> | ||
<details><summary>Help</summary> | ||
<p> | ||
|
||
|
||
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) | ||
|
||
|
||
|
||
See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7) | ||
|
||
|
||
|
||
See: [Grafana](https://go.crdb.dev/roachtest-grafana//github-test/1689957243000/1689957853000) | ||
|
||
</p> | ||
</details> | ||
/cc @cockroachdb/unowned | ||
<sub> | ||
|
||
[This test on roachdash](https://roachdash.crdb.dev/?filter=status:open%20t:.*github_test.*&sort=title+created&display=lastcommented+project) | [Improve this report!](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/bazci/githubpost/issues) | ||
|
||
</sub> | ||
|
||
------ | ||
Labels: | ||
- <code>O-roachtest</code> | ||
- <code>C-test-failure</code> | ||
- <code>release-blocker</code> | ||
Rendered:https://github.com/cockroachdb/cockroach/issues/new?body=roachtest.github_test+%5Bfailed%5D%28%29+on+test_branch+%40+%5Btest_SHA%5D%28%29.+A+Side-Eye+cluster+snapshot+was+captured+on+timeout%3A+%5Bhttps%3A%2F%2Fapp.side-eye.io%2Fsnapshots%2F1%5D%28https%3A%2F%2Fapp.side-eye.io%2Fsnapshots%2F1%29.%0A%0A%0A%60%60%60%0ATRANSIENT_ERROR%28ssh_problem%29%3A+oops%0A%60%60%60%0A%0AParameters%3A%0A+-+%3Ccode%3EROACHTEST_arch%3Damd64%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_cloud%3Dgce%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_coverageBuild%3Dfalse%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_cpu%3D4%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_encrypted%3Dfalse%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_fs%3Dext4%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_localSSD%3Dtrue%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_runtimeAssertionsBuild%3Dfalse%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_ssd%3D0%3C%2Fcode%3E%0A%3Cdetails%3E%3Csummary%3EHelp%3C%2Fsummary%3E%0A%3Cp%3E%0A%0A%0ASee%3A+%5Broachtest+README%5D%28https%3A%2F%2Fgithub.com%2Fcockroachdb%2Fcockroach%2Fblob%2Fmaster%2Fpkg%2Fcmd%2Froachtest%2FREADME.md%29%0A%0A%0A%0ASee%3A+%5BHow+To+Investigate+%5C%28internal%5C%29%5D%28https%3A%2F%2Fcockroachlabs.atlassian.net%2Fl%2Fc%2FSSSBr8c7%29%0A%0A%0A%0ASee%3A+%5BGrafana%5D%28https%3A%2F%2Fgo.crdb.dev%2Froachtest-grafana%2F%2Fgithub-test%2F1689957243000%2F1689957853000%29%0A%0A%3C%2Fp%3E%0A%3C%2Fdetails%3E%0A%2Fcc+%40cockroachdb%2Funowned%0A%3Csub%3E%0A%0A%5BThis+test+on+roachdash%5D%28https%3A%2F%2Froachdash.crdb.dev%2F%3Ffilter%3Dstatus%3Aopen%2520t%3A.%2Agithub_test.%2A%26sort%3Dtitle%2Bcreated%26display%3Dlastcommented%2Bproject%29+%7C+%5BImprove+this+report%21%5D%28https%3A%2F%2Fgithub.com%2Fcockroachdb%2Fcockroach%2Ftree%2Fmaster%2Fpkg%2Fcmd%2Fbazci%2Fgithubpost%2Fissues%29%0A%0A%3C%2Fsub%3E%0A%0A------%0ALabels%3A%0A-+%3Ccode%3EO-roachtest%3C%2Fcode%3E%0A-+%3Ccode%3EC-test-failure%3C%2Fcode%3E%0A-+%3Ccode%3Erelease-blocker%3C%2Fcode%3E%0A&title=roachtest%3A+github_test+failed | ||
---- | ||
---- |
60 changes: 60 additions & 0 deletions
60
pkg/cmd/roachtest/testdata/github/require_no_error_transient_error
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# When a transient error is lost as a result of the require package | ||
# casting it to a string, check that our fallback transient error handling | ||
# still catches it. | ||
|
||
add-failure name=(oops) type=(ssh-flake, require-no-error-failed) | ||
---- | ||
ok | ||
|
||
post | ||
---- | ||
---- | ||
roachtest.ssh_problem [failed]() on test_branch @ [test_SHA](). A Side-Eye cluster snapshot was captured on timeout: [https://app.side-eye.io/snapshots/1](https://app.side-eye.io/snapshots/1). | ||
|
||
|
||
``` | ||
test github_test failed: Received unexpected error: | ||
TRANSIENT_ERROR(ssh_problem): oops | ||
``` | ||
|
||
Parameters: | ||
- <code>ROACHTEST_arch=amd64</code> | ||
- <code>ROACHTEST_cloud=gce</code> | ||
- <code>ROACHTEST_coverageBuild=false</code> | ||
- <code>ROACHTEST_cpu=4</code> | ||
- <code>ROACHTEST_encrypted=false</code> | ||
- <code>ROACHTEST_fs=ext4</code> | ||
- <code>ROACHTEST_localSSD=true</code> | ||
- <code>ROACHTEST_runtimeAssertionsBuild=false</code> | ||
- <code>ROACHTEST_ssd=0</code> | ||
<details><summary>Help</summary> | ||
<p> | ||
|
||
|
||
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) | ||
|
||
|
||
|
||
See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7) | ||
|
||
|
||
|
||
See: [Grafana](https://go.crdb.dev/roachtest-grafana//github-test/1689957243000/1689957853000) | ||
|
||
</p> | ||
</details> | ||
/cc @cockroachdb/test-eng | ||
<sub> | ||
|
||
[This test on roachdash](https://roachdash.crdb.dev/?filter=status:open%20t:.*ssh_problem.*&sort=title+created&display=lastcommented+project) | [Improve this report!](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/bazci/githubpost/issues) | ||
|
||
</sub> | ||
|
||
------ | ||
Labels: | ||
- <code>O-roachtest</code> | ||
- <code>X-infra-flake</code> | ||
- <code>T-testeng</code> | ||
Rendered:https://github.com/cockroachdb/cockroach/issues/new?body=roachtest.ssh_problem+%5Bfailed%5D%28%29+on+test_branch+%40+%5Btest_SHA%5D%28%29.+A+Side-Eye+cluster+snapshot+was+captured+on+timeout%3A+%5Bhttps%3A%2F%2Fapp.side-eye.io%2Fsnapshots%2F1%5D%28https%3A%2F%2Fapp.side-eye.io%2Fsnapshots%2F1%29.%0A%0A%0A%60%60%60%0Atest+github_test+failed%3A+Received+unexpected+error%3A%0ATRANSIENT_ERROR%28ssh_problem%29%3A+oops%0A%60%60%60%0A%0AParameters%3A%0A+-+%3Ccode%3EROACHTEST_arch%3Damd64%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_cloud%3Dgce%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_coverageBuild%3Dfalse%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_cpu%3D4%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_encrypted%3Dfalse%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_fs%3Dext4%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_localSSD%3Dtrue%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_runtimeAssertionsBuild%3Dfalse%3C%2Fcode%3E%0A+-+%3Ccode%3EROACHTEST_ssd%3D0%3C%2Fcode%3E%0A%3Cdetails%3E%3Csummary%3EHelp%3C%2Fsummary%3E%0A%3Cp%3E%0A%0A%0ASee%3A+%5Broachtest+README%5D%28https%3A%2F%2Fgithub.com%2Fcockroachdb%2Fcockroach%2Fblob%2Fmaster%2Fpkg%2Fcmd%2Froachtest%2FREADME.md%29%0A%0A%0A%0ASee%3A+%5BHow+To+Investigate+%5C%28internal%5C%29%5D%28https%3A%2F%2Fcockroachlabs.atlassian.net%2Fl%2Fc%2FSSSBr8c7%29%0A%0A%0A%0ASee%3A+%5BGrafana%5D%28https%3A%2F%2Fgo.crdb.dev%2Froachtest-grafana%2F%2Fgithub-test%2F1689957243000%2F1689957853000%29%0A%0A%3C%2Fp%3E%0A%3C%2Fdetails%3E%0A%2Fcc+%40cockroachdb%2Ftest-eng%0A%3Csub%3E%0A%0A%5BThis+test+on+roachdash%5D%28https%3A%2F%2Froachdash.crdb.dev%2F%3Ffilter%3Dstatus%3Aopen%2520t%3A.%2Assh_problem.%2A%26sort%3Dtitle%2Bcreated%26display%3Dlastcommented%2Bproject%29+%7C+%5BImprove+this+report%21%5D%28https%3A%2F%2Fgithub.com%2Fcockroachdb%2Fcockroach%2Ftree%2Fmaster%2Fpkg%2Fcmd%2Fbazci%2Fgithubpost%2Fissues%29%0A%0A%3C%2Fsub%3E%0A%0A------%0ALabels%3A%0A-+%3Ccode%3EO-roachtest%3C%2Fcode%3E%0A-+%3Ccode%3EX-infra-flake%3C%2Fcode%3E%0A-+%3Ccode%3ET-testeng%3C%2Fcode%3E%0A&title=roachtest%3A+ssh_problem+failed | ||
---- | ||
---- |