-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: backup-restore/mixed-version failed #104604
Labels
C-test-failure
Broken test (automatically or manually discovered).
O-roachtest
O-robot
Originated from a bot.
T-disaster-recovery
Milestone
Comments
cockroach-teamcity
added
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
C-test-failure
Broken test (automatically or manually discovered).
O-roachtest
O-robot
Originated from a bot.
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
T-disaster-recovery
labels
Jun 8, 2023
adityamaru
removed
release-blocker
Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
labels
Jun 8, 2023
looks like a dup of #104481 but we don't have dmesg.txts for some reason. Looking at heap profiles. |
renatolabs
added a commit
to renatolabs/cockroach
that referenced
this issue
Jun 14, 2023
This commit updates the `backup-restore/mixed-version` roachtest to collect artifacts (cockroach logs and a debug.zip) when a restore fails in the last step of the test (when all backups taken are restored). In that step, we do not immediately fail the test when a restore fails but instead attempt to restore every backup and return a list of errors found when the process is done. However, restoring cluster backups involves wiping the cluster which also deletes existing cockroach logs up to that point. This makes debugging a restore failure that happened prior to a cluster restore impossible. After this commit, a restore failure in that test will cause a `restore_failure_N` directory to be created in the artifacts directory, including the cockroach logs collected right after the failure, as well as a debug.zip created at the same time. This will make issues such as cockroachdb#104604 more actionable. Epic: none Release note: None
renatolabs
added a commit
to renatolabs/cockroach
that referenced
this issue
Jun 14, 2023
This commit updates the `backup-restore/mixed-version` roachtest to collect artifacts (cockroach logs and a debug.zip) when a restore fails in the last step of the test (when all backups taken are restored). In that step, we do not immediately fail the test when a restore fails but instead attempt to restore every backup and return a list of errors found when the process is done. However, restoring cluster backups involves wiping the cluster which also deletes existing cockroach logs up to that point. This makes debugging a restore failure that happened prior to a cluster restore impossible. After this commit, a restore failure in that test will cause a `restore_failure_N` directory to be created in the artifacts directory, including the cockroach logs collected right after the failure, as well as a debug.zip created at the same time. This will make issues such as cockroachdb#104604 more actionable. Epic: none Release note: None
This is actually the relevant line, but it's hard to see in the midst of all those other "error messages". Once #104868 is merged, timeouts should become less likely but if it does happen, the error messaging should improve. Closing as there's nothing to do here. |
craig bot
pushed a commit
that referenced
this issue
Jun 14, 2023
103967: build,bazel: upgrade to `rules_js` r=sjbarag a=rickystewart The library which we were using, `rules_nodejs`, has known deficiencies: 1. The library has been "effectively deprecated" as of the [5.x branch](https://github.com/bazelbuild/rules_nodejs/tree/5.x); 2. the library is incompatible with things we need such as: cross-compilation, Bazel 6.0+, and remote execution; 3. and the library has bugs which we cannot fix, like a race condition which prevents builds from succeeding sporadically, requiring the dev to perform a `clean`. Here we move to [rules_js](https://github.com/aspect-build/rules_js), the modern alternative. Epic: none Release note: None 104820: backupccl: adjust a test to run for secondary tenant codec too r=yuzefovich a=yuzefovich Fixes: #82882. Release note: None 104868: roachtest: collect failure artifacts when restore fails r=srosenberg a=renatolabs This commit updates the `backup-restore/mixed-version` roachtest to collect artifacts (cockroach logs and a debug.zip) when a restore fails in the last step of the test (when all backups taken are restored). In that step, we do not immediately fail the test when a restore fails but instead attempt to restore every backup and return a list of errors found when the process is done. However, restoring cluster backups involves wiping the cluster which also deletes existing cockroach logs up to that point. This makes debugging a restore failure that happened prior to a cluster restore impossible. After this commit, a restore failure in that test will cause a `restore_failure_N` directory to be created in the artifacts directory, including the cockroach logs collected right after the failure, as well as a debug.zip created at the same time. This will make issues such as #104604 more actionable. Epic: none Release note: None 104872: go.mod: bump Pebble to 32834aa62738 r=RaduBerinde a=RaduBerinde 32834aa6 objstorage: support heteorogeneous Storage backends c75c4d65 db: wrap error when creating Reader with backing filenum a8a7ebf5 db: Add Option to Filter SSTables Release note: None Epic: None Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: Radu Berinde <[email protected]>
renatolabs
added a commit
to renatolabs/cockroach
that referenced
this issue
Jun 14, 2023
This commit updates the `backup-restore/mixed-version` roachtest to collect artifacts (cockroach logs and a debug.zip) when a restore fails in the last step of the test (when all backups taken are restored). In that step, we do not immediately fail the test when a restore fails but instead attempt to restore every backup and return a list of errors found when the process is done. However, restoring cluster backups involves wiping the cluster which also deletes existing cockroach logs up to that point. This makes debugging a restore failure that happened prior to a cluster restore impossible. After this commit, a restore failure in that test will cause a `restore_failure_N` directory to be created in the artifacts directory, including the cockroach logs collected right after the failure, as well as a debug.zip created at the same time. This will make issues such as cockroachdb#104604 more actionable. Epic: none Release note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-test-failure
Broken test (automatically or manually discovered).
O-roachtest
O-robot
Originated from a bot.
T-disaster-recovery
roachtest.backup-restore/mixed-version failed with artifacts on release-23.1 @ dcffb6a0a3f8ed7ab55b80d5a65d56be7a574f55:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=4
,ROACHTEST_encrypted=true
,ROACHTEST_ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-28624
The text was updated successfully, but these errors were encountered: