-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: disallow restore of backups older than the minimum supported binary version #93804
Conversation
Lots of tests are going to break, I want CI to tell me what fixtures I need to regenerate. |
509f705
to
e74c6bf
Compare
I have one TODO to resolve 743d5e1#diff-87e52e674b19b4c9ef048ee88b0a6b5feec243b869bb266a5551826f6997b0c8R72 and ensure our restore roachtests are up to date with this policy, but this should be ready for a look. I will squash all the commits once I get an LGTM! I'm primarily looking for reviewers to tell me if I've deleted a test I shouldn't have. |
e74c6bf
to
d73aaec
Compare
…ed binary version As outlined in https://www.cockroachlabs.com/docs/v22.2/restoring-backups-across-versions.html, we do not support restoring backups older than one major version. This patch enforces this in restore code by ensuring that the manifest's version is greater than equal to the minimum supported binary version of the cluster. Release note (sql change): Backups older than one major version from the cluster's current active version are not restoreable as per the policy in https://www.cockroachlabs.com/docs/v22.2/restoring-backups-across-versions.html. Fixes: cockroachdb#94295
Interleaved tables were removed >1 major version in the past and so a backup restoreable in 23.1 cannot contain interleaved tables. Release note: None
The xDBRef and duplicate-db exceptional backups can no longer be restored since they were generated >1 major version in the past. This change does add a test inspired by the duplicate-db test to expand coverage. Release note: None
This change removes the test that was added as part of were correctly upgraded on restore to 20.x+ versions. Backups taken one major release ago will no longer have these old style fks and so this test serves better as a regular test on this branch. Release note: None
This change removes old cluster restore fixtures, and replaces them with a cluster backup from 22.2.0. It also removes a 20.1 fixture that tested restore of a zoneconfig privilege that was overwritten as the usage privilege in 20.2+ versions. This test has been migrated to a regular test instead. Release note: None
This change removes an old MR backup that was being restored and tested against. There were no instructions on how to regenerate the fixture in cockroachdb#69725 so it is being deleted. Release note: None
Prior to 22.2 the public schema was not backed by a descriptor. These fixtures tested that a restore of a pre 22.2 backup would remap the synthetic public schema to a descriptor backed public schema. Since the only restoreable backups on this branch will be from 22.2 onwards, we should never see a synthetic public schema in a restoreable backup. Release note: None
The backup fixtures in this test were generated on 20.2 where cockroachdb#62564 existed. Given this was >1 major version in the past we do not expect to see a reoccurrence of the bug in 22.2+ backups that are restoreable in this release. Release note: None
This fixture captured a bug in 20.2.7 that has since been fixed. This change removes the fixture and moves it to a regular regression test. Release note: None
These fixtures were generate on older version of Cockroach that are no longer restoreable. The test has been migrated to a regular datadriven test. Release note: None
This change regenerate the sequences backup fixture on 22.2.0. While there have been no changes to the representation of sequences between 22.2 and 23.1 it seemed useful to keep a cross version backup test of this nature. Release note: None
All users in 22.2+ clusters will have IDs and so will the backups taken on these clusters. Thus, there is no need to keep around the cross version test that tests the custom restore func that injects a user ID. Release note: None
This change regenerates the fixtures for testing restores of backups that have been taken before or after the backfill portion of a schemachange. Note these tests rely on the old schemachanger and do not increase our test coverage with the new declarative schema changer. Release note: None
This test was testing a bug that was discovered during backups in pre-22.2 versions cockroachdb#88042. Since this bug does not exist on 22.2+ versions this test will not encounter the expected corruption on backups that are restoreable in this cluster. Release note: None
This change adds an env variable that will be set by the job running restore roachtests until we have updated all their fixtures to be 22.2.0+. Release note: None
d73aaec
to
358c0bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't read all of the test code updates, but overall this seems reasonable if we still want to enforce this in code. Left one small question.
return errors.WithHint(errors.Newf("the backup is from a version older than our "+ | ||
"minimum restoreable version %s", minimumRestorableVersion), | ||
"refer to our documentation about restoring across versions: https://www.cockroachlabs.com/docs/v22.2/restoring-backups-across-versions.html") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the v.Major == 0 case imply we never recorded a version in the manifest? If so, does it definitely need to happen before the currentVersion.Less check since current version will always be greater? that would help keep the two related checks next to each other.
…mework The restore/nodeshutdown tests have been using a very old workload that will not be restorable when cockroachdb#93804 lands. This patch changes the restore/nodeshutdown workload to a 80GB tpce restore and moves the tests to run on aws instead of gcp. Release note: None Epic: None
Closing in favour of #98597. |
97138: ui: add error code to stmt and txn insights details pages r=gtr a=gtr Part of: #87785. Previously, the stmt and txn insights details pages did not show any further information for failed executions. This commit adds an "error code" column to the insights table for a failed execution in the stmt and txn insights details pages. Additionally, a "status" column was added to the stmt and txn workload insights tables which is either "Completed" or "Failed". Future work involves adding the error message string in addition to the error code but it needs to be redacted first. Additionally, the txn status is missing the implementation of a "Cancelled" status. Note to reviewers: only consider the second commit, as the first is required to get the txn status. - Loom [demo](https://www.loom.com/share/e82b97ff9f034d82b98640170eb54408). Release note (ui change): Adds error code column to the insights table for a failed execution in the stmt and txn insights details page. Adds status column to the stmt and txn workload insights tables. 98410: cluster-ui: tenants use sqlstats-supplied regions r=matthewtodd a=matthewtodd Fixes #98056. As of #95449, the SQL Activity pages in the DB Console can draw regions information directly from the sqlstats tables, rather than having to translate node IDs to regions on page load. Here, we make that switch, but for non-system tenants only, because: 1. The ephemeral nature of serverless nodes made this view-time mapping especially problematic in that context. (See further notes in #95449.) 2. The system-tenant views also include KV node IDs in a special Regions/Nodes column, which we are unable to recreate given the backend storage structure. (Future design work might suggest removing these node IDs altogether, for a unified UI.) # Screenshots! ## Statements, with and without regions filter <img width="1372" alt="statements" src="https://user-images.githubusercontent.com/5261/225033247-739df90a-9173-4aab-a666-a61a1ceeb579.png"> <img width="1372" alt="statements - filtered" src="https://user-images.githubusercontent.com/5261/225033271-1c0d0f82-3dd4-48ea-bdef-11f19af97a85.png"> ## Statement details <img width="1372" alt="statement details" src="https://user-images.githubusercontent.com/5261/225033338-6dff4a6e-a4a3-48c6-863a-84f1375b0a61.png"> ## Transactions, with and without regions filter <img width="1372" alt="transactions" src="https://user-images.githubusercontent.com/5261/225033366-65f44e95-3549-47cc-b0f2-67ad48a1a1fa.png"> <img width="1372" alt="transactions - filtered" src="https://user-images.githubusercontent.com/5261/225033391-50b9a2dc-e9a1-457b-84b1-837426eba35e.png"> ## Transaction details <img width="1372" alt="transaction details" src="https://user-images.githubusercontent.com/5261/225033505-3fdeceef-35dc-4e06-af25-ab4d0c53518f.png"> Release note: None 98464: jobs,upgrades: add migration to backfill job_info table r=dt a=adityamaru This change adds a migration and corresponding cluster version after which every job entry in the system.jobs table will have its Payload and Progress written to two rows in the system.job_info table. Informs: #97762 Release note: None 98510: backupccl: update restore/nodeshutdown tests to use new roachtest framework r=adityamaru a=msbutler The restore/nodeshutdown tests have been using a very old workload that will not be restorable when #93804 lands. This patch changes the restore/nodeshutdown workload to a 80GB tpce restore and moves the tests to run on aws instead of gcp. Release note: None Epic: None 98579: upgrade/upgrades: skip TestUpgradeSchemaChangerElements r=smg260 a=smg260 Refs: #98062 Reason: flaky test Generated by bin/skip-test. Release justification: non-production code changes Release note: None Epic: None Co-authored-by: gtr <[email protected]> Co-authored-by: Matthew Todd <[email protected]> Co-authored-by: adityamaru <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Miral Gadani <[email protected]>
As outlined in https://www.cockroachlabs.com/docs/v22.2/restoring-backups-across-versions.html,
we do not support restoring backups older than one major version. This
patch enforces this in restore code by ensuring that the manifest's version is greater
than equal to the minimum supported binary version of the cluster.
Release note (sql change): Backups older than one major version
from the cluster's current active version are not restoreable as per
the policy in https://www.cockroachlabs.com/docs/v22.2/restoring-backups-across-versions.html.
Fixes: #94295