-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: stop including OFFLINE tables in backup #42606
Conversation
b74269b
to
175e7e1
Compare
175e7e1
to
a0880ca
Compare
Wanted to call out that I added higher-level tests where a RESTORE is paused and it is checked that the restoring tables are not included in a backup. However, the logic around this test is currently skipped due to flakiness around managing job state (see #40639). Unit tests should sufficiently tests this functionality for now though. |
Also, this should be backported. |
cc @aayushshah15 for impact on CDC. |
1a09ba7
to
db97a89
Compare
I updated the PR to always filter out OFFLINE tables since RESTORE verifies it's targets against the descriptors stored in the backup - not the descriptors it will restore into. Also CDC interacts with this here: https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/changefeedccl/changefeed_stmt.go#L198. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For CDC I think this change works. You shouldn't be able to run a changefeed on a table which is OFFLINE. |
In 19.2, the OFFLINE table descriptor state was added. This state indicates that the table is generally not visible to users as it is being populated by the database (via an in-progress RESTORE or IMPORT). However, these tables are currently included in backups. When performing a BACKUP while a RESTORE or IMPORT is being executed, there will be tables that exist in an OFFLINE state. These should not be included in a BACKUP as when they are restored they will appear to be in an intermediary and potentially inconsistent state. Consider an OFFLINE table `bank.pause`. This table could be included in a backup either via directly naming the table: `BACKUP bank.pause TO ...` or via table expansion: `BACKUP bank.* TO ...`. In the first case, an error should be returned as the table should not be directly visible to the user. In the later case, OFFLINE tables should be silently ignored since the OFFLINE table was not explicitly requested by the user. The remaining PUBLIC tables in the database should be expanded. Note: this commit also changes the name resolution logic for changefeeds in the same way that it applies to backup. To address this, the BACKUP and RESTORE name resolution logic was modified to support filtering OFFLINE tables. Release note (bug fix): Stop including tables that are being restored or imported as valid targets in backups and changefeeds.
db97a89
to
56fec47
Compare
TFTRs! |
42606: backupccl: stop including OFFLINE tables in backup r=aayushshah15,ajwerner,dt a=pbardea In 19.2, the OFFLINE table descriptor state was added. This state indicates that the table is generally not visible to users as it is being populated by the database (via an in-progress RESTORE or IMPORT). However, these tables are currently included in backups. When performing a BACKUP while a RESTORE or IMPORT is being executed, there will be tables that exist in an OFFLINE state. These should not be included in a BACKUP as when they are restored they will appear to be in an intermediary and potentially inconsistent state. Consider an OFFLINE table `bank.pause`. This table could be included in a backup either via directly naming the table: `BACKUP bank.pause TO ...` or via table expansion: `BACKUP bank.* TO ...`. In the first case, an error should be returned as the table should not be directly visible to the user. In the later case, OFFLINE tables should be silently ignored since the OFFLINE table was not explicitly requested by the user. The remaining PUBLIC tables in the database should be expanded. Note: this commit also changes the name resolution logic for changefeeds in the same way that it applies to backup. To address this, the BACKUP and RESTORE name resolution logic was modified to support filtering OFFLINE tables. Release note (bug fix): Stop including tables that are being restored or imported as valid targets in backups and changefeeds. Co-authored-by: Paul Bardea <[email protected]>
Build succeeded |
In 19.2, the OFFLINE table descriptor state was added. This state
indicates that the table is generally not visible to users as it is
being populated by the database (via an in-progress RESTORE or IMPORT).
However, these tables are currently included in backups.
When performing a BACKUP while a RESTORE or IMPORT is being executed,
there will be tables that exist in an OFFLINE state. These should not be
included in a BACKUP as when they are restored they will appear to be in
an intermediary and potentially inconsistent state.
Consider an OFFLINE table
bank.pause
. This table could be included ina backup either via directly naming the table:
BACKUP bank.pause TO ...
or via table expansion:BACKUP bank.* TO ...
. In the first case,an error should be returned as the table should not be directly visible
to the user. In the later case, OFFLINE tables should be silently
ignored since the OFFLINE table was not explicitly requested by the
user. The remaining PUBLIC tables in the database should be expanded.
Note: this commit also changes the name resolution logic for
changefeeds in the same way that it applies to backup.
To address this, the BACKUP and RESTORE name resolution logic was
modified to support filtering OFFLINE tables.
Release note (bug fix): Stop including tables that are being restored or
imported as valid targets in backups and changefeeds.