-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: improve backup-restore roundtrip testing of multiple tenants #108745
Comments
@stevendanna, is there any update on this? We'd like to re-enable incremental backups. |
@andy-kimball Will have an update for you on this one by the end of the week. |
Following up here, any update? |
Apologies that this has sat for so long. I've added the tests outline in this issue here: I've run a stress run with these new nemeses in place for 2 days on a GCE worker. I still have this running and will keep an eye on it over this week. While simple, I've run it with past bugs re-inserted to ensure that this randomised approach would have caught the issue over time. We've hit one issue so far: This is in a default-off feature, but I am currently investigating to make sure that it isn't indicative of another problem. Note that correct tenant restores currently still requires that tenant IDs are never re-used. We believe that this is the case in CC with the exception of manual action. That will be the case until we either move to per-tenant backup schedules or remove the use of ClearRange in the tenant GC path. We likely should do both over time. I think we can re-enable incrementals for CC once we (1) merge and backport this test, and (2) I finish investigating the above failure. Both of which I expect to happen this week. NB: We are currently investigating a newly issue discovered the the CCT cluster related to protected timestamps. While not strictly related to incremental tenant backups, depending on what we find, it may also have an impact for incremental tenant backups. However, the investigation there is going well so we should know more shortly. NB2: The universe heard my optimism and immediately rewarded it with a test failure. I've updated this message to reflect that. I think this failure wasn't caught on my GCEWorker as I had disabled metamorphic testing when investigating another issue. |
113851: backupccl: add tenant operations to backup nemesis r=dt a=stevendanna This adds: - CREATE TENANT - DROP TENANT - ALTER TENANT .. RENAME - CREATE TENANT FROM REPLICATION To our tenant backup nemesis tests. Additionally, it modifies the test to occasionally use cluster backups over tenant-level backups. Fixes #108745 Release note: None 114876: streamingccl: allow CUTOVER TO LATEST before initial scan finishes r=stevendanna a=msbutler This patch allows the user to execute ALTER TENANT x COMPLETE REPLICATION TO LATEST before the initial scan completes. After this cmd, the cutover time is set to the replicated start time. Fixes: #114734 Epic: none Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Michael Butler <[email protected]>
A recent escalation revealed that the following queries from the system tenant would break RESTORE's of tenants:
We ought to develop a unit test which runs the following system tenant workload:
include_all_virtual_clusters
arg at various points in this workloadA poor man's version of this test is in #108743
Jira issue: CRDB-30619
The text was updated successfully, but these errors were encountered: