Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulkio: Add onfailure option to restore #55856

Closed
miretskiy opened this issue Oct 22, 2020 · 3 comments
Closed

bulkio: Add onfailure option to restore #55856

miretskiy opened this issue Oct 22, 2020 · 3 comments
Labels
C-wishlist A wishlist feature. E-starter Might be suitable for a starter project for new employees or team members. no-issue-activity T-disaster-recovery X-stale

Comments

@miretskiy
Copy link
Contributor

miretskiy commented Oct 22, 2020

Add on_failure option to the restore command.
Intended usage:
RESTORE ... WITH onfailure='rollback': That's the default
RESTORE ... WITH onfailure='pause': If something happens, pause the job.

Why: Restores take long time (in general) to run. If something happen (transient error, running out of disk,
failure to mark table PUBLIC, etc), we currently rollback everything. This may be too aggressive and undesirable:
often times we have made significant progress in the restore, and simply retrying might get us to the final
success.

Epic CRDB-7909

Jira issue: CRDB-3623

@miretskiy miretskiy added the C-wishlist A wishlist feature. label Oct 22, 2020
@pbardea
Copy link
Contributor

pbardea commented Oct 22, 2020

Thanks for filing!

I wonder if it would be useful to allow the user to control this flow for all jobs and make the change at the job level. I found #36887, which seems to imply that this would also be useful for CDC use cases.

@miretskiy
Copy link
Contributor Author

We would definitely need to make job level changes to support this (in a reasonable way).
For example, if the job went into "paused" state (or maybe paused-because-of-error), we need to make sure
that we can cancel + rollback. We also need to make sure that when job framework invokes OnFailCancel, we can tell jobs
framework to not kill this job, but instead go to the "paused" state.

Regardless, the option itself needs to be implemented on the statement level (i.e. add it to import/restore/etc in sql.y)

@shermanCRL shermanCRL added the E-starter Might be suitable for a starter project for new employees or team members. label Jul 22, 2021
@github-actions
Copy link

github-actions bot commented Sep 6, 2023

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-wishlist A wishlist feature. E-starter Might be suitable for a starter project for new employees or team members. no-issue-activity T-disaster-recovery X-stale
Projects
No open projects
Archived in project
Development

No branches or pull requests

4 participants