Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] After Restoring a table, some tasks in the YB-Master may hold references to stale TableInfo objects #14679

Closed
deeps1991 opened this issue Oct 27, 2022 · 3 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@deeps1991
Copy link
Contributor

deeps1991 commented Oct 27, 2022

Jira Link: DB-4039

Description

While implementing the infrastructure for DDL atomicity (#13358), I saw that if a DDL rollback task is scheduled right before a restore operation, the restore operation changes the schema of the table underneath.
However the scheduled rollback task still holds a reference to the TableInfo object pertaining to the old schema. There is no way for the task to deterministically synchronize itself with a Restore operation which can happen at any point concurrently. The schema can change underneath the rollback task, introducing the possibility for the rollback task to try to alter/drop the table based on the old schema state.

DDL Atomicity infrastructure is still WIP at this point. However it would be good to have PITR infrastructure that could possibly invalidate references to stale table schema.

@deeps1991 deeps1991 added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Oct 27, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Oct 27, 2022
@deeps1991
Copy link
Contributor Author

@sanketkedia Created this ticket as per our discussion, please let me know if further details are required.

@lingamsandeep
Copy link
Contributor

@deeps1991 DId you fix this already as part of another change ?

@deeps1991
Copy link
Contributor Author

deeps1991 commented Oct 12, 2023

@lingamsandeep I did not, but it looks like the commented-out test that I had due to this issue is now fixed by some other changes. I no longer see the SEG fault (due to the TableInfo objects being deleted while the verification process still had a pointer to it) I was seeing before if DDL verification and restoration happen concurrently.
The DDL verification fails neatly with:

[m-1] I1012 17:45:19.019272 3832563 ysql_ddl_handler.cc:272] Alter transaction on 000033f5000030008000000000004005 failed, rolling back its schema changes
[m-1] W1012 17:45:19.019428 3832563 catalog_manager.cc:7056] Aborted (yb/master/sys_catalog-internal.h:122): An error occurred while updating sys-catalog tables entry: Aborted (yb/master/sys_catalog-internal.h:122): Trying to write data read before a restore was initiated.: Trying to write data read before a restore was initiated.
[m-1] W1012 17:45:19.019676 3832563 ysql_ddl_handler.cc:129] Transaction verification failed for table drop_test [id=000033f5000030008000000000004005]: Service unavailable (yb/master/catalog_manager-internal.h:65): Operation requested can only be executed on a leader master, but this master is no longer the leader: Aborted (yb/master/sys_catalog-internal.h:122): An error occurred while updating sys-catalog tables entry: Aborted (yb/master/sys_catalog-internal.h:122): Trying to write data read before a restore was initiated.: Trying to write data read before a restore was initiated. (master error 7)

I think this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

4 participants