Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-19.2: backport fixes for schema change deadlocks #46904

Conversation

ajwerner
Copy link
Contributor

@ajwerner ajwerner commented Apr 1, 2020

Backport 1/1 commits from #46170.
Backport 1/1 commits from #46384.
Backport 1/1 commits from #46588.
Backport 1/1 commits from #46829.

/cc @cockroachdb/release


Backports commits to deal with deadlocks due to being unable to detect deadlocks during schema element lookup.

@ajwerner ajwerner requested a review from andreimatei April 1, 2020 23:13
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ajwerner
Copy link
Contributor Author

ajwerner commented Apr 1, 2020

@andreimatei is there an additional commit that needs to be backported to deal with restarts and high priority transactions?

andreimatei and others added 4 commits April 1, 2020 23:14
This patch makes reading the system.namespace table and reading table
descriptors in high-priority transactions. The transactions will push
locks owned by schema changes out of their way. The idea is that regular
SQL transactions reading schema don't want to wait for the transactions
performing DDL statements. Instead, those DDLs will be pushed and forced
to refresh.

Besides the benefit to regular transactions, this patch also prevents
deadlocks for the transactions performing DDL. Before this patch, the
final select in the following sequence would deadlock:

begin; savepoint s; create table t(x int); rollback to savepoint s;
select * from t;

The select is reading the namespace table, using a different txn from
its own. That read would block on the intent laid down by the prior
create. With this patch, the transaction effectively pushes itself, but
gets to otherwise run.

Fixes cockroachdb#24885

Release justification: Fix for an old bug that became more preminent
when we introduced SAVEPOINTs recently.

Release note: A rare bug causing transactions that have performed schema
changes to deadlock after they restart has been fixed.
This commit is a follow up to cockroachdb#46170. That PR dealt with the table resolution
self-deadlock and additionally ensures that lease acquisition is not blocked.
That commit did not deal with the fact that schema changes which mutate
databases can deadlock themselves on retry because a separate transaction
is used to resolve database descriptors during planning.

This patch makes reading the system.namespace table and reading database
descriptors in high-priority transactions. The transactions will push
locks owned by schema changes out of their way. The idea is that regular
SQL transactions reading schema don't want to wait for the transactions
performing DDL statements. Instead, those DDLs will be pushed and forced
to refresh.

Besides the benefit to regular transactions, this patch also prevents
deadlocks for the transactions performing DDL. Before this patch, the
final select in the following sequence would deadlock:

begin; savepoint s; create database d; rollback to savepoint s;
select * from d.t;

The select is reading the namespace table, using a different txn from
its own. That read would block on the intent laid down by the prior
create. With this patch, the transaction effectively pushes itself, but
gets to otherwise run.

Fixes cockroachdb#46224

Release justification: Fix for an old bug that became more preminent
when we introduced SAVEPOINTs recently.

Release note (bug fix): A rare bug causing transactions that have performed
schema changes to deadlock after they restart has been fixed.
Prepartion of certain queries requires performing reads against the database.
If the user has already laid down intents, these reads may become part of a
dependency cycle. Prior to this commit, these reads would be on a different
transaction and thus this cycle would not be detected by our deadlock detection
mechanism.

This change opts to use the user's transaction for planning if there is one
and thus will properly interact with deadlock detection.

Fixes cockroachdb#46447.

Release justification: fixes for high-priority or high-severity bugs in
existing functionality

Release note: None
In cockroachdb#46588 a bug was introduced when a retriable error was encountered while
using a new transaction for preparing. Prior to that commit, all error were
treated as not retriable. This was sort of a bummer. Retriable errors can
occur due to read within uncertainty. Before this PR, those retriable errors
would make their way to the client. Now we'll handle those retry errors
internally underneath `connExecutor.prepare`

Fixes cockroachdb#43251

Release note: None
@ajwerner ajwerner force-pushed the backport19.2-46170-46384-46588-46829 branch from 3525108 to 75b2070 Compare April 2, 2020 03:17
@knz knz mentioned this pull request Apr 3, 2020
25 tasks
@knz
Copy link
Contributor

knz commented Apr 3, 2020

nb: the 19.2.6 branch would like to be cut today #46946, so if there was a way to merge this soon that would be swell.

@andreimatei
Copy link
Contributor

@andreimatei is there an additional commit that needs to be backported to deal with restarts and high priority transactions?

Don't think so.

LGTM

@ajwerner ajwerner merged commit ee75989 into cockroachdb:release-19.2 Apr 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants