[YSQL] Deadlock found via DDL atomicity stress test #22882

myang2021 · 2024-06-14T18:54:11Z

Description

Fix deadlock in DDL atomicity

In working on diff https://phorge.dev.yugabyte.com/D32682, I found one deadlock
that caused a test to be flaky. The dead lock involved 3 threads.

Thread A is executing yb::RWCLock::UpgradeToCommitLock:

void RWCLock::UpgradeToCommitLock() {
  lock_.lock();
  DCHECK(write_locked_);
#ifndef NDEBUG
  bool first_wait = true;
  while (reader_count_ > 0) {
    if (!no_readers_.TimedWait(first_wait ? kFirstWait : kSecondWait)) {
      std::ostringstream ss;
      ss << "Too long commit lock wait, num readers: " << reader_count_
         << ", current thread stack: " << GetStackTrace();

Thread A blocks because reader_count_ > 0, while it holds the write lock:

  DCHECK(write_locked_);

Thread A is executing the work flow of an ALTER TABLE statement with a call chain

yb::RWCLock::UpgradeToCommitLock
yb::CowObject<yb::master::PersistentTableInfo>::CommitMutation
yb::CowWriteLock<yb::master::PersistentTableInfo>::Commit
yb::master::CatalogManager::ClearYsqlDdlTxnState
yb::master::CatalogManager::HandleSuccessfulYsqlDdlTxn
yb::master::CatalogManager::YsqlDdlTxnCompleteCallbackInternal

From test log I see the thread (Thread B) that caused reader_count_ > 0 is
calling GetTableSchemaInternal:

Thread B:

Status CatalogManager::GetTableSchemaInternal(const GetTableSchemaRequestPB* req,
                                              GetTableSchemaResponsePB* resp,
                                              bool get_fully_applied_indexes) {
  VLOG(1) << "Servicing GetTableSchema request for " << req->ShortDebugString();

  // Lookup the table and verify if it exists.
  TRACE("Looking up table");
  scoped_refptr<TableInfo> table = VERIFY_RESULT(FindTable(req->table()));

  TRACE("Locking table");
  auto l = table->LockForRead();

......
  auto nsinfo = FindNamespaceById(table->namespace_id());

and it is blocked inside FindNamespaceById in order to get a shared lock on
mutex_ of CatalogManager:

Result<scoped_refptr<NamespaceInfo>> CatalogManager::FindNamespaceById(
    const NamespaceId& id) const {
  SharedLock lock(mutex_);
  return FindNamespaceByIdUnlocked(id);
}

The mutex_ is held exclusively by Thread C from CatalogManager::CreateTable:

  {
    LockGuard lock(mutex_);
    auto ns_lock = ns->LockForRead();
......
        if (IsIndex(req)) {
          TRACE("Locking indexed table");
          indexed_table_write_lock = indexed_table->LockForWrite();
        }
......
  }

Note Thread C exclusively holds mutex_ via LockGuard, and it is blocked on
LockForWrite, which calls RWCLock::WriteLock and cannot proceed because
write_locked_ (owned by Thread A).

void RWCLock::WriteLock() {
  ThreadRestrictions::AssertWaitAllowed();

  MutexLock l(lock_);
  // Wait for any other mutations to finish.
#ifndef NDEBUG
  bool first_wait = true;
  while (write_locked_) {
    if (!no_mutators_.TimedWait(first_wait ? kFirstWait : kSecondWait)) {

In summary, we have a deadlock cycle:
Thread A (holds write_locked_) waits for Thread B to release reader_count_
Thread B (holds reader_count_) waits for Thread C to release mutex_
Thread C (holds mutex_) waits for Thread A to release write_locked_

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

Summary: In working on https://phorge.dev.yugabyte.com/D32682, the test PgDdlAtomicityColocatedStressTest.TestFailDdlRollback showed logs like the following: ``` [m-1] W0614 20:35:35.849383 26140 ysql_ddl_handler.cc:551] YsqlDdlTxnCompleteCallback failed: Corruption (yb/util/uuid.cc:312): Invalid length of binary data with TransactionId '': 0 (expected 16) ``` The relevant code are: ``` const string pb_txn_id = table->LockForRead()->pb_transaction_id(); return background_tasks_thread_pool_->SubmitFunc( [this, pb_txn_id, is_committed, epoch]() { WARN_NOT_OK(YsqlDdlTxnCompleteCallback(pb_txn_id, is_committed, epoch), "YsqlDdlTxnCompleteCallback failed"); } ); ``` and ``` Status CatalogManager::YsqlDdlTxnCompleteCallback(const string& pb_txn_id, bool is_committed, const LeaderEpoch& epoch) { SleepFor(MonoDelta::FromMicroseconds(RandomUniformInt<int>(0, FLAGS_TEST_ysql_max_random_delay_before_ddl_verification_usecs))); auto txn = VERIFY_RESULT(FullyDecodeTransactionId(pb_txn_id)); ``` It is possible that `pb_txn_id` of `table` is already cleared. In this case `pb_txn_id` is an empty string. The error is reported when an empty string is passed to `FullyDecodeTransactionId`. I made a fix not to call `YsqlDdlTxnCompleteCallback` if all the tables in the DDL transaction verifier state have their `pb_txn_id` already cleared. In other words, we only call `YsqlDdlTxnCompleteCallback` when we can find a table in the DDL transaction verifier state that has `pb_txn_id` and this `pb_txn_id` is the transaction id that we pass to `YsqlDdlTxnCompleteCallback` to process all the tables in this DDL transaction. Jira: DB-11784 Test Plan: Tested in the context of https://phorge.dev.yugabyte.com/D32682. ./yb_build.sh --sj --cxx-test pg_ddl_atomicity_stress-test --gtest_filter=PgDdlAtomicityColocatedStressTest.TestFailDdlRollback -n 20 --tp 2 Reviewers: fizaa Reviewed By: fizaa Subscribers: ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D35879

Summary: In working on diff https://phorge.dev.yugabyte.com/D32682, I found one deadlock that caused a test to be flaky. The dead lock involved 3 threads. Thread A is executing yb::RWCLock::UpgradeToCommitLock: ``` void RWCLock::UpgradeToCommitLock() { lock_.lock(); DCHECK(write_locked_); #ifndef NDEBUG bool first_wait = true; while (reader_count_ > 0) { if (!no_readers_.TimedWait(first_wait ? kFirstWait : kSecondWait)) { std::ostringstream ss; ss << "Too long commit lock wait, num readers: " << reader_count_ << ", current thread stack: " << GetStackTrace(); ``` Thread A blocks because `reader_count_ > 0`, while it holds the write lock: ``` DCHECK(write_locked_); ``` Thread A is executing the work flow of an ALTER TABLE statement with a call chain ``` yb::RWCLock::UpgradeToCommitLock yb::CowObject<yb::master::PersistentTableInfo>::CommitMutation yb::CowWriteLock<yb::master::PersistentTableInfo>::Commit yb::master::CatalogManager::ClearYsqlDdlTxnState yb::master::CatalogManager::HandleSuccessfulYsqlDdlTxn yb::master::CatalogManager::YsqlDdlTxnCompleteCallbackInternal ``` From test log I see the thread (Thread B) that caused `reader_count_ > 0` is calling `GetTableSchemaInternal`: Thread B: ``` Status CatalogManager::GetTableSchemaInternal(const GetTableSchemaRequestPB* req, GetTableSchemaResponsePB* resp, bool get_fully_applied_indexes) { VLOG(1) << "Servicing GetTableSchema request for " << req->ShortDebugString(); // Lookup the table and verify if it exists. TRACE("Looking up table"); scoped_refptr<TableInfo> table = VERIFY_RESULT(FindTable(req->table())); TRACE("Locking table"); auto l = table->LockForRead(); ...... auto nsinfo = FindNamespaceById(table->namespace_id()); ``` Thread B causes `reader_count_ > 0` via `table->LockForRead()` and it is blocked inside `FindNamespaceById` in order to get a shared lock on `mutex_` of `CatalogManager`: ``` Result<scoped_refptr<NamespaceInfo>> CatalogManager::FindNamespaceById( const NamespaceId& id) const { SharedLock lock(mutex_); return FindNamespaceByIdUnlocked(id); } ``` The `mutex_` is held exclusively by Thread C from `CatalogManager::CreateTable`: ``` { LockGuard lock(mutex_); auto ns_lock = ns->LockForRead(); ...... if (IsIndex(req)) { TRACE("Locking indexed table"); indexed_table_write_lock = indexed_table->LockForWrite(); } ...... } ``` Note Thread C exclusively holds `mutex_` via `LockGuard`, and it is blocked on `LockForWrite`, which calls `RWCLock::WriteLock` and cannot proceed because `write_locked_` (owned by Thread A). ``` void RWCLock::WriteLock() { ThreadRestrictions::AssertWaitAllowed(); MutexLock l(lock_); // Wait for any other mutations to finish. #ifndef NDEBUG bool first_wait = true; while (write_locked_) { if (!no_mutators_.TimedWait(first_wait ? kFirstWait : kSecondWait)) { ``` In summary, we have a deadlock cycle: Thread A (holds write_locked_) waits for Thread B to release reader_count_ Thread B (holds reader_count_) waits for Thread C to release mutex_ Thread C (holds mutex_) waits for Thread A to release write_locked_ To break the dead lock cycle, I made changes in Thread C, so that it will release the exclusive lock on `mutex_` before calling `LockForWrite`. In addition, it does not need to re-acquire `mutex_` after `LockForWrite` because the rest code in the relevant scope does not need to be protected by having an exclusive lock on `mutex_`. Jira: DB-11784 Test Plan: Tested in the context of https://phorge.dev.yugabyte.com/D32682. Reviewers: hsunder, asrivastava, zdrudi Reviewed By: asrivastava, zdrudi Subscribers: zdrudi, ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D35843

Summary: 411a32e [DB-11813] Rename ysql_conn_mgr_idle_or_pending_clients metric name c2e13ef [#15682] YSQL: Fix stack_is_too_deep function in ASAN ef31455 [PLAT-14188] Fixing upgrade disk availability check db6b1b7 [#23004] YCQL: Fix tserver crash due to NULL pointer dereference 0ada80a [PLAT-14433] Use correct kubeconfig for edit provider validation eccbc10 [PLAT-14414] Enable Kubernetes provider validation by default 199f679 [PLAT-14324]: Move all node agent based flags from BETA to INTERNAL in Provider Conf keys file 86a865d [PLAT-14443] YBA Installer wait for ready time configurable. ac184a8 [#22882] YSQL: Fix deadlock in DDL atomicity a4218fb [Docs] Sort feature to tables (Where fulfills the criteria) (#22836) 2f267ca [#22996] xCluster: Add SOURCE_UNREACHABLE and SYSTEM_ERROR enums Skipped due to conflict: dee7691 [#21534] docdb: Set owner correctly for cloned databases 34632ba [PLAT-14495] Set up the node_exporter for ynp 7c99ff9 [#22876][#22773] CDCSDK: Add new yb-admin command to remove user table from CDCSDK stream 4e9a81c [#22876][#22835][#22773] CDCSDK: Remove non-eligible tables for CDC from existing CDCSDK stream f2e574e [#23013] xClusterDDLRepl: Allow table_ids for GetXClusterStreams Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: tfoucher, sanketh, jenkins-bot Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D36184

Summary: In working on https://phorge.dev.yugabyte.com/D32682, the test PgDdlAtomicityColocatedStressTest.TestFailDdlRollback showed logs like the following: ``` [m-1] W0614 20:35:35.849383 26140 ysql_ddl_handler.cc:551] YsqlDdlTxnCompleteCallback failed: Corruption (yb/util/uuid.cc:312): Invalid length of binary data with TransactionId '': 0 (expected 16) ``` The relevant code are: ``` const string pb_txn_id = table->LockForRead()->pb_transaction_id(); return background_tasks_thread_pool_->SubmitFunc( [this, pb_txn_id, is_committed, epoch]() { WARN_NOT_OK(YsqlDdlTxnCompleteCallback(pb_txn_id, is_committed, epoch), "YsqlDdlTxnCompleteCallback failed"); } ); ``` and ``` Status CatalogManager::YsqlDdlTxnCompleteCallback(const string& pb_txn_id, bool is_committed, const LeaderEpoch& epoch) { SleepFor(MonoDelta::FromMicroseconds(RandomUniformInt<int>(0, FLAGS_TEST_ysql_max_random_delay_before_ddl_verification_usecs))); auto txn = VERIFY_RESULT(FullyDecodeTransactionId(pb_txn_id)); ``` It is possible that `pb_txn_id` of `table` is already cleared. In this case `pb_txn_id` is an empty string. The error is reported when an empty string is passed to `FullyDecodeTransactionId`. I made a fix not to call `YsqlDdlTxnCompleteCallback` if all the tables in the DDL transaction verifier state have their `pb_txn_id` already cleared. In other words, we only call `YsqlDdlTxnCompleteCallback` when we can find a table in the DDL transaction verifier state that has `pb_txn_id` and this `pb_txn_id` is the transaction id that we pass to `YsqlDdlTxnCompleteCallback` to process all the tables in this DDL transaction. Jira: DB-11784 Original commit: ec88a5d / D35879 Test Plan: Tested in the context of https://phorge.dev.yugabyte.com/D32682. ./yb_build.sh --sj --cxx-test pg_ddl_atomicity_stress-test --gtest_filter=PgDdlAtomicityColocatedStressTest.TestFailDdlRollback -n 20 --tp 2 Reviewers: fizaa Reviewed By: fizaa Subscribers: yql, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36192

Summary: In working on diff https://phorge.dev.yugabyte.com/D32682, I found one deadlock that caused a test to be flaky. The dead lock involved 3 threads. Thread A is executing yb::RWCLock::UpgradeToCommitLock: ``` void RWCLock::UpgradeToCommitLock() { lock_.lock(); DCHECK(write_locked_); bool first_wait = true; while (reader_count_ > 0) { if (!no_readers_.TimedWait(first_wait ? kFirstWait : kSecondWait)) { std::ostringstream ss; ss << "Too long commit lock wait, num readers: " << reader_count_ << ", current thread stack: " << GetStackTrace(); ``` Thread A blocks because `reader_count_ > 0`, while it holds the write lock: ``` DCHECK(write_locked_); ``` Thread A is executing the work flow of an ALTER TABLE statement with a call chain ``` yb::RWCLock::UpgradeToCommitLock yb::CowObject<yb::master::PersistentTableInfo>::CommitMutation yb::CowWriteLock<yb::master::PersistentTableInfo>::Commit yb::master::CatalogManager::ClearYsqlDdlTxnState yb::master::CatalogManager::HandleSuccessfulYsqlDdlTxn yb::master::CatalogManager::YsqlDdlTxnCompleteCallbackInternal ``` From test log I see the thread (Thread B) that caused `reader_count_ > 0` is calling `GetTableSchemaInternal`: Thread B: ``` Status CatalogManager::GetTableSchemaInternal(const GetTableSchemaRequestPB* req, GetTableSchemaResponsePB* resp, bool get_fully_applied_indexes) { VLOG(1) << "Servicing GetTableSchema request for " << req->ShortDebugString(); // Lookup the table and verify if it exists. TRACE("Looking up table"); scoped_refptr<TableInfo> table = VERIFY_RESULT(FindTable(req->table())); TRACE("Locking table"); auto l = table->LockForRead(); ...... auto nsinfo = FindNamespaceById(table->namespace_id()); ``` Thread B causes `reader_count_ > 0` via `table->LockForRead()` and it is blocked inside `FindNamespaceById` in order to get a shared lock on `mutex_` of `CatalogManager`: ``` Result<scoped_refptr<NamespaceInfo>> CatalogManager::FindNamespaceById( const NamespaceId& id) const { SharedLock lock(mutex_); return FindNamespaceByIdUnlocked(id); } ``` The `mutex_` is held exclusively by Thread C from `CatalogManager::CreateTable`: ``` { LockGuard lock(mutex_); auto ns_lock = ns->LockForRead(); ...... if (IsIndex(req)) { TRACE("Locking indexed table"); indexed_table_write_lock = indexed_table->LockForWrite(); } ...... } ``` Note Thread C exclusively holds `mutex_` via `LockGuard`, and it is blocked on `LockForWrite`, which calls `RWCLock::WriteLock` and cannot proceed because `write_locked_` (owned by Thread A). ``` void RWCLock::WriteLock() { ThreadRestrictions::AssertWaitAllowed(); MutexLock l(lock_); // Wait for any other mutations to finish. bool first_wait = true; while (write_locked_) { if (!no_mutators_.TimedWait(first_wait ? kFirstWait : kSecondWait)) { ``` In summary, we have a deadlock cycle: Thread A (holds write_locked_) waits for Thread B to release reader_count_ Thread B (holds reader_count_) waits for Thread C to release mutex_ Thread C (holds mutex_) waits for Thread A to release write_locked_ To break the dead lock cycle, I made changes in Thread C, so that it will release the exclusive lock on `mutex_` before calling `LockForWrite`. In addition, it does not need to re-acquire `mutex_` after `LockForWrite` because the rest code in the relevant scope does not need to be protected by having an exclusive lock on `mutex_`. Jira: DB-11784 Original commit: ac184a8 / D35843 Test Plan: Tested in the context of https://phorge.dev.yugabyte.com/D32682. Reviewers: hsunder, asrivastava, zdrudi Reviewed By: asrivastava Subscribers: yql, ybase, zdrudi Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36229

myang2021 added the area/ysql Yugabyte SQL (YSQL) label Jun 14, 2024

myang2021 self-assigned this Jun 14, 2024

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 14, 2024

sushantrmishra added the 2024.1 Backport Required label Jun 25, 2024

myang2021 closed this as completed Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Deadlock found via DDL atomicity stress test #22882

[YSQL] Deadlock found via DDL atomicity stress test #22882

myang2021 commented Jun 14, 2024 •

edited

Loading

[YSQL] Deadlock found via DDL atomicity stress test #22882

[YSQL] Deadlock found via DDL atomicity stress test #22882

Comments

myang2021 commented Jun 14, 2024 • edited Loading

Description

Issue Type

Warning: Please confirm that this issue does not contain any sensitive information

myang2021 commented Jun 14, 2024 •

edited

Loading