Skip to content

Commit

Permalink
[#22882] YSQL: Fix unexpected DDL atomicity log message
Browse files Browse the repository at this point in the history
Summary:
In working on https://phorge.dev.yugabyte.com/D32682, the test
PgDdlAtomicityColocatedStressTest.TestFailDdlRollback showed logs like the following:

```
[m-1] W0614 20:35:35.849383 26140 ysql_ddl_handler.cc:551] YsqlDdlTxnCompleteCallback failed: Corruption (yb/util/uuid.cc:312): Invalid length of binary data with TransactionId '': 0 (expected 16)
```

The relevant code are:
```
      const string pb_txn_id = table->LockForRead()->pb_transaction_id();
      return background_tasks_thread_pool_->SubmitFunc(
        [this, pb_txn_id, is_committed, epoch]() {
            WARN_NOT_OK(YsqlDdlTxnCompleteCallback(pb_txn_id, is_committed, epoch),
                        "YsqlDdlTxnCompleteCallback failed");
        }
      );
```

and
```
Status CatalogManager::YsqlDdlTxnCompleteCallback(const string& pb_txn_id,
                                                  bool is_committed,
                                                  const LeaderEpoch& epoch) {

  SleepFor(MonoDelta::FromMicroseconds(RandomUniformInt<int>(0,
    FLAGS_TEST_ysql_max_random_delay_before_ddl_verification_usecs)));

  auto txn = VERIFY_RESULT(FullyDecodeTransactionId(pb_txn_id));

```

It is possible that `pb_txn_id` of `table` is already cleared. In this case
`pb_txn_id` is an empty string. The error is reported when an empty string is
passed to `FullyDecodeTransactionId`.

I made a fix not to call `YsqlDdlTxnCompleteCallback` if all the tables in the
DDL transaction verifier state have their `pb_txn_id` already cleared. In other
words, we only call `YsqlDdlTxnCompleteCallback` when we can find a table
in the DDL transaction verifier state that has `pb_txn_id` and this `pb_txn_id`
is the transaction id that we pass to `YsqlDdlTxnCompleteCallback` to process
all the tables in this DDL transaction.
Jira: DB-11784

Test Plan:
Tested in the context of https://phorge.dev.yugabyte.com/D32682.

./yb_build.sh --sj --cxx-test pg_ddl_atomicity_stress-test --gtest_filter=PgDdlAtomicityColocatedStressTest.TestFailDdlRollback -n 20 --tp 2

Reviewers: fizaa

Reviewed By: fizaa

Subscribers: ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D35879
  • Loading branch information
myang2021 committed Jun 18, 2024
1 parent 29916ed commit ec88a5d
Showing 1 changed file with 20 additions and 7 deletions.
27 changes: 20 additions & 7 deletions src/yb/master/ysql_ddl_handler.cc
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,7 @@ Status CatalogManager::YsqlDdlTxnAlterTableHelper(const YsqlTableDdlTxnState txn
table_pb.set_state_msg(
strings::Substitute("Alter table version=$0 ts=$1", table_pb.version(), LocalTimeAsString()));

VLOG(3) << "Clearing ysql_ddl_txn_verifier_state from table " << txn_data.table->id();
table_pb.clear_ysql_ddl_txn_verifier_state();
table_pb.clear_transaction();

Expand Down Expand Up @@ -500,20 +501,32 @@ Status CatalogManager::TriggerDdlVerificationIfNeeded(
return Status::OK();
}

table = verifier_state->tables.front();
if (verifier_state->txn_state != TxnState::kUnknown) {
// We already know whether this transaction is a success or a failure. We don't need to poll
// the transaction coordinator at this point. We can simply invoke post DDL verification
// directly.
const bool is_committed = verifier_state->txn_state == TxnState::kCommitted;
const string pb_txn_id = table->LockForRead()->pb_transaction_id();
return background_tasks_thread_pool_->SubmitFunc(
[this, pb_txn_id, is_committed, epoch]() {
WARN_NOT_OK(YsqlDdlTxnCompleteCallback(pb_txn_id, is_committed, epoch),
"YsqlDdlTxnCompleteCallback failed");
string pb_txn_id;
vector<TableId> table_ids;
for (const auto& table : verifier_state->tables) {
table_ids.push_back(table->id());
pb_txn_id = table->LockForRead()->pb_transaction_id();
if (pb_txn_id.empty()) {
continue;
}
);
return background_tasks_thread_pool_->SubmitFunc(
[this, table, pb_txn_id, is_committed, epoch]() {
WARN_NOT_OK(YsqlDdlTxnCompleteCallback(pb_txn_id, is_committed, epoch),
Format("YsqlDdlTxnCompleteCallback failed, table: ",
table->id()));
}
);
}
VLOG(3) << "All tables " << VectorToString(table_ids)
<< " in transaction " << txn << " have pb_txn_id cleared";
return Status::OK();
}
table = verifier_state->tables.front();
}

// Schedule transaction verification.
Expand Down

0 comments on commit ec88a5d

Please sign in to comment.