Skip to content

Commit

Permalink
[yugabyte#13358] YSQL: Part 1 - YSQL DDL Atomicity
Browse files Browse the repository at this point in the history
Summary:
Currently the metadata for YSQL Tables is stored in both the PG catalog and DocDB schema. Sometimes when a DDL transaction fails, we can hit an issue where these two sources of metadata go out of sync. This is because the PG catalog is modified using the DocDB transactions framework, but the DocDB schema is stored in the Kudu-inherited structures in the YB-Master sys catalog. Thus when a transaction is aborted, only the changes to the PG catalog rollback, but the changes made to the DocDB schema cannot be rolled-back.

This patch contains the first phase of changes required to fix this issue. We now enforce that only one DDL that modifies the DocDB schema can occur on a table. This simplifies the problem because, if a DDL transaction starts, the DocDB schema now only needs to maintain two versions of the schema - the current committed schema, and the uncommitted schema of the ongoing transaction.

This patch then extends the Transaction GC framework introduced in commit: 88e9d59 to also handle ALTER operations and DROP operations. When a DDL transaction modifies the DocDB schema, the transaction metadata, and the current schema of the table is stored in the SysTablesEntryPB. YB-Master monitors the state of the transaction. Once the transaction ends, YB-Master detects whether the transaction was successful or not by comparing the DocDB schema and PG catalog. If the transaction is aborted, YB-Master will rollback the changes made to the DocDB schema using the state present in the SysTablesEntryPB.

**Correctness**
Thus, the two schemas can be described to be "eventually consistent" because we can have periods of time where both the schemas do not match.
However this mostly not affect correctness because of the following two properties:

  - When inconsistent, the DocDB schema always has more columns/constraints than PG schema.
  - Clients always use the PG schema (which is guaranteed to not return uncommitted state) to prepare their read/write requests.

These two properties ensure that we can't have orphaned data or failed integrity checks or use DDL entities created by uncommitted
transactions because of DocDB schema being inconsistent.

**Feature Flag:** ysql_ddl_rollback_enabled
Note that this flag is shared across pg backends and YB-Master. This flag is used by the PG backends to determine whether or not to perform best-effort rollback for ALTER TABLE ADD column introduced in 5a6bfcd
Note that we do not support changing this flag //**during**// a DDL operation.
If this flag is changed to false after a DDL but before the rollback completes, then all future DDL operations will ignore the presence of YsqlTxnVerifierState. Alter operations in particular will clear the YsqlTxnVerifierState as well.

Notes:
  - Since this patch contains only the first phase of changes, this feature is disabled by default using the flag `ysql_ddl_rollback_enabled`
  - This patch only handles Create Table, Alter Table Add Column, Alter Table Add Primary Key, Alter Table Rename Table/Column and Drop Table
  -  This patch does not have support for databases, materialized views, indexes, colocated tables, PG catalog tables and tablegroups.
  - GUC `yb_test_fail_next_ddl` has been changed to fail any upcoming DDL operation, not just CREATE TABLE ddl operations.

Design Doc: https://docs.google.com/document/d/1YAUnSV9X7Gv2ZVpILUe15NbXbVwgFvpV8IRaj6wFMZo/edit?usp=sharing

Test Plan: ybd --cxx-test pg_ddl_atomicity-test

Reviewers: ena, nicolas, myang

Reviewed By: nicolas, myang

Subscribers: zyu, pjain, jmaley, hsunder, yguan, dmitry, ybase, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D18398
  • Loading branch information
deeps1991 authored and jayant07-yb committed Dec 7, 2022
1 parent e875de7 commit e44b832
Show file tree
Hide file tree
Showing 39 changed files with 1,940 additions and 443 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -441,7 +441,7 @@ public void creatingSystemRelsAfterFailure() throws Exception {
+ ")";

stmt.execute("SET yb_test_fail_next_ddl TO true");
runInvalidQuery(stmt, ddlSql, "DDL failed as requested");
runInvalidQuery(stmt, ddlSql, "Failed DDL operation as requested");

// Letting CatalogManagerBgTasks do the cleanup.
Thread.sleep(BuildTypeUtil.adjustTimeout(5000));
Expand Down
22 changes: 14 additions & 8 deletions src/postgres/src/backend/commands/tablecmds.c
Original file line number Diff line number Diff line change
Expand Up @@ -991,9 +991,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
InvalidOid /* matviewPgTableId */);
}

/* For testing purposes, user might ask us to fail a DLL. */
YBTestFailDdlIfRequested();

/*
* Open the new relation and acquire exclusive lock on it. This isn't
* really necessary for locking out other backends (since they can't see
Expand Down Expand Up @@ -3896,12 +3893,21 @@ ATController(AlterTableStmt *parsetree,
}
PG_CATCH();
{
/* Rollback the DocDB changes. */
ListCell *lc = NULL;
foreach(lc, rollbackHandles)
if (!*YBCGetGFlags()->ysql_ddl_rollback_enabled ||
rel->yb_table_properties->is_colocated)
{
YBCPgStatement handle = (YBCPgStatement) lfirst(lc);
YBCExecAlterTable(handle, RelationGetRelid(rel));
/*
* The new way of doing ddl rollback is disabled (or unsupported,
* as in the case of colocated tables) fall back to the old way of
* doing a best-effort rollback which may not always succeed
* (e.g., in case of network failure or PG crash).
*/
ListCell *lc = NULL;
foreach(lc, rollbackHandles)
{
YBCPgStatement handle = (YBCPgStatement) lfirst(lc);
YBCExecAlterTable(handle, RelationGetRelid(rel));
}
}
PG_RE_THROW();
}
Expand Down
27 changes: 21 additions & 6 deletions src/postgres/src/backend/commands/ybccmds.c
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@
#include "catalog/pg_class.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_database.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_tablespace.h"
#include "catalog/pg_type.h"
#include "catalog/pg_type_d.h"
Expand Down Expand Up @@ -697,15 +696,31 @@ YBCDropTable(Relation relation)
false, /* if_exists */
&handle),
&not_found);
const bool valid_handle = !not_found;
if (valid_handle)
if (not_found)
{
return;
}
/*
* YSQL DDL Rollback is not yet supported for colocated tables.
*/
if (*YBCGetGFlags()->ysql_ddl_rollback_enabled &&
!yb_props->is_colocated)
{
/*
* We cannot abort drop in DocDB so postpone the execution until
* the rest of the statement/txn is finished executing.
* The following issues a request to the YB-Master to drop the
* table once this transaction commits.
*/
YBSaveDdlHandle(handle);
HandleYBStatusIgnoreNotFound(YBCPgExecDropTable(handle),
&not_found);
return;
}
/*
* YSQL DDL Rollback is disabled/unsupported. This means DocDB will not
* rollback the drop if the transaction ends up failing. We cannot
* abort drop in DocDB so postpone the execution until the rest of the
* statement/txn finishes executing.
*/
YBSaveDdlHandle(handle);
}
}

Expand Down
8 changes: 4 additions & 4 deletions src/postgres/src/backend/utils/misc/guc.c
Original file line number Diff line number Diff line change
Expand Up @@ -2023,8 +2023,8 @@ static struct config_bool ConfigureNamesBool[] =

{
{"yb_test_system_catalogs_creation", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Relaxes some internal sanity checks for system catalogs to "
"allow creating them."),
gettext_noop("Relaxes some internal sanity checks for system "
"catalogs to allow creating them."),
NULL,
GUC_NOT_IN_SAMPLE
},
Expand All @@ -2035,8 +2035,8 @@ static struct config_bool ConfigureNamesBool[] =

{
{"yb_test_fail_next_ddl", PGC_USERSET, DEVELOPER_OPTIONS,
gettext_noop("When set, the next DDL (only CREATE TABLE for now) "
"will fail right after DocDB processes the actual database structure change."),
gettext_noop("When set, the next DDL will fail right before "
"commit."),
NULL,
GUC_NOT_IN_SAMPLE
},
Expand Down
13 changes: 5 additions & 8 deletions src/postgres/src/backend/utils/misc/pg_yb_utils.c
Original file line number Diff line number Diff line change
Expand Up @@ -1127,6 +1127,11 @@ YBDecrementDdlNestingLevel(bool is_catalog_version_increment,
--ddl_transaction_state.nesting_level;
if (ddl_transaction_state.nesting_level == 0)
{
if (yb_test_fail_next_ddl)
{
yb_test_fail_next_ddl = false;
elog(ERROR, "Failed DDL operation as requested");
}
if (GetCurrentMemoryContext() == ddl_transaction_state.mem_context)
MemoryContextSwitchTo(ddl_transaction_state.mem_context->parent);
/*
Expand Down Expand Up @@ -1638,14 +1643,6 @@ void YBCFillUniqueIndexNullAttribute(YBCPgYBTupleIdDescriptor* descr) {
last_attr->is_null = true;
}

void YBTestFailDdlIfRequested() {
if (!yb_test_fail_next_ddl)
return;

yb_test_fail_next_ddl = false;
elog(ERROR, "DDL failed as requested");
}

void
YbTestGucBlockWhileStrEqual(char **actual, const char *expected,
const char *msg)
Expand Down
2 changes: 0 additions & 2 deletions src/postgres/src/include/pg_yb_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -565,8 +565,6 @@ YbTableProperties YbTryGetTableProperties(Relation rel);
*/
bool YBIsSupportedLibcLocale(const char *localebuf);

void YBTestFailDdlIfRequested();

/* Spin wait while test guc var actual equals expected. */
extern void YbTestGucBlockWhileStrEqual(char **actual, const char *expected,
const char *msg);
Expand Down
2 changes: 1 addition & 1 deletion src/yb/client/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ endif()
# and therefore don't need to worry about export strictness)
ADD_YB_LIBRARY(yb_client_test_util
SRCS client-test-util.cc
DEPS gmock gtest yb_client)
DEPS gmock gtest yb_client yb_test_util)

ADD_YB_LIBRARY(ql-dml-test-base
SRCS ql-dml-test-base.cc txn-test-base.cc snapshot_test_util.cc
Expand Down
12 changes: 11 additions & 1 deletion src/yb/client/client-internal.cc
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ DEFINE_test_flag(string, assert_tablet_server_select_is_in_zone, "",
DECLARE_int64(reset_master_leader_timeout_ms);

DECLARE_string(flagfile);
DECLARE_bool(ysql_ddl_rollback_enabled);

namespace yb {

Expand Down Expand Up @@ -558,7 +559,8 @@ Status YBClient::Data::DeleteTable(YBClient* client,
const bool is_index_table,
CoarseTimePoint deadline,
YBTableName* indexed_table_name,
bool wait) {
bool wait,
const TransactionMetadata *txn) {
DeleteTableRequestPB req;
DeleteTableResponsePB resp;
int attempts = 0;
Expand All @@ -569,6 +571,14 @@ Status YBClient::Data::DeleteTable(YBClient* client,
if (!table_id.empty()) {
req.mutable_table()->set_table_id(table_id);
}
if (FLAGS_ysql_ddl_rollback_enabled && txn) {
// If 'txn' is set, this means this delete operation should actually result in the
// deletion of table data only if this transaction is a success. Therefore ensure that
// 'wait' is not set, because it makes no sense to wait for the deletion to complete if we want
// to postpone the deletion until end of transaction.
DCHECK(!wait);
txn->ToPB(req.mutable_transaction());
}
req.set_is_index_table(is_index_table);
const Status status = SyncLeaderMasterRpc(
deadline, req, &resp, "DeleteTable", &master::MasterDdlProxy::DeleteTableAsync,
Expand Down
14 changes: 8 additions & 6 deletions src/yb/client/client-internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
#include "yb/common/common_net.pb.h"
#include "yb/common/entity_ids.h"
#include "yb/common/index.h"
#include "yb/common/transaction.h"
#include "yb/common/wire_protocol.h"

#include "yb/master/master_fwd.h"
Expand Down Expand Up @@ -137,12 +138,13 @@ class YBClient::Data {

// Take one of table id or name.
Status DeleteTable(YBClient* client,
const YBTableName& table_name,
const std::string& table_id,
bool is_index_table,
CoarseTimePoint deadline,
YBTableName* indexed_table_name,
bool wait = true);
const YBTableName& table_name,
const std::string& table_id,
bool is_index_table,
CoarseTimePoint deadline,
YBTableName* indexed_table_name,
bool wait = true,
const TransactionMetadata *txn = nullptr);

Status IsDeleteTableInProgress(YBClient* client,
const std::string& table_id,
Expand Down
85 changes: 85 additions & 0 deletions src/yb/client/client-test-util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -43,29 +43,64 @@
#include <boost/function.hpp>

#include "yb/client/client_fwd.h"
#include "yb/client/client.h"
#include "yb/client/error.h"
#include "yb/client/schema.h"
#include "yb/client/session.h"
#include "yb/client/table_handle.h"
#include "yb/client/yb_op.h"
#include "yb/client/yb_table_name.h"

#include "yb/common/common.pb.h"
#include "yb/common/ql_type.h"
#include "yb/common/ql_value.h"

#include "yb/master/master_types.pb.h"

#include "yb/util/backoff_waiter.h"
#include "yb/util/enums.h"
#include "yb/util/monotime.h"
#include "yb/util/status.h"
#include "yb/util/status_callback.h"
#include "yb/util/status_log.h"
#include "yb/util/strongly_typed_bool.h"
#include "yb/util/test_macros.h"
#include "yb/util/test_util.h"

using std::string;

namespace yb {
namespace client {

namespace {
void VerifyNamespace(client::YBClient *client,
const NamespaceName& db_name,
const bool exists,
const int timeout_secs) {
ASSERT_OK(LoggedWaitFor([&]() -> Result<bool> {
Result<bool> ret = client->NamespaceExists(db_name, YQLDatabase::YQL_DATABASE_PGSQL);
WARN_NOT_OK(ResultToStatus(ret), "NamespaceExists call failed");
return ret.ok() && ret.get() == exists;
}, MonoDelta::FromSeconds(timeout_secs),
Format("Verify Namespace $0 $1 exists", db_name, (exists ? "" : "not"))));
}

void VerifyTable(client::YBClient* client,
const string& database_name,
const string& table_name,
const int timeout_secs,
const bool exists) {
ASSERT_OK(LoggedWaitFor([&]() -> Result<bool> {
auto ret = client->TableExists(
client::YBTableName(YQL_DATABASE_PGSQL, database_name, table_name));
WARN_NOT_OK(ResultToStatus(ret), "TableExists call failed");
return ret.ok() && ret.get() == exists;
}, MonoDelta::FromSeconds(timeout_secs),
Format("Verify Table $0 $1 exists in database $2",
table_name, (exists ? "" : "not"), database_name)));
}
} // namespace

void LogSessionErrorsAndDie(const FlushStatus& flush_status) {
const auto& s = flush_status.status;
CHECK(!s.ok());
Expand Down Expand Up @@ -158,5 +193,55 @@ std::shared_ptr<YBqlReadOp> CreateReadOp(
return op;
}

Result<string> GetNamespaceIdByNamespaceName(YBClient* client,
const string& namespace_name) {
const auto namespaces = VERIFY_RESULT(client->ListNamespaces(YQL_DATABASE_PGSQL));
for (const auto& ns : namespaces) {
if (ns.name() == namespace_name) {
return ns.id();
}
}
return STATUS_SUBSTITUTE(NotFound, "The namespace $0 does not exist", namespace_name);
}

Result<string> GetTableIdByTableName(client::YBClient* client,
const string& namespace_name,
const string& table_name) {
const auto tables = VERIFY_RESULT(client->ListTables());
for (const auto& t : tables) {
if (t.namespace_name() == namespace_name && t.table_name() == table_name) {
return t.table_id();
}
}
return STATUS_SUBSTITUTE(NotFound, "The table $0 does not exist in namespace $1",
table_name, namespace_name);
}

void VerifyNamespaceExists(YBClient *client,
const NamespaceName& db_name,
const int timeout_secs) {
VerifyNamespace(client, db_name, true /* exists */, timeout_secs);
}

void VerifyNamespaceNotExists(YBClient *client,
const NamespaceName& db_name,
const int timeout_secs) {
VerifyNamespace(client, db_name, false /* exists */, timeout_secs);
}

void VerifyTableExists(YBClient* client,
const string& database_name,
const string& table_name,
const int timeout_secs) {
VerifyTable(client, database_name, table_name, timeout_secs, true /* exists */);
}

void VerifyTableNotExists(YBClient* client,
const string& database_name,
const string& table_name,
const int timeout_secs) {
VerifyTable(client, database_name, table_name, timeout_secs, false /* exists */);
}

} // namespace client
} // namespace yb
18 changes: 18 additions & 0 deletions src/yb/client/client-test-util.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,5 +74,23 @@ YBSchema YBSchemaFromSchema(const Schema& schema);
std::shared_ptr<YBqlReadOp> CreateReadOp(
int32_t key, const TableHandle& table, const std::string& value_column);

Result<std::string> GetNamespaceIdByNamespaceName(
YBClient* client, const std::string& namespace_name);

Result<std::string> GetTableIdByTableName(
YBClient* client, const std::string& namespace_name, const std::string& table_name);

void VerifyNamespaceExists(YBClient *client, const NamespaceName& db_name, int timeout_secs = 20);

void VerifyNamespaceNotExists(
YBClient *client, const NamespaceName& db_name, int timeout_secs = 20);

void VerifyTableExists(
YBClient* client, const std::string& db_name, const std::string& table_name, int timeout_secs);

void VerifyTableNotExists(
YBClient* client, const std::string& db_name, const std::string& table_name, int timeout_secs);


} // namespace client
} // namespace yb
8 changes: 6 additions & 2 deletions src/yb/client/client.cc
Original file line number Diff line number Diff line change
Expand Up @@ -633,14 +633,18 @@ Status YBClient::DeleteTable(const YBTableName& table_name, bool wait) {
wait);
}

Status YBClient::DeleteTable(const string& table_id, bool wait, CoarseTimePoint deadline) {
Status YBClient::DeleteTable(const string& table_id,
bool wait,
const TransactionMetadata *txn,
CoarseTimePoint deadline) {
return data_->DeleteTable(this,
YBTableName(),
table_id,
false /* is_index_table */,
PatchAdminDeadline(deadline),
nullptr /* indexed_table_name */,
wait);
wait,
txn);
}

Status YBClient::DeleteIndexTable(const YBTableName& table_name,
Expand Down
Loading

0 comments on commit e44b832

Please sign in to comment.