Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
…olerance for DDLs Summary: Maintain the ysql catalog version in a (new) system table for new clusters. The table schema is: ``` Table "pg_catalog.pg_yb_catalog_version" Column | Type | Collation | Nullable | Default -----------------------+--------+-----------+----------+--------- db_oid | oid | | not null | current_version | bigint | | not null | last_breaking_version | bigint | | not null | Indexes: "pg_yb_catalog_version_db_oid_index" PRIMARY KEY, ``` This means we now: 1. Only modify the version number if the overall DDL statement (txn) committed successfully (fixing an important correctness bug with version management) 2. Only increment the version number (at most) once per change (minimizing version mismatch errors) - Explicitly distinguish safe vs unsafe changes, so we only need to force a refresh during a transaction if necessary (further minimizing version mismatch errors). Additionally, adjust the order in which internal operations are executed following this design doc: https://github.com/yugabyte/yugabyte-db/blob/master/architecture/design/online-schema-migrations.md The goal being to guarantee that under any failure scenario that cluster is left in a consistent state. Also added a new (master) test gflag `TEST_ysql_catalog_write_rejection_percentage` that rejects writes to YSQL system catalog tables with the set percentage (default `0`). Future work: 1. Implement the upgrade path for old clusters to safely migrate to this new versioning method. 2. Maintain the version per-database instead of one per cluster (schema is already set up for that). 3. Once more schema changes are handled by the online schema migrations plan, adjust their version increment (to become non-breaking) to fully avoid the DDL mismatch errors. 4. While all failures should leave the cluster in a consistent state, some may leave behind orphan DocDB relations. These are harmless from correctness perspective to OID/UUID uniqueness but we should have a simple (perhaps automated way to clean them up). 5. Consider how to handle truncate (in particular when used as part of drop colocated table). Test Plan: 1. Existing jenkins tests 2. Added TestPgDdlFaultTolerance.java 3. Run sysbench with more than one threads. Reviewers: neha, neil, alex, jason Reviewed By: alex, jason Subscribers: alex, vgopalan, mikhail, nicolas, yql, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D8729
- Loading branch information