Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
…e from CDCSDK stream Summary: This diff introduces three new yb-admin commands required to remove a **user table** from a CDCSDK stream. **`NOTE: All three commands are only meant to be used on CDC streams that are not associated with a replication slot.`** **Command-1**: yb-admin command to disable dynamic table addition in a CDC stream. Only works when the new auto flag `enable_cdcsdk_dynamic_tables_disable_option` is set to true. **Note, post execution of this command, no dynamic tables (user/non-user) will get added to CDC stream. Additionally, there is no option to re-enable dynamic table addition for the stream.** ``` yb-admin \ -master_addresses <master-addresses> \ disable_dynamic_table_addition_in_change_data_stream <stream_id> ``` The command works with a single stream_id. **Command-2**: yb-admin command to remove only a particular **user** table from the CDC stream metadata as well as update the checkpoint for corresponding state table entries to OpId max. Since, the checkpoint is set to max, these entries will be later deleted from the cdc state table by a separate thread (UpdatePeersAndMetrics). ``` yb-admin \ -master_addresses <master-addresses> \ remove_user_table_from_change_data_stream <stream_id> <table_id> ``` The command works with a single stream_id & table_id. **Command-3**: yb-admin command to validate cdc state table entries for a particular stream. As part of validation, if the table of any cdc state table entry is not present in the CDC stream metadata, then checkpoint of such entries will be updated to OpID max, and they'll be later deleted by a separate thread (UpdatePeersAndMetrics). ``` yb-admin \ -master_addresses <master-addresses> \ validate_and_sync_cdc_state_table_entries_on_change_data_stream <stream_id> ``` The command works with a single stream_id. **Advisory for command-usage:** General guidelines that need to strictly followed while executing these commands: - Ensure no DDLs are performed before/after 15 mins of executing these commands. These yb-admin commands are meant to be used when a user is only interested on polling from subset of tables in the namespace. Therefore, the user can remove the extra tables from CDC stream that are not supposed to be polled. To achieve this, user needs to first execute Command-1, followed by command-2 & command-3. Example: Starting state: 5 user tables (t1 to t5) in the CDC stream including 4 extra tables that are not polled (t1,t2,t3,t4) + 2 indexes (i1,i2) Target state: Only t5 + 2 indexes (i1,i2) should be present in CDC stream. To reach the target state, we need to remove 4 user tables (t1-t4) from stream metadata & their state entries **Perform the following steps to remove user tables from the CDC stream:** # Firstly, disable dynamic table addition using command-1. # Confirm that dynamic table addition is disabled by running `list_change_data_streams` yb-admin command. The output for that stream would contain the string `cdcsdk_disable_dynamic_table_addition: true` # Remove the table from stream metadata & update its state table entries using command-2. # Confirm that the table is removed from stream metadata by re-running `list_change_data_streams` command. # Based on when the user reads the cdc state table (via cqlsh), the state table entries corresponding to this table would have been either updated to checkpoint max or may be removed. Note, State table entries deletion might take some time as it will be done in a separate thread. # Repeat step 3-5 for all user tables that needs to be removed. # At the end, once all extra user tables are removed from a stream, execute command-3 as a sanity check to get rid of any cdc state entries that might still be hanging around in state table but the corresponding table has been removed from stream metadata. One scenario where cdc state table entries might be present even after table is removed, is when a tablet splits while table was being removed from stream metadata. In this case, the children tablet entries will get added to cdc state table and so they'll get removed when command-3 is executed. **Working**: Command-1 internally calls //DisableDynamicTableAdditionOnCDCSDKStream// RPC that will set the optional field `cdcsdk_disable_dynamic_table_addition` in stream metadata to true. This will prevent any tables, that are not yet part of the CDC stream, to get added to the CDC stream. Command-2 internally calls //RemoveUserTableFromCDCSDKStream// RPC that performs the following: # Update the checkpoint of tablet entries for the given table in the CDC state table to `OpId::Max()`. This is done to release the retention barriers on these tables and allow the deletion of the state table entry by UpdatePeersAndMetrics. # Remove the table from CDC stream metadata, //cdcsdk_tables_to_stream_map_// and persist the updated metadata in sys catalog. Command-3 internally calls //ValidateAndSyncCDCStateEntriesForCDCSDKStream// RPC that updates checkpoint to max for cdc state table entries whose table is not found in the CDC stream metadata. **Upgrade/Rollback safety:** //cdcsdk_disable_dynamic_table_addition// - added a new optional field in existing protos SysCDCStreamEntryPB, CDCStreamInfoPB. This field is protected and will only be read when the new auto flag `cdcsdk_enable_dynamic_tables_disable_option` is set. Introduced request, response proto for new RPCs: - DisableDynamicTableAdditionOnCDCSDKStream - DisableDynamicTableAdditionOnCDCSDKStreamRequestPB, DisableDynamicTableAdditionOnCDCSDKStreamResponsePB - RemoveUserTableFromCDCSDKStream - RemoveUserTableFromCDCSDKStreamRequestPB, RemoveUserTableFromCDCSDKStreamResponsePB - ValidateAndSyncCDCStateEntriesForCDCSDKStream - ValidateAndSyncCDCStateEntriesForCDCSDKStreamRequestPB, ValidateAndSyncCDCStateEntriesForCDCSDKStreamResponsePB Jira: DB-11778, DB-11676 Test Plan: ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestDisableOfDynamicTableAdditionOnNonConsistentSnapshotStream ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestDisableOfDynamicTableAdditionOnConsistentSnapshotStream ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestUserTableRemovalFromNonConsistentSnapshotCDCStream ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestUserTableRemovalFromConsistentSnapshotCDCStream ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestValidationAndSyncOfCDCStateEntriesAfterUserTableRemovalOnNonConsistentSnapshotStream ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestValidationAndSyncOfCDCStateEntriesAfterUserTableRemovalOnConsistentSnapshotStream Reviewers: skumar, asrinivasan, stiwary Reviewed By: asrinivasan, stiwary Subscribers: ycdcxcluster, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D35870
- Loading branch information