-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[#22862] XCluster: Improving XCluster Index Base WAL Retention Policy
Summary: **Background:** XCluster and CDCSDK rely on the tablet's `cdc_min_replicated_index` to retain the WAL segments that have not been replicated to the target. This value is updated by the background task in `CDCServiceImpl::UpdatePeersAndMetrics`, which is scheduled to run every 60 seconds(code link). For each update: * It first collects `cdc_min_replicated_index` and some CDCSDK checkpoints for each tablet from the CDC_State table * Then it subsequently updates the value within each tablet's WAL and then flush the values to the tablet metadata file. **Issue:** Potential WAL Over-GC Risk: * Currently, each tablet monitors its own `cdc_min_replicated_index`. If the tablet doesn't receive an update within 15 minutes, it then drops its OpId index-based WAL retention for XCluster(introduced in D7873). This implies that if there is a delay longer than 15 minutes before updating a XCluster tablet, WAL over gc could happen. For example, in `CDCServiceImpl::UpdatePeersAndMetrics`, updating all tablets cdc WAL retention barrier might take more than 15 mins. Unnecessary Flush: * Since `cdc_min_replicated_index` is already stored in the CDC_State table, the additional flush to the tablet metadata file is unnecessary. (We’ve observed this caused increased disk IO in CE-509) **Alternative for Dropping XCluster WAL Retention Policy:** In `CDCServiceImpl::UpdatePeersAndMetrics`, after collecting tablet checkpoint info, instead of individual updates, CDCService will store the checkpoint info stored in an in-memory map. And all tablets' WAL will pull their `cdc_min_replicated_index` from this map during WAL GC. If not found, CDCService will return maximum OpId, indicating removal from XCluster replication. **New Gflag: ** `xcluster_checkpoint_max_staleness_secs`(300 by default): The maximum interval in seconds that the xcluster checkpoint map can go without being refreshed. If the map is not refreshed within this interval, it is considered stale, and all WAL segments will be retained until the next refresh. Setting to 0 will disable Opid based WAL segment retention for XCluster. **If we want to disable it, It's recommended to also set Gflag`log_min_seconds_to_retain` to a large value so the WAL segments still can be retained by the time based WAL segment retention** **TServer Restart Safety Guarantee:** This change removed the periodic flushing of tablet metadata for persisting `cdc_min_replicated_index` by XCluster. Consequently, tablets no longer rely on persisted `cdc_min_replicated_index` to retain WAL segments during TServer restarts. This design ensures safety by leveraging the `xcluster_checkpoint_max_staleness_secs` mechanism. Before CDCService refresh the xcluster checkpoint map, the map is considered stale, preventing premature WAL GC. **Other Changes:** Modified CDCServiceTest unit tests to verify tablet xcluster required minimum index by checking the CDCService cached xcluster checkpoint map instead of relying on the tablet's log and metadata. Jira: DB-11766 Test Plan: CDCServiceTestFourServers.TestMinReplicatedIndexAfterTabletMove CDCServiceTestDurableMinReplicatedIndex.TestLogCDCMinReplicatedIndexIsDurable CdcTabletSplitITest.GetChangesOnSplitParentTablet Reviewers: hsunder, mlillibridge, jhe, xCluster Reviewed By: hsunder Subscribers: ycdcxcluster, slingam, rthallam, ybase Differential Revision: https://phorge.dev.yugabyte.com/D36298
- Loading branch information
1 parent
09d6e96
commit 69d4052
Showing
27 changed files
with
539 additions
and
298 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.