[CDCSDK] Intents are getting GCed after Tablet LEADER changes #13770

sureshdash2022-yb · 2022-08-25T04:55:32Z

CDC Service retains intents for a tablet based on the flag cdc_intent_retention_ms (default value is 4 hours), to track the intent expiration per tablet level, there is a separate thread UpdatePeersAndMetrics which periodically does the following:-

Read cdc_state table using the method PopulateTabletCheckPointInfo.Each row in the cdc_state table is uniquely identified by the tablet_id, and stream_id pair. so it calculates the minimum checkpoint among all active streams belonging to the same tablet_id. and max remaining time among all streams by reading the tablet LEADER cache and creating map std::unordered_map<TabletId, TabletCDCCheckpointInfo>.
The above map is passed to the method UpdateTabletPeersWithMinReplicatedIndex, which internally does the following:- a. pick each tablet_id from the map, if the current Tablet is not a LEADER then it continues for the next tablet_id in the map. b. If the current tablet is a LEADER then it sends the minimum checkpoint as well as intent expiration time remaining for the FOLLOWERS tablets.
Above 2 operations are not atomic, that means during step:-1 (PopulateTabletCheckPointInfo call) whatever maximum remaining time we have calculated based on the tablet LEADER cache reference and created a map. but when it comes to step:-2(UpdateTabletPeersWithMinReplicatedIndex) the same tablet_id may become FOLLOWER or vice versa.

In our scenario, we have a single stream (stream_1) and single tablet (tabet_1) and 3 tservers(TS-1, TS-2, TS-3).

i). Initially the Tablet LEADER is present in TS1, the client is called GetChanges, so the UpdatePeersAndMetrics thread will be enabled in TS1.
ii). Now there is a LEADER switch happening to TS2, so the UpdatePeersAndMetrics thread will be active in TS2.
iii) now TS1 which is now a FOLLOWER, has UpdatePeersAndMetrics activated which will periodically do the above 2 steps described.
In our FAILURE scenario TS1 which is FOLLOWER, create a map as part PopulateTabletCheckPointInfo call with active time remaining set 0, because it doesn't contain the tablet LEADER. Before UpdateTabletPeersWithMinReplicatedIndex call TS1 is becoming LEADER so as part of the above description for step:-2, the LEADER tablet will send the remaining expiration as 0, causing whole intents GCed.

Summary: CDC Service retains intents for a tablet based on the flag //cdc_intent_retention_ms// (default value is 4 hours), to track the intent expiration per tablet level, there is a separate thread //UpdatePeersAndMetrics// which periodically does the following:- 1. Read //cdc_state// table using the method //PopulateTabletCheckPointInfo//.Each row in the cdc_state table is uniquely identified by the tablet_id, and stream_id pair. so it calculates the minimum checkpoint among all active streams belonging to the same tablet_id. and max remaining time among all streams by reading the tablet LEADER cache and creating map // std::unordered_map<TabletId, TabletCDCCheckpointInfo>. // 2. The above map is passed to the method //UpdateTabletPeersWithMinReplicatedIndex//, which internally does the following:- a. pick each tablet_id from the map, if the current Tablet is not a LEADER then it continues for the next tablet_id in the map. b. If the current tablet is a LEADER then it sends the minimum checkpoint as well as intent expiration time remaining for the FOLLOWERS tablets. Above 2 operations are not atomic, that means during step:-1 (//PopulateTabletCheckPointInfo// call) whatever maximum remaining time we have calculated based on the tablet LEADER cache reference and created a map. but when it comes to step:-2(UpdateTabletPeersWithMinReplicatedIndex) the same tablet_id may become FOLLOWER or vice versa. In our scenario, we have a single stream (stream_1) and single tablet (tabet_1) and 3 tservers(TS-1, TS-2, TS-3). i). Initially the Tablet LEADER is present in TS1, the client is called GetChanges, so the UpdatePeersAndMetrics thread will be enabled in TS1. ii). Now there is a LEADER switch happening to TS2, so the UpdatePeersAndMetrics thread will be active in TS2. iii) now TS1 which is now a FOLLOWER, has UpdatePeersAndMetrics activated which will periodically do the above 2 steps described. In our FAILURE scenario TS1 which is FOLLOWER, create a map as part //PopulateTabletCheckPointInfo// call with active time remaining set 0, because it doesn't contain the tablet LEADER. Before //UpdateTabletPeersWithMinReplicatedIndex// call TS1 is becoming LEADER so as part of the above description for step:-2, the LEADER tablet will send the remaining expiration as 0, causing whole intents GCed. To handle this scenario, in step:-1(//PopulateTabletCheckPointInfo//), the maximum active time we will calculate based on tablet cache(either it can be LEADER or FOLLOWER) and in regular interval stream active time we update in the FOLLOWERS cache so that they are in sync. Test Plan: Running all the c and java testcase Reviewers: skumar, abharadwaj, srangavajjula Reviewed By: abharadwaj, srangavajjula Subscribers: ycdcxcluster Differential Revision: https://phabricator.dev.yugabyte.com/D19149

…et LEADER changes Summary: "Original commit: 8eaec8c/D19149" CDC Service retains intents for a tablet based on the flag //cdc_intent_retention_ms// (default value is 4 hours), to track the intent expiration per tablet level, there is a separate thread //UpdatePeersAndMetrics// which periodically does the following:- 1. Read //cdc_state// table using the method //PopulateTabletCheckPointInfo//.Each row in the cdc_state table is uniquely identified by the tablet_id, and stream_id pair. so it calculates the minimum checkpoint among all active streams belonging to the same tablet_id. and max remaining time among all streams by reading the tablet LEADER cache and creating map // std::unordered_map<TabletId, TabletCDCCheckpointInfo>. // 2. The above map is passed to the method //UpdateTabletPeersWithMinReplicatedIndex//, which internally does the following:- a. pick each tablet_id from the map, if the current Tablet is not a LEADER then it continues for the next tablet_id in the map. b. If the current tablet is a LEADER then it sends the minimum checkpoint as well as intent expiration time remaining for the FOLLOWERS tablets. Above 2 operations are not atomic, that means during step:-1 (//PopulateTabletCheckPointInfo// call) whatever maximum remaining time we have calculated based on the tablet LEADER cache reference and created a map. but when it comes to step:-2(UpdateTabletPeersWithMinReplicatedIndex) the same tablet_id may become FOLLOWER or vice versa. In our scenario, we have a single stream (stream_1) and single tablet (tabet_1) and 3 tservers(TS-1, TS-2, TS-3). i). Initially the Tablet LEADER is present in TS1, the client is called GetChanges, so the UpdatePeersAndMetrics thread will be enabled in TS1. ii). Now there is a LEADER switch happening to TS2, so the UpdatePeersAndMetrics thread will be active in TS2. iii) now TS1 which is now a FOLLOWER, has UpdatePeersAndMetrics activated which will periodically do the above 2 steps described. In our FAILURE scenario TS1 which is FOLLOWER, create a map as part //PopulateTabletCheckPointInfo// call with active time remaining set 0, because it doesn't contain the tablet LEADER. Before //UpdateTabletPeersWithMinReplicatedIndex// call TS1 is becoming LEADER so as part of the above description for step:-2, the LEADER tablet will send the remaining expiration as 0, causing whole intents GCed. To handle this scenario, in step:-1(//PopulateTabletCheckPointInfo//), the maximum active time we will calculate based on tablet cache(either it can be LEADER or FOLLOWER) and in regular interval stream active time we update in the FOLLOWERS cache so that they are in sync. Test Plan: Running all the c and java testcase Reviewers: abharadwaj, srangavajjula, skumar Reviewed By: skumar Subscribers: ycdcxcluster Differential Revision: https://phabricator.dev.yugabyte.com/D19181

… LEADER changes Summary: "Original commit: 8eaec8c/D19149" CDC Service retains intents for a tablet based on the flag //cdc_intent_retention_ms// (default value is 4 hours), to track the intent expiration per tablet level, there is a separate thread //UpdatePeersAndMetrics// which periodically does the following:- 1. Read //cdc_state// table using the method //PopulateTabletCheckPointInfo//.Each row in the cdc_state table is uniquely identified by the tablet_id, and stream_id pair. so it calculates the minimum checkpoint among all active streams belonging to the same tablet_id. and max remaining time among all streams by reading the tablet LEADER cache and creating map // std::unordered_map<TabletId, TabletCDCCheckpointInfo>. // 2. The above map is passed to the method //UpdateTabletPeersWithMinReplicatedIndex//, which internally does the following:- a. pick each tablet_id from the map, if the current Tablet is not a LEADER then it continues for the next tablet_id in the map. b. If the current tablet is a LEADER then it sends the minimum checkpoint as well as intent expiration time remaining for the FOLLOWERS tablets. Above 2 operations are not atomic, that means during step:-1 (//PopulateTabletCheckPointInfo// call) whatever maximum remaining time we have calculated based on the tablet LEADER cache reference and created a map. but when it comes to step:-2(UpdateTabletPeersWithMinReplicatedIndex) the same tablet_id may become FOLLOWER or vice versa. In our scenario, we have a single stream (stream_1) and single tablet (tabet_1) and 3 tservers(TS-1, TS-2, TS-3). i). Initially the Tablet LEADER is present in TS1, the client is called GetChanges, so the UpdatePeersAndMetrics thread will be enabled in TS1. ii). Now there is a LEADER switch happening to TS2, so the UpdatePeersAndMetrics thread will be active in TS2. iii) now TS1 which is now a FOLLOWER, has UpdatePeersAndMetrics activated which will periodically do the above 2 steps described. In our FAILURE scenario TS1 which is FOLLOWER, create a map as part //PopulateTabletCheckPointInfo// call with active time remaining set 0, because it doesn't contain the tablet LEADER. Before //UpdateTabletPeersWithMinReplicatedIndex// call TS1 is becoming LEADER so as part of the above description for step:-2, the LEADER tablet will send the remaining expiration as 0, causing whole intents GCed. To handle this scenario, in step:-1(//PopulateTabletCheckPointInfo//), the maximum active time we will calculate based on tablet cache(either it can be LEADER or FOLLOWER) and in regular interval stream active time we update in the FOLLOWERS cache so that they are in sync. Test Plan: Running all the c and java testcase Reviewers: srangavajjula, skumar, abharadwaj Reviewed By: skumar, abharadwaj Subscribers: ycdcxcluster Differential Revision: https://phabricator.dev.yugabyte.com/D19182

… LEADER changes Summary: "Original commit: 8eaec8c/D19149" CDC Service retains intents for a tablet based on the flag //cdc_intent_retention_ms// (default value is 4 hours), to track the intent expiration per tablet level, there is a separate thread //UpdatePeersAndMetrics// which periodically does the following:- 1. Read //cdc_state// table using the method //PopulateTabletCheckPointInfo//.Each row in the cdc_state table is uniquely identified by the tablet_id, and stream_id pair. so it calculates the minimum checkpoint among all active streams belonging to the same tablet_id. and max remaining time among all streams by reading the tablet LEADER cache and creating map // std::unordered_map<TabletId, TabletCDCCheckpointInfo>. // 2. The above map is passed to the method //UpdateTabletPeersWithMinReplicatedIndex//, which internally does the following:- a. pick each tablet_id from the map, if the current Tablet is not a LEADER then it continues for the next tablet_id in the map. b. If the current tablet is a LEADER then it sends the minimum checkpoint as well as intent expiration time remaining for the FOLLOWERS tablets. Above 2 operations are not atomic, that means during step:-1 (//PopulateTabletCheckPointInfo// call) whatever maximum remaining time we have calculated based on the tablet LEADER cache reference and created a map. but when it comes to step:-2(UpdateTabletPeersWithMinReplicatedIndex) the same tablet_id may become FOLLOWER or vice versa. In our scenario, we have a single stream (stream_1) and single tablet (tabet_1) and 3 tservers(TS-1, TS-2, TS-3). i). Initially the Tablet LEADER is present in TS1, the client is called GetChanges, so the UpdatePeersAndMetrics thread will be enabled in TS1. ii). Now there is a LEADER switch happening to TS2, so the UpdatePeersAndMetrics thread will be active in TS2. iii) now TS1 which is now a FOLLOWER, has UpdatePeersAndMetrics activated which will periodically do the above 2 steps described. In our FAILURE scenario TS1 which is FOLLOWER, create a map as part //PopulateTabletCheckPointInfo// call with active time remaining set 0, because it doesn't contain the tablet LEADER. Before //UpdateTabletPeersWithMinReplicatedIndex// call TS1 is becoming LEADER so as part of the above description for step:-2, the LEADER tablet will send the remaining expiration as 0, causing whole intents GCed. To handle this scenario, in step:-1(//PopulateTabletCheckPointInfo//), the maximum active time we will calculate based on tablet cache(either it can be LEADER or FOLLOWER) and in regular interval stream active time we update in the FOLLOWERS cache so that they are in sync. Test Plan: Running all the c and java testcase Reviewers: abharadwaj, srangavajjula, skumar Reviewed By: skumar Subscribers: ycdcxcluster Differential Revision: https://phabricator.dev.yugabyte.com/D19188

sureshdash2022-yb self-assigned this Aug 25, 2022

sureshdash2022-yb added priority/medium Medium priority issue 2.12 Backport Required 2.14 Backport Required labels Aug 25, 2022

sureshdash2022-yb closed this as completed Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CDCSDK] Intents are getting GCed after Tablet LEADER changes #13770

[CDCSDK] Intents are getting GCed after Tablet LEADER changes #13770

sureshdash2022-yb commented Aug 25, 2022

[CDCSDK] Intents are getting GCed after Tablet LEADER changes #13770

[CDCSDK] Intents are getting GCed after Tablet LEADER changes #13770

Comments

sureshdash2022-yb commented Aug 25, 2022