c2c: cutover is not resilient to node shutdown #103534
Labels
A-disaster-recovery
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-disaster-recovery
If the coordinator node issuing revert range requests during cutover fails over, another node will be not able to resume the work. This occurs because the current implementation of cutover uses the the job progress FractionCompleted oneOf field. Sadly, during regular ingestion, we use the progress high_water field, so when cutover begins, we write over the job progress's high water mark.
When a node then tries to resume cutover after the og coordinator dies, it can't, because it attempts to check the high water mark which is unreadible.
This was seen #103008 (comment)
Jira issue: CRDB-28065
Epic CRDB-25146
The text was updated successfully, but these errors were encountered: