release-21.2: backupccl: breakup the txn that inserts stats during cluster restore #82049

adityamaru · 2022-05-29T21:52:47Z

Backport 1/1 commits from #75969.

/cc @cockroachdb/release

We have seen instances of restores with hundreds of tables getting
stuck on inserting the backed up table stats into the system.table_stats
table on the restoring cluster. Previously, we would issue insert
statements for each table stat row in a single, long-running txn. If this
txn were to be retried a few times, we would observe intent buildup
on the system.table_stats ranges. Once these intents exceeded the
max_intent_bytes on the cluster, every subsequent txn retry would fall
back to the much more expensive ranged intent resolution. The only
remedy at this point would be to delete the BACKUP-STATISTICS file from
the bucket where the backup resides, and restore the tables with no
stats, relying on the AUTO STATS job to rebuild them gradually.

This change "batches" the insertion of the table stats to prevent the
above situation.

Fixes: #69207

Release note: None

Release justification: low risk, high impact change that fixes a class of stuck RESTOREs.

blathers-crl · 2022-05-29T21:52:51Z

cockroach-teamcity · 2022-05-29T21:52:55Z

This change is

We have seen instances of restores with hundreds of tables getting stuck on inserting the backed up table stats into the system.table_stats table on the restoring cluster. Previously, we would issue insert statements for each table stat row in a single, long-running txn. If this txn were to be retried a few times, we would observe intent buildup on the system.table_stats ranges. Once these intents exceeded the `max_intent_bytes` on the cluster, every subsequent txn retry would fall back to the much more expensive ranged intent resolution. The only remedy at this point would be delete the BACKUP-STATISTICS file from the bucket where the backup resides, and restore the tables with no stats, relying on the AUTO STATS job to rebuild them gradually. This change "batches" the insertion of the table stats to prevent the above situation. Fixes: cockroachdb#69207 Release note: None

adityamaru requested review from stevendanna and a team May 29, 2022 21:52

adityamaru force-pushed the backport21.2-75969 branch from 1472a7e to 701410c Compare May 29, 2022 23:18

stevendanna approved these changes May 30, 2022

View reviewed changes

adityamaru force-pushed the backport21.2-75969 branch from 701410c to 823d17d Compare May 30, 2022 14:17

adityamaru merged commit 9de15a1 into cockroachdb:release-21.2 May 30, 2022

adityamaru deleted the backport21.2-75969 branch May 30, 2022 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-21.2: backupccl: breakup the txn that inserts stats during cluster restore #82049

release-21.2: backupccl: breakup the txn that inserts stats during cluster restore #82049

adityamaru commented May 29, 2022 •

edited

Loading

blathers-crl bot commented May 29, 2022

cockroach-teamcity commented May 29, 2022

release-21.2: backupccl: breakup the txn that inserts stats during cluster restore #82049

release-21.2: backupccl: breakup the txn that inserts stats during cluster restore #82049

Conversation

adityamaru commented May 29, 2022 • edited Loading

blathers-crl bot commented May 29, 2022

cockroach-teamcity commented May 29, 2022

adityamaru commented May 29, 2022 •

edited

Loading