kvserver: timeout during export request on 16gb range #107519
Labels
A-disaster-recovery
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
db-cy-23
T-disaster-recovery
Describe the problem
In #104588 we're seeing a backup fail to back up a 16GiB range. I've learned that ExportRequest reads 16mb worth of values; in this case it was possibly an incremental that didn't find anything new and so had to scan the entire 16GB, which is no bueno - very expensive.
While we don't endorse let alone support 16GiB ranges, it stands to reason that
backup should be able to back up ranges of any size, as sometimes ranges may grow
to that size without the operator being at fault.
Also, we are entertaining the idea of increasing the default range sizes significantly,
which will likely put this issue on the menu at least in some deployments.
So we should find a way to paginate on the "bytes processed" and not "bytes returned".
To Reproduce
Presumably doing what the linked roachtest does to get the large range and then
trying to back up the table will reproduce it.
Related
#103879 is about a similar issue when sending snapshots.
Jira issue: CRDB-30090
The text was updated successfully, but these errors were encountered: