backupccl: move openSSTs call into the restore workers #98906
Merged
+26
−49
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently we see openSSTs being a bottleneck during restore when using more than 4 workers.
This patch moves the openSSTs call into the worker itself, so that this work can be parallelized. This is needed for later PR which will increase the number of workers.
Also, this change simplifies the code a bit and makes it easier to implement #93324, because in that PR we want to produce a few partial SSTs that need to be processed serially. Before this patch it wasn't trivial to make sure that the N workers will not process those partial SSTs in the wrong order, and with this patch each worker will process a single mergedSST, and therefore can serialize the partial SSTs created from that mergedSST.
Tested by running a roachtest (4 nodes, 8 cores) with and without this change. The fixed version was faster: 80MB/s/node vs 60 but some of it is noise, we do expect a perf improvement when using more workers and other params tuned, which is the next step.
Informs: #98015
Epic: CRDB-20916
Release note: None