-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: checkpoint restore progress with a span frontier #97862
Conversation
First commit getting reviewed #97800. With better checkpointing, the restore with pause test completed in half the time, in about 15 minutes. |
662dec6
to
9544f54
Compare
9544f54
to
18ee093
Compare
a heads up that i'm reworking this. hold off on review for a moment! |
18ee093
to
5fc70cb
Compare
5fc70cb
to
3582a82
Compare
@adityamaru @rhu713 RFAL! |
3582a82
to
5bb1f8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
if f.useFrontierCheckpointing { | ||
return f.findToDoSpans(requiredSpan) | ||
} | ||
if requiredSpan.EndKey.Compare(f.highWaterMark) < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be <= 0 since EndKey of span is exclusive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, yeah it can be <=0
. In other words, if requiredSpan.EndKey.Compare(f.highWaterMark) == 0
, the last key in the required span was persisted. This is a small bug in 22.2 and earlier.
5bb1f8a
to
1b81c23
Compare
1b81c23
to
bd917b3
Compare
Previously, after a node restart or pause event, a resumed restore would have to redo a significant amount of work due to our naive progress checkpointing procedure described in cockroachdb#87843. This patch upgrades our checkpointing procedure to use a span frontier, significantly reducing wasted work. Here's the basic idea: 1. Create a span frontier behind a lock on the coordinator. 2. When the coordinator receives a progress update about a span that was ingested, forward that span's time stamp in the frontier to the hardcoded `completedSpanTime`. 3. The frontier is periodically persisted to the job record every 15 seconds or after 5% progress. Note, only the first N spans are persisted, to bound the amount of data written to the jobs table. The bound is set by the new restore.frontier_checkpoint_max_bytes private setting. 4. If the job is resumed after some progress was persisted, the generative split and scatter processor will skip the completed spans in the persisted frontier. Note that if a part of requiredSpan was complete, only the subset that is incomplete will get processed into a restoreSpanEntry. As part of this work, many of the function signatures in the restore codebase were refactored to prevent a large number input parameters. Further, much of the restore checkpointing code was moved to the seperate restore_progress.go file. Fixes cockroachdb#87843 Release note: None
bd917b3
to
2ea46c8
Compare
TFTRs!! bors r=adityamaru, rhu713 |
test run indicates all is well. |
Build succeeded: |
PR cockroachdb#97862 introduced a subtle bug which allowed the new restore checkpointing policy to take effect before the 23_1 migrations occured. This patch ensures the new policy only takes effect after all migrations occur. Release note: None Epic: None
98368: multiregionccl,server: use cached sqlliveness.Reader, deflake ColdStartLatencyTest r=ajwerner a=ajwerner #### multitenantccl: deflake ColdStartLatencyTest This test was flakey due to the closed timestamp sometimes not leading far for global tables due to overload, and due to a cached liveness reader not being used in distsql. The former was fixed in previous commits. The latter is fixed here. Fixes: #96334 #### sql: use CachedReader for uses with sqlinstance and the sql builtins The CachedReader won't block, which in multi-region clusters is good. It will mean that in some cases, it'll state that a sessions is alive when it most certainly is not. Currently, nobody needs synchronous semantics. This is a major part of fixing the TestColdStartLatency as sometimes distsql planning would block. That's not acceptable -- the idea that query physical planning can need to wait for a cross-region RPC is unacceptable. #### sqlliveness: re-arrange APIs to clarify when the API was blocking By "default" the implementation of Reader was blocking and there was a method to get a handle to a non-blocking CachedReader(). This asymmetry does not aid understandability. The non-blocking reader came later, but it is generally the more desirable interface. Release note: None 98519: storage: More CheckSSTConflicts fixes r=erikgrinaker a=itsbilal A few additional fixes around CheckSSTConflicts, stats calculations, and Next()ing logic, caught by kvnemesis. Hopefully the last of its kind. Also re-enable kvnemesis testing for range keys in AddSSTable, reverting #98475. Fixes #94141. Fixes #98473. Informs #94876. Epic: none Release note: None 98567: backupccl: use correct version gate for restore checkpointing r=adityamaru a=msbutler PR #97862 introduced a subtle bug which allowed the new restore checkpointing policy to take effect before the 23_1 migrations occured. This patch ensures the new policy only takes effect after all migrations occur. Release note: None Epic: None Co-authored-by: ajwerner <[email protected]> Co-authored-by: Bilal Akhtar <[email protected]> Co-authored-by: Michael Butler <[email protected]>
Previously, after a node restart or pause event, a resumed restore would have to
redo a significant amount of work due to our naive progress checkpointing
procedure described in #87843.
This patch upgrades our checkpointing procedure to use a span frontier,
significantly reducing wasted work. Here's the basic idea:
ingested, forward that span's time stamp in the frontier to the hardcoded
completedSpanTime
.or after 5% progress. Note, only the first N spans are persisted, to bound the
amount of data written to the jobs table. The bound is set by the new
restore.frontier_checkpoint_max_bytes private setting.
split and scatter processor will skip the completed spans in the persisted
frontier. Note that if a part of requiredSpan was complete, only the subset
that is incomplete will get processed into a restoreSpanEntry.
As part of this work, many of the function signatures in the restore codebase
were refactored to prevent a large number input parameters. Further, much of
the restore checkpointing code was moved to the seperate restore_progress.go
file.
Fixes #87843
Release note: None