Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: checkpoint restore progress with a span frontier #97862

Merged
merged 1 commit into from
Mar 13, 2023

Conversation

msbutler
Copy link
Collaborator

@msbutler msbutler commented Mar 1, 2023

Previously, after a node restart or pause event, a resumed restore would have to
redo a significant amount of work due to our naive progress checkpointing
procedure described in #87843.

This patch upgrades our checkpointing procedure to use a span frontier,
significantly reducing wasted work. Here's the basic idea:

  1. Create a span frontier behind a lock on the coordinator.
  2. When the coordinator receives a progress update about a span that was
    ingested, forward that span's time stamp in the frontier to the hardcoded
    completedSpanTime.
  3. The frontier is periodically persisted to the job record every 15 seconds
    or after 5% progress. Note, only the first N spans are persisted, to bound the
    amount of data written to the jobs table. The bound is set by the new
    restore.frontier_checkpoint_max_bytes private setting.
  4. If the job is resumed after some progress was persisted, the generative
    split and scatter processor will skip the completed spans in the persisted
    frontier. Note that if a part of requiredSpan was complete, only the subset
    that is incomplete will get processed into a restoreSpanEntry.

As part of this work, many of the function signatures in the restore codebase
were refactored to prevent a large number input parameters. Further, much of
the restore checkpointing code was moved to the seperate restore_progress.go
file.

Fixes #87843

Release note: None

@msbutler msbutler self-assigned this Mar 1, 2023
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msbutler
Copy link
Collaborator Author

msbutler commented Mar 1, 2023

First commit getting reviewed #97800.

With better checkpointing, the restore with pause test completed in half the time, in about 15 minutes.

@msbutler msbutler force-pushed the butler-restore-checkpoint branch 7 times, most recently from 662dec6 to 9544f54 Compare March 2, 2023 14:08
@msbutler msbutler requested review from rhu713 and adityamaru March 2, 2023 14:08
@msbutler msbutler marked this pull request as ready for review March 2, 2023 14:09
@msbutler msbutler requested review from a team as code owners March 2, 2023 14:09
@msbutler msbutler requested review from herkolategan, smg260 and rharding6373 and removed request for a team, herkolategan, smg260 and rharding6373 March 2, 2023 14:09
pkg/util/span/frontier.go Outdated Show resolved Hide resolved
@msbutler msbutler force-pushed the butler-restore-checkpoint branch from 9544f54 to 18ee093 Compare March 2, 2023 18:05
@msbutler
Copy link
Collaborator Author

msbutler commented Mar 3, 2023

a heads up that i'm reworking this. hold off on review for a moment!

@msbutler msbutler force-pushed the butler-restore-checkpoint branch from 18ee093 to 5fc70cb Compare March 6, 2023 14:36
@msbutler msbutler force-pushed the butler-restore-checkpoint branch from 5fc70cb to 3582a82 Compare March 6, 2023 17:12
@msbutler
Copy link
Collaborator Author

msbutler commented Mar 6, 2023

@adityamaru @rhu713 RFAL!

@msbutler msbutler requested a review from dt March 6, 2023 17:16
@msbutler msbutler force-pushed the butler-restore-checkpoint branch from 3582a82 to 5bb1f8a Compare March 7, 2023 22:55
Copy link
Contributor

@rhu713 rhu713 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

if f.useFrontierCheckpointing {
return f.findToDoSpans(requiredSpan)
}
if requiredSpan.EndKey.Compare(f.highWaterMark) < 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be <= 0 since EndKey of span is exclusive?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, yeah it can be <=0. In other words, if requiredSpan.EndKey.Compare(f.highWaterMark) == 0, the last key in the required span was persisted. This is a small bug in 22.2 and earlier.

pkg/ccl/backupccl/restore_progress.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/restore_progress.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/restore_span_covering.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/restore_progress.go Show resolved Hide resolved
pkg/ccl/backupccl/restore_span_covering.go Outdated Show resolved Hide resolved
@msbutler msbutler force-pushed the butler-restore-checkpoint branch from 5bb1f8a to 1b81c23 Compare March 13, 2023 13:38
@adityamaru adityamaru self-requested a review March 13, 2023 13:52
@msbutler msbutler force-pushed the butler-restore-checkpoint branch from 1b81c23 to bd917b3 Compare March 13, 2023 14:16
Previously, after a node restart or pause event, a resumed restore would have to
redo a significant amount of work due to our naive progress checkpointing
procedure described in cockroachdb#87843.

This patch upgrades our checkpointing procedure to use a span frontier,
significantly reducing wasted work. Here's the basic idea:
1. Create a span frontier behind a lock on the coordinator.
2. When the coordinator receives a progress update about a span that was
ingested, forward that span's time stamp in the frontier to the  hardcoded
`completedSpanTime`.
3. The frontier is periodically persisted to the job record every 15 seconds
or after 5% progress. Note, only the first N spans are persisted, to bound the
amount of data written to the jobs table. The bound is set by the new
restore.frontier_checkpoint_max_bytes private setting.
4. If the job is resumed after some progress was persisted, the generative
split and scatter processor will skip the completed spans in the persisted
frontier. Note that if a part of requiredSpan was complete, only the subset
that is incomplete will get processed into a restoreSpanEntry.

As part of this work, many of the function signatures in the restore codebase
were refactored to prevent a large number input parameters. Further, much of
the restore checkpointing code was moved to the seperate restore_progress.go
file.

Fixes cockroachdb#87843

Release note: None
@msbutler msbutler force-pushed the butler-restore-checkpoint branch from bd917b3 to 2ea46c8 Compare March 13, 2023 16:22
@msbutler
Copy link
Collaborator Author

TFTRs!!

bors r=adityamaru, rhu713

@msbutler
Copy link
Collaborator Author

test run indicates all is well.

@craig
Copy link
Contributor

craig bot commented Mar 13, 2023

Build succeeded:

@craig craig bot merged commit e71ac0f into cockroachdb:master Mar 13, 2023
@msbutler msbutler deleted the butler-restore-checkpoint branch March 13, 2023 18:19
msbutler added a commit to msbutler/cockroach that referenced this pull request Mar 14, 2023
PR cockroachdb#97862 introduced a subtle bug which allowed the new restore checkpointing
policy to take effect before the 23_1 migrations occured. This patch ensures
the new policy only takes effect after all migrations occur.

Release note: None

Epic: None
craig bot pushed a commit that referenced this pull request Mar 14, 2023
98368: multiregionccl,server: use cached sqlliveness.Reader, deflake ColdStartLatencyTest r=ajwerner a=ajwerner

#### multitenantccl: deflake ColdStartLatencyTest
This test was flakey due to the closed timestamp sometimes not leading far
for global tables due to overload, and due to a cached liveness reader
not being used in distsql. The former was fixed in previous commits. The
latter is fixed here.

Fixes: #96334


#### sql: use CachedReader for uses with sqlinstance and the sql builtins
The CachedReader won't block, which in multi-region clusters is good. It will
mean that in some cases, it'll state that a sessions is alive when it most
certainly is not. Currently, nobody needs synchronous semantics.

This is a major part of fixing the TestColdStartLatency as sometimes
distsql planning would block. That's not acceptable -- the idea that
query physical planning can need to wait for a cross-region RPC is
unacceptable.

#### sqlliveness: re-arrange APIs to clarify when the API was blocking

By "default" the implementation of Reader was blocking and there was a method
to get a handle to a non-blocking CachedReader(). This asymmetry does not aid
understandability. The non-blocking reader came later, but it is generally the
more desirable interface.

Release note: None

98519: storage: More CheckSSTConflicts fixes r=erikgrinaker a=itsbilal

A few additional fixes around CheckSSTConflicts, stats calculations, and Next()ing logic, caught by kvnemesis. Hopefully the last of its kind.

Also re-enable kvnemesis testing for range keys in AddSSTable, reverting #98475.

Fixes #94141.
Fixes #98473.
Informs #94876.

Epic: none

Release note: None

98567: backupccl: use correct version gate for restore checkpointing r=adityamaru a=msbutler

PR #97862 introduced a subtle bug which allowed the new restore checkpointing policy to take effect before the 23_1 migrations occured. This patch ensures the new policy only takes effect after all migrations occur.

Release note: None

Epic: None

Co-authored-by: ajwerner <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

backupccl: add finer grain checkpointing during restore
5 participants