Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: handle range keys in readAsOfIterator #84214

Merged
merged 1 commit into from
Jul 21, 2022

Conversation

msbutler
Copy link
Collaborator

@msbutler msbutler commented Jul 11, 2022

Previously, the readAsOfIterator used in RESTORE could not handle range keys.
This PR implements the new SimpleMVCCIterator methods that handle range keys.
Further, this patch ensures the readAsOfIterator skips over point keys
shadowed by range keys at or below the caller's specified asOf timestamp.

Next, Backup needs to be tought about RangeKeys.

Informs #71155

Release note: none

@msbutler msbutler self-assigned this Jul 11, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msbutler msbutler force-pushed the butler-mvcc-restore-aost branch from c3a0cd3 to 8370850 Compare July 15, 2022 15:59
@msbutler msbutler requested review from erikgrinaker and dt July 15, 2022 15:59
@msbutler msbutler marked this pull request as ready for review July 15, 2022 15:59
@msbutler msbutler requested a review from a team as a code owner July 15, 2022 15:59
@msbutler msbutler force-pushed the butler-mvcc-restore-aost branch from 8370850 to 2328c7d Compare July 15, 2022 21:08
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A few minor nits which you can ignore at will.

pkg/storage/read_as_of_iterator.go Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Show resolved Hide resolved
pkg/storage/read_as_of_iterator.go Outdated Show resolved Hide resolved
@msbutler msbutler force-pushed the butler-mvcc-restore-aost branch from 2328c7d to dc78849 Compare July 19, 2022 20:06
Previously, the readAsOfIterator used in RESTORE could not handle range keys.
This PR implements the new SimpleMVCCIterator methods that handle range keys.
Further, this patch ensures the  readAsOfIterator skips over point keys
shadowed by range keys  at or below the caller's specified asOf timestamp.

Next, Backup needs to be tought about RangeKeys.

Informs cockroachdb#71155

Release note: none
@msbutler msbutler force-pushed the butler-mvcc-restore-aost branch from dc78849 to 341a77f Compare July 20, 2022 15:22
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case there was any doubt, this should be good to go. 🚀

@msbutler
Copy link
Collaborator Author

going to wait to merge until #84666 lands.

@msbutler
Copy link
Collaborator Author

TFTR!

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented Jul 21, 2022

Build failed (retrying...):

@craig craig bot merged commit 79edfce into cockroachdb:master Jul 21, 2022
@craig
Copy link
Contributor

craig bot commented Jul 21, 2022

Build succeeded:

@msbutler msbutler deleted the butler-mvcc-restore-aost branch July 21, 2022 19:53
msbutler added a commit to msbutler/cockroach that referenced this pull request Jul 25, 2022
Previously BACKUP would not back up range tombstones. With this patch, BACKUPs
with revision_history will backup range tombstones. Non-revision history backups
are not affected by this diff because MVCCExportToSST filters all tombstones
out of the backup already.

Specifically, this patch replaces the iterators used in the backup_processor
with the pebbleIterator, which has baked in range key support.

At this point a backup with range keys is restorable, thanks to cockroachdb#84214. Note
that the restore codebase still touches iterators that are not range key aware.
This is not a problem because restored data does not have range keys, nor do
the empty ranges restore dumps data into. These iterators (e.g. in SSTBatcher
and in CheckSSTConflicts) will be updated when cockroachdb#70428 gets fixed.

Fixes cockroachdb#71155

Release note: none
msbutler added a commit to msbutler/cockroach that referenced this pull request Jul 26, 2022
Previously BACKUP would not back up range tombstones. With this patch, BACKUPs
with revision_history will backup range tombstones. Non-revision history backups
are not affected by this diff because MVCCExportToSST filters all tombstones
out of the backup already.

Specifically, this patch replaces the iterators used in the backup_processor
with the pebbleIterator, which has baked in range key support. This refactor
introduces a 5% regression in backup runtime, even when the backup has no range
keys, though cockroachdb#83051 hopes to address this gap. See details below on the
benchmark experiment.

At this point a backup with range keys is restorable, thanks to cockroachdb#84214. Note
that the restore codebase still touches iterators that are not range key aware.
This is not a problem because restored data does not have range keys, nor do
the empty ranges restore dumps data into. These iterators (e.g. in SSTBatcher
and in CheckSSTConflicts) will be updated when cockroachdb#70428 gets fixed.

Fixes cockroachdb#71155

Release note: none

To benchmark this diff, the following commands were used on the following sha
a5ccdc3, with and without this commit, over
three trials:
```
roachprod create -n 5 --gce-machine-type=n2-standard-16 $CLUSTER
roachprod put $CLUSTER [build] cockroach

roachprod wipe $CLUSTER; roachprod start $CLUSTER;
roachprod run $CLUSTER:1 -- "./cockroach workload init bank --rows 1000000000"
roachprod sql $CLUSTER:1 -- -e "BACKUP INTO 'gs://somebucket/michael-rangkey?AUTH=implicit'"
```

The backup on the control binary took on average 478 seconds with a stdev of 13
seconds, while the backup with the treatment binary took on average 499 seconds
with stddev of 8 seconds.
craig bot pushed a commit that referenced this pull request Jul 27, 2022
84875: backupccl: handle range keys in BACKUP r=erikgrinaker a=msbutler

Previously BACKUP would not back up range tombstones. With this patch, BACKUPs
with revision_history will backup range tombstones. Non-revision history backups
are not affected by this diff because MVCCExportToSST filters all tombstones
out of the backup already.

Specifically, this patch replaces the iterators used in the backup_processor
with the pebbleIterator, which has baked in range key support. This refactor
introduces a 5% regression in backup runtime, even when the backup has no range
keys, though #83051 hopes to address this gap. See details below on the
benchmark experiment.

At this point a backup with range keys is restorable, thanks to #84214. Note
that the restore codebase still touches iterators that are not range key aware.
This is not a problem because restored data does not have range keys, nor do
the empty ranges restore dumps data into. These iterators (e.g. in SSTBatcher
and in CheckSSTConflicts) will be updated when #70428 gets fixed.

Fixes #71155

Release note: none

To benchmark this diff, the following commands were used on the following sha
a5ccdc3, with and without this commit, over
three trials:
```
roachprod create -n 5 --gce-machine-type=n2-standard-16 $CLUSTER
roachprod put $CLUSTER [build] cockroach

roachprod wipe $CLUSTER; roachprod start $CLUSTER;
roachprod run $CLUSTER:1 -- "./cockroach workload init bank --rows 1000000000"
roachprod sql $CLUSTER:1 -- -e "BACKUP INTO 'gs://somebucket/michael-rangkey?AUTH=implicit'"
```

The backup on the control binary took on average 478 seconds with a stdev of 13
seconds, while the backup with the treatment binary took on average 499 seconds
with stddev of 8 seconds.

84883: kvserver: add server-side transaction retry metrics r=arulajmani a=arulajmani

This patch adds a few new metrics to track successful/failed
server-side transaction retries. Specifically, whenever we attempt
to retry a read or write batch or run into a read within uncertainty
interval error, we increment specific counters indicating if the
retry was successful or not.

Release note: None

85074: upgrades: add checkpointing for `raftAppliedIndexTermMigration` r=irfansharif a=erikgrinaker

Forward-port of #84909, for posterity.

----

The `raftAppliedIndexTermMigration` upgrade migration could be
unreliable. It iterates over all ranges and runs a `Migrate` request
which must be applied on all replicas. However, if any ranges merge or
replicas are unavailable, the migration fails and starts over from the
beginning. In large clusters with many ranges, this meant that it might
never complete.

This patch makes the upgrade more robust, by retrying each `Migrate`
request 5 times, and checkpointing the progress after every fifth batch
(1000 ranges), allowing resumption on failure. At some point this should
be made part of the migration infrastructure.

NB: This fix was initially submitted for 22.1, and even though the
migration will be removed for 22.2, it is forward-ported for posterity.

Release note: None

85086: eval: stop ignoring all ResolveOIDFromOID errors r=ajwerner a=rafiss

fixes #84448

The decision about whether an error is safe to ignore is made at the
place where the error is created/returned. This way, the callers don't
need to be aware of any new error codes that the implementation may
start returning in the future.

Release note (bug fix): Fixed incorrect error handling that could cause
casts to OID types to fail in some cases.

Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants