Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl,spanconfig,kvserver: ExportRequest noops on ephemeral ranges #75451

Merged
merged 1 commit into from
Feb 14, 2022

Conversation

adityamaru
Copy link
Contributor

@adityamaru adityamaru commented Jan 24, 2022

This change is the first of two changes that gets us to the goal of backup
ignoring certain table row data, and not holding up GC on these ranges.

This change does a few things:

  • It sets up the transport of the exclude_data_from_backup bit set on a
    table descriptor, to the span configuration applied in KV.

  • It teaches ExportRequest on a range marked as excluded to return
    an empty ExportResponse. In this way, a backup processor will receive no row
    data to backup up for an ephemeral table.

  • A follow up change will also teach the SQLTranslator
    to not populate the protected timestamp field on the SpanConfig for such
    tables. This way, a long running backup will not hold up GC on such high-churn
    tables. With no protection on such ranges, it is possible that an
    ExportRequest targetting the range has a StartTime
    below the range's GCThreshold. To avoid the returned BatchTimestampBeforeGCError
    from failing the backup we decorate the the error with information about the
    range being excluded from backup and handle the error in the backup processor.

Informs: #73536

Release note (sql change): BACKUP of a table marked with exclude_data_from_backup
via ALTER TABLE ... SET (exclude_data_from_backup = true) will no longer backup
that table's row data. The backup will continue to backup the table's descriptor
and related metadata, and so on restore we will end up with an empty version of
the backed up table.

@adityamaru adityamaru requested review from dt, irfansharif and a team January 24, 2022 17:58
@adityamaru adityamaru requested a review from a team as a code owner January 24, 2022 17:58
@adityamaru adityamaru requested a review from a team January 24, 2022 17:58
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru
Copy link
Contributor Author

first two commits are from #75295.

@adityamaru
Copy link
Contributor Author

Please holdoff on reviewing this. I'll ping the PR once it is RFAL 🙂

@adityamaru
Copy link
Contributor Author

@dt @irfansharif reworked this a bit with the new syntax, and with better tests. This is RFAL now!

@adityamaru
Copy link
Contributor Author

friendly ping!

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but you're the expert on the backupccl tests.

pkg/roachpb/span_config.proto Outdated Show resolved Hide resolved
pkg/roachpb/errors.proto Outdated Show resolved Hide resolved
pkg/kv/kvserver/replica.go Outdated Show resolved Hide resolved
// backup, it is safe to ignore the error. It is likely that the
// table has been configured with a low GC TTL, and so the data
// the backup is targeting has already been gc'ed.
if batchTimestampBeforeGCError.ExcludeDataFromBackup {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting for posterity: was a bit surprised to see this bubble up through the roachpb.BatchTimestampBeforeGCError but it seems the least invasive way given the GC threshold checks happen before command eval.

return startKey
}

getStoreAndReplica := func() (*kvserver.Store, *kvserver.Replica) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should some of these helpers be moved outside the test? We're re-using them in a few places (both in this PR, and IIRC other tests in the package).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved all these shared helpers to a utils file.

return startKey
}

getStoreAndReplica := func() (*kvserver.Store, *kvserver.Replica) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

pkg/ccl/backupccl/backup_test.go Show resolved Hide resolved
@adityamaru adityamaru force-pushed the add-pts-ephemeral branch 5 times, most recently from 52472d2 to e4da251 Compare February 9, 2022 21:14
@adityamaru
Copy link
Contributor Author

adityamaru commented Feb 9, 2022

The increased diffs are just un-exporting some test utility methods that lint was complaining about. I consolidated helpers_test.go into testutils.go and so I think lint started complaining about exported fields without comments. There was no need for these to be exported since their use is limited to backupccl.

@adityamaru adityamaru force-pushed the add-pts-ephemeral branch 3 times, most recently from 7c466e7 to 7eb6024 Compare February 12, 2022 22:25
@adityamaru
Copy link
Contributor Author

TFTR!

bors r=irfansharif,dt

@craig
Copy link
Contributor

craig bot commented Feb 13, 2022

Build failed:

@adityamaru
Copy link
Contributor Author

bors r-

… from backup

This change is the first of two changes that gets us to the goal of backup
ignoring certain table row data, and not holding up GC on these ranges.

This change does a few things:

- It sets up the transport of the exclude_data_from_backup bit set on a
table descriptor, to the span configuration applied in KV.

- It teaches ExportRequest on a range marked as excluded to return
an empty ExportResponse. In this way, a backup processor will receive no row
data to backup up for an ephemeral table.

- A follow up change will also teach the SQLTranslator
to not populate the protected timestamp field on the SpanConfig for such
tables. This way, a long running backup will not hold up GC on such high-churn
tables. With no protection on such ranges, it is possible that an
ExportRequest targetting the range has a StartTime
below the range's GCThreshold. To avoid the returned BatchTimestampBeforeGCError
from failing the backup we decorate the the error with information about the
range being excluded from backup and handle the error in the backup processor.

Informs: cockroachdb#73536

Release note (sql change): BACKUP of a table marked with `exclude_data_from_backup`
via `ALTER TABLE ... SET (exclude_data_from_backup = true)` will no longer backup
that table's row data. The backup will continue to backup the table's descriptor
and related metadata, and so on restore we will end up with an empty version of
the backed up table.
@adityamaru
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Feb 14, 2022
75451: backupccl,spanconfig,kvserver: ExportRequest noops on ephemeral ranges r=adityamaru a=adityamaru

This change is the first of two changes that gets us to the goal of backup
ignoring certain table row data, and not holding up GC on these ranges.

This change does a few things:

- It sets up the transport of the exclude_data_from_backup bit set on a
table descriptor, to the span configuration applied in KV.

- It teaches ExportRequest on a range marked as excluded to return
an empty ExportResponse. In this way, a backup processor will receive no row
data to backup up for an ephemeral table.

- A follow up change will also teach the SQLTranslator
to not populate the protected timestamp field on the SpanConfig for such
tables. This way, a long running backup will not hold up GC on such high-churn
tables. With no protection on such ranges, it is possible that an
ExportRequest targetting the range has a StartTime
below the range's GCThreshold. To avoid the returned BatchTimestampBeforeGCError
from failing the backup we decorate the the error with information about the
range being excluded from backup and handle the error in the backup processor.

Informs: #73536

Release note (sql change): BACKUP of a table marked with `exclude_data_from_backup`
via `ALTER TABLE ... SET (exclude_data_from_backup = true)` will no longer backup
that table's row data. The backup will continue to backup the table's descriptor
and related metadata, and so on restore we will end up with an empty version of
the backed up table.

76168: ui,server: Table stats collection details added to Database pages r=ericharmeling a=ericharmeling

Closes: #67510

Release justification: low-risk, high benefit changes to
existing functionality

Release note (ui change):
- Added status of automatic statistics collection to the Database
and Database table pages on the DB Console.
- Added timestamp of last statistics collection to the Database
details and Database table pages on the DB Console.

Here's a video of the changes in the UI:

https://user-images.githubusercontent.com/27286675/153521703-9ed19ed0-5fe9-4cb6-a537-1d227aeb0a0b.mov




76444: bazel: upgrade to bazel 5.0.0 r=rail a=rickystewart

Closes #75796.

Release note: None

Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: Eric Harmeling <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
@craig
Copy link
Contributor

craig bot commented Feb 14, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 14, 2022

Build succeeded:

@craig craig bot merged commit 07a3683 into cockroachdb:master Feb 14, 2022
@adityamaru adityamaru deleted the add-pts-ephemeral branch February 14, 2022 21:50
adityamaru added a commit to adityamaru/cockroach that referenced this pull request Feb 20, 2022
…_backup` to SpanConfig

This change is a follow up to cockroachdb#75451 which taught ExportRequests
to noop on ranges marked as `exclude_data_from_backup`. This change
adds a field to the ProtectionPolicy in the SpanConfig. This field is set to
true if the ProtectionPolicy can be ignored if the span it applies to
has been marked to be excluded from backups i.e. the applied SpanConfig
has `exclude_data_from_backup = true`.

This field is currently only set to true when a protected timestamp record
has been written by a backup job. This is to ensure that ProtectionPolicies
written by non-backup users (CDC, streaming) on spans marked as
`exclude_data_from_backup` are still respected when making GC decisions on
the span. The work to teach the GC queue about this field will be done in
a final follow up.

Informs: cockroachdb#73536

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this pull request Feb 24, 2022
…backup to SpanConfig

This change is a follow up to cockroachdb#75451 which taught ExportRequests
to noop on ranges marked as exclude_data_from_backup. This change

This diff does two things:

- It adds an `ignore_if_excluded_from_backup` bit to ptpb.Target that is set
on PTS records written by backup schedules and jobs.

- It adds an `ignore_if_excluded_from_backup` bit to the ProtectionPolicy that
is shipped to KV as part of the SpanConfig.

In a follow up PR, this bit on the SpanConfig will be used in conjunction with
`exclude_data_from_backup` to decide whether or not to ignore the ProtectionPolicy
when making GC decisions on a span. All other consumers of PTS records will
default to setting this bit to false, and so their ProtectionPolicies will always
influence GC even if `exclude_data_from_backup` is set to true.

Informs: cockroachdb#73536

Release note: None
craig bot pushed a commit that referenced this pull request Feb 25, 2022
76831: spanconfigsqltranslator,jobsprotectedts: add `ignore_if_excluded_from_backup` to SpanConfig r=dt,irfansharif a=adityamaru

This change is a follow up to #75451 which taught ExportRequests
to noop on ranges marked as exclude_data_from_backup. This change

This diff does two things:

- It adds an `ignore_if_excluded_from_backup` bit to ptpb.Target that is set
on PTS records written by backup schedules and jobs.

- It adds an `ignore_if_excluded_from_backup` bit to the ProtectionPolicy that
is shipped to KV as part of the SpanConfig.

In a follow up PR, this bit on the SpanConfig will be used in conjunction with
`exclude_data_from_backup` to decide whether or not to ignore the ProtectionPolicy
when making GC decisions on a span. All other consumers of PTS records will
default to setting this bit to false, and so their ProtectionPolicies will always
influence GC even if `exclude_data_from_backup` is set to true.

Informs: #73536

Release note: None

Co-authored-by: Aditya Maru <[email protected]>
maryliag pushed a commit to maryliag/cockroach that referenced this pull request Feb 28, 2022
…backup to SpanConfig

This change is a follow up to cockroachdb#75451 which taught ExportRequests
to noop on ranges marked as exclude_data_from_backup. This change

This diff does two things:

- It adds an `ignore_if_excluded_from_backup` bit to ptpb.Target that is set
on PTS records written by backup schedules and jobs.

- It adds an `ignore_if_excluded_from_backup` bit to the ProtectionPolicy that
is shipped to KV as part of the SpanConfig.

In a follow up PR, this bit on the SpanConfig will be used in conjunction with
`exclude_data_from_backup` to decide whether or not to ignore the ProtectionPolicy
when making GC decisions on a span. All other consumers of PTS records will
default to setting this bit to false, and so their ProtectionPolicies will always
influence GC even if `exclude_data_from_backup` is set to true.

Informs: cockroachdb#73536

Release note: None
RajivTS pushed a commit to RajivTS/cockroach that referenced this pull request Mar 6, 2022
…backup to SpanConfig

This change is a follow up to cockroachdb#75451 which taught ExportRequests
to noop on ranges marked as exclude_data_from_backup. This change

This diff does two things:

- It adds an `ignore_if_excluded_from_backup` bit to ptpb.Target that is set
on PTS records written by backup schedules and jobs.

- It adds an `ignore_if_excluded_from_backup` bit to the ProtectionPolicy that
is shipped to KV as part of the SpanConfig.

In a follow up PR, this bit on the SpanConfig will be used in conjunction with
`exclude_data_from_backup` to decide whether or not to ignore the ProtectionPolicy
when making GC decisions on a span. All other consumers of PTS records will
default to setting this bit to false, and so their ProtectionPolicies will always
influence GC even if `exclude_data_from_backup` is set to true.

Informs: cockroachdb#73536

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants