Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{backup,changefeed,streaming}ccl: start populating pts target #74248

Merged
merged 1 commit into from
Jan 7, 2022

Conversation

adityamaru
Copy link
Contributor

@adityamaru adityamaru commented Dec 23, 2021

This change touches all jobs that create a protected timestamp
record before calling Protect. Previously, the created record
would contain the spans that the record was going to protect.

With this change, the record will also populate the Target
field on ptpb.Record with the object it is going to protect.
The Target field is a proto message defined in ptpb.Target
and can be one of:

  • Cluster
  • Tenants
  • Schema object (database or table)

For backups, this target field is determined based on the targets
passed in by the user via the BACKUP <target> query.

For changefeeds, this target is the group of tables on which the
changefeed is being started + system.descriptors table.

For the streaming job, this target is the tenant that is being
streamed.

This change does not touch any test files that create
a raw ptpb.Record for testing purposes. That will come in a follow
up PR where we actually teach Protect to validate and make use
of this Target field. A test for how backup chooses its target
will also come in a follow up PR once Protect actually writes
the encoded protobuf target field to the underlying system table.

Informs: #73727

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru adityamaru force-pushed the populate-pts-record-target branch 4 times, most recently from c7dde69 to 3d7425b Compare December 28, 2021 17:42
@adityamaru
Copy link
Contributor Author

First two commits from #74281.

@adityamaru
Copy link
Contributor Author

This is RFAL! I have a test for the backup target selection in a follow-up PR over in c7dde69. I also moved the ptpb.Target message outside the ptpb.Record since in a future PR we need to encode this protobuf to bytes before we write to the target column in system.pts_records.

@adityamaru adityamaru marked this pull request as ready for review January 5, 2022 19:02
@adityamaru adityamaru requested a review from a team January 5, 2022 19:02
@adityamaru adityamaru requested review from a team as code owners January 5, 2022 19:02
@adityamaru adityamaru requested review from miretskiy and removed request for a team January 5, 2022 19:02
@@ -64,12 +66,26 @@ func createProtectedTimestampRecord(
progress.ProtectedTimestampRecord = uuid.MakeV4()
log.VEventf(ctx, 2, "creating protected timestamp %v at %v",
progress.ProtectedTimestampRecord, resolved)
spansToProtect := makeSpansToProtect(codec, targets)
deprecatedSpansToProtect := makeSpansToProtect(codec, targets)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need both targets and spans?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we run the migration in #74281 we want to continue protecting spans so that jobs started in a mixed version state do not fail. Once the migration is complete, the Protect method in the protected ts Storage interface will stop writing these passed in spans to the underlying system.pts_records table (this will come as a follow up PR). So all jobs started after the migration might pass in a record with the spans field set, but this will not be persisted.

The idea is that GC for 22.1 will continue to respect both spans protected by the old subsystem, and targets protected by the new subsystem since this simplifies the migration in a mixed version state. In 22.2 with some elbow grease, we should be able to stop populating the spans field in the record entirely.

@@ -89,28 +89,6 @@ message Metadata {

// Record corresponds to a protected timestamp.
message Record {
message SchemaObjectsTarget {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it safe to change proto like this? we have serialized records stored in protected keys, don't we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields were added only a couple days ago in #74211, since this is unreleased code I think it is safe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nobody generated and serialized those, then it's fine (i'm thinking roachtests).

Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: from changefeed side.

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as an outsider to backup/changefeed/streaming.

pkg/ccl/backupccl/schedule_pts_chaining.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/protectedts/ptpb/protectedts.go Outdated Show resolved Hide resolved
@adityamaru adityamaru requested a review from gh-casper January 5, 2022 20:37
@adityamaru
Copy link
Contributor Author

@gh-casper adding you for the streaming diff, thanks!

@miretskiy
Copy link
Contributor

miretskiy commented Jan 5, 2022 via email

@adityamaru adityamaru removed request for a team and gh-casper January 6, 2022 02:23
This change touches all jobs that create a protected timestamp
record before calling `Protect`. Previously, the created record
would contain the spans that the record was going to protect.

With this change, the record will also populate the `Target`
field on `ptpb.Record` with the object it is going to protect.
The `Target` field is a proto message defined in `ptpb.Target`
and can be one of:

- Cluster
- Tenants
- Schema object (database or table)

For backups, this target field is determined based on the targets
passed in by the user via the `BACKUP <target>` query.

For changefeeds, this target is the group of tables on which the
changefeed is being started + `system.descriptors` table.

For the streaming job, this target is the tenant that is being
streamed.

This change does not touch any test files that create
a raw `ptpb.Record` for testing purposes. That will come in a follow
up PR where we actually teach `Protect` to validate and make use
of this `Target` field. A test for how backup chooses its target
will also come in a follow up PR once Protect actually writes
the encoded protobuf target field to the underlying system table.

Informs: cockroachdb#73727

Release note: None
@adityamaru adityamaru force-pushed the populate-pts-record-target branch from ed08564 to 2528c38 Compare January 6, 2022 02:37
@adityamaru
Copy link
Contributor Author

@dt friendly ping for the backup stuff, I'll be adding a test in a follow-up PR once we start persisting these targets in the system tables.

for _, tenant := range backupManifest.Tenants {
tenantID = append(tenantID, roachpb.MakeTenantID(tenant.ID))
}
return ptpb.MakeTenantsTarget(tenantID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently BACKUP syntax doesn't allow backing up table 245 and tenant 123, but that is a valid backup job. It looks like however that is not possible with the ptpb.Target? i.e. tenants target and schema object targets are mutually exclusive?

Not blocking, since it's unreachable with current syntax, just wondering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible today yeah, we could in the future switch the Record to hold a repeated Target instead of a single target to accommodate for this.

@adityamaru
Copy link
Contributor Author

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 7, 2022

Build succeeded:

@craig craig bot merged commit d191f64 into cockroachdb:master Jan 7, 2022
@adityamaru adityamaru deleted the populate-pts-record-target branch January 7, 2022 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants