Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: periodic pts record updates #76605

Merged
merged 1 commit into from
Feb 18, 2022

Conversation

samiskin
Copy link
Contributor

@samiskin samiskin commented Feb 15, 2022

Previously changefeeds only laid down protected timestamp records to
protect against either an ongoing backfill or the changefeed lagging
behind. This is insufficient in cases such as if the gcttl is very
short, recurring errors retry the changefeed for too long, or in
upcoming work to enable serverless to shut down idle changefeeds.

This PR removes the manual PTS protection on backfills and begins an
async routine on the changeFrontier that updates the protected timestamp
record to the current highwater mark.

Fixes #76247

Release note (enterprise change): changefeeds running on tables with a
low gcttl will function more reliably due to protected timestamps being
maintained for the changefeed targets at the resolved timestamp of the
changefeed. The frequency at which the protected timestamp is updated
to the resolved timestamp can be configured through the
changefeed.protect_timestamp_interval cluster setting. If the
changefeed lags too far behind such that storage of old data becomes an
issue, cancelling the changefeed will release the protected timestamps
and allow garbage collection to resume. If
protect_data_from_gc_on_pause is unset, pausing the changefeed will
release the existing protected timestamp record.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@samiskin samiskin force-pushed the changefeed-auto-pts branch 6 times, most recently from 0f87fba to 519906f Compare February 16, 2022 14:42
@samiskin samiskin requested a review from miretskiy February 16, 2022 14:44
@samiskin samiskin marked this pull request as ready for review February 16, 2022 14:44
@samiskin samiskin requested a review from a team as a code owner February 16, 2022 14:44
@miretskiy miretskiy self-requested a review February 16, 2022 15:58
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice.

Reviewed 1 of 5 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @samiskin)


-- commits, line 14 at r1:
This definitely needs a release note. It's an internal change, and probably nobody cares about this
internal implementation detail; but it's a release note worty.
At the very least mentioning that changefeeds can run reliably on tables w/ low ttl.
Definitely need to mention the setting to control the frequency of such updates because this frequency effectively
overrides gcttl set on a table.


pkg/ccl/changefeedccl/changefeed.go, line 46 at r1 (raw file):

}

func shouldProtectTimestamps(codec keys.SQLCodec) bool {

pls add a todo to remove this restriction once tenant based pts is in...


pkg/ccl/changefeedccl/changefeed_processors.go, line 1172 at r1 (raw file):

	}

	if err := cf.flowCtx.Stopper().RunAsyncTask(ctx, "changefeed-pts-manager", func(ctx context.Context) {

not sure if we need this. Why not do this while trying to checkpoint job progress?

@samiskin samiskin force-pushed the changefeed-auto-pts branch 2 times, most recently from b60c090 to b12ddd9 Compare February 17, 2022 14:42
Copy link
Contributor Author

@samiskin samiskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)


-- commits, line 14 at r1:

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

This definitely needs a release note. It's an internal change, and probably nobody cares about this
internal implementation detail; but it's a release note worty.
At the very least mentioning that changefeeds can run reliably on tables w/ low ttl.
Definitely need to mention the setting to control the frequency of such updates because this frequency effectively
overrides gcttl set on a table.

Done.


pkg/ccl/changefeedccl/changefeed.go, line 46 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

pls add a todo to remove this restriction once tenant based pts is in...

Done.


pkg/ccl/changefeedccl/changefeed_processors.go, line 1172 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

not sure if we need this. Why not do this while trying to checkpoint job progress?

I just had it this way since you mentioned it being a separate loop on a timer initially. I wasn't sure whether it'd be considered more or less complex to either have a piece of functionality be more distinct from other functionality versus the timing of events being more deterministic where recurring events happen in lockstep with resolved events.

I do like that keeping it with checkpoints keeps table changes in one place and having it there lets it share the "min checkpoint frequency" option. I made the changes to move it to resolved 👍.

@samiskin samiskin force-pushed the changefeed-auto-pts branch from b12ddd9 to acd1cc2 Compare February 17, 2022 15:32
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @samiskin)


-- commits, line 16 at r3:
i don't think it's a bug fix -- it's an enterprise change.

@miretskiy miretskiy self-requested a review February 17, 2022 16:34
@samiskin samiskin force-pushed the changefeed-auto-pts branch from acd1cc2 to 0b5e2b0 Compare February 17, 2022 17:16
Previously changefeeds only laid down protected timestamp records to
protect against either an ongoing backfill or the changefeed lagging
behind.  This is insufficient in cases such as if the gcttl is very
short, recurring errors retry the changefeed for too long, or in
upcoming work to enable serverless to shut down idle changefeeds.

This PR removes the manual PTS protection on backfills and begins an
async routine on the changeFrontier that updates the protected timestamp
record to the current highwater mark.

Fixes cockroachdb#76247

Release note (enterprise change): changefeeds running on tables with a
low gcttl will function more reliably due to protected timestamps being
maintained for the changefeed targets at the resolved timestamp of the
changefeed.  The frequency at which the protected timestamp is updated
to the resolved timestamp can be configured through the
`changefeed.protect_timestamp_interval` cluster setting. If the
changefeed lags too far behind such that storage of old data becomes an
issue, cancelling the changefeed will release the protected timestamps
and allow garbage collection to resume. If
`protect_data_from_gc_on_pause` is unset, pausing the changefeed will
release the existing protected timestamp record.
@samiskin samiskin force-pushed the changefeed-auto-pts branch from 0b5e2b0 to 54203d6 Compare February 17, 2022 18:28
@samiskin
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 17, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 18, 2022

Build succeeded:

@@ -46,32 +43,30 @@ func emitResolvedTimestamp(
return nil
}

func shouldProtectTimestamps(codec keys.SQLCodec) bool {
// TODO(smiskin): Remove this restriction once tenant based pts are enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we TODO this @samiskin @adityamaru, given #73727 (comment) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

changefeeds: Implement trailing PTS handling
4 participants