Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streamingccl: tighten replication timestamp semantics #92788

Merged
merged 1 commit into from
Dec 13, 2022

Conversation

adityamaru
Copy link
Contributor

Previously, each partition would reach out to the
source cluster and pick its own timestamp from which it would start ingesting MVCC versions. This timestamp was used by the rangefeed setup by the partition, to run its initial scan. Eventually, all the partitions would replicate up until a certain timestamp and cause the frontier to be bumped but it was possible for different partitions to begin ingesting at different timestamps.

This change makes it such that during replication planning when we create the producer job on the source cluster, we return a timestamp alongwith the StreamID. This becomes the timestamp at which each ingestion partition sets up the inital scan of the rangefeed, and consequently become the inital timestamp at which all data is ingested. We stash this timestamp in the replication job details and never update its value. On future resumptions of the replication job, if there is a progress high water, we will not run an initial rangefeed scan but instead start the rangefeed from the previous progress highwater.

The motivation for this change was to know the lower bound on both the source and destination cluster for MVCC versions that have been streamed. This is necessary to bound the fingerprinting on both clusters to ensure a match.

Release note: None

Fixes: #92742

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru adityamaru marked this pull request as ready for review December 1, 2022 16:02
@adityamaru adityamaru requested review from a team as code owners December 1, 2022 16:02
@adityamaru adityamaru requested a review from a team December 1, 2022 16:02
@adityamaru adityamaru requested a review from a team as a code owner December 1, 2022 16:02
@adityamaru adityamaru requested review from benbardin, stevendanna, lidorcarmel and a team and removed request for a team and benbardin December 1, 2022 16:02
Copy link
Contributor

@lidorcarmel lidorcarmel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@adityamaru adityamaru force-pushed the start-time-fixup branch 3 times, most recently from 4a364f5 to 8fafe9f Compare December 11, 2022 16:34
@adityamaru adityamaru requested a review from a team December 11, 2022 20:36
Previously, each partition would reach out to the
source cluster and pick its own timestamp from which it
would start ingesting MVCC versions. This timestamp was
used by the rangefeed setup by the partition, to run its
initial scan. Eventually, all the partitions would replicate
up until a certain timestamp and cause the frontier to be
bumped but it was possible for different partitions to begin
ingesting at different timestamps.

This change makes it such that during replication planning when
we create the producer job on the source cluster, we return a timestamp
alongwith the StreamID. This becomes the timestamp at which each
ingestion partition sets up the inital scan of the rangefeed,
and consequently become the inital timestamp at which all data
is ingested. We stash this timestamp in the replication job
details and never update its value. On future resumptions of the
replication job, if there is a progress high water, we will not
run an initial rangefeed scan but instead start the rangefeed from
the previous progress highwater.

The motivation for this change was to know the lower bound on
both the source and destination cluster for MVCC versions that
have been streamed. This is necessary to bound the fingerprinting
on both clusters to ensure a match.

Release note: None

Fixes: cockroachdb#92742
@adityamaru
Copy link
Contributor Author

Both failures are flakes:

TFTR!

bors r=lidorcarmel

@craig
Copy link
Contributor

craig bot commented Dec 12, 2022

Build failed:

@adityamaru
Copy link
Contributor Author

Failed on TestClusterRestoreFailCleanup for a seemingly unrelated reason. Investigating

// start ingesting data in the replication job. This timestamp is empty unless
// the replication job resumes after a progress checkpoint has been recorded.
// While it is empty we use the InitialScanTimestamp described below.
optional util.hlc.Timestamp previous_high_water_timestamp = 2 [(gogoproto.nullable) = false];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new name.

@adityamaru
Copy link
Contributor Author

bors retry

@craig
Copy link
Contributor

craig bot commented Dec 12, 2022

Build failed:

@adityamaru
Copy link
Contributor Author

adityamaru commented Dec 12, 2022

Now its TestComposeGSS, third time is the charm. I'll file something for TestComposeGSS.

bors retry

@craig
Copy link
Contributor

craig bot commented Dec 12, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Dec 12, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Dec 12, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Dec 13, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Dec 13, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Dec 13, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Dec 13, 2022

Build succeeded:

@craig craig bot merged commit bdfde49 into cockroachdb:master Dec 13, 2022
@adityamaru adityamaru deleted the start-time-fixup branch December 13, 2022 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

c2c: pick a uniform start timestamp across partitions when creating a tenant replication stream
4 participants