Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc: Add highwater timestamp to crdb_internal.jobs #28156

Merged
merged 1 commit into from
Aug 12, 2018

Conversation

mrtracy
Copy link
Contributor

@mrtracy mrtracy commented Aug 1, 2018

Adds column high_water_timestamp to the crdb_internal.jobs table, and
thus also to the output of SHOW JOBS. This column is stored internally
as a HLC timestamp, but is output to the user as a DECIMAL in the same
fashion as the cluster_logical_timestamp() built-in. This column will
be nil for existing jobs, which use fraction_completed to represent
their progress.

Release note: Added an additional column to crdb_internal.jobs and SHOW
JOBS to represent the "high water timestamp" of Changefeed jobs. This is
used to represent the progress of changefeeds, as an alternative to the
fraction_completed column which is used by existing jobs.

@mrtracy mrtracy requested review from danhhz and a team August 1, 2018 17:33
@mrtracy mrtracy requested a review from a team as a code owner August 1, 2018 17:33
@mrtracy mrtracy requested review from a team August 1, 2018 17:33
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for cluster_logical_timestamp says "This function is reserved for testing purposes by CockroachDB developers and its definition may change without prior notice" so I'm a little hesitant to add it to SHOW JOBS. That said, I think we're pretty committed at this point to making this timestamp as decimal format work as an input to AS OF SYSTEM TIME. @bdarnell thoughts?

:lgtm: otherwise

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/show_test.go, line 822 at r1 (raw file):

		out.details = in.details

		// Shift 10 decimals to the right, so that the logical

eh? what's doing this shift?

@mrtracy mrtracy force-pushed the mtracy/system_jobs_highwater branch from 0376e94 to 69848d5 Compare August 1, 2018 17:54
Copy link
Contributor Author

@mrtracy mrtracy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/show_test.go, line 822 at r1 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

eh? what's doing this shift?

Ah, that's left over from before I found tree.DecimalToHLC(). Removed.

@mrtracy mrtracy force-pushed the mtracy/system_jobs_highwater branch from 69848d5 to 80a17ef Compare August 1, 2018 18:01
Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this field? Is it for end users (for monitoring that your changefeed jobs aren't falling too far behind, for example), or for our internal checkpointing purposes? Do we expect users to issue AS OF SYSTEM TIME queries using this value?

For monitoring purposes, I think it would be better to expose a TIMESTAMP column for ease of understanding. For checkpointing, I'd prefer a binary blob (encoded protobuf?) to give us flexibility to store other kinds of checkpoints (maybe including other kinds of jobs) without committing to implementation details (we might even want two separate fields, a TIMESTAMP and a blob, if we want to serve both needs). But if we expect this value to make its way back into AOST queries, decimal might be the right choice.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@mrtracy mrtracy force-pushed the mtracy/system_jobs_highwater branch from 80a17ef to 14ca338 Compare August 1, 2018 21:40
@mrtracy
Copy link
Contributor Author

mrtracy commented Aug 6, 2018

After talking with @danhhz, I believe we are going to remain with the DECIMAL notation; usage in AOST queries seems likely.

@mrtracy mrtracy force-pushed the mtracy/system_jobs_highwater branch from 14ca338 to b76e953 Compare August 9, 2018 21:22
@mrtracy mrtracy requested a review from a team August 9, 2018 21:22
@mrtracy mrtracy force-pushed the mtracy/system_jobs_highwater branch from b76e953 to 77fea65 Compare August 9, 2018 21:24
@mrtracy
Copy link
Contributor Author

mrtracy commented Aug 9, 2018

@danhhz I found an error in the Admin API Jobs method that resulted from this change and fixed it; would you mind taking another look at this new code?

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

Adds column high_water_timestamp to the crdb_internal.jobs table, and
thus also to the output of SHOW JOBS. This column is stored internally
as a HLC timestamp, but is output to the user as a DECIMAL in the same
fashion as the `cluster_logical_timestamp()` built-in. This column will
be nil for existing jobs, which use fraction_completed to represent
their progress.

The Admin API Jobs method has also been modified to handle the
possibly-null output of fraction_completed.

Release note (general change): Added an additional column to
crdb_internal.jobs and SHOW JOBS to represent the "high water timestamp"
of Changefeed jobs. This is used to represent the progress of
changefeeds, as an alternative to the fraction_completed column which is
used by existing jobs.

Release note: None
@mrtracy mrtracy force-pushed the mtracy/system_jobs_highwater branch from 77fea65 to 9f0c6da Compare August 12, 2018 03:59
@mrtracy
Copy link
Contributor Author

mrtracy commented Aug 12, 2018

bors r+

craig bot pushed a commit that referenced this pull request Aug 12, 2018
28156: cdc: Add highwater timestamp to crdb_internal.jobs r=mrtracy a=mrtracy

Adds column high_water_timestamp to the crdb_internal.jobs table, and
thus also to the output of SHOW JOBS. This column is stored internally
as a HLC timestamp, but is output to the user as a DECIMAL in the same
fashion as the `cluster_logical_timestamp()` built-in. This column will
be nil for existing jobs, which use fraction_completed to represent
their progress.

Release note: Added an additional column to crdb_internal.jobs and SHOW
JOBS to represent the "high water timestamp" of Changefeed jobs. This is
used to represent the progress of changefeeds, as an alternative to the
fraction_completed column which is used by existing jobs.

Co-authored-by: Matt Tracy <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 12, 2018

Build succeeded

@craig craig bot merged commit 9f0c6da into cockroachdb:master Aug 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants