Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resume CHANGEFEED from last committed cursor timestamp #65573

Closed
GlennFawcett-doordash opened this issue May 21, 2021 · 5 comments
Closed

Resume CHANGEFEED from last committed cursor timestamp #65573

GlennFawcett-doordash opened this issue May 21, 2021 · 5 comments
Assignees
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-community Originated from the community T-cdc X-blathers-triaged blathers was able to find an owner

Comments

@GlennFawcett-doordash
Copy link

GlennFawcett-doordash commented May 21, 2021

Is your feature request related to a problem? Please describe.

A CHANGEFEED can fail for multiple reasons internal to CRDB, the Kafka endpoint, etc... After the CHANGEFEED fails, you have to figure out exactly when it failed from the logs and restart from that point after converting to the cluster_logical_timestamp epoch format.

Describe the solution you'd like

I would like for CRDB to store the last committed timestamp and have some way to simply RESUME from that point. You can easily PAUSE and RESUME jobs, but if the job fails you must resubmit. If the changefeed was created as an object in the database instead of a job, we could do something like:

RESUME CHANGEFEED <mychangefeed>

And it would simple pickup from where it left off.

Epic CRDB-2397

@GlennFawcett-doordash GlennFawcett-doordash added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label May 21, 2021
@blathers-crl
Copy link

blathers-crl bot commented May 21, 2021

Hello, I am Blathers. I am here to help you get the issue triaged.

I have CC'd a few people who may be able to assist you:

  • @cockroachdb/cdc (found keywords: CHANGEFEED,Kafka)

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels May 21, 2021
@shermanCRL shermanCRL added the A-cdc Change Data Capture label May 24, 2021
@blathers-crl blathers-crl bot added the T-cdc label May 24, 2021
@shermanCRL shermanCRL added A-cdc Change Data Capture and removed A-cdc Change Data Capture labels May 24, 2021
@amruss
Copy link
Contributor

amruss commented May 24, 2021

Note: Bulk IO also considering something similar, we probably want to do something more generic like RESUME FAILED JOB _job_id_ (RESUME JOB FROM FAILURE, RESUME FAILED JOB, RESURECT JOB, etc.) and spawn a new job id - with the same parameters / settings

We would want to error out for jobs that aren't changefeed jobs right now

@amruss
Copy link
Contributor

amruss commented Jul 23, 2021

After talking with the team, we will likely want to do this instead

@miretskiy
Copy link
Contributor

@amruss this can probably be closed? We have #36887 issue and few others that we're working on.

@spiffyy99
Copy link
Contributor

going to go ahead and close this, have a fix here: #68176

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-community Originated from the community T-cdc X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

No branches or pull requests

6 participants