Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl/kvevent: implement parallel event consumer #86902

Closed
jayshrivastava opened this issue Aug 25, 2022 · 1 comment · Fixed by #87994
Closed

changefeedccl/kvevent: implement parallel event consumer #86902

jayshrivastava opened this issue Aug 25, 2022 · 1 comment · Fixed by #87994
Assignees
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) stability-period-v22.2 T-cdc

Comments

@jayshrivastava
Copy link
Contributor

jayshrivastava commented Aug 25, 2022

Implement parallel kv event consumers which parallelize the encoding and emitting of KV events to external sinks.

Since we consume (encode and emit) an event in a single thread on each node we can only process events so fast. Adding parallelism here will help ensure that we are not the bottleneck. That is encoding and emitting events should keep up with events coming in from rangefeeds.

WIP PR: https://github.com/jayshrivastava/cockroach/tree/nprocs-3

Some issues to be addressed are:

  • Error reporting: Errors currently do not get reported back to the change aggregator because we encode/emit in the background.
    • Errors should not be out of order (ie. we should not emit an KV event if the preceeding on caused an error).
    • Errors should not violate ordering in terms of restarts/checkpoints. We should flush all events which are queued to be emitted before checkpointing.
    • Multiple tests expect synchronous errors. We should add a knob which makes the parallel consumer synchronous by default for these tests.
  • Per-key ordering: This is currently not maintained by the PR. We should used a fixed # of workers and shard by key to fix this.
  • Race conditions with ALTER CHANGEFEED. Some tests fail because we modify options when altering a changefeed while the options are being used by the parallel consumer in the background. This should be fixed by returning copies of data in certain spots instead of pointers.
  • Race conditions with the buffered sink and changeAggregator.changedRowBuf

Jira issue: CRDB-18996

Epic CRDB-24463

@jayshrivastava jayshrivastava added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-cdc Change Data Capture labels Aug 25, 2022
@jayshrivastava jayshrivastava self-assigned this Aug 25, 2022
@blathers-crl blathers-crl bot added the T-cdc label Aug 25, 2022
@blathers-crl
Copy link

blathers-crl bot commented Aug 25, 2022

cc @cockroachdb/cdc

@amruss amruss added the stability-period Intended to be worked on during a stability period. Use with the Milestone field to specify version. label Aug 31, 2022
@craig craig bot closed this as completed in c55586b Sep 27, 2022
@exalate-issue-sync exalate-issue-sync bot added stability-period-v22.2 and removed stability-period Intended to be worked on during a stability period. Use with the Milestone field to specify version. labels Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) stability-period-v22.2 T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants