Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: rangefeed txn pusher barrier may cause resolved timestamp stalls #119536

Closed
erikgrinaker opened this issue Feb 22, 2024 · 0 comments · Fixed by #119512
Closed

kvserver: rangefeed txn pusher barrier may cause resolved timestamp stalls #119536

erikgrinaker opened this issue Feb 22, 2024 · 0 comments · Fixed by #119512
Assignees
Labels
A-kv-rangefeed Rangefeed infrastructure, server+client C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team

Comments

@erikgrinaker
Copy link
Contributor

In #117612, we added a barrier command to flush the rangefeed's Raft pipeline when pushing aborted transactions, as a fix for #104309.

The barrier command is marked isUnsplittable to prevent it from spanning range boundaries. Unfortunately, the DistSender enforces this constraint based on its range cache, which can be stale. Moreover, it will never attempt to refresh its range cache in response to this.

If a rangefeed runs on a follower replica after a range merge, the local DistSender's range cache may be stale, and contain the pre-merge range descriptors. When the barrier command is submitted spanning the entire merged range, it will be continually rejected by the DistSender, until some other request happens to trigger a cache refresh. This can prevent the rangefeed's resolved timestamp (and thus checkpoints) from advancing, similarly preventing the changefeed's frontier (or watermark) from advancing, logging the following error:

pushing old intents failed: range barrier failed, range split

Rangefeed events will still be emitted as usual, and garbage collection will be prevented by CDC protected timestamps, allowing the rangefeed to recover if it is restarted.

Seen in #119333.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-rangefeed Rangefeed infrastructure, server+client C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team
Projects
No open projects
Status: Closed
Development

Successfully merging a pull request may close this issue.

1 participant