-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: fix 'observed raft log position' assertion #107412
kvserver: fix 'observed raft log position' assertion #107412
Conversation
Fixes cockroachdb#107336. Fixes cockroachdb#106123. Fixes cockroachdb#107156. Fixes cockroachdb#106589. It's possible to hit this assertion under --stress --race when the proposing replica is starved enough for raft ticks that it loses leadership right when it steps proposals through raft. We're relying on undocumented API semantics in the etcd raft library whereby it mutates stepped entries with the term+index its to end up in. But that's only applicable if stepping through entries as a leader. Simply relax this assertion instead. Release note: None
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also get a quick look from someone in repl-team? I have limited understanding of etcd/raft implementation details.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @irfansharif)
I think this is fine, but I'll have a closer look tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems OK at a glance.
This means that AC is effectively disabled when the leader and leaseholder isn't colocated. That should be rare enough that we can disregard it, at least initially.
How do we handle leader changes and such here? Let's say the leader proposes 1000 entries, and deducts tokens for them, but is unable to commit the entries. A new leader is elected which replaces the old leader's uncommitted tail. Will we return all of the tokens in that case? Otherwise, if the old leader reacquires leadership, can it deadlock because it will never commit the log entries AC is waiting for? Conversely, if we reset them, will flapping leadership effectively reset AC? We expect to see such flapping precisely under overload, if leaders are unable to heartbeat in time.
Thanks for looking! bors r+
Yes. We’ll err on the side of over-admission to avoid token leaks/write stalls.
Yes, for IO/flow tokens, not for CPU slots. I suspect CPU overload comes into play more so when we’re unable to tick raft groups in time. Perhaps also if IO latencies are severely degraded (>10ms p99s under IOPS/bandwidth saturation) but AC doesn’t monitor IO latencies directly today. |
Build succeeded: |
Fixes #107336.
Fixes #106123.
Fixes #107156.
Fixes #106589.
It's possible to hit this assertion under --stress --race when the proposing replica is starved enough for raft ticks that it loses leadership right when it steps proposals through raft. We're relying on undocumented API semantics in the etcd raft library whereby it mutates stepped entries with the term+index its to end up in. But that's only applicable if stepping through entries as a leader. Simply relax this assertion instead.
Release note: None