-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: don't leak raft application tracing spans on or after ErrRemoved #54140
kv: don't leak raft application tracing spans on or after ErrRemoved #54140
Conversation
Release justification: low risk, high benefit changes to existing functionality.
Fixes cockroachdb#53677. This commit ensures that we properly finish tracing spans of Raft commands that throw ErrRemoved errors in ApplySideEffects. I was originally intending to do something more dramatic and make `replicaStateMachine.ApplySideEffects` responsible for acknowledging proposers in all cases, but doing so turned out to be pretty invasive so I was concerned that it would be harder to backport to v20.2 and to v20.1. I may revisit that in the future. Release justification: low risk, high benefit changes to existing functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 8 of 8 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained
pkg/kv/kvserver/apply/cmd.go, line 208 at r3 (raw file):
// In the following three functions, fn is written with ctx as a 2nd param // because callers wants to bind it to methods that has Commands (or variants)
“callers wants”
This comment is generally hard to parse
a4ce989
to
e9f5a61
Compare
bors r+ |
bors r- |
Canceled. |
Extension of cockroachdb#53677. This commit ensures that we properly finish tracing spans of Raft commands that follow a command that throws a ErrRemoved. Before this commit, these commands would be abandoned and would never be finished. The effects of this are theoretically even worse than those fixed in the previous commit because these leaked commands could be locally proposed, so we may be abandoning a local proposer indefinitely. It's not clear that we ever saw an instance of this. It seems rare for a local proposal to end up in the same CommittedEntries batch as a command that removes a replica because of the lease requirements, but it doesn't seem impossible, especially of the local proposal was a RequestLease request. Release justification: low risk, high benefit changes to existing functionality.
e9f5a61
to
e73de8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r+
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/kv/kvserver/apply/cmd.go, line 208 at r3 (raw file):
Previously, ajwerner wrote…
“callers wants”
This comment is generally hard to parse
Done. Cleaned it up a bit too.
Build succeeded: |
Fixes #53677.
This change ensures that we properly finish tracing spans of Raft commands that throw ErrRemoved errors in ApplySideEffects. It then ensures that we properly finish tracing spans of Raft commands that follow a command that throws an
ErrRemoved
. Before this, these commands would be abandoned and would never be finished. The effects of this are theoretically even worse than those fixed in the previous commit because these leaked commands could be locally proposed, so we may be abandoning a local proposer indefinitely.It's not clear that we ever saw an instance of this. It seems rare for a local proposal to end up in the same CommittedEntries batch as a command that removes a replica because of the lease requirements, but it doesn't seem impossible, especially of the local proposal was a RequestLease request.
I was originally intending to do something more dramatic and make
replicaStateMachine.ApplySideEffects
responsible for acknowledging proposers in all cases, but doing so turned out to be pretty invasive so I was concerned that it would be harder to backport to v20.2 and to v20.1. I may revisit that in the future.Release justification: low risk, high benefit changes to existing functionality.