-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: quota pool does not consider node liveness #84202
Comments
cc @cockroachdb/replication |
As seen in #84943, we should get rid of the |
The proposal quota release procedure checked node connection health for every node that appears active after replica activity checks. This is expensive, and previously caused issues like cockroachdb#84943. This change replaces the node connection health check with the cheaper NodeLiveness check. Fixes cockroachdb#84202 Release note: None
The proposal quota release procedure checked node connection health for every node that appears active after replica activity checks. This is expensive, and previously caused issues like cockroachdb#84943. This change replaces the node connection health check with the cheaper NodeLiveness check. Fixes cockroachdb#84202 Release note: None
After more discussion with @erikgrinaker, an edge case came up. If we check liveness instead of We think it's okay to just remove the |
85565: kvserver: don't check ConnHealth when releasing proposal quota r=erikgrinaker a=pavelkalinnikov The proposal quota release procedure checked node connection health for every node that appears active after replica activity checks. This is expensive, and previously caused issues like #84943. This change removes the ConnHealth check, because other checks, such as isFollowerActiveSince and paused replicas, provide sufficient protection from various kinds of overloads. Touches #84202 Release note: None Co-authored-by: Pavel Kalinnikov <[email protected]>
This issue is obsolete if we disable/remove the quota pool, x-ref #106063 |
The replica quota pool does not take node liveness into account. Instead, it only considers the RPC status:
cockroach/pkg/kv/kvserver/replica_proposal_quota.go
Lines 167 to 170 in d35cf75
We should check the liveness record instead or in addition.
Jira issue: CRDB-17522
Epic CRDB-39898
The text was updated successfully, but these errors were encountered: