-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: add MaxInflightMsgBytes to raft #90314
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Comments
tbg
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv-replication
labels
Oct 20, 2022
cc @cockroachdb/replication |
The current prototype is in etcd-io/etcd#14624. |
The
|
craig bot
pushed a commit
that referenced
this issue
Jan 16, 2023
94692: kvserver: use MaxInflightBytes raft config option r=erikgrinaker,tbg a=pavelkalinnikov This change starts using the raft's `MaxInflightBytes` option which limits the inflight total byte size of the log entries sent via `MsgApp`. The default limit is set to a conservatively large value, to not change the current behaviour for now. Part of #90314 Epic: none Release note (ops change): Added COCKROACH_RAFT_MAX_INFLIGHT_BYTES env variable which helps strictly limiting inflight traffic from a Raft leader to followers, particularly in situations when many large messages are sent and significantly exceed COCKROACH_RAFT_MAX_SIZE_PER_MSG * COCKROACH_RAFT_MAX_INFLIGHT_MSGS which is a softer limit. Co-authored-by: Pavel Kalinnikov <[email protected]>
The option is implemented and plumbed into CRDB. The default is a generous 256MB which should not disrupt existing workflows. We may revisit this setting in the future. UPD: it is 32 MiB now [#101437]. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Is your feature request related to a problem? Please describe.
raft.RawNode
has two optionsMaxSizePerMsg
,MaxInflightMsgs
:cockroach/pkg/kv/kvserver/store.go
Lines 284 to 285 in 0c9a5da
We set them to 32kb 1 and 1282, respectively.
Consider a slow follower that the leader is replicating a stream of 16mb AddSST entries to. The follower will, for this one range, have to hold up to
128*16m = 2G
worth of entries in memory. If this happens on even a handful of ranges, an OOM is likely to follow; we see this happen in practice3.Note that the quota pool (I think defaulting to 8mb right now) can sort of inhibit this undesirable behavior in the steady state. But once a node falls out of the quota pool, for example since it was recently restarted or something like that (I assume slow nodes frequently fall out of the quota pool, since we haven't really seen range write stalls caused by them, though that doesn't mean they don't happen - see #79756) there's no stopping the raft leader from catching up the follower in this unintended overly aggressive way.
Describe the solution you'd like
RawNode
should have options that allow us to specify the max inflight bytes. We could set this to, say, 64MiB to say that no more than 64MiB should be outstanding at any point in time. (This is an etcd/raft change).In an ideal world, we'd also be able to budget how much can be outstanding across raft instances, but this is a much harder problem. For example, there may be 1000 ranges for which the leader is sending us 16mb AddSSTs. We would need to have these leaders coordinate (unlikely) or the follower would need to reject messages (this being essentially 3).
Describe alternatives you've considered
Additional context
Jira issue: CRDB-20690
Footnotes
https://github.com/cockroachdb/cockroach/blob/2c3916be699e01165193d148e6caef7cb79ad110/pkg/base/config.go#L136-L137 ↩
https://github.com/cockroachdb/cockroach/blob/2c3916be699e01165193d148e6caef7cb79ad110/pkg/base/config.go#L145-L148 ↩
https://github.com/cockroachdb/cockroach/issues/79752 ↩ ↩2
The text was updated successfully, but these errors were encountered: