-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: add socket buffer limit field in ClientTrafficPolicy and BackendTrafficPolicy #3724
Conversation
Signed-off-by: Dingkang Li <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3724 +/- ##
==========================================
+ Coverage 67.63% 67.66% +0.03%
==========================================
Files 184 184
Lines 22546 22546
==========================================
+ Hits 15248 15255 +7
+ Misses 6214 6210 -4
+ Partials 1084 1081 -3 ☔ View full report in Codecov by Sentry. |
// +kubebuilder:validation:XValidation:rule="type(self) == string ? self.matches(r\"^[1-9]+[0-9]*([EPTGMK]i|[EPTGMk])?$\") : type(self) == int",message="socketBufferLimit must be of the format \"^[1-9]+[0-9]*([EPTGMK]i|[EPTGMk])?$\"" | ||
// +optional | ||
// +notImplementedHide | ||
SocketBufferLimit *resource.Quantity `json:"socketBufferLimit,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What's the default value if not specified? Is it the operation system default? Do we need to set a sane min/max value to avoid misconfiguration?
- Are there any relationships between Connection Buffer Limit and Socket Buffer Limit? For example, do they need to be in a special ratio?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If unspecified, Envoy will use system buffer size, like the sample below, ordered as
min
default
max
, this applies to all sockets:
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
User can set max
buffer size for individual socket via setsockopt
function on SO_SNDBUF
or SO_RCVBUF
option, limited by
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
No socket options to set min
buffer size for individual socket.
- Socket Buffer is streaming channel between two TCP/IP stacks, as kernel space buffer. Connection Buffer is streaming channel (maybe non-streaming) between two processes, as user-space buffer. IMO they don't need to be in special ratio.
https://man7.org/linux/man-pages/man7/tcp.7.html
https://man7.org/linux/man-pages/man7/socket.7.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation @aoledk This API LGTM.
Could you please add the some of the above explanation "user-space vs kerne-space buffer" to the comments of the API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aoledk when would a user specify a different value for connection buffet and socket buffer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arkodg Anytime. If user wants to control memory consumption of TCP/IP stack, they can tune socket buffer; if they want to control memory consumption of application process, connection buffer can be tuned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @liorokman. Do you think that it's a reasonable requirement to have different socket and user-space buffer sizes? One thing that come to mind here is that users may want to restrict socket-level buffer size as a security measure, since most validations (IP allowlists, MTLS, JWTs, ... ) occur at a later user-space stage. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SO_SNDBUF and SO_RCVBUF might be tuned to allow the kernel to store a little more data while it waits for userspace to catch up. I think it's not generally a good idea to have these buffers be too large, definitely not at the same scale as the user-space buffer.
Setting the socket buffers too small would trigger transmission failures (for small SO_SNDBUF
values), or cause the TCP stack to force the remote side to retransmit data (for small SO_RCVBUF
values). Setting these sockets to a too big value is wasteful, and could impact the number of connections that the entire host can support.
I think these buffer sizes are usually tuned to allow user-space a little more time to react before accepting the data, so it does make sense to allow for different socket and user-space buffer sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @liorokman and @aoledk would be great if we could convert these recommendations into API doc strings so users can better understand when to tweak this knob
Signed-off-by: Dingkang Li <[email protected]>
Signed-off-by: Dingkang Li <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks.
Signed-off-by: Dingkang Li <[email protected]>
Signed-off-by: Dingkang Li <[email protected]>
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/retest |
1 similar comment
/retest |
Signed-off-by: Dingkang Li <[email protected]>
/retest |
/retest |
What this PR does / why we need it:
Allow user to configure socket buffer limit for both incoming socket and outgoing socket to control memory consumed by Envoy, especially useful for user case of long-lived L4 proxy for massive connections.
It's different from existing connection buffer limit api which is used to control user space memory consumption, and this new api is used to control kernel memory consumption.
Which issue(s) this PR fixes:
Fixes #2670