Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: add socket buffer limit field in ClientTrafficPolicy and BackendTrafficPolicy #3724

Merged
merged 20 commits into from
Aug 6, 2024

Conversation

aoledk
Copy link
Contributor

@aoledk aoledk commented Jul 1, 2024

What this PR does / why we need it:

Allow user to configure socket buffer limit for both incoming socket and outgoing socket to control memory consumed by Envoy, especially useful for user case of long-lived L4 proxy for massive connections.

It's different from existing connection buffer limit api which is used to control user space memory consumption, and this new api is used to control kernel memory consumption.

Which issue(s) this PR fixes:

Fixes #2670

@aoledk aoledk requested a review from a team as a code owner July 1, 2024 06:20
Copy link

codecov bot commented Jul 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.66%. Comparing base (194a7ea) to head (da342fe).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3724      +/-   ##
==========================================
+ Coverage   67.63%   67.66%   +0.03%     
==========================================
  Files         184      184              
  Lines       22546    22546              
==========================================
+ Hits        15248    15255       +7     
+ Misses       6214     6210       -4     
+ Partials     1084     1081       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

// +kubebuilder:validation:XValidation:rule="type(self) == string ? self.matches(r\"^[1-9]+[0-9]*([EPTGMK]i|[EPTGMk])?$\") : type(self) == int",message="socketBufferLimit must be of the format \"^[1-9]+[0-9]*([EPTGMK]i|[EPTGMk])?$\""
// +optional
// +notImplementedHide
SocketBufferLimit *resource.Quantity `json:"socketBufferLimit,omitempty"`
Copy link
Member

@zhaohuabing zhaohuabing Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What's the default value if not specified? Is it the operation system default? Do we need to set a sane min/max value to avoid misconfiguration?
  2. Are there any relationships between Connection Buffer Limit and Socket Buffer Limit? For example, do they need to be in a special ratio?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If unspecified, Envoy will use system buffer size, like the sample below, ordered as min default max, this applies to all sockets:
net.ipv4.tcp_rmem = 4096	12582912	16777216
net.ipv4.tcp_wmem = 4096	12582912	16777216

User can set max buffer size for individual socket via setsockopt function on SO_SNDBUF or SO_RCVBUF option, limited by

net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

No socket options to set min buffer size for individual socket.

  1. Socket Buffer is streaming channel between two TCP/IP stacks, as kernel space buffer. Connection Buffer is streaming channel (maybe non-streaming) between two processes, as user-space buffer. IMO they don't need to be in special ratio.

https://man7.org/linux/man-pages/man7/tcp.7.html
https://man7.org/linux/man-pages/man7/socket.7.html

Copy link
Member

@zhaohuabing zhaohuabing Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation @aoledk This API LGTM.

Could you please add the some of the above explanation "user-space vs kerne-space buffer" to the comments of the API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aoledk when would a user specify a different value for connection buffet and socket buffer ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arkodg Anytime. If user wants to control memory consumption of TCP/IP stack, they can tune socket buffer; if they want to control memory consumption of application process, connection buffer can be tuned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @liorokman. Do you think that it's a reasonable requirement to have different socket and user-space buffer sizes? One thing that come to mind here is that users may want to restrict socket-level buffer size as a security measure, since most validations (IP allowlists, MTLS, JWTs, ... ) occur at a later user-space stage. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SO_SNDBUF and SO_RCVBUF might be tuned to allow the kernel to store a little more data while it waits for userspace to catch up. I think it's not generally a good idea to have these buffers be too large, definitely not at the same scale as the user-space buffer.

Setting the socket buffers too small would trigger transmission failures (for small SO_SNDBUF values), or cause the TCP stack to force the remote side to retransmit data (for small SO_RCVBUF values). Setting these sockets to a too big value is wasteful, and could impact the number of connections that the entire host can support.

I think these buffer sizes are usually tuned to allow user-space a little more time to react before accepting the data, so it does make sense to allow for different socket and user-space buffer sizes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @liorokman and @aoledk would be great if we could convert these recommendations into API doc strings so users can better understand when to tweak this knob

@aoledk aoledk requested a review from zhaohuabing July 23, 2024 10:53
zhaohuabing
zhaohuabing previously approved these changes Jul 23, 2024
Copy link
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks.

@aoledk
Copy link
Contributor Author

aoledk commented Jul 26, 2024

/retest

@aoledk aoledk requested a review from zhaohuabing July 26, 2024 08:54
zhaohuabing
zhaohuabing previously approved these changes Jul 26, 2024
shawnh2
shawnh2 previously approved these changes Aug 3, 2024
Copy link
Contributor

@shawnh2 shawnh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aoledk
Copy link
Contributor Author

aoledk commented Aug 5, 2024

/retest

1 similar comment
@aoledk
Copy link
Contributor Author

aoledk commented Aug 5, 2024

/retest

Signed-off-by: Dingkang Li <[email protected]>
@aoledk aoledk dismissed stale reviews from shawnh2 and zhaohuabing via d98507f August 5, 2024 09:03
@aoledk
Copy link
Contributor Author

aoledk commented Aug 5, 2024

/retest

@aoledk
Copy link
Contributor Author

aoledk commented Aug 6, 2024

/retest

@zirain zirain merged commit 3497600 into envoyproxy:main Aug 6, 2024
26 of 27 checks passed
@aoledk aoledk deleted the socket-buffer-api branch August 6, 2024 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Socket options SO_SNDBUF and SO_RCVBUF support
7 participants