You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current rate limiting method of KubeGateway is local rate limiting for each instance, which does not require additional dependencies and is simple to implement, but it also has some issues:
The quotas are inaccurate. Each gateway instance limits based on its own quota. The HTTP/2 long connections between the client and gateway may cause requests to concentrate on certain gateway instances, resulting in clients receiving less quota than the total configured rate limiting quota.
The precision of rate limiting thresholds is poor. When the number of gateway instances is scaled up, the total rate limiting quota for all instances increases, so it is necessary to readjust the threshold for each instance.. For requests with small rate limiting thresholds like "list," it is difficult to precisely limit the flow.
The round-robin load balancing strategy cannot guarantee strict balance of requests to backend apiserver instances. Even slight deviations in requests for requests like "full list" can put significant pressure on the apiserver.
To address the above issues, we can integrate a global rate limiting center to implement a global rate limiting strategy. The gateway supports both local rate limiting and integration with the rate limiting center. When the rate limiting center is unavailable, the local rate limiting capability serves as a fallback. The rate limiting center is a weak dependency of KubeGateway. During data center construction, local rate limiting capability is used first, and integration with the rate limiting center is done once its deployment is completed.
The text was updated successfully, but these errors were encountered:
Description
The current rate limiting method of KubeGateway is local rate limiting for each instance, which does not require additional dependencies and is simple to implement, but it also has some issues:
The quotas are inaccurate. Each gateway instance limits based on its own quota. The HTTP/2 long connections between the client and gateway may cause requests to concentrate on certain gateway instances, resulting in clients receiving less quota than the total configured rate limiting quota.
The precision of rate limiting thresholds is poor. When the number of gateway instances is scaled up, the total rate limiting quota for all instances increases, so it is necessary to readjust the threshold for each instance.. For requests with small rate limiting thresholds like "list," it is difficult to precisely limit the flow.
The round-robin load balancing strategy cannot guarantee strict balance of requests to backend apiserver instances. Even slight deviations in requests for requests like "full list" can put significant pressure on the apiserver.
To address the above issues, we can integrate a global rate limiting center to implement a global rate limiting strategy. The gateway supports both local rate limiting and integration with the rate limiting center. When the rate limiting center is unavailable, the local rate limiting capability serves as a fallback. The rate limiting center is a weak dependency of KubeGateway. During data center construction, local rate limiting capability is used first, and integration with the rate limiting center is done once its deployment is completed.
The text was updated successfully, but these errors were encountered: