-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix least request lb not fair #29873
Fix least request lb not fair #29873
Conversation
Signed-off-by: Leonardo da Mata <[email protected]>
Signed-off-by: Leonardo da Mata <[email protected]>
Signed-off-by: Leonardo da Mata <[email protected]>
Hi @barroca, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
Signed-off-by: Leonardo da Mata <[email protected]>
0c4f25f
to
7711181
Compare
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution. And some comments are added.
And at least a release note is necessary to tell what this PR changed or fixed. You can add a new change log entry in this file. https://github.com/envoyproxy/envoy/blob/main/changelogs/current.yaml
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3)); | ||
EXPECT_CALL(random_, random()).WillOnce(Return(9999)); | ||
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr)); | ||
} | ||
|
||
// Host weight is 100. | ||
{ | ||
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3)); | ||
EXPECT_CALL(random_, random()).WillOnce(Return(9999)); | ||
EXPECT_EQ(hostSet().healthy_hosts_[0], lb_.chooseHost(nullptr)); | ||
} | ||
|
||
HostVector empty; | ||
{ | ||
hostSet().runCallbacks(empty, empty); | ||
EXPECT_CALL(random_, random()).WillOnce(Return(0)).WillOnce(Return(2)).WillOnce(Return(3)); | ||
EXPECT_CALL(random_, random()).WillOnce(Return(9999)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get why these change is necessary 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't make sense to change to 9999, but the number of calls for random have changed since we will call full scan instead.
api/envoy/extensions/load_balancing_policies/least_request/v3/least_request.proto
Outdated
Show resolved
Hide resolved
// The number of random healthy hosts from which the host with the fewest active requests will | ||
// be chosen. Defaults to 2 so that we perform two-choice selection if the field is not set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR change the behavior of the LB. It's necessary to add some more comment to tell that if the choice_count
is larger than or equal to the hosts size, a full scan will be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior change should be harmless to end users and maybe make this LB's behavior closer to it's name least request
.
So, I think a change log should enough rather than a runtime guard.
But still need a check from @envoyproxy/runtime-guard-changes
/wait |
…number of choices is larger than the size. Signed-off-by: Leonardo da Mata <[email protected]>
/lgtm api defer to @wbpcode |
: absl::nullopt) { | ||
: absl::nullopt), | ||
enable_full_scan_( | ||
PROTOBUF_GET_WRAPPED_OR_DEFAULT(least_request_config.ref(), enable_full_scan, false)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure how to do this on the code, when removing the field from here I was having problems compiling the load_balancer_impl.* files. I would appreciate help on this.
Just keep this enable_full_scan_
always be false if legacy API is used. Then I think this will resolve the compiling problem.
/wait
Signed-off-by: Leonardo da Mata <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM overall. Please check the CI and add a release note, thanks.
@@ -749,6 +752,7 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase { | |||
double active_request_bias_{}; | |||
|
|||
const absl::optional<Runtime::Double> active_request_bias_runtime_; | |||
const bool enable_full_scan_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: const bool enable_full_scan_{};
/wait |
Signed-off-by: Leonardo da Mata <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
Signed-off-by: Leonardo da Mata <[email protected]>
Not sure If I need someone else to approve since "envoyproxy/api-shepherds must approve for any API change" is pending. |
Signed-off-by: Leonardo da Mata <[email protected]>
perhaps @lizan needs to take a look again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
…citly forleast request lb (envoyproxy#30794)" This reverts commit e93e556. Revert "Fix least request lb not fair (envoyproxy#29873)" This reverts commit 3ea2bc4. restore api Signed-off-by: Kuat Yessenov <[email protected]> fix merge Signed-off-by: Kuat Yessenov <[email protected]>
We are reverting this change since we noticed that there is a significant impact on clusters with a small number of hosts caused by the deterministic loop which makes every client pick the same hosts. I think we need a better understanding of the impact with a load simulation for this change to be re-applied. Please consider adding some independent sampling to the algorithm and we can do a more thorough review again. |
I understand that changing the behaviour would have impacted people, but It is not clear to me that adding an option to enable full scan would have a bad impact since it is explicitly selecting the host with least requests. Wondering if we can add it back? Thanks @kyessenov and @wbpcode for fixing. I wasn't aware that I had breaking issues . :) |
The API will be kept and new implementation is still welcome. And @tonya11en is working a simulated system to ensure we can get a more reasonable implementation in the future. Thanks for your contribution and so sorry for that we need to revert it. We know it taken you lots of time. :( Hope we can bring it back soon. |
If the patch is limited to only the “full scan” flag, it should do what you want without affecting the existing selection behavior.
…On Fri, Nov 10, 2023, at 6:17 AM, code wrote:
The API will be kept and new implementation is still welcome. And @tonya11en <https://github.com/tonya11en> is working a simulated system to ensure we can get a more reasonable implementation in the future.
Thanks for your contribution and so sorry for that we need to revert it. We know it taken you lots of time. :(
Hope we can bring it back soon.
—
Reply to this email directly, view it on GitHub <#29873 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIOZ7QPECF7Z7BNKBKEMJTYDYZPRAVCNFSM6AAAAAA5MROLXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBVHAZDEMJYGI>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I noticed the "revert PR" hasn't been merged yet. Instead of reverting, is it sufficient to instead add a patch that starts the full scan at a random index (as suggested by @ggreenway and @tomwans here)? Or are there additional concerns that need to be addressed regarding full scan mode? |
Hello, I have this PR open that changes the behaviour for a random index: #31146 |
* Add new idea for selecting hosts among those not selected yet. Signed-off-by: Leonardo da Mata <[email protected]> * Change how we choose full table scan Signed-off-by: Leonardo da Mata <[email protected]> * Remove cout Signed-off-by: Leonardo da Mata <[email protected]> * Fix Tests for load_balancer_impl_test Signed-off-by: Leonardo da Mata <[email protected]> * Fix format and make sure full scan happens only when selected or the number of choices is larger than the size. Signed-off-by: Leonardo da Mata <[email protected]> * Enable new option on extesions api only Signed-off-by: Leonardo da Mata <[email protected]> * Fix Integration tests. Signed-off-by: Leonardo da Mata <[email protected]> * Add release notes for full scan in least request LB. Signed-off-by: Leonardo da Mata <[email protected]> * Fix ref for release note. Signed-off-by: Leonardo da Mata <[email protected]> * Fix release notes Signed-off-by: Leonardo da Mata <[email protected]> * Update release note Signed-off-by: Leonardo da Mata <[email protected]> --------- Signed-off-by: Leonardo da Mata <[email protected]> Signed-off-by: Leonardo da Mata <[email protected]> Co-authored-by: Leonardo da Mata <[email protected]>
Commit Message: Fix Least requests LB when doing a random pick so it removes already chosen hosts from the random function to remove the chance of selecting the same host again when dealing with a small amount of hosts. Also, when the number of choices is smaller or equal the number of hosts, use full scan of least used instead.
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
This Release changes the default behaviour of Least Request Load Balancer doing a full scan when the number of choices is more than equal the size of hosts and also adds a new option on the envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest configuration to always do a full scan. Allowing a full scan instead of random making choices reduces the chance of selecting a host that doesn't have least requests when the number of hosts is smaller.
Platform Specific Features:
[Optional Runtime guard:]
Fixes #11004
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]