Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Least Request LB active request bias config #11252

Merged
merged 27 commits into from
Jul 15, 2020

Conversation

gkleiman
Copy link
Contributor

@gkleiman gkleiman commented May 19, 2020

Commit Message:
Introduce Least Request LB active request bias config

Additional Description:
This patch adds a runtime configurable active request bias to the Least Request Load Balancer. This bias only takes effect when all host weights are not equal. The larger it is, the more aggressively active requests will lower the effective weight of a host.

The least request lb behaves like the round robin lb when the bias is set to 0.0 and all weights are not equal. This is very useful for squeeze/redline tests and fixes the issues described on #10602.

Risk Level: S
Testing: Unit test. Validated on Lyft infra that this approach fixes the issues described in #10602.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with small comments for a test. Thank you!

/wait

@@ -81,6 +81,7 @@ constexpr const char* runtime_features[] = {
constexpr const char* disabled_runtime_features[] = {
// Sentinel and test flag.
"envoy.reloadable_features.test_feature_false",
"envoy.reloadable_features.alternative_least_request_weights",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we end up doing a real PR for this, it should be a real config option vs. one of these. I would just use a RuntimeFeatureFlag inside the LB config proto.

Comment on lines 460 to 461
alternativeWeights_ = Runtime::runtimeFeatureEnabled(
"envoy.reloadable_features.alternative_least_request_weights");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For perf reasons I would not re-snap this on every choice. I would just do it in refresh below. I think this is fine.

const uint32_t choice_count_;
bool alternativeWeights_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternative_weights_

wgallagher
wgallagher previously approved these changes May 19, 2020
Copy link
Member

@tonya11en tonya11en left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recognize I have the unpopular opinion here, but this change seems like a hack to workaround for a lack of operational agility. I think a better approach would be to just add some coefficient (k) that makes the weights look something like:

new_weight = (configured_weight / outstanding_requests^k)

If k=1, things behave as they do today and normalize the weights by dividing it by outstanding requests.
If k=0, you get the behavior in this PR and the outstanding requests don't affect the configured weights.

You also have the freedom to configure anything in between and even make the outstanding requests have a more aggressive impact on the weights if you so choose. This would be trivial to implement, can be runtime configurable, and seems less hacky. WDYT?

const uint32_t choice_count_;
bool alternativeWeights_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also explain what this means exactly in a comment. It's non-obvious from the name.

@mattklein123
Copy link
Member

I think a better approach would be to just add some coefficient (k) that makes the weights look something like:

+1 I like this idea. Let's do that!

Signed-off-by: Gastón Kleiman <[email protected]>
@gkleiman
Copy link
Contributor Author

I think a better approach would be to just add some coefficient (k) that makes the weights look something like:

+1 I like this idea. Let's do that!

Good idea! I implemented this with a global runtime. If things work fine in the experiment, I'll make it a real config option inside LeastRequestLbConfig.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to test. If the patch looks good I like this idea for a general config flag. Great idea @tonya11en!

/wait

Comment on lines 484 to 485
ENVOY_LOG(debug, "cluster={} address={} active_requests_exponent={} weight={}",
host.cluster().name(), host.address()->asString(), active_requests_exponent_, weight);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely trace. I would also probably remove this in the final patch.

Comment on lines 481 to 482
const double weight = static_cast<double>(host.weight()) /
std::pow(host.stats().rq_active_.value() + 1, active_requests_exponent_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For perf reasons I would probably check if 1.0 and then branch and avoid the pow call.

Signed-off-by: Gastón Kleiman <[email protected]>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait

#include "common/upstream/edf_scheduler.h"

namespace Envoy {
namespace Upstream {

static const std::string RuntimeLeastRequestsActiveRequestsExponent =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absl::string_view to avoid static fiasco.

@stale
Copy link

stale bot commented May 29, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label May 29, 2020
EXPECT_CALL(random_, random()).WillRepeatedly(Return(0));

// We should see 2:1 ratio for hosts[1] to hosts[0], regardless of the active request count.
hostSet().healthy_hosts_[1]->stats().rq_active_.set(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this works. Are the stats affecting the outstanding requests perceived by the LB when the exponent is non-zero?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jun 1, 2020
@stale
Copy link

stale bot commented Jun 9, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jun 9, 2020
@tonya11en
Copy link
Member

This was just verified to have the desired effect in our infra, so I'm resetting stalebot.

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jun 9, 2020
@jmarantz
Copy link
Contributor

jmarantz commented Jul 2, 2020

/wait

Use a plain double instead of a runtime double to store the per-cluster
active request bias.

Note: The goal of this commit is to evaluate the memory overhead of this
approach. A commit with te Least Requests LB changes might follow if we
deem the memory overhead of this approach acceptable.

Signed-off-by: Gastón Kleiman <[email protected]>
@jmarantz
Copy link
Contributor

jmarantz commented Jul 6, 2020

Per above I'm OK merging what you have temporarily. But I think what we want here is a RuntimeDouble for the process, not one for each cluster. I'm not 100% sure of the best data structure to hold that RuntimeDouble.

@gkleiman
Copy link
Contributor Author

gkleiman commented Jul 7, 2020

Per above I'm OK merging what you have temporarily. But I think what we want here is a RuntimeDouble for the process, not one for each cluster. I'm not 100% sure of the best data structure to hold that RuntimeDouble.

@jmarantz 👍 I restored the functional changes and merged master — PerWorkerStatsAndBalancing/IPv4 failed the CI coverage run, but I did over 1000 local runs and they all succeeded (bazel test test/integration:stats_integration_test --test_filter="*PerWorkerStatsAndBalancing*" --runs_per_test=1000).

@gkleiman
Copy link
Contributor Author

gkleiman commented Jul 8, 2020

/retest

Looks like Azure Pipelines is degraded and the Linux coverage job failed with the following error:

We stopped hearing from agent i-04515e37cd973b99f. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error. For more information, see: https://go.microsoft.com/fwlink/?linkid=846610

@repokitteh-read-only
Copy link

🐴 hold your horses - no failures detected, yet.

🐱

Caused by: a #11252 (comment) was created by @gkleiman.

see: more, trace.

@jmarantz
Copy link
Contributor

jmarantz commented Jul 9, 2020

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jmarantz
Copy link
Contributor

jmarantz commented Jul 9, 2020

Summarizing DM discussion on Slack, we are going to move the RuntimeDouble out of the cluster configuration and into bootstrap. The problem with my earlier suggestion to merge this is that it creates an API at the cluster level that will be hard to remove. @gkleiman indicated he didn't think it would be too bad to drive the config decision from the bootstrap proto.

@gkleiman
Copy link
Contributor Author

gkleiman commented Jul 10, 2020

I've changed the place/way where the active request bias is stored a few times already since I began working on this, so I would love to reach consensus between Lyft & external maintainers before changing the implementation again.

Here goes a summary of Lyft's requirements, findings about the extra per-cluster memory footprint and possible approaches. I hope this makes the conversation a bit easier — it turned out to be a bit long for a PR comment, so I would be also happy to move this to a Google Doc so people can comment on it.

Lyft Requirements

In order to improve Envoy’s load balancing behavior during squeeze tests, we would like to make it possible for the least request load balancer to ignore the active request count in the case that all host weights are not equal.

We would do configure this change globally for all Envoy instances in our service mesh. Being able to override the active request bias for a particular cluster would be nice to have but is currently not required.

Findings

  1. The stats memory integration tests report that with the current approach (Initial import #1) the per-cluster memory usage increases by 256 bytes. That would be 2.56 MB for an Envoy instance with 10k clusters.
  2. The memory tests don’t create any LBs, so the extra memory usage is attributable to the new field in LeastRequestLbConfig and not to the changes in the LB implementation. Note that even though RuntimeDouble contains a runtime key, the memory tests create any instances of the new field. So the actual memory usage for users of this feature that specify runtime keys would be higher than reported by the test.
  3. When adding three fields to LeastRequestLbConfig the memory tests still report 256 bytes of extra per-cluster memory usage. This hints that the extra memory usage might be related to alignment/padding or other optimizations.
  4. As far as I could tell from looking at the code and using a debugger no instances of LeastRequestLbConfig should be created by the tests — it is still unclear why more memory is used if we’re defining a field but not creating any instances of the message containing it.

Feedback

@tonya11en and @mattklein123 initially proposed the current approach — per-cluster RuntimeDouble in `LeastRequestLbConfig (#1)

@jmarantz suggested making the active request bias a global setting instead of a per-cluster setting (#4)

@alyssawilk suggested using scalars for changing per-cluster values via CDS rather than using runtime overrides. (#2)

Approaches

1. Per-Cluster - RuntimeDouble field in CDS LeastRequestLbConfig (Current implementation)

Pro: Easy to implement. Most flexible approach.
Con: Largest memory usage of all approaches (need to store a runtime key per load balancer). Dependency on a v3 xDS server.

2. Per-Cluster - Scalar field in CDS LeastRequestLbConfig

Pro: Smaller memory footprint than using RuntimeDouble , since there is no need to store runtime keys. However the memory tests still report 256 bytes of extra memory. Very easy to implement.
Con: Dependency on a v3 xDS server. Little flexibility, not using runtime means that any changes have to be made via an xDS server.

3. Global Runtime - Hard-coded runtime key (Original implementation)

We could define a global runtime key that can be used to globally configure the active request bias used by all least request LBs, e.g., upstream.least_request_load_balancer.active_request_bias.

Pro: No additional memory footprint. Very easy to implement. Easy to adopt, no dependency on a v3 xDS server.
Con: Little flexibility, using a hard-coded runtime key makes it harder to use different values for different Envoy instances that share the same runtime layers.

4. Global Runtime - Move the RuntimeDouble out of the cluster configuration and into bootstrap

We could move the RuntimeDouble to the bootstrap config message, e.g., we could move it to Bootstrap.ClusterManager.LbConfig.LeastRequestLbConfig.active_request_bias.

To avoid reducing the additional memory usage we could add a Runtime::Double instance owned by the cluster manager and figure out how to share it among load balancers on different threads.

Pro: Small memory footprint. Easy to adopt, no dependency on a v3 xDS server.
Con: Intermediate flexibility. No way of configuring per-cluster values, but it’s possible to use different runtime keys for different Envoy instances. Implementation is a bit more complex than the other approaches.

@jmarantz
Copy link
Contributor

Thanks for the detailed note.

It's clear we'll have to spend some memory to get the per-cluster configurability for this feature. I'm fine if that's the consensus. The amount of memory we are spending was surprising, which I believe is due to Envoy's usage of protobufs as retained data structures. I'm not sure how big a task it would be to move away from protobufs as a retained data structure and use them instead only for transport. But that could help get that memory back and perhaps then some. Obviously that would be outside the scope of the PR, obviously. Perhaps someone could own the task of evaluating that optimization as a follow-up.

In the scheme of things, 2.5M memory is obviously not a lot for a proxy meant to handle 10k clusters, so I'm OK merging this in in its current form. I had pushed back earlier, because I thought maybe the per-cluster config was not strictly needed and would be hard to remove once it was added. Sounds like it would be useful at least.

@mattklein123
Copy link
Member

I think we should move this PR forward as-is with the per-cluster increase. If you want to leave some TODOs we can do that but I think the proto issues are out of scope for this PR?

@jmarantz
Copy link
Contributor

agreed; happy to see it merged and iterate from there.

@gkleiman
Copy link
Contributor Author

@mattklein123 IpVersionsClientType/VhdsIntegrationTest.VhdsOnDemandUpdateHttpConnectionCloses/0 failed the tsan run due to a data race that doesn't seem to be related to this PR.

@alyssawilk
Copy link
Contributor

alyssawilk commented Jul 14, 2020 via email

@jmarantz jmarantz merged commit 9f7d448 into envoyproxy:master Jul 15, 2020
KBaichoo pushed a commit to KBaichoo/envoy that referenced this pull request Jul 30, 2020
* Add support for making Least Requests LB behave like Round Robin in weighted hosts case

Signed-off-by: Gastón Kleiman <[email protected]>

* Address feedback

Signed-off-by: Gastón Kleiman <[email protected]>

* Perf/logging improvements

Signed-off-by: Gastón Kleiman <[email protected]>

* Address feedback and cleanup BUILD file

Signed-off-by: Gastón Kleiman <[email protected]>

* Make active requests exponent configurable via CDS/runtime

Signed-off-by: Gastón Kleiman <[email protected]>

* Address feedback

Signed-off-by: Gastón Kleiman <[email protected]>

* Validate log message

Signed-off-by: Gastón Kleiman <[email protected]>

* Update cluster memory test golden values

Signed-off-by: Gastón Kleiman <[email protected]>

* Fix method name

Signed-off-by: Gastón Kleiman <[email protected]>

* Explicitly initialize active_request_bias_

Signed-off-by: Gastón Kleiman <[email protected]>

* Try to make clang-tidy happy

Signed-off-by: Gastón Kleiman <[email protected]>

* Use unique_ptr instead of optional

Signed-off-by: Gastón Kleiman <[email protected]>

* Update stats integration test

Signed-off-by: Gastón Kleiman <[email protected]>

* Check whether memory footprint is reduced without LB changes

Signed-off-by: Gastón Kleiman <[email protected]>

* Use plain double for active request bias

Use a plain double instead of a runtime double to store the per-cluster
active request bias.

Note: The goal of this commit is to evaluate the memory overhead of this
approach. A commit with te Least Requests LB changes might follow if we
deem the memory overhead of this approach acceptable.

Signed-off-by: Gastón Kleiman <[email protected]>

* Revert back to approved implementation using RuntimeDouble

Signed-off-by: Gastón Kleiman <[email protected]>

* Add extra fields to CDS cluster proto to check memory usage

Signed-off-by: Gastón Kleiman <[email protected]>

* Revert "Add extra fields to CDS cluster proto to check memory usage"

This reverts commit a6a285d.

Signed-off-by: Gastón Kleiman <[email protected]>

* Add changelog entry

Signed-off-by: Gastón Kleiman <[email protected]>
Signed-off-by: Kevin Baichoo <[email protected]>
scheler pushed a commit to scheler/envoy that referenced this pull request Aug 4, 2020
* Add support for making Least Requests LB behave like Round Robin in weighted hosts case

Signed-off-by: Gastón Kleiman <[email protected]>

* Address feedback

Signed-off-by: Gastón Kleiman <[email protected]>

* Perf/logging improvements

Signed-off-by: Gastón Kleiman <[email protected]>

* Address feedback and cleanup BUILD file

Signed-off-by: Gastón Kleiman <[email protected]>

* Make active requests exponent configurable via CDS/runtime

Signed-off-by: Gastón Kleiman <[email protected]>

* Address feedback

Signed-off-by: Gastón Kleiman <[email protected]>

* Validate log message

Signed-off-by: Gastón Kleiman <[email protected]>

* Update cluster memory test golden values

Signed-off-by: Gastón Kleiman <[email protected]>

* Fix method name

Signed-off-by: Gastón Kleiman <[email protected]>

* Explicitly initialize active_request_bias_

Signed-off-by: Gastón Kleiman <[email protected]>

* Try to make clang-tidy happy

Signed-off-by: Gastón Kleiman <[email protected]>

* Use unique_ptr instead of optional

Signed-off-by: Gastón Kleiman <[email protected]>

* Update stats integration test

Signed-off-by: Gastón Kleiman <[email protected]>

* Check whether memory footprint is reduced without LB changes

Signed-off-by: Gastón Kleiman <[email protected]>

* Use plain double for active request bias

Use a plain double instead of a runtime double to store the per-cluster
active request bias.

Note: The goal of this commit is to evaluate the memory overhead of this
approach. A commit with te Least Requests LB changes might follow if we
deem the memory overhead of this approach acceptable.

Signed-off-by: Gastón Kleiman <[email protected]>

* Revert back to approved implementation using RuntimeDouble

Signed-off-by: Gastón Kleiman <[email protected]>

* Add extra fields to CDS cluster proto to check memory usage

Signed-off-by: Gastón Kleiman <[email protected]>

* Revert "Add extra fields to CDS cluster proto to check memory usage"

This reverts commit a6a285d.

Signed-off-by: Gastón Kleiman <[email protected]>

* Add changelog entry

Signed-off-by: Gastón Kleiman <[email protected]>
Signed-off-by: scheler <[email protected]>
@gkleiman gkleiman deleted the alternative-lr-weights branch August 26, 2020 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants