Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable LoadBalancing Extensions #15827

Closed
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions api/envoy/config/cluster/v3/cluster.proto
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,6 @@ message Cluster {
// this option or not.
CLUSTER_PROVIDED = 6;

// [#not-implemented-hide:] Use the new :ref:`load_balancing_policy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also remove the [#not-implemented-hide:] annotation on the load_balancing_policy field (line 971) and the LoadBalancingPolicy message (line 1036).

// <envoy_api_field_config.cluster.v3.Cluster.load_balancing_policy>` field to determine the LB policy.
// [#next-major-version: In the v3 API, we should consider deprecating the lb_policy field
// and instead using the new load_balancing_policy field as the one and only mechanism for
Expand Down Expand Up @@ -718,8 +717,7 @@ message Cluster {

// The :ref:`load balancer type <arch_overview_load_balancing_types>` to use
// when picking a host in the cluster.
// [#comment:TODO: Remove enum constraint :ref:`LOAD_BALANCING_POLICY_CONFIG<envoy_api_enum_value_config.cluster.v3.Cluster.LbPolicy.LOAD_BALANCING_POLICY_CONFIG>` when implemented.]
LbPolicy lb_policy = 6 [(validate.rules).enum = {defined_only: true not_in: 7}];
LbPolicy lb_policy = 6 [(validate.rules).enum = {defined_only: true}];

// Setting this is required for specifying members of
// :ref:`STATIC<envoy_api_enum_value_config.cluster.v3.Cluster.DiscoveryType.STATIC>`,
Expand Down
4 changes: 1 addition & 3 deletions api/envoy/config/cluster/v4alpha/cluster.proto

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 1 addition & 3 deletions generated_api_shadow/envoy/config/cluster/v3/cluster.proto

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

28 changes: 28 additions & 0 deletions include/envoy/upstream/load_balancer.h
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,34 @@ class LoadBalancer {

using LoadBalancerPtr = std::unique_ptr<LoadBalancer>;

/**
* Context passed to load balancer factory to access server resources.
*/
class LoadBalancerFactoryContext {
public:
virtual ~LoadBalancerFactoryContext() = default;

/**
* @return ProtobufMessage::ValidationVisitor& validation visitor for filter configuration
* messages.
*/
virtual ProtobufMessage::ValidationVisitor& messageValidationVisitor() PURE;
};

class TypedLoadBalancerFactory : public Config::UntypedFactory {
public:
~TypedLoadBalancerFactory() override = default;

virtual LoadBalancerPtr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we actually implement this, I think we should spend some time thinking about what the LB policy API will actually look like, since this is an area where we need to consider semantic consistency between gRPC and Envoy, at least in terms of what gets expressed in the xDS API.

There are some significant differences between the existing LB policy APIs in Envoy and gRPC. In particular, Envoy's LB policy is given a set of hosts that are already health-checked, and it picks a host synchronously without any knowledge of what connection(s) may or may not already exist in the individual hosts. In contrast, gRPC's LB policy is given a list of addresses, and it is responsible for creating and managing connections to those addresses; its API allows it to tell the channel to queue a request until the LB policy has a connection capable of sending it on.

CC @pianiststickman @htuch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a useful exercise to try understand how the existing LBs in Envoy coudl be migrated to extensions if we wanted to do this in the future. That would tell us whether this interface is sufficient to be able to express what LB policies might want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worthwhile to think about the future of the LB interface, but I would prefer to not over-complicate this PR, as it is addressing a longstanding issue of not being able to extend the existing Envoy definition of what a load balancer is. Given that the configuration for this is opaque, I don't think we need to conflate any future load balancing changes with this PR which is really just allowing an extension point on top of an existing interface?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. I just wouldn't consider #5598 fully resolved until we can at least state how we could migrate the static load balancers across to this new interface. Having looked at the ClusterEntry constructor, I think it might be sufficient here, but probably worth some comments in the PR or issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wouldn't consider #5598 fully resolved until we can at least state how we could migrate the static load balancers across to this new interface. Having looked at the ClusterEntry constructor, I think it might be sufficient here, but probably worth some comments in the PR or issue.

I think there are 2 different issues here:

  1. @markdroth concerns about the future of load balancing. I think this is a very important convo but IMO is out of scope of this change.
  2. @htuch concern about whether this interface should be sufficient to migrate all existing LBs. IMO this is required for this change. Not to actually do the migration but please check all existing LB constructors and make sure we are passing sufficient info into the create function such that they all would work. (Bonus points for actually doing the migration.)

create(const envoy::config::cluster::v3::LoadBalancingPolicy::Policy& policy,
LoadBalancerType load_balancer_type, LoadBalancerFactoryContext& context,
const PrioritySet& priority_set, const PrioritySet* local_priority_set,
ClusterStats& cluster_stats, Runtime::Loader& loader, Random::RandomGenerator& random,
const envoy::config::cluster::v3::Cluster::CommonLbConfig& common_config) PURE;

std::string category() const override { return "envoy.load_balancers"; }
};

/**
* Factory for load balancers.
*/
Expand Down
3 changes: 2 additions & 1 deletion include/envoy/upstream/load_balancer_type.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ enum class LoadBalancerType {
RingHash,
OriginalDst,
Maglev,
ClusterProvided
ClusterProvided,
LoadBalancingPolicyConfig
};

/**
Expand Down
6 changes: 6 additions & 0 deletions include/envoy/upstream/upstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -787,6 +787,12 @@ class ClusterInfo {
return std::dynamic_pointer_cast<const Derived>(extensionProtocolOptions(name));
}

/**
* @return const envoy::config::cluster::v3::Cluster::CommonLbConfig& the common configuration for
* all load balancers for this cluster.
*/
virtual const envoy::config::cluster::v3::LoadBalancingPolicy& loadBalancingPolicy() const PURE;

/**
* @return const envoy::config::cluster::v3::Cluster::CommonLbConfig& the common configuration for
* all load balancers for this cluster.
Expand Down
2 changes: 2 additions & 0 deletions source/common/upstream/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,8 @@ envoy_cc_library(
"//include/envoy/upstream:load_balancer_interface",
"//include/envoy/upstream:upstream_interface",
"//source/common/common:assert_lib",
"//source/common/config:utility_lib",
"//source/common/config:well_known_names",
"//source/common/protobuf:utility_lib",
"//source/common/runtime:runtime_protos_lib",
"@envoy_api//envoy/config/cluster/v3:pkg_cc_proto",
Expand Down
27 changes: 26 additions & 1 deletion source/common/upstream/cluster_manager_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@ ClusterManagerImpl::ClusterManagerImpl(
ProtobufMessage::ValidationContext& validation_context, Api::Api& api,
Http::Context& http_context, Grpc::Context& grpc_context, Router::Context& router_context)
: factory_(factory), runtime_(runtime), stats_(stats), tls_(tls),
random_(api.randomGenerator()),
random_(api.randomGenerator()), validation_context_(validation_context),
bind_config_(bootstrap.cluster_manager().upstream_bind_config()), local_info_(local_info),
cm_stats_(generateStats(stats)),
init_helper_(*this, [this](ClusterManagerCluster& cluster) { onClusterInit(cluster); }),
Expand Down Expand Up @@ -1348,6 +1348,31 @@ ClusterManagerImpl::ThreadLocalClusterManagerImpl::ClusterEntry::ClusterEntry(
parent.parent_.random_, cluster->lbConfig());
break;
}
case LoadBalancerType::LoadBalancingPolicyConfig: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This occurs off of the main thread, so this won't work since we need to verify the correctness of the configuration prior to when we get here. The way I would recommend doing this is processing the configuration on the main thread next to the other LBs, and storing a factory that we know will work, and then using that factory in the appropriate process in worker context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There already is a lb_factory_. I noticed this is used for the consistent LBs that are built on the main thread, I wonder if this pattern somehow can be reused.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit (5e8ce8c) attempts to mimic lb_factory_ more closely: getFactory is now pulled in to the main thread. I had a hard time coming up with a scenario where I could trigger a config error in the non-main threads, and I can't find where this verification comes from, so this is not much more than a shot in the dark.

I'll continue investigating this, and any extra pointers may speed me up!

ASSERT(lb_factory_ == nullptr);
for (const auto& policy : cluster->loadBalancingPolicy().policies()) {
LoadBalancerFactoryContextImpl context(
parent_.parent_.validation_context_.staticValidationVisitor());
TypedLoadBalancerFactory* factory =
Registry::FactoryRegistry<TypedLoadBalancerFactory>::getFactory(policy.name());

if (factory == nullptr) {
ENVOY_LOG(warn, fmt::format("Didn't find a registered implementation for name: '{}'",
policy.name()));
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we fix the above comment you can use the throwing version of the factory lookup functions.

}

lb_ = factory->create(policy, cluster->lbType(), context, priority_set_,
parent_.local_priority_set_, cluster->stats(),
parent.parent_.runtime_, parent.parent_.random_, cluster->lbConfig());
break;
}
if (lb_ == nullptr) {
ENVOY_LOG(critical, "Didn't find any registered implementation for load_balancing_policy.");
ASSERT(lb_ != nullptr);
}
Comment on lines +1370 to +1373
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just make this RELEASE_ASSERT as this shouldn't happen.

break;
}
case LoadBalancerType::ClusterProvided:
case LoadBalancerType::RingHash:
case LoadBalancerType::Maglev:
Expand Down
1 change: 1 addition & 0 deletions source/common/upstream/cluster_manager_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -582,6 +582,7 @@ class ClusterManagerImpl : public ClusterManager, Logger::Loggable<Logger::Id::u

protected:
ClusterMap active_clusters_;
ProtobufMessage::ValidationContext& validation_context_;

private:
ClusterMap warming_clusters_;
Expand Down
2 changes: 2 additions & 0 deletions source/common/upstream/load_balancer_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,11 @@
#include "envoy/upstream/upstream.h"

#include "common/common/assert.h"
#include "common/config/well_known_names.h"
#include "common/protobuf/utility.h"

#include "absl/container/fixed_array.h"
#include "absl/types/optional.h"

namespace Envoy {
namespace Upstream {
Expand Down
46 changes: 46 additions & 0 deletions source/common/upstream/load_balancer_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "envoy/upstream/load_balancer.h"
#include "envoy/upstream/upstream.h"

#include "common/config/utility.h"
#include "common/protobuf/utility.h"
#include "common/runtime/runtime_protos.h"
#include "common/upstream/edf_scheduler.h"
Expand All @@ -24,6 +25,51 @@ namespace Upstream {
// Priority levels and localities are considered overprovisioned with this factor.
static constexpr uint32_t kDefaultOverProvisioningFactor = 140;

template <class ConfigProto>
class ConfigurableTypedLoadBalancerFactory : public TypedLoadBalancerFactory {
virtual ProtobufTypes::MessagePtr createEmptyConfigProto() {
return std::make_unique<ConfigProto>();
}

LoadBalancerPtr
create(const envoy::config::cluster::v3::LoadBalancingPolicy::Policy& policy,
LoadBalancerType load_balancer_type, LoadBalancerFactoryContext& context,
const PrioritySet& priority_set, const PrioritySet* local_priority_set,
ClusterStats& cluster_stats, Runtime::Loader& loader, Random::RandomGenerator& random,
const envoy::config::cluster::v3::Cluster::CommonLbConfig& common_config) override {
ProtobufTypes::MessagePtr config = createEmptyConfigProto();

Envoy::Config::Utility::translateOpaqueConfig(policy.typed_config(),
ProtobufWkt::Struct::default_instance(),
context.messageValidationVisitor(), *config);

return createLoadBalancerWithConfig(load_balancer_type, priority_set, local_priority_set,
cluster_stats, loader, random, common_config,
MessageUtil::downcastAndValidate<const ConfigProto&>(
*config, context.messageValidationVisitor()));
}

virtual LoadBalancerPtr
createLoadBalancerWithConfig(LoadBalancerType, const PrioritySet&, const PrioritySet*,
ClusterStats&, Runtime::Loader&, Random::RandomGenerator&,
const envoy::config::cluster::v3::Cluster::CommonLbConfig&,
const ConfigProto&) PURE;
};

class LoadBalancerFactoryContextImpl : public LoadBalancerFactoryContext {

public:
LoadBalancerFactoryContextImpl(ProtobufMessage::ValidationVisitor& validation_visitor)
: validation_visitor_(validation_visitor) {}

ProtobufMessage::ValidationVisitor& messageValidationVisitor() override {
return validation_visitor_;
}

private:
ProtobufMessage::ValidationVisitor& validation_visitor_;
};

/**
* Base class for all LB implementations.
*/
Expand Down
1 change: 1 addition & 0 deletions source/common/upstream/subset_lb.cc
Original file line number Diff line number Diff line change
Expand Up @@ -779,6 +779,7 @@ SubsetLoadBalancer::PrioritySubsetImpl::PrioritySubsetImpl(const SubsetLoadBalan
lb_ = thread_aware_lb_->factory()->create();
break;

case LoadBalancerType::LoadBalancingPolicyConfig:
case LoadBalancerType::OriginalDst:
case LoadBalancerType::ClusterProvided:
// LoadBalancerType::OriginalDst is blocked in the factory. LoadBalancerType::ClusterProvided
Expand Down
4 changes: 4 additions & 0 deletions source/common/upstream/upstream_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -775,6 +775,7 @@ ClusterInfoImpl::ClusterInfoImpl(
added_via_api_(added_via_api),
lb_subset_(LoadBalancerSubsetInfoImpl(config.lb_subset_config())),
metadata_(config.metadata()), typed_metadata_(config.metadata()),
load_balancing_policy_(config.load_balancing_policy()),
common_lb_config_(config.common_lb_config()),
cluster_socket_options_(parseClusterSocketOptions(config, bind_config)),
drain_connections_on_host_removal_(config.ignore_health_on_host_removal()),
Expand Down Expand Up @@ -830,6 +831,9 @@ ClusterInfoImpl::ClusterInfoImpl(

lb_type_ = LoadBalancerType::ClusterProvided;
break;
case envoy::config::cluster::v3::Cluster::LOAD_BALANCING_POLICY_CONFIG:
lb_type_ = LoadBalancerType::LoadBalancingPolicyConfig;
break;
default:
NOT_REACHED_GCOVR_EXCL_LINE;
}
Expand Down
4 changes: 4 additions & 0 deletions source/common/upstream/upstream_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,9 @@ class ClusterInfoImpl : public ClusterInfo,

// Upstream::ClusterInfo
bool addedViaApi() const override { return added_via_api_; }
const envoy::config::cluster::v3::LoadBalancingPolicy& loadBalancingPolicy() const override {
return load_balancing_policy_;
}
const envoy::config::cluster::v3::Cluster::CommonLbConfig& lbConfig() const override {
return common_lb_config_;
}
Expand Down Expand Up @@ -719,6 +722,7 @@ class ClusterInfoImpl : public ClusterInfo,
LoadBalancerSubsetInfoImpl lb_subset_;
const envoy::config::core::v3::Metadata metadata_;
Envoy::Config::TypedMetadataImpl<ClusterTypedMetadataFactory> typed_metadata_;
const envoy::config::cluster::v3::LoadBalancingPolicy load_balancing_policy_;
const envoy::config::cluster::v3::Cluster::CommonLbConfig common_lb_config_;
const Network::ConnectionSocket::OptionsSharedPtr cluster_socket_options_;
const bool drain_connections_on_host_removal_;
Expand Down
1 change: 1 addition & 0 deletions test/mocks/upstream/cluster_info.cc
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ MockClusterInfo::MockClusterInfo()
ON_CALL(*this, lbMaglevConfig()).WillByDefault(ReturnRef(lb_maglev_config_));
ON_CALL(*this, lbOriginalDstConfig()).WillByDefault(ReturnRef(lb_original_dst_config_));
ON_CALL(*this, upstreamConfig()).WillByDefault(ReturnRef(upstream_config_));
ON_CALL(*this, loadBalancingPolicy()).WillByDefault(ReturnRef(load_balancing_policy_));
ON_CALL(*this, lbConfig()).WillByDefault(ReturnRef(lb_config_));
ON_CALL(*this, clusterSocketOptions()).WillByDefault(ReturnRef(cluster_socket_options_));
ON_CALL(*this, metadata()).WillByDefault(ReturnRef(metadata_));
Expand Down
3 changes: 3 additions & 0 deletions test/mocks/upstream/cluster_info.h
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,8 @@ class MockClusterInfo : public ClusterInfo {
(const));
MOCK_METHOD(ProtocolOptionsConfigConstSharedPtr, extensionProtocolOptions, (const std::string&),
(const));
MOCK_METHOD(const envoy::config::cluster::v3::LoadBalancingPolicy&, loadBalancingPolicy, (),
(const));
MOCK_METHOD(const envoy::config::cluster::v3::Cluster::CommonLbConfig&, lbConfig, (), (const));
MOCK_METHOD(LoadBalancerType, lbType, (), (const));
MOCK_METHOD(envoy::config::cluster::v3::Cluster::DiscoveryType, type, (), (const));
Expand Down Expand Up @@ -190,6 +192,7 @@ class MockClusterInfo : public ClusterInfo {
absl::optional<envoy::config::cluster::v3::Cluster::OriginalDstLbConfig> lb_original_dst_config_;
absl::optional<envoy::config::core::v3::TypedExtensionConfig> upstream_config_;
Network::ConnectionSocket::OptionsSharedPtr cluster_socket_options_;
envoy::config::cluster::v3::LoadBalancingPolicy load_balancing_policy_;
envoy::config::cluster::v3::Cluster::CommonLbConfig lb_config_;
envoy::config::core::v3::Metadata metadata_;
std::unique_ptr<Envoy::Config::TypedMetadata> typed_metadata_;
Expand Down