-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable LoadBalancing Extensions #15827
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,6 +111,34 @@ class LoadBalancer { | |
|
||
using LoadBalancerPtr = std::unique_ptr<LoadBalancer>; | ||
|
||
/** | ||
* Context passed to load balancer factory to access server resources. | ||
*/ | ||
class LoadBalancerFactoryContext { | ||
public: | ||
virtual ~LoadBalancerFactoryContext() = default; | ||
|
||
/** | ||
* @return ProtobufMessage::ValidationVisitor& validation visitor for filter configuration | ||
* messages. | ||
*/ | ||
virtual ProtobufMessage::ValidationVisitor& messageValidationVisitor() PURE; | ||
}; | ||
|
||
class TypedLoadBalancerFactory : public Config::UntypedFactory { | ||
public: | ||
~TypedLoadBalancerFactory() override = default; | ||
|
||
virtual LoadBalancerPtr | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before we actually implement this, I think we should spend some time thinking about what the LB policy API will actually look like, since this is an area where we need to consider semantic consistency between gRPC and Envoy, at least in terms of what gets expressed in the xDS API. There are some significant differences between the existing LB policy APIs in Envoy and gRPC. In particular, Envoy's LB policy is given a set of hosts that are already health-checked, and it picks a host synchronously without any knowledge of what connection(s) may or may not already exist in the individual hosts. In contrast, gRPC's LB policy is given a list of addresses, and it is responsible for creating and managing connections to those addresses; its API allows it to tell the channel to queue a request until the LB policy has a connection capable of sending it on. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be a useful exercise to try understand how the existing LBs in Envoy coudl be migrated to extensions if we wanted to do this in the future. That would tell us whether this interface is sufficient to be able to express what LB policies might want. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's worthwhile to think about the future of the LB interface, but I would prefer to not over-complicate this PR, as it is addressing a longstanding issue of not being able to extend the existing Envoy definition of what a load balancer is. Given that the configuration for this is opaque, I don't think we need to conflate any future load balancing changes with this PR which is really just allowing an extension point on top of an existing interface? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's fine. I just wouldn't consider #5598 fully resolved until we can at least state how we could migrate the static load balancers across to this new interface. Having looked at the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think there are 2 different issues here:
|
||
create(const envoy::config::cluster::v3::LoadBalancingPolicy::Policy& policy, | ||
LoadBalancerType load_balancer_type, LoadBalancerFactoryContext& context, | ||
const PrioritySet& priority_set, const PrioritySet* local_priority_set, | ||
ClusterStats& cluster_stats, Runtime::Loader& loader, Random::RandomGenerator& random, | ||
const envoy::config::cluster::v3::Cluster::CommonLbConfig& common_config) PURE; | ||
|
||
std::string category() const override { return "envoy.load_balancers"; } | ||
}; | ||
|
||
/** | ||
* Factory for load balancers. | ||
*/ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -253,7 +253,7 @@ ClusterManagerImpl::ClusterManagerImpl( | |
ProtobufMessage::ValidationContext& validation_context, Api::Api& api, | ||
Http::Context& http_context, Grpc::Context& grpc_context, Router::Context& router_context) | ||
: factory_(factory), runtime_(runtime), stats_(stats), tls_(tls), | ||
random_(api.randomGenerator()), | ||
random_(api.randomGenerator()), validation_context_(validation_context), | ||
bind_config_(bootstrap.cluster_manager().upstream_bind_config()), local_info_(local_info), | ||
cm_stats_(generateStats(stats)), | ||
init_helper_(*this, [this](ClusterManagerCluster& cluster) { onClusterInit(cluster); }), | ||
|
@@ -1348,6 +1348,31 @@ ClusterManagerImpl::ThreadLocalClusterManagerImpl::ClusterEntry::ClusterEntry( | |
parent.parent_.random_, cluster->lbConfig()); | ||
break; | ||
} | ||
case LoadBalancerType::LoadBalancingPolicyConfig: { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This occurs off of the main thread, so this won't work since we need to verify the correctness of the configuration prior to when we get here. The way I would recommend doing this is processing the configuration on the main thread next to the other LBs, and storing a factory that we know will work, and then using that factory in the appropriate process in worker context. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There already is a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This commit (5e8ce8c) attempts to mimic lb_factory_ more closely: getFactory is now pulled in to the main thread. I had a hard time coming up with a scenario where I could trigger a config error in the non-main threads, and I can't find where this verification comes from, so this is not much more than a shot in the dark. I'll continue investigating this, and any extra pointers may speed me up! |
||
ASSERT(lb_factory_ == nullptr); | ||
for (const auto& policy : cluster->loadBalancingPolicy().policies()) { | ||
LoadBalancerFactoryContextImpl context( | ||
parent_.parent_.validation_context_.staticValidationVisitor()); | ||
TypedLoadBalancerFactory* factory = | ||
Registry::FactoryRegistry<TypedLoadBalancerFactory>::getFactory(policy.name()); | ||
|
||
if (factory == nullptr) { | ||
ENVOY_LOG(warn, fmt::format("Didn't find a registered implementation for name: '{}'", | ||
policy.name())); | ||
continue; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Once we fix the above comment you can use the throwing version of the factory lookup functions. |
||
} | ||
|
||
lb_ = factory->create(policy, cluster->lbType(), context, priority_set_, | ||
parent_.local_priority_set_, cluster->stats(), | ||
parent.parent_.runtime_, parent.parent_.random_, cluster->lbConfig()); | ||
break; | ||
} | ||
if (lb_ == nullptr) { | ||
ENVOY_LOG(critical, "Didn't find any registered implementation for load_balancing_policy."); | ||
ASSERT(lb_ != nullptr); | ||
} | ||
Comment on lines
+1370
to
+1373
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would just make this RELEASE_ASSERT as this shouldn't happen. |
||
break; | ||
} | ||
case LoadBalancerType::ClusterProvided: | ||
case LoadBalancerType::RingHash: | ||
case LoadBalancerType::Maglev: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also remove the
[#not-implemented-hide:]
annotation on theload_balancing_policy
field (line 971) and theLoadBalancingPolicy
message (line 1036).