TCP proxy API: Weighted clusters and cluster_from_sni #4444

rshriram · 2018-09-17T20:43:37Z

API changes to support weighted clusters (#4430).

In addition, add an option to select the upstream cluster
based on the SNI value. This will be used for cross cluster routing in Istio,
where traffic from one cluster lands on the gateway Envoy in another
cluster, carrying the cluster name in the SNI field. The gateway envoy
would then forward the traffic to the appropriate upstream cluster.
(In some sense, its pretty similar to the cluster_header field in HTTP Route).

PR with implementation will follow shortly. Wanted to get the API out so as to
not conflict with Venil's PR (#4430) for
weighted cluster implementation.

Signed-off-by: Shriram Rajagopalan [email protected]

Signed-off-by: Shriram Rajagopalan <[email protected]>

rshriram · 2018-09-17T20:45:51Z

cc @PiotrSikora & @louiscryan (per offline discussion).

FYI @venilnoronha this is the weighted cluster spec I mentioned in your PR.

There is some scope for reuse of the WeightedClusters struct, but its nestled in HTTP route, with a bunch of HTTP specific things like add/remove request headers.

Signed-off-by: Shriram Rajagopalan <[email protected]>

ggreenway · 2018-09-17T20:52:34Z

api/envoy/config/filter/network/tcp_proxy/v2/tcp_proxy.proto

+    // handshake. If the SNI is empty (as is the case for plaintext TCP
+    // connections) or the referenced cluster does not exist, Envoy will
+    // terminate the connection.
+    google.protobuf.BoolValue use_sni_as_cluster = 10;


I think we should do this with a more generic approach. We had a discussion last week about allowing per-connection metadata, which filters could set and consume. I think we should have tcp_proxy consume something that is a cluster override value, and write a separate filter that sets this value based on SNI. Then we can support arbitrary routing of this type without putting it all hard-coded into tcp_proxy.

Also, in #4357, it was suggested to disallow cluster names with '.' in them, which would prevent a cluster from being named with the SNI value.

The per connection metadata - is this same as the request info stuff that went into HTTP conn. manager?

Discussion was here: #4331

Ah yes. But setting cluster from another filter means the tcp proxy can no longer be the terminating filter in the network stack. We need to have http style router as the final filter. And it’s unclear how it would work with things like redis.

I don’t understand. How will the filter set the cluster that tcp proxy should use to route to upstream? The metadata stuff works for logging purposes. Not for changing the route that tcp proxy is supposed to take.

Take a look at https://github.com/envoyproxy/envoy/blob/master/include/envoy/request_info/filter_state.h, and the discussion in #4331 and #4100. It allows setting arbitrary data to communicate between filters.

For this case, we would say that a specific metadata value, maybe "TcpProxy.cluster", will contain a cluster name to route to. And if it's not present, use the one in the TcpProxy config. So than any earlier filter could set "TcpProxy.cluster" to an appropriate value.

Yes I know about that request info stuff.

But I would much rather opt for more explicit semantics than an implicit “if not set, use the cluster in tcp proxy”. It makes it hard to debug and the configuration at first sight doesn’t tell you what mode Envoy is using.

Put another way, we might as well make the cluster field optional so that whatever filter I add earlier could set the cluster, allowing me to leave the cluster field in tcp proxy to be empty. Even then, it feels a bit awkward.

Yeah, I agree it should be allowed to leave the cluster empty.

What I'm trying to avoid is everyone putting their own special routing rules hard-coded into tcp_proxy. Especially when it adds opaque requirements on other parts of the configuration (like cluster names must match SNI values).

Another approach we could think about is adding a hook for a lua script to run and select the cluster to use.

I am okay with the separate filter. Just don't want to implicitly pull and override a user specified config in tcp proxy.

Signed-off-by: Shriram Rajagopalan <[email protected]>

venilnoronha · 2018-09-17T22:50:32Z

api/envoy/config/filter/network/tcp_proxy/v2/tcp_proxy.proto

+    // Specifies the total weight across all clusters. The sum of all cluster
+    // weights must equal this value, which must be greater than 0. Defaults
+    // to 100.
+    google.protobuf.UInt32Value total_weight = 3 [(validate.rules).uint32.gte = 1];


Do we need total_weight, or can it be computed from the clusters definition?

I think this is to allow people to blackhole a certain portion of traffic. I copied this verbatim from http so that we can maintain the same semantics for http thrift and tcp

Got it.

Just FYI, it's determined from the clusters definition for thrift currently:

envoy/api/envoy/config/filter/network/thrift_proxy/v2alpha1/route.proto

Lines 106 to 108 in c32aed9

// weight. The sum of weights across all entries in the clusters array determines the total

// weight.

google.protobuf.UInt32Value weight = 2 [(validate.rules).uint32.gte = 1];

Oh bummer. Why have different weighted clusters! I will leave it upto you.

I think it makes sense to allow for blackholing.

But this is documented as "the sum of all cluster weights must equal this value". So there's no use for blackholing then, right? It seems like for blackholing, you can create a cluster with no hosts (and even name it "blackhole"), and give it a specific weight.

That feels like a better approach to me. Thanks.

venilnoronha · 2018-09-17T23:25:59Z

api/envoy/config/filter/network/tcp_proxy/v2/tcp_proxy.proto

+      string name = 1 [(validate.rules).string.min_bytes = 1];
+
+      // An integer between 0 and :ref:`total_weight
+      // <envoy_api_field_route.WeightedCluster.total_weight>`. The choice of an upstream


nit: envoy_api_field_route.WeightedCluster.total_weight -> envoy_api_field_config.filter.network.tcp_proxy.v2.TcpProxy.WeightedCluster.total_weight.

venilnoronha · 2018-09-17T23:26:29Z

api/envoy/config/filter/network/tcp_proxy/v2/tcp_proxy.proto

+    // Specifies the total weight across all clusters. The sum of all cluster
+    // weights must equal this value, which must be greater than 0. Defaults
+    // to 100.
+    google.protobuf.UInt32Value total_weight = 3 [(validate.rules).uint32.gte = 1];


nit: total_weight = 3 -> total_weight = 2.

lizan · 2018-09-18T07:09:09Z

api/envoy/config/filter/network/tcp_proxy/v2/tcp_proxy.proto

+    // handshake. If the SNI is empty (as is the case for plaintext TCP
+    // connections) or the referenced cluster does not exist, Envoy will
+    // terminate the connection.
+    google.protobuf.BoolValue use_sni_as_cluster = 10;


Let's use server_names instead of sni here, to be consistent with what FilterChainMatch has, and put SNI only where it is TLS specific.

Also I suggest put a strong warning here that this is pretty dangerous when used without specifying server_names in FilterChainMatch and/or RBAC filter in front of it. It will allow client to request arbitrary cluster configured in the Envoy (e.g., Secret Discovery Service).

Sure. See Greg’s comments.

Also, this behavior is already present today with the cluster_header option in http and also the original dst ip from the headers. So we might want to define a separate thing that marks one or more clusters as private or not addressable from filters

rshriram · 2018-09-18T17:46:22Z

Alright. I am closing this, as we decided to make the cluster name optional in a TCP proxy (in fact, its already optional!, looking at the code). Will try to do the DynamicMetadata (i.e FilterState) that sets the cluster and use it in TCP proxy..

Adding weighted cluster and sni cluster to tcp proxy

36be75c

Signed-off-by: Shriram Rajagopalan <[email protected]>

missing import

0758294

Signed-off-by: Shriram Rajagopalan <[email protected]>

ggreenway requested changes Sep 17, 2018

View reviewed changes

ggreenway self-assigned this Sep 17, 2018

missing type

f3f8d87

Signed-off-by: Shriram Rajagopalan <[email protected]>

venilnoronha reviewed Sep 17, 2018

View reviewed changes

lizan reviewed Sep 18, 2018

View reviewed changes

rshriram closed this Sep 18, 2018

lizan mentioned this pull request Sep 18, 2018

rbac: support match based on destination cluster #4455

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TCP proxy API: Weighted clusters and cluster_from_sni #4444

TCP proxy API: Weighted clusters and cluster_from_sni #4444

rshriram commented Sep 17, 2018

rshriram commented Sep 17, 2018

ggreenway Sep 17, 2018

ggreenway Sep 17, 2018

rshriram Sep 17, 2018

ggreenway Sep 17, 2018

rshriram Sep 17, 2018

rshriram Sep 17, 2018

ggreenway Sep 17, 2018

rshriram Sep 17, 2018

ggreenway Sep 18, 2018

rshriram Sep 18, 2018

venilnoronha Sep 17, 2018

rshriram Sep 17, 2018

venilnoronha Sep 17, 2018 •

edited

Loading

rshriram Sep 17, 2018

venilnoronha Sep 17, 2018

ggreenway Sep 17, 2018

venilnoronha Sep 17, 2018

venilnoronha Sep 17, 2018

venilnoronha Sep 17, 2018

lizan Sep 18, 2018

rshriram Sep 18, 2018

rshriram commented Sep 18, 2018

	// weight. The sum of weights across all entries in the clusters array determines the total
	// weight.
	google.protobuf.UInt32Value weight = 2 [(validate.rules).uint32.gte = 1];

TCP proxy API: Weighted clusters and cluster_from_sni #4444

TCP proxy API: Weighted clusters and cluster_from_sni #4444

Conversation

rshriram commented Sep 17, 2018

rshriram commented Sep 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

venilnoronha Sep 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rshriram commented Sep 18, 2018

venilnoronha Sep 17, 2018 •

edited

Loading