Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for configurable precedence of FilterChainMatch rules. #3411

Closed
PiotrSikora opened this issue May 16, 2018 · 18 comments · Fixed by #20110
Closed

Add support for configurable precedence of FilterChainMatch rules. #3411

PiotrSikora opened this issue May 16, 2018 · 18 comments · Fixed by #20110
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!

Comments

@PiotrSikora
Copy link
Contributor

From discussion in #3217 (comment):

@htuch

What about when we have CIDR and SNI, what order do we resolve with? In general, it would be helpful to clarify where we want to go in the end state of the API for all the match criteria and precedence.

@ggreenway

I've been thinking about this a fair bit. I think the precedence should be configurable. I can think of a few ways that might be useful; I'm not sure yet what the right answer is.

One option is to let the user provide a list for the field precedence order. For instance, [SNI, source-ip, transport protocol]. Another possible matching type is the entire list of matches as an ordered list, with first-match-wins. I'm sure there's other options as well. I would be surprised if we can come up with a one-size-fits-all matching algorithm with this many potential criteria.

I think this can all be solved in a later PR.

@htuch

Sure, I can see this being a useful thing to add later. So, for now, as long as we have precedence for the fields that are actually implemented, and a TODO for the rest I'd be happy.

@PiotrSikora

As mentioned earlier (I think?), the destination IP would take precedence over SNI, over transport protocol. I didn't consider source IPs.

I could see having filter_chain_match_precedence: ["server_names", "source_ranges", "transport_protocol"] as a way of controlling precedence, but it doesn't seem that this is something required by most of the use cases, so I'm going to let someone that has an use case for it work on this.

This is mostly a place-holder to gauge interest in configurable precedence of the FilterChainMatch rules, with no immediate plans for adding it.

@mattklein123 mattklein123 added enhancement Feature requests. Not bugs or questions. help wanted Needs help! labels May 16, 2018
@PiotrSikora
Copy link
Contributor Author

PiotrSikora commented May 17, 2018

Another thing that @lizan pointed out is that in routes we're using "first match wins" approach, i.e. the order of filter chain blocks would matter.

With current implementation, following configuration:

filter_chains:                                                                                                                                                                                                                                          
- filter_chain_match:                                                                                                                                                                                                                                   
    # empty                                                                                                                                                                                                                                             
- filter_chain_match:                                                                                                                                                                                                                                   
    sni_domains: ["www.example.com", "server1.example.com"]                                                                                                                                                                                             
    transport_protocol: "tls"                                                                                                                                                                                                                           
    application_protocols: ["dummy", "h2"]

would match 2nd filter chain if connection matched all requirements, but per @lizan it should match 1st filter chain, since it matches all of its requirements as well, and it's configured first.

Thoughts? @mattklein123 @ggreenway @htuch @lizan

@lizan
Copy link
Member

lizan commented May 17, 2018

That is what actually @ggreenway said:

Another possible matching type is the entire list of matches as an ordered list, with first-match-wins.

@mattklein123
Copy link
Member

In general I think first match wins is going to be the simplest to understand for most people, and when we implement other rules like IP/CIDR matching, it may be the only way it makes sense.

However, I know that there are use cases in which we want to optimize and do hash lookups (e.g., large number of SNI matches). My feeling on this is that the documented behavior should be FIFO, but if envoy can detect that a hash lookup is correct, it can switch to that (depending on the match conditions across all matches).

@lizan
Copy link
Member

lizan commented May 18, 2018

Perhaps first-match-wins with LRU cache?

@PiotrSikora
Copy link
Contributor Author

@mattklein123 I think the "simplest to understand" is going to be based on the prior experience with other software, e.g. for me (and probably many others) the most obvious choice is the-most-specific-match-wins, since that's what NGINX does for the server name matching, and I would be extremely surprised if the filter chain match for sni_domains: example.com doesn't get selected for a client with example.com SNI, because it's specified after a more generic filter chain that also matches.

Having said that, I recognize that it's inconsistent with how the matching for HTTP routing rules is done, which is definitely a valid concern, and if most of the maintainers agree that first-match-wins is a better approach, then I can work on the change... But please note that it isn't going to scale well.

first-match-wins: @mattklein123 @lizan
most-specific-match-wins: @PiotrSikora

I'd like to hear from @alyssawilk @htuch and @ggreenway (and anyone else who has opinion on this, really) before making any changes.

@mattklein123
Copy link
Member

But please note that it isn't going to scale well.

As I said before, I think this can be fixed by intelligent detection of when a set of matching criteria can be optimized into a hash lookup. But, personally, I would treat that as the common case. Anyway, I do think that given the other match types we have planned, FIFO with AND semantics (like router) is going to be the easiest for people to understand given other Envoy semantics. But, also agreed, let's hear from other folks.

@ggreenway
Copy link
Contributor

I prefer the faster hash-lookup for SNI with most-specific-wins.

If we mostly do first-match-wins, but we try to detect cases where we can use a different algorithm, I think we'll run into problems where a small config change causes a massive performance degradation, and that can be confusing for users.

@mattklein123
Copy link
Member

How will this work for IP/CIDR matching? Will it be a hash lookup/most-specific followed by a FIFO match of CIDRs?

@PiotrSikora
Copy link
Contributor Author

A series of hash-lookups with wider and wider netmask, i.e.:

find(127.0.0.1/32) -> find(127.0.0.1/31) -> find(127.0.0.1/30) -> ...

with a limit on the widest netmask configured, and perhaps with an optimization to only try the netmasks that were configured.

@ggreenway
Copy link
Contributor

Couldn't this be done with the LcTrie implementation we have?

@PiotrSikora
Copy link
Contributor Author

...or that :) I wasn't aware we have LcTrie already.

@mattklein123
Copy link
Member

My question was less about how to do the lookup for IP but more of whether the type of nested hash lookups we are doing is going to be confusing from a configuration standpoint for users. I guess this gets to @htuch question about allowing the precedence rules to be configurable so the nesting order is dynamic. The reason I think it's confusing, FWIW, is that the configuration is specified essentially as a list of matching rules, and I think most users would interpret that as a set of AND conditions that if they match would win. They might not immediately realize that some of the rules are more important than others.

I don't know if this is going to be confusing, but with the various options we have already laid out (https://github.com/envoyproxy/envoy/blob/master/api/envoy/api/v2/listener/listener.proto#L69) it seems like it might be.

Anyhow, I guess as long as we have good examples and very clear documentation the nested hash approach (with maybe configurable precedence in the future) is OK with me.

Can we potentially commit to doing some good examples in the docs in addition to the SNI ones we already have in the FAQ?

@PiotrSikora
Copy link
Contributor Author

PiotrSikora commented May 18, 2018

Yeah, I'll add more examples and the docs, regardless of the decision here.

I think most users would interpret that as a set of AND conditions that if they match would win. They might not immediately realize that some of the rules are more important than others.

It's still a set of AND conditions, i.e. ALL rules must match in order for any filter chain to be selected.

Example:

filter_chains:
- filter_chain_match:
    # empty
- filter_chain_match:
    sni_domains: "www.example.com"
    application_protocols: "h2"

connection("", "h2") would select 1st filter chain in both cases.

connection("www.example.com", "h2") would select 2nd filter chain with the current implementation, but 1st filter chain with first-match-wins.

connection("www.example.com", "") wouldn't select any filter chain with the current implementation, since we have a match for example.com server name, however rest of the less important criteria for the filter chain doesn't match (fallback for example.com can be easily added with another filter_chain_match without application_protocols), but 1st filter chain with first-match-wins.

For the configuration above (with a fallback for non-HTTP/2 traffic for example.com) to result in the same filter chain selection with the first-match-wins, one would need to write it rather carefully, i.e.

filter_chains:
- filter_chain_match:
    sni_domains: "www.example.com"
    application_protocols: "h2"
- filter_chain_match:
    sni_domains: "www.example.com"
- filter_chain_match:
    # empty

However, even with only 2 matching rules, this is already much more complex routing than usually performed on the same port, and if we assume that this feature is going to be used mostly for SNI routing, then I don't see any advantage of the first-match-wins approach (and it might be simply confusing, if you look at the first config in this comment).

@mattklein123
Copy link
Member

@PiotrSikora for the relatively simple rules we are going to have (compared to HTTP routing for example) you have convinced me that the precedence system makes sense with nested hash/trie/whatever lookups, so I think my vote has switched over. We can always make precedence configurable later as @ggreenway and @htuch suggested and as long as wen have good examples I think it's probably fine for most (all?) uses cases.

@lizan
Copy link
Member

lizan commented May 19, 2018

I'm fine with most-specific-match-wins as long as we have clear field precedence order, either pre-defined or user-configured, and good documentation.

In @PiotrSikora 's example I still think that:

connection("www.example.com", "") wouldn't select any filter chain with the current implementation, since we have a match for example.com server name, however rest of the less important criteria for the filter chain doesn't match (fallback for example.com can be easily added with another filter_chain_match without application_protocols), but 1st filter chain with first-match-wins.

should be able to fallback to 1st filter chain (empty, thus catchall) without adding another filter_chain_match without application_protocols.

@mattklein123
Copy link
Member

should be able to fallback to 1st filter chain (empty, thus catchall) without adding another filter_chain_match without application_protocols

Catch-alls are some of the corner cases I'm concerned about. I think if we have a set of really good examples we should be able to iterate together on the best final behavior, though I think we all now agree we can do this largely within the confines of the existing implementation.

@htuch
Copy link
Member

htuch commented May 21, 2018

I'd just point out that in addition to the above discussion, first-match-wins is not suitable for extremely large CIDR matching scenarios as it is O(n) in the number of filter chains, e.g. O(100k) of filter chains, each matching on different CIDR. We have some internal scenarios that need this level of scale. So, whatever we do, we should provide the ability to do a classical most-specific match solution for CIDR.

FWIW, the first-match-wins semantics of route configurations also need some thought for a similar scaling concern.

I think that providing a sensible precedence ordering initially based on most-specific match and usual use cases, and the ability to later add flexible ordering sounds good.

@mattklein123
Copy link
Member

FWIW, the first-match-wins semantics of route configurations also need some thought for a similar scaling concern.

@htuch lets open a different issue on that. The complexity of the existing routing rules are going to make it almost impossible to build a hashing implementation (especially due to regex). I think we can do some interesting things with HTTP routing along the lines of having an optional new type of route configuration that is designed for a most specific wins hashing implementation and by definition implements a much smaller set of matching conditions. It could even be a layered implementation where once a match is found an L2 FIFO match using the existing routing rules can optionally take place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
5 participants