Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http2: Need connection policy for maximum streams per upstream connection #7403

Closed
conqerAtapple opened this issue Jun 26, 2019 · 24 comments · Fixed by #9668
Closed

Http2: Need connection policy for maximum streams per upstream connection #7403

conqerAtapple opened this issue Jun 26, 2019 · 24 comments · Fixed by #9668
Assignees
Labels
area/http enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Milestone

Comments

@conqerAtapple
Copy link
Contributor

conqerAtapple commented Jun 26, 2019

Issue Template

Title: *Http2: Need connection policy to allow and control multiple upstream connections. *

Description:

This is related to #7217.

Current default upstream connection scheme doesnt allow you to set a maximum number of streams allowed per connection. This causes upstream load balancing issues as some connections could carry more streams. For example: when upstream is a multiprocess server with shared socket, only few workers receive all the streams.

A connection policy that gives configurable parameters to manage maximum number of streams per upstream connections is needed to solve this problem.

Some implementation approaches:

Essentially we need to :

  • Enqueue the stream if current client stream count exceeds the configured maximum
  • Create a new connection
  • Process the enqueued stream/request in the context of the new connection.
@conqerAtapple conqerAtapple changed the title Http2: Need connection policy for maximum upstream streams per connection Http2: Need connection policy for maximum streams per upstream connection Jun 26, 2019
@mattklein123 mattklein123 added enhancement Feature requests. Not bugs or questions. help wanted Needs help! labels Jun 26, 2019
@mattklein123
Copy link
Member

@conqerAtapple I think what you are asking for is to allow the HTTP/2 connection pool to use more than 1 connection, and also to limit the maximum requests per connection, right? If so, this is something that we have talked about off and on for a long time (cc @alyssawilk @PiotrSikora).

My feeling is to add this capability to the HTTP/2 connection pool code that exists today. We can have a configuration that specifies some policy in terms of when to create a new connection in the pool. Perhaps this is the number of requests already on a connection, the maximum requests per connection, etc. and then we can bind new requests to a new HTTP/2 pool connection.

I think there are some details that need to be worked out here in terms of what knobs we want to provide the user on controlling the connection expansion (and of course connections should be capped via standard circuit breakers).

@conqerAtapple if this is what you want can you potentially rephrase the title/description of the issue and then propose a more detailed solution after looking at both the HTTP/1 and HTTP/2 connection pool as a reference?

@alyssawilk @PiotrSikora any further thoughts or comments on this?

@alyssawilk
Copy link
Contributor

Yeah, I'd definitely love to see the H2 connection pool handle multiple upstream connections, it's been long on our wish list as well.

cc @antoniovicente as interested in this feature as well.

@conqerAtapple
Copy link
Contributor Author

conqerAtapple commented Jun 26, 2019

@mattklein123 @alyssawilk Although I have the general idea of how this needs to be done, I will need some pointers initially on where to look etc in the code.
IMO, the general flow should be:

  • Have a list of Active connections ( in active list)
  • When a Active connection exceeds the configured policy's max allowed streams, create a new Active connection. Move the overflow Active connection to drainlist.
  • On stream close, drain its connection (in the drainlist)
  • If the drain connection has no active streams, close the connection.

@mattklein123
Copy link
Member

@conqerAtapple effectively you are going to make the HTTP/2 connection pool look more like the HTTP/1 connection pool, where you will have a list of active connections, and using "some algorithm" you will decide which connection to use. As I suggested above, I think we want this algorithm to probably be tunable and allow the user to specify the target max simultaneous requests per connection before a new connection is spun up. There is definitely more thought that needs to go into how this will work from an algorithmic perspective, so my suggestion is to think about this for a bit and propose a rough algorithm with what tuning knobs we want, then we can talk about the code changes. I can help with this also, but not until Monday. I'm sure @alyssawilk will have lots of good thoughts here also and I would like her to help with the policy/knob design.

@conqerAtapple
Copy link
Contributor Author

@mattklein123 @alyssawilk @antoniovicente

I have started a document to go over the proposals and design details.

https://docs.google.com/document/d/1gzY1y8CdbvZZ_82y9Fb6kNXAlzOBC0VIeiuMdn_xCRs/edit#heading=h.zd70r91epk53

@mattklein123
Copy link
Member

@conqerAtapple I left a few small comments but in general this is exactly what I was hoping for. I think this will be great so if you can implement that would be fantastic. @alyssawilk @antoniovicente and I can help review. Feel free to post a partial PR if you want a sanity check before working on all the tests.

@conqerAtapple
Copy link
Contributor Author

@mattklein123 @alyssawilk @antoniovicente .Thanks for reviewing the doc. I will update the doc and add some more details this week.Can start on the initial implementation next week.

@mattklein123 mattklein123 modified the milestones: 1.11.0, 1.12.0 Jul 1, 2019
@conqerAtapple
Copy link
Contributor Author

@mattklein123 @alyssawilk Had a question before I start the implementation: Looks like ProdClusterManagerFactory::allocateConnPool has the ability to fetch a Http1 connection pool if upstream.use_http2 feature is disabled. Isnt this what we would use if we just cared about 1 request/stream per upstream connection ?

@mattklein123
Copy link
Member

Isnt this what we would use if we just cared about 1 request/stream per upstream connection ?

I think we are talking about 2 things here. If you want to use HTTP/1.1 just use HTTP/1.1. :)

@conqerAtapple
Copy link
Contributor Author

Yes looking at the code in detail, the connection pool for Http/1 is coupled with its codec. I thought /was-hoping that i could use http1 connection pool policy with http2 codec. So yea i guess it wont work.Thanks!

@wallyatgithub
Copy link

Hi,

Just want to make sure, will this enhancement also be applicable to Original destination cluster?

Thanks a lot,
Wally

@mattklein123
Copy link
Member

Just want to make sure, will this enhancement also be applicable to Original destination cluster?

Yes it should be once implemented.

@oschaaf
Copy link
Member

oschaaf commented Nov 19, 2019

Cross linking an experimental approach / PoC from Nighthawk for those who would like to have a look-see: envoyproxy/nighthawk#200

That adds a relatively simple new pool implementation, which round robins over multiple vanilla H/2 pools under the hood to leverage > 1 active connections. In Nighthawk the number of pools would equal the configured number of --connections configured so semantics for that option are on par with H1.

@wallyatgithub
Copy link

Just want to make sure, will this enhancement also be applicable to Original destination cluster?

Yes it should be once implemented.

Thanks a lot for confirmation.
Is this still under active development? seems the pull request has been closed a month ago.

@snowp
Copy link
Contributor

snowp commented Dec 14, 2019

FYI I'm kinda playing around with refactoring the HTTP conn pools to share more code between HTTP/1 and HTTP/2, so hopefully that will make some incremental improvements that makes supporting this feature easier.

@ggreenway
Copy link
Contributor

I have a PR to polish up and submit in the next few weeks, continuing where @conqerAtapple left off. @snowp we should sync up on if these will conflict.

@mattklein123
Copy link
Member

Hi @ggreenway, welcome back! :) I was planning on working on this also, so happy to see that much interest in this. It might be worth syncing up at some point on plans along with @snowp

@snowp
Copy link
Contributor

snowp commented Dec 14, 2019

I was just kind of starting to pull things up into the base class + introduce a base class for the ActiveStream implementations. My anticipation was to have the base class manage the ready/busy lists, perhaps also introducing additional lists as necessary.

Looking at the PR it seems to be fairly compatible in that it attempts to use a similar set of lists to track the various states, but I imagine it would cause quite a bit of merge conflicts, so it might make sense for me to hold off on the refactors.

Here's a WIP commit that does some of the safer refactors: snowp@a38c215

@wallyatgithub
Copy link

Hi,

Should we really pull up the ready/busy clients into base? Busy client should be http1.x specific, as it does not support concurrent request/streams, while this is not applicable to http2, so we would not have a busy client for h2.

@snowp
Copy link
Contributor

snowp commented Dec 20, 2019

In my mind you'd treat a H2 client as "busy" when you reach the threshold for maximum streams per upstream connection described in this issue. This means that the handling for HTTP/1.1 busy connections just becomes a special case (max streams = 1) of the HTTP/2 behavior

@mattklein123
Copy link
Member

+1 to what @snowp said.

@wallyatgithub
Copy link

OK, I get it, here the max streams means max concurrent streams, not max total streams.

@ggreenway
Copy link
Contributor

Right; there's max_requests_per_connection on the cluster, and http2_protocol_options .max_concurrent_streams in the h2 options. Although in the above description, I'd say the H2 client is "busy" if it has reached either of those thresholds. Basically, anytime it cannot take any more requests, it is "busy".

@wallyatgithub
Copy link

I think if max_requests_per_connection is reached, it should be moved to draining not busy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/http enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants