Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upstream: Implement retry concurrency budgets #9069
upstream: Implement retry concurrency budgets #9069
Changes from 8 commits
ee13f16
413a55a
43804fd
c43f0cd
24e247d
86cd4d5
eb75475
4a4e502
38c53c8
0970941
d5ec251
8d8cde0
e74e782
b11c614
87ca3a8
9a0c03f
f9137c9
4a50f10
ea64ad7
3002bcd
0430c19
8053b7e
8f23bed
9f409e2
91de3ce
92a8188
a50970e
34bc98c
18c5a98
3378b64
baf91ac
193e537
4fc3fb5
f050c30
a2fb127
3bbe93c
f6162c8
1b45679
0d33ec3
49753ce
b635e91
1f02666
6cee795
7d11191
7ce58e7
c318a95
dde62da
2344034
51a8f33
cf9c6ef
9a8e947
0e100d1
ef41c69
81941c0
12198cd
74e5bbb
5285e3e
d3ed226
659f480
61d7057
f70acbe
a77d1f2
105dd0d
0c46399
b9f46ec
c78c39e
bb9e81e
d21677d
4cc50dd
f9cffe0
94f63e8
4fb492a
005317c
dc5f95e
068e0e7
2446f7b
da5fe31
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Right now we can set a retry policy on either the virtual host or the route, with the route overriding the virtual host. I don't feel strongly about this, but I'm wondering if it would be a better experience to allow the retry budget to be set independently from the policy. In this case, if the user sets a per-route policy, they will have to duplicate the budget across every policy because I don't think anything is merged from the virtual host. WDYT? @snowp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a cluster-level retry budget config and have each route optionally override it. It might get confusing to reason about where retry overflows are coming from, since we're using the retry_overflow counter for both the max_retries circuit breaker and the retry budget.
I can start adding this top-level budget to a different branch and defer to you as to whether it should go in this patch or a subsequent one along with runtime knobs and the solution to the HTTP/1.1 request tracking problem below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is going to be problematic at Lyft? Or you think we can just configure this universally? I'm concerned that it's going to cause problems with us for our own usage right away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty confident this can just be configured universally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the config you have now how can we configure this universally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit iffy to me: today kinda this works before for H/1.1 and H/2 make use of either max_connections OR max_requests so this works, but there are requested features that might introduce max_connections to H/2 as well (#7403). Envoy also doesn't reclaim H/1.1 connections until it needs to, so I think using
max_connections
to get a sense for the current concurrency isn't perfect.I think ideally we'd be counting # of requests for HTTP/1.1 and just do max_requests + max_pending, but I don't think we can just add that enforcement safely at this point (since users might have bumped max_con but not max_req). Maybe just note this limitation in the docs for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted in the docs. Let me know if the latest patch communicates this well enough.