-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
router: set correct timeout for egress->ingress envoys #8051
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is a good start
source/common/router/router.cc
Outdated
expected_timeout = timeout.global_timeout_.count(); | ||
} | ||
// todo(nezdolik) Check if order is correct, add tests. | ||
// Check if there is timeout set by egress envoy. If present, use that instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably need this to be guarded by a config flag as this could potentially break existing deployments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snowp, you mean by runtime guarding or by simple bool config property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past I think we've been doing xds config for these kind of changes, but I guess this case could be considered a bug fix. @mattklein123 do you have any thoughts around what kind of feature guarding we'll need for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My quick thought is this should probably be a full on config option in the router?
test/common/router/router_test.cc
Outdated
{"x-envoy-expected-rq-timeout-ms", "10"}}; | ||
FilterUtility::TimeoutData timeout = | ||
FilterUtility::finalTimeout(route, headers, true, false, false); | ||
EXPECT_EQ(std::chrono::milliseconds(10), timeout.global_timeout_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the route timeout and the expected-rq-timeout-ms is the same in this test case, this line doesn't really verify that we're honoring the expected-rq-timeout-ms. You probably want to make these values different so that you can verify that we're capping the global timeout by the incoming deadline
source/common/router/router.cc
Outdated
if (absl::SimpleAtoi(header_timeout_entry->value().getStringView(), &header_timeout)) { | ||
timeout.global_timeout_ = std::chrono::milliseconds(header_timeout); | ||
Http::HeaderEntry* header_expected_timeout_entry = | ||
request_headers.EnvoyExpectedRequestTimeoutMs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be worthwhile to add an integration test to verify that we actually have this header at this point due to all the possible header sanitization that happens at various points during request processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks
added config guard, may need to fix tests later. |
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
still working on this (was on vacation). |
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
a3a1551
to
be4d596
Compare
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
/retest |
🔨 rebuilding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good overall, just a few comments
@@ -63,4 +63,10 @@ message Router { | |||
"x-envoy-retry-on" | |||
] | |||
}]; | |||
|
|||
// If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "will first check if the `x-envoy"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use *x-envoy-expected-timeout-ms*
formatting for headers.
source/common/router/router.cc
Outdated
Http::HeaderEntry* header_expected_timeout_entry = | ||
request_headers.EnvoyExpectedRequestTimeoutMs(); | ||
if (header_expected_timeout_entry) { | ||
// This will prevent from overriding `x-envoy-expected-rq-timeout-ms` header. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this? won't this be set based on the global_timeout_ later on which should match the value we're extracting from the expected-rq-timeout-ms header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snowp this was protection against this code path:
if (insert_envoy_expected_request_timeout_ms && expected_timeout > 0) {
request_headers.insertEnvoyExpectedRequestTimeoutMs().value(expected_timeout);
}
We do indeed derive timeout and put it into a separate data structure (with global timeout), so there is not a big gain from setting insert_envoy_expected_request_timeout_ms
to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed my mind, there is a gain, so that we use same value in timeout.global_timeout (derived from x-envoy-expected-timeout-ms
) and observe same value in header x-envoy-expected-timeout-ms
by not overriding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think we want the expected header to reflect the timeout used by the router, which is affected by more than just those two headers. If you look further down in that function you'll see that we'll use the per try timeout instead of the global timeout if it's set. It seems like we want the expected timeout header to always reflect the timeout enforced by the router for the outgoing request, and just use the incoming expected timeout header to infer the global timeout. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snowp it does, thanks for clarification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will adjust the test as well
// If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present | ||
// and use it's value as timeout to upstream cluster. If header is not present or | ||
// `respect_expected_rq_timeout` is set to false, envoy will derive timeout value from | ||
// `x-envoy-upstream-rq-timeout-ms` header. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think this is completely correct, as there might not be a rq-timeout header in which case the route specified timeout is used. Might be better to avoid listing what the behavior is without this flag to help ensure that this comment doesn't have to be updated whenever the timeout decision logic changes
ASSERT_TRUE(upstream_request_->waitForEndStream(*dispatcher_)); | ||
|
||
// Trigger global timeout, populated from `x-envoy-expected-rq-timeout-ms` header. | ||
timeSystem().sleep(std::chrono::milliseconds(501)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the upstream-rq-timeout header is greater than the expected-rq-timeout, you can't tell based on this test that the timeout is due to upstream-rq-timeout. I suggest making upstream-rq-timeout 300ms instead so that we know we're timing out before the expected-rq-timeout timer would hit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snowp made x-envoy-upstream-rq-timeout-ms
smaller than x-envoy-expected-rq-timeout-ms
@@ -63,4 +63,10 @@ message Router { | |||
"x-envoy-retry-on" | |||
] | |||
}]; | |||
|
|||
// If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use *x-envoy-expected-timeout-ms*
formatting for headers.
|
||
// If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present | ||
// and use it's value as timeout to upstream cluster. If header is not present or | ||
// `respect_expected_rq_timeout` is set to false, envoy will derive timeout value from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you :ref:
internal link to the relevant field in the API docs?
|
||
// If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present | ||
// and use it's value as timeout to upstream cluster. If header is not present or | ||
// `respect_expected_rq_timeout` is set to false, envoy will derive timeout value from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/envoy/Envoy/g
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done, this LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great work. Just a few small comments. Great feature!
/wait
source/common/router/router.cc
Outdated
if (header_expected_timeout_entry) { | ||
if (absl::SimpleAtoi(header_expected_timeout_entry->value().getStringView(), | ||
&header_timeout)) { | ||
timeout.global_timeout_ = std::chrono::milliseconds(header_timeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you lift this part out into a helper which takes the header, etc.? It's repeated 3 times in this function and it would help readability.
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
Signed-off-by: Kateryna Nezdolii <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks!
Signed-off-by: Kateryna Nezdolii <[email protected]>
hi All, Attaching my config below. static_resources: |
@Sooryaa-A please create a dedicated issue in envoy repo |
Signed-off-by: Kateryna Nezdolii [email protected]
Description: In case of egress->ingress envoy setup, ingress envoy currently does not respect
x-envoy-expected-rq-timeout-ms
header, set by egress envoy and overrides the header with it's own timeout value. This change makes ingress envoy to respectx-envoy-expected-rq-timeout-ms
header value, if it's present in request.Risk Level: Low
Testing: Added unit and integration test to make sure header is not sanitised and not ignored.
Docs Changes: Updated API v2 docs
Release Notes: Updated version history
Fixes #7358