-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upstream: avoid reset after end_stream #14106
Changes from 5 commits
2d3bcfb
1088604
a63edbf
6d9f5cb
a00f00b
999f575
cb63c1c
a1a6a0e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ | |
#include "common/http/header_map_impl.h" | ||
#include "common/http/headers.h" | ||
#include "common/http/message_impl.h" | ||
#include "common/http/status.h" | ||
#include "common/network/transport_socket_options_impl.h" | ||
#include "common/router/router.h" | ||
|
||
|
@@ -45,6 +46,11 @@ void TcpUpstream::encodeData(Buffer::Instance& data, bool end_stream) { | |
|
||
Envoy::Http::Status TcpUpstream::encodeHeaders(const Envoy::Http::RequestHeaderMap&, | ||
bool end_stream) { | ||
if (!upstream_request_) { | ||
// TODO(snowp): Should this return something else in this case? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, seems like an error case to me. |
||
return Envoy::Http::okStatus(); | ||
} | ||
|
||
// Headers should only happen once, so use this opportunity to add the proxy | ||
// proto header, if configured. | ||
ASSERT(upstream_request_->routeEntry().connectConfig().has_value()); | ||
|
@@ -86,7 +92,16 @@ void TcpUpstream::resetStream() { | |
} | ||
|
||
void TcpUpstream::onUpstreamData(Buffer::Instance& data, bool end_stream) { | ||
if (!upstream_request_) { | ||
return; | ||
} | ||
|
||
upstream_request_->decodeData(data, end_stream); | ||
// This ensures that if we get a reset after end_stream we won't propagate two | ||
// "end streams" to the upstream_request_. | ||
if (end_stream) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. once we have upstream filters, will it be possible to get an upstream "end stream "before we send headers? I think it could be the case, so I think we're going to want to do null pointer checks for upstream_request in few other places, WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah good point, I'll go over all the functions |
||
upstream_request_ = nullptr; | ||
} | ||
} | ||
|
||
void TcpUpstream::onEvent(Network::ConnectionEvent event) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive by: @snowp @alyssawilk did you check carefully that the destructor of the various upstreams isn't referencing anything that can go away? We have had subtle bugs in this area and in general the use of deferred delete can be a little precarious.
Also, question: is this change defensive or is there a real case that causes this? It seems like we shouldn't be calling reset at all after we receive end stream? Why doesn't end_stream cause upstream_ to get nulled out, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original motivation for this PR was to fix an issue I ran into in #13095 where a reset would follow an end_stream=true callback and both of these events would be propagated to the router/UpstreamRequest. This ran into an issue when resetting the upstream_request_ as part of the router, hence the deferred deletions. It does feel like there's something weird going on here, but this seemed to be the easiest fix to what was happening on the other PR. Happy to discuss more (and even revert if we think this is the wrong way to handle it).
The remaining changes here are defensive changes in anticipation of the upstream FM changes, at least I don't have a clear sense of when this would be a problem, maybe @alyssawilk does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: deferred delete, it's mainly just making sure the upstream can be deleted after the router part. Is there any reference in the destructor that might cause a timing issue? For example timers getting cancelled, etc. This can be somewhat subtle and needs to be audited.
re: the change itself, it does seem to me like this is not the correct fix, so we can at least open a tracking issue to try to figure out what state is not getting reset properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through both the HTTP and TCP upstream neither of them has a dtor that does anything but a null check, and the only thing that would be destructed as part of it is the
ConnectionData
handle. TheTcpConnectionData
seems to indicate that it's resilient to deferred deletion ordering so I suspect this is fine:I definitely had not done my due diligence here so, so thanks for pointing this out.
I'll file an issue and try to dig a bit deeper if I can find the time