Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upstream: avoid reset after end_stream #14106

Merged
merged 8 commits into from
Dec 1, 2020
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions include/envoy/router/router.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "envoy/config/core/v3/base.pb.h"
#include "envoy/config/route/v3/route_components.pb.h"
#include "envoy/config/typed_metadata.h"
#include "envoy/event/deferred_deletable.h"
#include "envoy/http/codec.h"
#include "envoy/http/codes.h"
#include "envoy/http/conn_pool.h"
Expand Down Expand Up @@ -1256,9 +1257,8 @@ class GenericConnectionPoolCallbacks {
*
* It is similar logically to RequestEncoder, only without the getStream interface.
*/
class GenericUpstream {
class GenericUpstream : public Event::DeferredDeletable {
public:
virtual ~GenericUpstream() = default;
/**
* Encode a data frame.
* @param data supplies the data to encode. The data may be moved by the encoder.
Expand Down
2 changes: 1 addition & 1 deletion source/common/router/upstream_request.cc
Original file line number Diff line number Diff line change
Expand Up @@ -480,8 +480,8 @@ void UpstreamRequest::clearRequestEncoder() {
// Before clearing the encoder, unsubscribe from callbacks.
if (upstream_) {
parent_.callbacks()->removeDownstreamWatermarkCallbacks(downstream_watermark_manager_);
parent_.callbacks()->dispatcher().deferredDelete(std::move(upstream_));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by: @snowp @alyssawilk did you check carefully that the destructor of the various upstreams isn't referencing anything that can go away? We have had subtle bugs in this area and in general the use of deferred delete can be a little precarious.

Also, question: is this change defensive or is there a real case that causes this? It seems like we shouldn't be calling reset at all after we receive end stream? Why doesn't end_stream cause upstream_ to get nulled out, etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original motivation for this PR was to fix an issue I ran into in #13095 where a reset would follow an end_stream=true callback and both of these events would be propagated to the router/UpstreamRequest. This ran into an issue when resetting the upstream_request_ as part of the router, hence the deferred deletions. It does feel like there's something weird going on here, but this seemed to be the easiest fix to what was happening on the other PR. Happy to discuss more (and even revert if we think this is the wrong way to handle it).

The remaining changes here are defensive changes in anticipation of the upstream FM changes, at least I don't have a clear sense of when this would be a problem, maybe @alyssawilk does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: deferred delete, it's mainly just making sure the upstream can be deleted after the router part. Is there any reference in the destructor that might cause a timing issue? For example timers getting cancelled, etc. This can be somewhat subtle and needs to be audited.

re: the change itself, it does seem to me like this is not the correct fix, so we can at least open a tracking issue to try to figure out what state is not getting reset properly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through both the HTTP and TCP upstream neither of them has a dtor that does anything but a null check, and the only thing that would be destructed as part of it is the ConnectionData handle. The TcpConnectionData seems to indicate that it's resilient to deferred deletion ordering so I suspect this is fine:

    ~TcpConnectionData() override {
      // Generally it is the case that TcpConnectionData will be destroyed before the
      // ActiveTcpClient. Because ordering on the deferred delete list is not guaranteed in the
      // case of a disconnect, make sure parent_ is valid before doing clean-up.
      if (parent_) {
        parent_->clearCallbacks();
      }
    }

I definitely had not done my due diligence here so, so thanks for pointing this out.

I'll file an issue and try to dig a bit deeper if I can find the time

}
upstream_.reset();
}

void UpstreamRequest::DownstreamWatermarkManager::onAboveWriteBufferHighWatermark() {
Expand Down
15 changes: 15 additions & 0 deletions source/extensions/upstreams/http/tcp/upstream_request.cc
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include "common/http/header_map_impl.h"
#include "common/http/headers.h"
#include "common/http/message_impl.h"
#include "common/http/status.h"
#include "common/network/transport_socket_options_impl.h"
#include "common/router/router.h"

Expand Down Expand Up @@ -45,6 +46,11 @@ void TcpUpstream::encodeData(Buffer::Instance& data, bool end_stream) {

Envoy::Http::Status TcpUpstream::encodeHeaders(const Envoy::Http::RequestHeaderMap&,
bool end_stream) {
if (!upstream_request_) {
// TODO(snowp): Should this return something else in this case?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, seems like an error case to me.

return Envoy::Http::okStatus();
}

// Headers should only happen once, so use this opportunity to add the proxy
// proto header, if configured.
ASSERT(upstream_request_->routeEntry().connectConfig().has_value());
Expand Down Expand Up @@ -86,7 +92,16 @@ void TcpUpstream::resetStream() {
}

void TcpUpstream::onUpstreamData(Buffer::Instance& data, bool end_stream) {
if (!upstream_request_) {
return;
}

upstream_request_->decodeData(data, end_stream);
// This ensures that if we get a reset after end_stream we won't propagate two
// "end streams" to the upstream_request_.
if (end_stream) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once we have upstream filters, will it be possible to get an upstream "end stream "before we send headers? I think it could be the case, so I think we're going to want to do null pointer checks for upstream_request in few other places, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point, I'll go over all the functions

upstream_request_ = nullptr;
}
}

void TcpUpstream::onEvent(Network::ConnectionEvent event) {
Expand Down
Loading