Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network: delayed conn close #4382

Merged
merged 21 commits into from
Sep 24, 2018

Conversation

AndresGuedez
Copy link
Contributor

network: Support delayed close of downstream conns (#2929)

Mitigate client read/close race issues on downstream HTTP connections by adding a new connection
close type 'FlushWriteAndDelay'. This new close type flushes the write buffer on a connection but
does not immediately close after emptying the buffer (unlike ConnectionCloseType::FlushWrite).

A timer has been added to track delayed closes for both 'FlushWrite' and 'FlushWriteAndDelay'. Upon
triggering, the socket will be closed and the connection will be cleaned up.

Delayed close processing can be disabled by setting the newly added HCM 'delayed_close_timeout'
config option to 0.

Risk Level: Medium (changes common case behavior for closing of downstream HTTP connections)
Testing: Unit tests and integration tests added.
Fixes #2929.

Signed-off-by: Andres Guedez [email protected]

Mitigate client read/close race issues on downstream HTTP connections by adding a new connection
close type 'FlushWriteAndDelay'. This new close type flushes the write buffer on a connection but
does not immediately close after emptying the buffer (unlike ConnectionCloseType::FlushWrite).

A timer has been added to track delayed closes for both 'FlushWrite' and 'FlushWriteAndDelay'. Upon
triggering, the socket will be closed and the connection will be cleaned up.

Delayed close processing can be disabled by setting the newly added HCM 'delayed_close_timeout'
config option to 0.

Risk Level: Medium (changes common case behavior for closing of downstream HTTP connections)
Testing: Unit tests and integration tests added.
Fixes envoyproxy#2929.

Signed-off-by: Andres Guedez <[email protected]>
HTTP/2 integration tests added for delayed close processing.

Signed-off-by: Andres Guedez <[email protected]>
@AndresGuedez AndresGuedez changed the title Delayed conn close network: delayed conn close Sep 10, 2018
Signed-off-by: Andres Guedez <[email protected]>
Increase the timeout values for the common case (no timeout trigerred)
tests to prevent test flakiness.

Signed-off-by: Andres Guedez <[email protected]>
Copy link
Contributor

@dnoe dnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for fixing this! Mostly style nits.

//
// A value of 0 will completely disable delayed close processing, and the downstream connection's
// socket will be closed immediately after the write flush is completed.
google.protobuf.Duration delayed_close_timeout = 25 [(gogoproto.stdduration) = true];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a default value? What happens if delayed_close_timeout is not set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a default timeout value of 1000 ms. I've amended the comment to note this.

@@ -63,7 +63,8 @@ class ConnectionCallbacks {
*/
enum class ConnectionCloseType {
FlushWrite, // Flush pending write data before raising ConnectionEvent::LocalClose
NoFlush // Do not flush any pending data and immediately raise ConnectionEvent::LocalClose
NoFlush, // Do not flush any pending data and immediately raise ConnectionEvent::LocalClose
FlushWriteAndDelay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a similar comment to the above two lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

close_with_flush_ = true;
RELEASE_ASSERT(type == ConnectionCloseType::FlushWrite ||
type == ConnectionCloseType::FlushWriteAndDelay,
"");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind making this a RELEASE_ASSERT? What's the practical implication if it was an ASSERT and happened in production?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the RELEASE_ASSERT to provide runtime protection against bugs introduced in this logic, since they could lead to issues such as resource leaks and/or memory corruption.

I just re-read the style guide and it does mention that ASSERT should be the common case for this type of check, so I can certainly change it if that's the convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is generally to use ASSERT for this, if you feel strongly about RELEASE_ASSERT I'll defer to senior maintainer for the decision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to ASSERT() per style guidelines.

type == ConnectionCloseType::FlushWriteAndDelay,
"");
delayed_close_ = true;
bool delayed_close_timeout_set = delayedCloseTimeout().count();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to see the explicit boolean expression here, ie, delayedCloseTimeout().count() > 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


// Force a closeSocket() after the write buffer is flushed if the close_type calls for it or if
// no delayed close timeout is set.
close_after_flush_ = type == ConnectionCloseType::FlushWrite || !delayed_close_timeout_set;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit, but I think this expression is more readable if you reverse the two halves of the or, since it separates the = and == more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

std::unique_ptr<Network::ConnectionImpl> server_connection(new Network::ConnectionImpl(
dispatcher, std::make_unique<ConnectionSocketImpl>(0, nullptr, nullptr),
TransportSocketPtr(transport_socket), true));
server_connection->setDelayedCloseTimeout(std::chrono::milliseconds(500));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we safe from flakes here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The timer's callback is manually invoked at the end of the test. The 'setDelayedCloseTimeout()' call is required to enable creation of the timer, but the actual value does not matter (as long as it's >0). I've added a clarifying comment.

TEST_P(ConnectionImplTest, FlushWriteAndDelayCloseTimerTriggerTest) {
setUpBasicConnection();
connect();
server_connection_->setDelayedCloseTimeout(std::chrono::milliseconds(500));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about flakes here. How do we know 500ms is the right number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is safe. Since the client connection does not issue a close() during this test, the timer always triggers. I've added a clarifying comment and reduced the timeout value since there is no need to wait so long.

@@ -33,6 +33,7 @@ using testing::InSequence;
using testing::Invoke;
using testing::InvokeWithoutArgs;
using testing::Return;
using testing::ReturnNew;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

version_);

connection.run();
EXPECT_TRUE(response.find("SETTINGS expected") != std::string::npos);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the form EXPECT_THAT(response, HasSubstr("SETTINGS expected")) more readable when trying to understand how the test works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the matcher. Done.

version_);

connection.run();
EXPECT_TRUE(response.find("SETTINGS expected") != std::string::npos);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, the HasSubstr matcher will make this read nicely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@AndresGuedez
Copy link
Contributor Author

The CircleCI mac failure is due to the lack of early connection close detection in macOS (see #4299). Per the conversation in that issue, I am disabling the failing test on macOS.

// The delayed close timeout is for downstream connections managed by the HTTP connection manager.
// It is defined as a grace period after connection close processing has been locally initiated
// during which Envoy will flush the write buffers for the connection and await the peer to close
// the connection (i.e., issue a TCP FIN/RST).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case flush the write buffer and delay close, we should issue TCP FIN but not RST, the expected behavior is issue FIN after flush writing, and RST on timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've clarified this comment since this is referring to the client shutting down the connection and Envoy receiving the FIN/RST.

With regards to Envoy issuing FIN vs RST, the behavior is for a shutdown(_, SHUT_WR) to be issued when the flush write is successfully completed. If the delayed close timeout triggers, a close() is issued on the socket, which will trigger FIN or RST depending on the state of the socket/connection.

What benefit are we trying to get out of always issuing RST on timeout? It will likely lead to faster reclamation of resources which is nice, but I think Envoy's behavior regarding FIN vs RST should be applied uniformly across timeout/error conditions and therefore should be tracked as a new issue.

close_with_flush_ = true;
RELEASE_ASSERT(type == ConnectionCloseType::FlushWrite ||
type == ConnectionCloseType::FlushWriteAndDelay,
"");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is generally to use ASSERT for this, if you feel strongly about RELEASE_ASSERT I'll defer to senior maintainer for the decision.

@@ -10,7 +10,6 @@ jobs:
resource_class: xlarge
working_directory: /source
environment:
BAZEL_REMOTE_CACHE: https://storage.googleapis.com/envoy-circleci-bazel-cache/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's plan to revert these changes before this gets merged. If we do disable this for all of Envoy, I'd like to do it in a separate PR and consult with maintainers. I think you can leave it in for now with this comment in place so senior maintainer gets a heads up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return new Buffer::WatermarkBuffer(below_low, above_high);
}));

auto timer = new Event::MockTimer(&dispatcher);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That explanation makes sense. I get nervous when I see bare new calls, especially without a comment. I think this test might be doing the opposite of the pattern I usually see, which is to have a unique_ptr here in the test body which is std::moved into the right place to be returned by Envoy::MockBufferFactory::create() but only after you steal a bare pointer out of the unique_ptr to use when setting up any the gmock expectations. This is a clear case of dangerous cheating the unique_ptr, but it's ok in this limited test case example since we know the unique pointer scope lives as long as the body of this test. I think it leaves the object life cycle a bit clearer defined than if you don't use unique_ptr here.

If this doesn't work in your case, please add a comment explaining the object lifecycle at the bare news. Happy to chat more about this in person tomorrow too if it helps.

dnoe
dnoe previously approved these changes Sep 12, 2018
Copy link
Contributor

@dnoe dnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks. Tagging @zuercher for senior review.

@AndresGuedez
Copy link
Contributor Author

@dnoe -- thanks for the review!

Copy link
Member

@zuercher zuercher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on.

@@ -86,6 +88,8 @@ class Connection : public Event::DeferredDeletable, public FilterManager {
Stats::Gauge& write_current_;
// Counter* as this is an optional counter. Bind errors will not be tracked if this is nullptr.
Stats::Counter* bind_errors_;
// Optional counter. Delayed close semantics are only used by HTTP connections.
Stats::Counter* delayed_close_timeouts_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the reference to HTTP, since this isn't necessarily HTTP-specific. I think the comment can just mirror the one for bind_errors_.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed the delayed close timeout is not inherently HTTP specific, but the intent of the comment is to point out that as currently implemented it only affects HTTP connections, since the H{1,2} codecs are the only callers of setDelayedCloseTimeout() on a connection to enable it.

I'm fine removing this however if you don't think the note/clarification is required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is that it's just a comment we'll have to delete (or will be incorrect) when someone inevitably enables this for a different purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Cleaned up the comment.


/**
* Set the timeout for delayed connection close()s.
* This is only used for downstream connections processing HTTP/1 and HTTP/2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, remove this reference (or at least the "only" qualifier).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document (probably on close) the relationship between this timeout value and the various close options. I had assumed that ConnectionCloseType::FlushWrite behavior was meant to be unchanged, but really it's going to wait up to the delayed close timeout, if set, to complete its flush before closing.

source/common/network/connection_impl.cc Show resolved Hide resolved
@@ -535,6 +559,13 @@ bool ConnectionImpl::bothSidesHalfClosed() {
return read_end_stream_ && write_end_stream_ && write_buffer_->length() == 0;
}

void ConnectionImpl::onDelayedCloseTimeout() {
ENVOY_CONN_LOG(debug, "triggered delayed close", *this);
ASSERT(connection_stats_->delayed_close_timeouts_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer that you test if delayed_close_timeouts_ is null (similar to how bind_errors_ optionality is handled). Or else, ASSERT the pointer is non-null when the delayed close timeout is configured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (delayed_close_timer_) {
// It's ok to disable even if the timer has already fired.
delayed_close_timer_->disableTimer();
delayed_close_timer_.reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: don't need to reset -- it's about to be destroyed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
};
} // namespace Network
}; // namespace Envoy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: undo these -- looks like fix format got confused at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tools/check_format.py fails when this is reverted -- 'FixNamespaceComments' appears to be required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I downloaded this file and it seems to pass format checks locally without the stray namespace comments. Has it failed CI's format check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted comments. I was looking at the wrong lines... Thanks.

@@ -149,18 +149,36 @@ IntegrationCodecClient::startRequest(const Http::HeaderMap& headers) {
return {encoder, std::move(response)};
}

void IntegrationCodecClient::waitForDisconnect() {
bool IntegrationCodecClient::waitForDisconnect(uint32_t time_to_wait) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use std::chrono::millisconds here to match the other integration tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

connection_->dispatcher().run(Event::Dispatcher::RunType::Block);
if (wait_timer_triggered && !disconnected_) {
return false;
}
EXPECT_TRUE(disconnected_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should disable the timer if it hasn't fired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


response->waitForEndStream();
// The delayed close timeout should trigger since client is not closing the connection.
EXPECT_TRUE(codec_client_->waitForDisconnect(750));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you given this the runs_per_test treatment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've done extensive flake testing and did not hit any issues. I've adjusted the timeouts though to make it even more resistant since the CI environment is considerably more flake prone than my machine.

Copy link
Member

@zuercher zuercher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. One small thing, but otherwise looks good to me.

@@ -128,6 +141,7 @@ void ConnectionImpl::close(ConnectionCloseType type) {
// Create and activate a timer which will immediately close the connection if triggered.
// A config value of 0 disables the timeout.
if (delayed_close_timeout_set) {
ASSERT(connection_stats_->delayed_close_timeouts_ != nullptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant in setDelayedCloseTimeout.I don't want someone to enable this feature on a connection and then only later get an assertion about this stat.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connection_stats_ is set via setConnectionStats(), so asserting on the counter != nullptr in setDelayedCloseTimeout() will introduce an ordering dependency (which is not currently met). I think it would be preferable to keep these two functions independent of each other.

Unfortunately, initialization of ConnectionImpls is spread out between codec creation (allocation and call to setDelayedCloseTimeout) and ConnectionManagerImpl::initializeReadFilterCallbacks() where setConnectionStats() is called. Therefore, adding an initFinished() function where the ASSERT could be safely issued will require some work.

I can mark this with a TODO so we can move it into the appropriate place when such a function is added. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just make setting connection_state_->delayed_close_timeouts_ optional? (E.g. don't assert, just don't increment if it's null?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

Signed-off-by: Andres Guedez <[email protected]>
zuercher
zuercher previously approved these changes Sep 13, 2018
Copy link
Member

@zuercher zuercher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@AndresGuedez
Copy link
Contributor Author

@zuercher -- Thanks for reviewing!

@AndresGuedez
Copy link
Contributor Author

This failed with a timeout in the SDS integration test. Can you check if your branch has c4211b3 (which was supposed to fix the timeout).

I don't have that commit in my branch; I'll merge and kick off new CI runs.

zuercher
zuercher previously approved these changes Sep 19, 2018
ggreenway
ggreenway previously approved these changes Sep 19, 2018
@ggreenway
Copy link
Contributor

@mattklein123 Please take a look.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Love seeing this implemented and the flush timeout leak fixed also. A few small questions/comments.

Http1Settings settings,
std::chrono::milliseconds delayed_close_timeout)
: ConnectionImpl(connection, HTTP_REQUEST), callbacks_(callbacks), codec_settings_(settings) {
connection_.setDelayedCloseTimeout(delayed_close_timeout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for doing this set in the codec? It seems like something that is unrelated to the codec itself and should be done by higher layer code? Same question for the HTTP/2 codec below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not related to the codec directly but this particular delayed_close_timeout is HTTP only since it is managed via the HCM configuration. This seemed to be the most appropriate place to set the timeout for HTTP connections just after they are constructed.

Since the timeout value is intended to be network filter specific, I wouldn't expect any layer higher than the corresponding filter to issue the call to setDelayedCloseTimeout().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it should be done in the filter. Why not do it in the HCM filter and avoid codec modifications?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it's best to contain it within the HCM. I revisited the code and moved setting the timeout to the HCM config.


// No need to continue if a FlushWrite/FlushWriteAndDelay has already been issued and there is a
// pending delayed close.
if (delayed_close_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this happen in practice? Mostly just wondering about the flow that causes this. Perhaps flesh out the comment with some examples of cases where this happens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I initially tried an ASSERT() which triggered in multiple tests during a //test/... run. I just tracked down a failure during an integration_test and added it to the comment. Based on that example and code inspection, this conditional can evaluate true under fairly standard use cases such as when Envoy initiates the close() sequence during codec processing.

}

delayed_close_ = true;
bool delayed_close_timeout_set = delayedCloseTimeout().count() > 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@AndresGuedez AndresGuedez dismissed stale reviews from ggreenway and zuercher via 9d86332 September 21, 2018 20:50
Instead, constrain setting the delayed close timeout to the HCM config,
which owns it.

Signed-off-by: Andres Guedez <[email protected]>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, one small change and I think you also need a master merge.

@@ -301,6 +302,7 @@ Http::ServerConnectionPtr
HttpConnectionManagerConfig::createCodec(Network::Connection& connection,
const Buffer::Instance& data,
Http::ServerConnectionCallbacks& callbacks) {
connection.setDelayedCloseTimeout(delayed_close_timeout_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would recommend setting this in ConnectionManagerImpl::initializeReadFilterCallbacks. It's earlier than here and also near where we setup some other connection level stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can you merge master and we can 🚢? Thank you!

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, sorry, one more doc thing.

//
// A value of 0 will completely disable delayed close processing, and the downstream connection's
// socket will be closed immediately after the write flush is completed.
google.protobuf.Duration delayed_close_timeout = 26 [(gogoproto.stdduration) = true];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forget this from before. Can you add this to the release notes also with a link to this field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@mattklein123 mattklein123 merged commit aa9478f into envoyproxy:master Sep 24, 2018
@AndresGuedez
Copy link
Contributor Author

@mattklein123 Thanks for the review!

junr03 pushed a commit that referenced this pull request Oct 1, 2018
This reverts commit aa9478f.

Signed-off-by: Jose Nino <[email protected]>
junr03 added a commit that referenced this pull request Oct 2, 2018
This PR reverts #4382. When deploying at Lyft we noticed crashes on here where we might be derefencing the connection_stats_ pointer after the point has been reset.

Note: this PR keeps the changes to the API made in the original PR but tags the field as not implemented. This is what we have done in the past for reverts that involve changes that change the API.

Signed-off-by: Jose Nino <[email protected]>
AndresGuedez added a commit to AndresGuedez/envoy that referenced this pull request Oct 2, 2018
Re-enable the changes reverted in 9d32e5c, which were originally
merged as part of envoyproxy#4382.

Signed-off-by: Andres Guedez <[email protected]>
AndresGuedez added a commit to AndresGuedez/envoy that referenced this pull request Oct 2, 2018
Fixes a segfault introduced in envoyproxy#4382 due to a connection tear down race condition when the delayed
close timer triggers after connection state has been reset via closeSocket().

Signed-off-by: Andres Guedez <[email protected]>
mattklein123 pushed a commit that referenced this pull request Oct 4, 2018
Re-enable the changes reverted in 9d32e5c, which were originally merged as part of #4382.

Signed-off-by: Andres Guedez <[email protected]>
aa-stripe pushed a commit to aa-stripe/envoy that referenced this pull request Oct 11, 2018
This PR reverts envoyproxy#4382. When deploying at Lyft we noticed crashes on here where we might be derefencing the connection_stats_ pointer after the point has been reset.

Note: this PR keeps the changes to the API made in the original PR but tags the field as not implemented. This is what we have done in the past for reverts that involve changes that change the API.

Signed-off-by: Jose Nino <[email protected]>
Signed-off-by: Aaltan Ahmad <[email protected]>
aa-stripe pushed a commit to aa-stripe/envoy that referenced this pull request Oct 11, 2018
…ix (envoyproxy#4587)

Re-enable the changes reverted in 9d32e5c, which were originally merged as part of envoyproxy#4382.

Signed-off-by: Andres Guedez <[email protected]>
Signed-off-by: Aaltan Ahmad <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

http: not proxying 413 correctly
7 participants