Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[conn_pool] fix use after free in H/1 connection pool #14220

Merged
merged 17 commits into from
Dec 8, 2020

Conversation

asraa
Copy link
Contributor

@asraa asraa commented Nov 30, 2020

Signed-off-by: Asra Ali [email protected]

Commit Message: Fixes (thanks @yanavlasov @antoniovicente) use after free when dispatcher tries to run conn_pool->onUpstreamReady() after the connection pool was destroyed. Reverts back to a schedulable callback per https://github.com/envoyproxy/envoy/pull/13867/files

Risk Level: Medium
Testing: Added test. This test fails with the use after free at head:

 
+TEST_F(Http1ConnPoolImplTest, RegressionTest) {
+  InSequence s;
+
+  NiceMock<MockResponseDecoder> outer_decoder;
+  ConnPoolCallbacks callbacks;
+  conn_pool_->expectClientCreate();
+  Http::ConnectionPool::Cancellable* handle = conn_pool_->newStream(outer_decoder, callbacks);
+  EXPECT_NE(nullptr, handle);
+
+  NiceMock<MockRequestEncoder> request_encoder;
+  ResponseDecoder* inner_decoder;
+  EXPECT_CALL(*conn_pool_->test_clients_[0].codec_, newStream(_))
+      .WillOnce(DoAll(SaveArgAddress(&inner_decoder), ReturnRef(request_encoder)));
+  EXPECT_CALL(callbacks.pool_ready_, ready());
+  EXPECT_CALL(*conn_pool_->test_clients_[0].connect_timer_, disableTimer());
+  conn_pool_->test_clients_[0].connection_->raiseEvent(Network::ConnectionEvent::Connected);
+
+  EXPECT_TRUE(
+      callbacks.outer_encoder_
+          ->encodeHeaders(TestRequestHeaderMapImpl{{":path", "/"}, {":method", "GET"}}, true)
+          .ok());
+
+  Event::PostCb post_cb;
+  EXPECT_CALL(conn_pool_->mock_dispatcher_, post(_)).WillOnce(SaveArg<0>(&post_cb));
+
+  dispatcher_.deferredDelete(std::move(conn_pool_));
+
+  inner_decoder->decodeHeaders(
+      ResponseHeaderMapPtr{new TestResponseHeaderMapImpl{{":status", "200"}}}, true);
+
+  // Clear deferred delete list and trigger post callback
+  dispatcher_.clearDeferredDeleteList();
+  post_cb();
+
+  conn_pool_->destructAllConnections();
+}

This test had to be modified now that there is a schedulable callback.What I don't like about the fix/test is that the callback is never scheduled after the fix because it is guarded by hasPendingStreams() which ends up false in this test, so there is never a potential for use after free if I add that if condition. I don't know if in production crashes this is exactly the problem or not. Trying to understand how to make this test better..

Signed-off-by: Asra Ali <[email protected]>
@asraa
Copy link
Contributor Author

asraa commented Nov 30, 2020

Going to close this until I figure out more test failures.

@asraa asraa closed this Nov 30, 2020
This reverts commit b48aee6.

Signed-off-by: Asra Ali <[email protected]>
Signed-off-by: Asra Ali <[email protected]>
Copy link
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking on the debugging and fixing of this crash.

source/common/conn_pool/conn_pool_base.cc Outdated Show resolved Hide resolved
test/common/http/http1/conn_pool_test.cc Show resolved Hide resolved
test/common/http/http1/conn_pool_test.cc Outdated Show resolved Hide resolved
test/common/http/http1/conn_pool_test.cc Outdated Show resolved Hide resolved
test/common/http/http1/conn_pool_test.cc Show resolved Hide resolved
source/common/http/http1/conn_pool.cc Show resolved Hide resolved
asraa added 5 commits December 2, 2020 09:06
Signed-off-by: Asra Ali <[email protected]>
Signed-off-by: Asra Ali <[email protected]>
Signed-off-by: Asra Ali <[email protected]>
Signed-off-by: Asra Ali <[email protected]>
@asraa
Copy link
Contributor Author

asraa commented Dec 2, 2020

Working on reproducing the test failure in opt mode (doesn't reproduce locally with bazel)

asraa added 2 commits December 2, 2020 16:00
Signed-off-by: Asra Ali <[email protected]>
Signed-off-by: Asra Ali <[email protected]>
@antoniovicente
Copy link
Contributor

The new test case is crashing on the gcc build and possibly others. Please take a look.

@asraa
Copy link
Contributor Author

asraa commented Dec 3, 2020

Seems to be an issue clearing the deferred delete list which contains a connection pool. When conn pool is destroyed, it destroys it's clients, which calls clear deferred delete again, and then recurses.

This wouldn't happen wit ha real dispatcher, so working on modifying the clearDeferredDeleteList to change.

asraa added 3 commits December 4, 2020 14:16
Signed-off-by: Asra Ali <[email protected]>
Signed-off-by: Asra Ali <[email protected]>
antoniovicente
antoniovicente previously approved these changes Dec 4, 2020
Signed-off-by: Asra Ali <[email protected]>
MockDestructSchedulableCallback* upstream_ready_cb)
: ConnPoolImplForTest(dispatcher, cluster, random_generator, upstream_ready_cb) {}

~ConnPoolImplNoDestructForTest() override {} = default;
Copy link
Contributor

@antoniovicente antoniovicente Dec 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could remove this empty destructor override.

At the very least change to:
~ConnPoolImplNoDestructForTest() override = default;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up getting rid of it, I dealt with the client destruction in the test instead of in this special connpool

Signed-off-by: Asra Ali <[email protected]>
antoniovicente
antoniovicente previously approved these changes Dec 7, 2020
source/common/conn_pool/conn_pool_base.h Outdated Show resolved Hide resolved
Signed-off-by: Asra Ali <[email protected]>
@@ -253,6 +250,9 @@ class ConnPoolImplBase : protected Logger::Loggable<Logger::Id::pool> {
// The number of streams that can be immediately dispatched
// if all CONNECTING connections become connected.
uint32_t connecting_stream_capacity_{0};

void onUpstreamReady();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Usual ordering of elements in the private section according to style guide is:

  • classes/structs
  • functions
  • data members

@yanavlasov yanavlasov merged commit 657f3c1 into envoyproxy:master Dec 8, 2020
mpuncel added a commit to mpuncel/envoy that referenced this pull request Dec 8, 2020
* master: (41 commits)
  event: Remove a source of non-determinism by always running deferred deletion before post callbacks (envoyproxy#14293)
  Fix TSAN bug in integration test (envoyproxy#14327)
  tracing: Add hostname to Zipkin config.  (envoyproxy#14186) (envoyproxy#14187)
  [conn_pool] fix use after free in H/1 connection pool (envoyproxy#14220)
  lua: update deprecated lua_open to luaL_newstate (envoyproxy#14297)
  extension: use bool_flag to control extension link (envoyproxy#14240)
  stats: Factor out creation of cluster-stats StatNames from creation of the stats, to save CPU during xDS updates (envoyproxy#14028)
  test: add scaled timer integration test (envoyproxy#14290)
  [Win32 Signals] Add term and ctrl-c signal handlers (envoyproxy#13954)
  config: v2 transport API fatal-by-default. (envoyproxy#14223)
  matcher: fix UB bug caused by dereferencing a bad optional (envoyproxy#14271)
  test: putting fake upstream config in a struct (envoyproxy#14266)
  wasm: use Bazel rules from Proxy-Wasm Rust SDK. (envoyproxy#14292)
  docs: fix typo (envoyproxy#14237)
  dependencies: allowlist CVE-2018-21270 to prevent false positives. (envoyproxy#14294)
  typo in redis doc (envoyproxy#14248)
  access_loggers: removed redundant dep (envoyproxy#14274)
  fix http2 flaky test (envoyproxy#14261)
  test: disable flaky xds_integration_test. (envoyproxy#14287)
  http: add functionality to configure kill header in KillRequest proto (envoyproxy#14288)
  ...

Signed-off-by: Michael Puncel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants