Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve Consul connection leaks by using SimpleURIConnectionPool #1657

Merged
merged 62 commits into from
Mar 16, 2023

Conversation

miklish
Copy link
Collaborator

@miklish miklish commented Mar 16, 2023

SimpleConnectionPool

SimpleConnectionPool is a simple, natively mockable, threadsafe, resilient, high-performance connection pool intended to replace com.networknt.client.http.Http2ClientConnectionPool.

Use of this connection pool fully resolves issue 1656 (#1656) by replacing the use of Http2ClientConnection in com.networknt.consul.client.ConsulClientImpl.lookupHealthService() with SimpleURIConnectionPool.

Focus on Stability and Testability

The primary goal of SimpleConnectionPool is operational stability and reliability. Optimizations were considered only after stability and resilience were confirmed.

Easy testability was vital to confirm the correct behaviour of the connection pool under a wide array of common and corner-case conditions. To this end, the connection pool was developed with mockability built-in.

Avoidance of Consul Connection Bug

The following Consul connection-leak bug was reported in August 2020, but has yet to be resolved.

When a Consul client cancels a blocking HTTP-query, the TCP connection is not closed correctly by the server. The TCP connection stays in the FIN_WAIT-2 state until it's tcp_fin_timeout expires. The FIN_WAIT-2 means that the client is waiting for an ACK from the server.

When a lot of blocking queries are cancelled and retried at the same time, the servers http_max_conns_per_client can be hit the further client queries will fail.

Bug Ticket URL: hashicorp/consul#8524

SimpleConnectionPool avoids this bug entirely by never closing connections that are in use. While this can cause connection expiry times to not be precisely honoured, it also ensures that Consul blocking queries are never canceled before they complete, thereby avoiding the conditions that manifest this bug.

Multi-Layer Architecture

The connection pool was developed using a multilayer architecure.

Level 1: Routing to URI connection pools

Threads that need a connection to a URI must 'borrow' a connection from the pool and then restore that connection to the pool when they are done with it.

A SimpleConnectionPool routes these URI-connection requests to a SimpleURIConnectionPool for that URI. If a SimpleURIConnectionPool does not exist for that URI, the SimpleConnectionPool will create one.

The SimpleConnectionPool needs only minimal thread synchronization since the SimpleURIConnectionPools are already fully thread safe. A benefit of this, is increased opportunity for concurrency. For example: N threads can request connections to N distinct URIs concurrently.

Level 2: URI connection pools

A SimpleURIConnectionPool manages connections to a single URI where the connections it manages have the ability to be used by 1 or more threads concurrently. A SimpleURIConnectionPool:

  • Enforces a configurable maximum number of connections
  • Closes connections after a configurable 'expiry' time, but ensures that it never closes connections that are in use
  • Safely manages connection resources (if client behaves as expected)
  • Safely closes leaked connections that may be created by connection-creation callback threads after a connection-creation timeout has occurred in the parent thread
  • Is threadsafe

Also, note that SimpleURIConnectionPools can be used used independently of SimpleConnectionPool. This means that, any code that only needs to connect to a single URI can use a SimpleURIConnectionPool directly--it does not need to make requests to borrow connections from a SimpleConnectionPool (which is only needed for code that needs to connect to multiple distinct URIs).

The SimpleURIConnectionPool needs very little information about connections to manage the pool. It only needs to know all of the following boolean properties of the connection:

  • is the connection borrowed
  • is the connection borrowable
  • is the connection expired
  • is the connection closed

Level 3: Connection Holders

SimpleConnectionHolder objects translate the state of connections into the smaller set of states relevant to the SimpleURIConnectionPool (namely borrowed, borrowable, expired, closed).

SimpleConnectionHolders have a 1-1 relationship with connections, meaning that every connection in the pool is contained by a single SimpleConnectionHolder, and every SimpleConnectionHolder contains exactly one connection. In the docs, we therefore often refer to connections and their connection holders interchangeably.

SimpleConnectionHolders also keep track of whether a connection is currently in use or not through the use of ConnectionToken objects that client threads must 'borrow' in order to get a connection, and must 'restore' when they are done with the connection.

When a SimpleConnectionHolder is created, it also creates the connection it holds. It does this by using a SimpleConnectionMaker factory object that is passed to it in its constructor.

The connection state exposed by the SimpleConnectionHolder adheres to the following state diagram.

State diagram for a connection

              |
             \/
    [ NOT_BORROWED_VALID ]  --(borrow)-->  [ BORROWED_VALID ]
              |             <-(restore)--          |
              |                                    |
           (expire)                             (expire)
              |                                    |
             \/                                   \/
    [ NOT_BORROWED_EXPIRED ] <-(restore)-- [ BORROWED_EXPIRED ]
             |
          (close) (*)
             |
            \/
        [ CLOSED ]
 

(*) NOTE

A connection can be closed explicitly by the connection pool, or it can be unexpectedly closed at any time by the OS. If it is closed unexpectedly by the OS, then the state can jump directly to CLOSED regardless of what state it is currently in.

Level 4: SimpleConnections

Objects that implement SimpleConnection interfaces must wrap a physical 'raw' connection. For example, the SimpleClientConnection class implements SimpleConnection and wraps Undertow's ClientConnection.

SimpleConnectionHolders only deal with SimpleConnection objects--they never deal directly with 'raw' connections (such as Undertow's ClientConnection).

Mockability

Since SimpleConnectHolder objects use a SimpleConnectionMaker to create their connections, and since the connections that SimpleConnectionMakers create are SimpleConnections, it means that the SimpleConnectionPool, SimpleURIConnectionPool, and SimpleConnectionHolder are able to manage the pool without having any knowledge of the raw connections they are managing.

This means connections can be fully mocked by simply implementing SimpleConnection and creating a SimpleConnectionMaker that instantiates these mock connections.

Mock Test Harness

SimpleConnectionPool comes with a test harness that can be used to easily test mock connections. For example, to test how the connection pool will handle a connection that randomly closes can be built as follows:

1. Develop the mock connection

public class MockRandomlyClosingConnection implements SimpleConnection {
    private volatile boolean closed = false;
    private boolean isHttp2 = true;
    private String MOCK_ADDRESS = "MOCK_HOST_IP:" + ThreadLocalRandom.current().nextInt((int) (Math.pow(2, 15) - 1.0), (int) (Math.pow(2, 16) - 1.0));

    /***
     * This mock connection simulates a multiplexable connection that has a 5% chance of closing
     *  every time isOpen() is called
     */
    public MockRandomlyClosingConnection(boolean isHttp2) { this.isHttp2 = isHttp2; }
    @Override public boolean isOpen() {
        if(ThreadLocalRandom.current().nextInt(20) == 0)
            closed = true;
        return !closed;
    }
    @Override public Object getRawConnection() {
        throw new RuntimeException("Mock connection has no raw connection");
    }
    @Override public boolean isMultiplexingSupported() { return isHttp2; }
    @Override public String getLocalAddress() { return MOCK_ADDRESS; }
    @Override public void safeClose() { closed = true; }
}

2. Test how the connection pool handles the connection

public class TestRandomlyClosingConnection {
    public static void main(String[] args) {
        new TestRunner()
            // set connection properties
            .setConnectionPoolSize(100)
            .setSimpleConnectionClass(MockRandomlyClosingConnection.class)
            .setCreateConnectionTimeout(5)
            .setConnectionExpireTime(5)
            .setHttp2(true)

            // configure borrower-thread properties
            .setNumBorrowerThreads(8)
            .setBorrowerThreadStartJitter(3)
            .setBorrowTimeLength(5)
            .setBorrowTimeLengthJitter(5)
            .setWaitTimeBeforeReborrow(2)
            .setWaitTimeBeforeReborrowJitter(2)

            // execute test
            .setTestLength(10*60)
            .executeTest();
    }
}

Plugability

SimpleConnectionPool requires very litte information about a connection in order to manage that connection. This lends itself to supporting many different networking APIs. If one can implement a SimpleConnection and SimpleConnectionMaker using a networking API, then connections created using that API can be managed by SimpleConnectionPool.

ClientConnections created using undertow's libraries can be wrapped this way. See the simplepool.undertow package to see how support for undertow was implemented.

How to safely use the SimpleURIConnectionPool

The following code snippet demonstrates the recommended way to structure code that borrows a connection...

SimpleConnectionHolder.ConnectionToken borrowToken = null;
try {
    borrowToken = pool.borrow(createConnectionTimeout, isHttp2);
    ClientConnection connection = (ClientConnection) borrowToken.getRawConnection();

    // Use connection...
    
} finally {
    // restore token
    pool.restore(borrowToken);
}    

...and this demonstrates the recommended way to borrow connections in a loop...

while(true) {
    SimpleConnectionHolder.ConnectionToken borrowToken = null;
    try {
        borrowToken = pool.borrow(createConnectionTimeout, isHttp2);
        ClientConnection connection = (ClientConnection) borrowToken.getRawConnection();

        // Use connection...
        
    } finally {
        // restore token
        pool.restore(borrowToken);
    }    
}

miklish added 30 commits March 14, 2023 00:38
…SimpleConnectionMaker to create connections) to make testing easier
… to SimpleConnection 2. added example mock connection and mock connection maker
…-synchronized since they are only used in this class and only called directly or transitively by synchronized methods
…y used in this class and only called directly or transitively by synchronized methods
…ing Iterators (to avoid ConcurrentModificationException)
…nnection' in SimpleConnectionHolder. Also, removed redundant log message SimpleConnectionHolder.safeClose()
@miklish miklish changed the title Simplepool1645 Resolve Consul connection leaks by using SimpleURIConnectionPool Mar 16, 2023
Copy link
Contributor

@GavinChenYan GavinChenYan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks

@stevehu stevehu merged commit ea57b06 into 1.6.x Mar 16, 2023
@stevehu
Copy link
Contributor

stevehu commented Mar 16, 2023

Thanks, @miklish @GavinChenYan @AkashWorkGit for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants