Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebsocketClient creates more connections than needed #4904

Closed
michaelkwan opened this issue May 22, 2020 · 16 comments · Fixed by #4911
Closed

WebsocketClient creates more connections than needed #4904

michaelkwan opened this issue May 22, 2020 · 16 comments · Fixed by #4911

Comments

@michaelkwan
Copy link

Jetty version
9.4.28
Java version
1.8.0.242 adopt openjdk
OS type/version
mac
Description
I noticed the websocket client is creating more connections than needed when multiple connect is called concurrently.

When I run the following, I see established count went to 64 (expected 50):

watch -n 0.5 "netstat -an -p tcp | awk '{ print \$4, \$5, \$6 }' | grep ESTABLISHED | grep 174.129.224 | wc -l"
public class SimpleSecureEchoClient
{
  public static void main(String[] args)
  {
    String destUri = "wss://echo.websocket.org";
    if (args.length > 0)
    {
      destUri = args[0];
    }

    SslContextFactory ssl = new SslContextFactory.Client();
    ssl.addExcludeProtocols("tls/1.3");
    ssl.setExcludeCipherSuites(); // websocket.org only uses WEAK cipher suites
    HttpClient http = new HttpClient(ssl);
    WebSocketClient client = new WebSocketClient(http);
    ExecutorService executorService = Executors.newFixedThreadPool(50);

    try
    {
      http.start();
      client.start();

      URI echoUri = new URI(destUri);


      for (int i = 0; i < 50; i++) {
        executorService.submit(() -> connect(client, echoUri));
      }

    }
    catch (Throwable t)
    {
      t.printStackTrace();
    }
    finally
    {
      executorService.shutdown();
      try {
        executorService.awaitTermination(10, TimeUnit.SECONDS);
      } catch (InterruptedException e) {
        e.printStackTrace();
      }
      stop(http);
      stop(client);
    }
  }

  private static void connect(final WebSocketClient client, final URI echoUri) {
    try {
    ClientUpgradeRequest request = new ClientUpgradeRequest();
    SimpleEchoSocket socket = new SimpleEchoSocket();
    request.setHeader("Origin", "https://websocket.org/");
    client.connect(socket, echoUri, request);
    System.out.printf("Connecting to : %s%n", echoUri);
    // wait for closed socket connection.
    socket.awaitClose(5, TimeUnit.SECONDS);
    } catch (Throwable t) {
      t.printStackTrace();
    }
  }

  private static void stop(LifeCycle lifeCycle)
  {
    try
    {
      lifeCycle.stop();
    }
    catch (Exception e)
    {
      e.printStackTrace();
    }
  }
@michaelkwan
Copy link
Author

michaelkwan commented May 22, 2020

Enabled debug logging, what is interesting is that there are 50 of these logs
2020-05-22 12:51:37.524:DBUG:oejc.AbstractConnectionPool:pool-1-thread-21: newConnection 2/64 connections 2/-1 pending
and 14 of these
2020-05-22 12:51:37.986:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-89: newConnection 51/64 connections 14/-1 pending

If i change to only 1 connect, I do not see these logs:
2020-05-22 12:51:37.986:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-89: newConnection 51/64 connections 14/-1 pending

@michaelkwan
Copy link
Author

When i increase the max connections per destination to say 100:

    http.setMaxConnectionsPerDestination(100);

I see connection established is 100 even thou I only have 50 websocket client connect() calls

@sbordet
Copy link
Contributor

sbordet commented May 25, 2020

ssl.addExcludeProtocols("tls/1.3");

The correct string is TLSv1.3 (https://docs.oracle.com/en/java/javase/11/docs/specs/security/standard-names.html#sslcontext-algorithms).

@michaelkwan I'm not sure but I won't be surprised that wss://echo.websocket.org throttle clients to avoid that random people opens N connections to it in a tight loop or concurrently 😉
The extra connections may be due to the fact that they are opened but closed by the server due to throttling.

Can you please verify that?

If the problem reproduces with less connections (say 10), please run it and attach the DEBUG logs.

@michaelkwan
Copy link
Author

test.log

Attached is the test run with 10 concurrent connect calls. I see 19 established connections and 9 of them do not get clean-up immediately after the run exited. There is no error or exception to indicate any problems.

I ran with my own go echo server, this does not seem to happen with such a low count; however, the fact that no error/exception make people think otherwise (why would new connections be created when you have perfectly good established connections?)

2020-05-26 09:33:47.030:DBUG:oejc.AbstractConnectionPool:pool-1-thread-5: newConnection 1/1024 connections 1/-1 pending
2020-05-26 09:33:47.030:DBUG:oejc.AbstractConnectionPool:pool-1-thread-8: newConnection 2/1024 connections 2/-1 pending
2020-05-26 09:33:47.030:DBUG:oejc.AbstractConnectionPool:pool-1-thread-10: newConnection 3/1024 connections 3/-1 pending
2020-05-26 09:33:47.031:DBUG:oejc.AbstractConnectionPool:pool-1-thread-4: newConnection 4/1024 connections 4/-1 pending
2020-05-26 09:33:47.031:DBUG:oejc.AbstractConnectionPool:pool-1-thread-1: newConnection 6/1024 connections 6/-1 pending
2020-05-26 09:33:47.031:DBUG:oejc.AbstractConnectionPool:pool-1-thread-2: newConnection 5/1024 connections 5/-1 pending
2020-05-26 09:33:47.032:DBUG:oejc.AbstractConnectionPool:pool-1-thread-3: newConnection 7/1024 connections 7/-1 pending
2020-05-26 09:33:47.032:DBUG:oejc.AbstractConnectionPool:pool-1-thread-6: newConnection 9/1024 connections 9/-1 pending
2020-05-26 09:33:47.032:DBUG:oejc.AbstractConnectionPool:pool-1-thread-9: newConnection 10/1024 connections 10/-1 pending
2020-05-26 09:33:47.032:DBUG:oejc.AbstractConnectionPool:pool-1-thread-7: newConnection 8/1024 connections 8/-1 pending
2020-05-26 09:33:47.393:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-40: newConnection 11/1024 connections 9/-1 pending
2020-05-26 09:33:47.393:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-39: newConnection 12/1024 connections 10/-1 pending
2020-05-26 09:33:47.422:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-45: newConnection 13/1024 connections 10/-1 pending
2020-05-26 09:33:47.448:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-39: newConnection 14/1024 connections 10/-1 pending
2020-05-26 09:33:47.476:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-40: newConnection 15/1024 connections 9/-1 pending
2020-05-26 09:33:47.476:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-39: newConnection 16/1024 connections 10/-1 pending
2020-05-26 09:33:47.486:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-37: newConnection 17/1024 connections 10/-1 pending
2020-05-26 09:33:47.507:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-39: newConnection 15/1024 connections 10/-1 pending
2020-05-26 09:33:47.509:DBUG:oejc.AbstractConnectionPool:HttpClient@61e4705b-39: newConnection 16/1024 connections 10/-1 pending

@michaelkwan
Copy link
Author

I did additional tests running docker containers on my machine:
1.  tested against ws://echo.websocket.org (10 concurrent threads, 19 established connections)

2.  tested against docker version of the kaazing gateway (which simply echos), reproducible (100 concurrent threads, 119 established connections) (https://hub.docker.com/r/kaazing/gateway)

3.  tested against my own go websocket server (100 concurrent threads, 100 established connections). re-test with 200 threads, 277 established connections (https://github.com/gorilla/websocket)

They all have no exceptions/stack trace/ error indicated in their debug logs.

@michaelkwan
Copy link
Author

Also this happens to both ws and wss endpoints.

@joakime
Copy link
Contributor

joakime commented May 26, 2020

(why would new connections be created when you have perfectly good established connections?)

Websocket is an HTTP connection that was upgraded to a WebSocket connection.
It cannot be reused after it is closed.

@sbordet
Copy link
Contributor

sbordet commented May 26, 2020

@michaelkwan thanks for the logs, it's indeed a bug.

Since opening connections is slow, we open 2 but still have 8 queued requests waiting for their connection.
When one of the 2 opened connections finishes its upgrade, it looks at the request queue, sees it's not empty, and asks for another connection to the pool.

This is handled differently in DuplexConnectionPool vs MultiplexConnectionPool (see acquire()).

DuplexConnectionPool should look at the destination queue size and pass that number to tryCreate().

If you want to confirm this, use MultiplexConnectionPool in this way:

HttpClientTransportOverHTTP transport = new HttpClientTransportOverHTTP();
transport.setConnectionPoolFactory(destination -> new MultiplexConnectionPool(destination, 64, destination, 1));

HttpClient httpClient = new HttpClient(transport);
...

With the snippet above I expect to never open more connections than expected.

Let us know if that's correct.

@sbordet sbordet closed this as completed May 26, 2020
@sbordet sbordet reopened this May 26, 2020
@michaelkwan
Copy link
Author

@sbordet I gave it a try on the docker kaazing gateway; however, it still creates more connections than # of upgrade requests. See logs.
test.log

sbordet added a commit that referenced this issue May 26, 2020
Fixed connection pool's `acquire()` methods to correctly take into account the number of queued requests.

Also fixed a collateral bug in `BufferingResponseListener` - wrong calculation of the max content length.

Restored `ConnectionPoolTest` that was disabled in #2540, cleaned it up, and let it run for hours without failures.

Signed-off-by: Simone Bordet <[email protected]>
@sbordet
Copy link
Contributor

sbordet commented May 26, 2020

@michaelkwan can you try PR #4911? I believe it'll fix the issue.

@michaelkwan
Copy link
Author

@sbordet I gave it a try but it didn't work. Lower thread count seems to be promising; however, I up the concurrent connects to 500 and saw 528 established connections. See attached logs.

test.log

Also, when I stop/shutdown http client, those extra connections remained in time_wait state even after the process is completed.

@joakime
Copy link
Contributor

joakime commented May 27, 2020

Is there a reason upgraded connections are even in the Connection Pool?

@sbordet
Copy link
Contributor

sbordet commented May 27, 2020

@joakime upgraded connection are not in the pool.
This is strictly a HttpClient issue - not particularly related to WebSocket.

sbordet added a commit that referenced this issue May 28, 2020
More fixes to the connection pool logic.
Now the connection creation is conditional, triggered by
explicit send() or failures.
The connection creation is not triggered _after_ a send(),
where we aggressively send more queued requests - or
in release(), where we send queued request after a previous
one was completed.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue May 29, 2020
More fixes to the connection pool logic.
Now the connection close/removal aggressively sends more
requests triggering the connection creation.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue May 29, 2020
sbordet added a commit that referenced this issue Jun 1, 2020
Updates after review: added javadocs.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 1, 2020
Updates after review.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 1, 2020
Updates after review.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 1, 2020
Updates after review.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jun 1, 2020
Updates after review.

Signed-off-by: Simone Bordet <[email protected]>
@sbordet
Copy link
Contributor

sbordet commented Jun 1, 2020

@michaelkwan so the issue was more complex than expected.

We have modified the code to try and minimize the number of connections created in case of concurrent requests.

However, to be absolutely accurate (i.e. spawn 500 threads and expect no more than 500 connections) we would need to grab a coarse lock and we decided against that for performance reasons.
Furthermore, this behavior would depend on the connection pool implementation - some pools are more eager to reuse connections, some more eager to create more connections.
The few additional connections that may be created will eventually idle timeout.

To constrain the number of connections precisely you can still use maxConnectionsPerDestination and be sure that no more connections than that will be created.

We have done a best effort to be more conservative and create less connections than before, but few more may still be created.

You should see improvements for your use case. We would like to get a feedback if you can test the latest code.

@sbordet sbordet closed this as completed in 0ae2fff Jun 1, 2020
sbordet added a commit that referenced this issue Jun 1, 2020
@michaelkwan
Copy link
Author

michaelkwan commented Jun 16, 2020

@sbordet question: do you still suggest to use MultiplexConnectionPool instead of DuplexConnectionPool for Websocket Client with the latest fix?

@sbordet
Copy link
Contributor

sbordet commented Jun 16, 2020

@michaelkwan no, stick with the default, i.e. the DuplexConnectionPool.

sbordet added a commit that referenced this issue Jul 9, 2020
Fixed MultiplexConnectionPool.acquire() to use the new boolean
parameter to decide whether or not create a new connection.

This fixes ConnectionPoolTest instability.

Signed-off-by: Simone Bordet <[email protected]>
sbordet added a commit that referenced this issue Jul 9, 2020
Fixed MaxConcurrentStreamsTest - it was always broken.
The problem was that the call to super.onSettings(...) was done
_after_ sending the request, so the connection pool was still
configured with the default maxMultiplex=1024.

Also fixed AbstractConnectionPool to avoid a second call to
activate() if we are not trying to create a new connection.

Signed-off-by: Simone Bordet <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants