Skip to content

WebSocketListener performance tests

Val edited this page Apr 30, 2015 · 17 revisions

Performance has improved around a 15% since I wrote this article. I will try to find time to update these tests results.

These tests were aimed to find bottlenecks and/or memory leaks in the application, to ensure that the component does not cause any excessive overhead or block.

The testing environment

  • Intel Core i7 3720QM, 8 cores, 8Gb RAM, Windows 7 Professional x64.
  • Intel Xeon X3430 2.40GHz, 4 core, 5Gb RAM, Windows 2012 x64.
  • Linksys WRT54GL used as 100Mbps switch.

The tested application is the Echo server included in the samples. It runs on the Windows 7 box.

The client application is a simple Console application that uses System.Net.WebSockets.ClientWebSocket to create multiple connections. It runs in the Windows 2012 box.

...but... why is a server OS running as client, and a potential client OS running as server?

System.Net.WebSockets.ClientWebSocket does not work on Windows 7, and I have no short or medium term plans for installing Windows 8 in my laptop.

Test structure

The tests shows these performance counters:

  • Messages In/sec (red): messages read per second. It is defined in the EchoServer application.
  • Messages Out/sec (blue): messages sent per second. It is defined in the EchoServer application.
  • Connected (green): number of clients connected.
  • % Processor Time (yellow): processor time for the EchoServer process. It is the sum of the % time in all processors. The computer hosting the server has 8 logical processors.
  • Working Set (pink): Memory working set for the EchoServer process.

In the screenshots, the "Messages Out/sec" is not visible because it is overlapped with the "Messages In/sec". The lines should be as flat as possible, indicating that the resource consumption and throughput are stable.

The client connects to the server and starts a loop that sends a message, waits for the result and loops again. The server just echoes the input back to the client.

... wait a second... shouldn't this be tested the other way around? since 'push' is the most attractive feature of WebSocket, shouldn´t the server send data and the client echoing it?

It doesn't really matter. The idea is to let the server do the minimal possible job in order to measure correctly the component throughput.

Test 1: 275 Bytes Messages and 1000 concurrent clients

This test lasted 30 minutes, using 1000 concurrent clients sending small messages of 275 Bytes each as fast as they can (they send and wait result before sending again anyway). The average throughput is around 30,188 messages/second. That would make 30 messages/second per client in each direction: Going any further in the number of clients caused timeouts. If you need this processing power, you should load balance it.

Test 1: 1024 Bytes Messages and 4000 concurrent clients

These test lasted one hour each, using 4000 clients sending messages of 1024 bytes length. The amount of data took almost the 100% of my 100Mbps bandwidth:

Normal WebSocket (WS://)

With ws:// (ie: without TLS), it provided a throughput of around 11,140 messages/second up and 11,140 messages/second down in total. That would make 2.75 messages/second in each direction per client.

Secure WebSocket (WSS://)

With wss:// (ie: using TLS), it provided a throughput of around 10,640 messages/second up and 10,640 messages/second down in total. That would make 2.66 messages/second in each direction per client. So there is small processing delay when using TLS with this message length.

Conclusion

It seems the component is light and does not create much overhead. Anyway, if you have to face this amount of traffic I recommend you to load balance it across two or more servers.

Take a look on the configuration parameters, and use testing to find your best configuation. I observed that:

  • Small and massive messages have more impact on secure WebSocket connections.
  • The parallel negotiations are very helpful when working with TLS.
  • If the server goes down with 5000 clients connected, the reconnection is going to be an ugly process.