Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please provide real benchmark data and sever information. #4

Closed
abacaj opened this issue Nov 24, 2015 · 11 comments
Closed

Please provide real benchmark data and sever information. #4

abacaj opened this issue Nov 24, 2015 · 11 comments
Labels

Comments

@abacaj
Copy link

abacaj commented Nov 24, 2015

The claim for 1m concurrent connections is a pretty big one. Please provide the following:

  • What machine was used to handle 1m connections? E.g. m3.2xLarge (8 cpus, 30gb memory)
    • To put it into perspective, node.js can handle 800k connections on a m3.2xLarge.
  • Are these just ping/pong connections? If so then the actual throughput/rps is MUCH lower then 1million.
    • G-WAN + Go can handle Average RPS:784,113 (at least according to their homepage)
  • What was the average latency for handling 1m concurrent connections?
  • Was there any bottlenecks? E.g. Is it just hardware that is holding this library back from achieving anymore throughput?

Thank you.

@abacaj
Copy link
Author

abacaj commented Nov 24, 2015

My initial tests show there is alot of failures, with only 100 concurrents and 5 req per sec - throughput drops by 8% (unacceptable) and siege fails.

Seems like averaging 1800 req/sec which is only 4x better than net/http not 10x :)

Any idea? Perhaps provide some sample code for me to test with.

In my sample code I am using err := server.ListenAndServe(":8000")

screen shot 2015-11-24 at 1 01 10 am

@valyala
Copy link
Owner

valyala commented Nov 28, 2015

What machine was used to handle 1m connections?

1M concurrent connections with 100k rps were achieved in production, not in test environment. The server had the following configuration:

  • 8xCPU Intel(R) Xeon(R) CPU E5-1630 v3
  • 64GB RAM
  • 1Gbit network

Are these just ping/pong connections?

Long-living keep-alive connections are established by video clients all over the world. Clients periodically send event requests to the server over these connections. The server pushes event data to db and sends back just transparent pixel. Every client sends an event every 10 seconds on average.

What was the average latency for handling 1m concurrent connections?

Less than 100ms from the client side.

Was there any bottlenecks? E.g. Is it just hardware that is holding this library back from achieving anymore throughput?

The main bottleneck was 1Gbit network, so we moved to 10Gbit :)
Also the db (postgres) could handle only 100K inserts per second over a single db connection. So now we push event data over multiple db connections.

We moved to 32-CPU, 128GB RAM, 10Gbit server now. Preliminary results show that the server could handle over 500K rps. Unfortunately we have no 5M concurrent clients yet for testing such a load :(

Any idea? Perhaps provide some sample code for me to test with.

The rps seems too low for both net/http and fasthttp. Maybe your request handler is too heavy. See sample code from the pull request to TechEmpower benchmarks.

@valyala
Copy link
Owner

valyala commented Nov 28, 2015

FYI, server process ate 10Gb of RAM when serving 1M concurrent connections, i.e. ~10Kb per connection, including memory required for pushing event data to db, memory fragmentation and GC overhead.

@abacaj
Copy link
Author

abacaj commented Nov 28, 2015

thanks for the example, it's useful 👍

@rkravchik
Copy link

My initial tests show there is alot of failures, with only 100 concurrents and 5 req per sec - throughput drops by 8% (unacceptable) and siege fails.
screen shot 2015-11-24 at 1 01 10 am

@abacaj, perhaps, those errors are caused by small pool and unlimited dialing (see golang/go#6785)
@valyala, do you have plan to limit max in-flight dialing?

@erikdubbelboer
Copy link
Collaborator

@rkravchik judging from the first lines of your screenshot you aren't using keep-alive connections and your system/user ran our of available file descriptors or address:port combinations. If you're going to do a benchmark please do it properly.

@rkravchik
Copy link

rkravchik commented Nov 13, 2018

@erikdubbelboer it was reply to the second post. Address your message to the proper reciepient.
Moreover, if you will read issue in golang repo you'll find why system may ran out of descriptors due to inproper behaviour.

@erikdubbelboer
Copy link
Collaborator

@rkravchik I'm so sorry, apparently I wasn't awake yet and didn't notice you were quoting a previous comment.

You can use Client.MaxConnsPerHost to limit the max in-flight dialing.

@rkravchik
Copy link

rkravchik commented Nov 14, 2018

@erikdubbelboer net/http/Transport also have:
MaxIdleConns int
MaxIdleConnsPerHost int
knobs.
But under some circumstances (described in golang/go#6785) there are many Dialing and that's why in go 1.11 have been added one more knob: MaxConnsPerHost.

As I can see in code Client.MaxConnsPerHost is a hard limit that caused ErrNoFreeConns error. And there is no way to have such problem that exist in net/http/ below go 1.11. Am I right?

@erikdubbelboer
Copy link
Collaborator

@rkravchik
Copy link

@erikdubbelboer thank you for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants