Prefork w/reuse port is NOT really faster than multi-threaded #23

nanoant · 2015-12-21T23:17:38Z

Hello I am an author of dumb & simple WebFrameworkBenchmark. It isn't as complete as TechEmpower's one, but the purpose it just to test the framework (router) overhead.

According to your Performance optimization tips for multi-core systems using pre-fork with SO_REUSEPORT is preferred way to scale on multicore system.

However I get completely opposite numbers, when using prefork I get worse results (~440kreq/s) than with multi-threded simpler version (~480kreq).

You can find the source for my benchmark at:

https://github.com/nanoant/WebFrameworkBenchmark/blob/master/benchmarks/go-fasthttp/helloworldserver.go

Also another important observation is that is performance increase comparing to net/http is around 1.8x, and it is nowhere close to claimed 4x-10x. Thoughts?

Therefore I humbly ask you to provide some solid benchmark examples where we can see the performance differences.

The text was updated successfully, but these errors were encountered:

mathvav · 2015-12-22T02:22:31Z

Look at issue #4. He elaborates on his benchmarks and provides some of the background information behind his benchmarks.

nanoant · 2015-12-22T10:49:36Z

@Annonomus-Penguin I have seen that. But my doubt are something else, first is preffered usage SO_REUSEPORT & prefork rather than normal multi-threaded. There's nothing about that in #4. Another is claim that fasthttp works 4x-10x than net/http. Again I cannot see the proof (example benchmark) for this claim.

Just to be clear, I am not denying that this solution is fast, it is VERY FAST. And it is 4th in my benchmarks, only loosing to some heavily optimized Java and 2 native C solution. So I think it is great.

valyala · 2015-12-24T18:39:54Z

@nanoant , first of all thanks for the benchmark.

According to your Performance optimization tips for multi-core systems using pre-fork with SO_REUSEPORT is preferred way to scale on multicore system.
However I get completely opposite numbers, when using prefork I get worse results (~440kreq/s) than with multi-threded simpler version (~480kreq).

Prefork with SO_REUSEPORT scales perfectly when a lot of concurrent client connections (at least thouthands) are established over the real network (preferrably 10Gbit with per-CPU hardware packet queues). See #4 for details. When small number of concurrent client connections are established over localhost, then SO_REUSEPORT usually doesn't give any performance gain (like in your case).

Also another important observation is that is performance increase comparing to net/http is around 1.8x, and it is nowhere close to claimed 4x-10x. Thoughts?

10x performance gain is achieved in synthetic benchmarks - see server benchmark results on the main page. These benchmarks completely skip network API provided by the operating system - they test bare server implementation performance. In real life fasthttp is faster than net/http by up to 3x (when client uses http pipelining) due to network API overhead.

As for the source code:

Replace

io.WriteString(ctx, "Hello World")

with

ctx.Write(helloWorldBytes)

where helloWorldBytes must be initialized in global scope as:

var helloWorldBytes = []byte("Hello World")

This may improve benchmark performance, since it eliminates io.WriteString overhead.

valyala · 2015-12-24T18:56:58Z

Just added io.WriteString performance booster at 80105c1. @nanoant, make go get -u github.com/valyala/fasthttp and re-run your tests.

valyala · 2016-01-08T14:55:47Z

Closing this issue, since this is not a bug.

valyala added the question label Jan 8, 2016

valyala closed this as completed Jan 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefork w/reuse port is NOT really faster than multi-threaded #23

Prefork w/reuse port is NOT really faster than multi-threaded #23

nanoant commented Dec 21, 2015

mathvav commented Dec 22, 2015

nanoant commented Dec 22, 2015

valyala commented Dec 24, 2015

valyala commented Dec 24, 2015

valyala commented Jan 8, 2016

Prefork w/reuse port is NOT really faster than multi-threaded #23

Prefork w/reuse port is NOT really faster than multi-threaded #23

Comments

nanoant commented Dec 21, 2015

mathvav commented Dec 22, 2015

nanoant commented Dec 22, 2015

valyala commented Dec 24, 2015

valyala commented Dec 24, 2015

valyala commented Jan 8, 2016