Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General performance figures and optimisations #214

Closed
3 tasks done
emcfarlane opened this issue Jul 19, 2018 · 5 comments
Closed
3 tasks done

General performance figures and optimisations #214

emcfarlane opened this issue Jul 19, 2018 · 5 comments
Labels

Comments

@emcfarlane
Copy link

General question on performance figures for confluent-kafka-go and how it compares to sarama? I am running go benchmarks locally using the channel based Producer/Consumers and am getting around 2x worse then sarama. Settings for consumer:

	"group.id":                        groupID,
	"default.topic.config":            kafka.ConfigMap{"auto.offset.reset": "earliest"},
	"go.events.channel.enable":        true,
	"go.application.rebalance.enable": true,
	"batch.num.messages": 100000,

Settings for producer:

	"compression.codec":       "snappy",
	"request.required.acks":   -1,
	"socket.keepalive.enable": true,
	"queue.buffering.max.ms":  2,
	"batch.num.messages":      100000,

The "queue.buffering.max.ms" had the largest effect dropping 20s write for 100k events to 1s. How can I improve this?

I am not able to reproduce this gists results:
https://gist.github.com/savaki/a19dcc1e72cb5d621118fbee1db4e61f

Checklist

Please provide the following information:

  • confluent-kafka-go and librdkafka version (LibraryVersion()): v0.11.4
  • Apache Kafka broker version: v1.1.0
  • Operating system: MacOS
@rnpridgeon
Copy link

I don't know too much about the benchmark from the gist but it appears to measure consumer performance exclusively. From the description your provided it sounds like you are measure end to end throughput however.

Either way I would expect increasing queue.buffering.max.ms(aka linger.ms) would yield the best results with regard to producer throughput. The idea being that sacrificing a bit of latency will allow you to accumulate a larger batch with which to amortize the cost of compression over. This of course assumes that your producer is failing to meet the batch.num.messages and/or message.max.bytes prior to queue.buffering.max.ms elapsing. The latter being the size limitation for a given message set.

Presumably there are some start-up costs associated with this tests as well that we may be accounting for. I would try disabling api.version.request as well setting broker.version.fallback to the target broker version, 1.1.0 based on the checklist you provided.

@mhowlett
Copy link
Contributor

mhowlett commented Jul 27, 2018

I did a similar test a while back and saw a similar differential: https://gist.github.com/mhowlett/e9491aad29817aeda6003c3404874b35

The primary reason to go with the confluent client is reliability. librdkafka is very widely used and tested and this go client leverages that to provide the core functionality (i.e. all the bits that are most likely to be buggy). It's not that hard to write a kafka client, but the interaction with the cluster is quite involved and it is hard to write one that handles all the error scenarios well.

Update: actually produce throughput was similar, you should check out that gist.

@OneCricketeer
Copy link

Maybe time to update benchmarks with librdkafka 1.0 release?

@devdinu
Copy link

devdinu commented Sep 13, 2019

We've noticed latency between produce time and consume time from kafka using confluent-kafka-go client to be high greater than 5 seconds. we're also speculating the lib config as it would batch / internally optimise and didn't suit our less latency issues.

Having benchmark with documented configurations (for high throughput, less latency, reliability) would be helpful. @edenhill

@edenhill
Copy link
Contributor

edenhill commented Oct 7, 2019

The librdkafka docs is a good starting point: https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants