Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: make rebalance the right way #369

Closed
wants to merge 1 commit into from

Conversation

springuper
Copy link

@springuper springuper commented May 11, 2016

fix #90.

After debugging in deep, I find out that in the following case FailedToRebalanceConsumerError definitely occurs:

  1. consumer A and B startup in the same time
  2. consumer A registers itself to ZK, and then begins to rebalance
  3. consumer B registers itself to ZK, and then begins to rebalance after a very short time
  4. consumer A rebalance is done, and obtains all partitions, let's assume partition 1 and partition 2
  5. consumer B rebalance and finds that partition 2 should be it's own, but it's already obtained by consumer A, then FailedToRebalanceConsumerError occurs

To solve this problem, I change rebalance with pending status and always listen to consumersChanged event. The purpose is to make rebalance once again when there are other consumers registered to ZK in the same time rebalance is continuing.

var path = '/consumers/' + groupId + '/ids';
this.client.getChildren(
path,
function () {
if (!that.closed) {
that.listConsumers(groupId);
if (!self.closed && watch) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid unnecessary nesting watchers, referring to Zookeeper.prototype.topicExists

}
}

// Wait for the consumer to be ready
this.on('registered', function () {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consumersChanged will trigger whenever init method completes, so there is no necessary to listen to registered event.

@saurako
Copy link

saurako commented May 23, 2016

👍 .. I'm waiting for this PR to be merged so we can have multiple instances of the consumer with the same group ID work properly.

@springuper
Copy link
Author

@saurako, the maintainers are too slow to review this PR. So, I have published a forked module 'kafka-node2', https://www.npmjs.com/package/kafka-node2, you can try it now.

@saurako
Copy link

saurako commented May 26, 2016

@springuper awesome! I'll definitely try it out. Will keep you posted on how that goes. Thanks!

@hyperlink
Copy link
Collaborator

I will try to review this soon. Sorry for the wait!

@nitsnwits
Copy link

@springuper thanks for the PR.
I am facing a similar issue and I'm interested in this update.

@panlilu
Copy link

panlilu commented May 31, 2016

@springuper This PR is awesome! Works like a charm and Save my day.

this.client.zk.registerConsumer(this.options.groupId, this.id, this.payloads, function (err) {
var zk = this.client.zk;
var groupId = this.options.groupId;
this.client.zk.registerConsumer(groupId, this.id, this.payloads, function (err) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have used zk here or this.client.zk again in the next statement for consistency

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I just noticed how zk is probably a bad var name choice as it conflicts with previous global var zk = require('./zookeeper') assignment statement and might be just confusing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like zk is actually being used outside the context of the client so we could delete line 12 and 13

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have updated this PR according to your useful advices.

@hyperlink
Copy link
Collaborator

@springuper can you rebase your branch with upstream/master and add some tests? 👍

@springuper
Copy link
Author

@hyperlink rebase done, and fix some conflicts. I found no tests currently about HighLevelConsumer, and there are some challenges about how to stub zk and kafka. I will try some time later.

@hyperlink
Copy link
Collaborator

@springuper feel free to add proxyquire and use sinon to help stub those.

@pcothenet
Copy link

Hi, any update?

@hyperlink
Copy link
Collaborator

@pcothenet @springuper will need to rebase with upstream master since I've changed the coding style. And ideally I would also like to see some tests to accompany this PR too.

@lezi1022
Copy link

still not fixed in the latest version(0.5.1)

@hyperlink
Copy link
Collaborator

@lezi1022 I'm planning on incorporating rebalance fix in 0.5.3.

@springuper
Copy link
Author

please refer to #423.
Since I have no time currently, and #423 includes all bugfix code and unit tests, this PR is now closed.

@springuper springuper closed this Aug 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Keep on getting FailedToRebalanceConsumerError: Exception: NODE_EXISTS[-110]
9 participants