-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to commit offset during rebalancing but messages are consumed and processed #2118
Comments
As I remember it, as long as the consumer is a member of the group and doesn't disconnect, the generation ID shouldn't get incremented until the (re-)JoinGroup request has been sent. How many topic partitions do you have assigned to the consumer? Each one will map to its own partition consumer which provides the // Messages returns the read channel for the messages that are returned by
// the broker. The messages channel will be closed when a new rebalance cycle
// is due. You must finish processing and mark offsets within
// Config.Consumer.Group.Session.Timeout before the topic/partition is eventually
// re-assigned to another group member. So with your config you should have 20 seconds from the start of the rebalance for you to drain and complete all your processing of all the messages from your assigned partitions |
It would be useful if you could also add a log statement to your |
@dnwe Thanks for looking into this. logs from consumer 2 that joined the group 2022/01/23 12:54:36.091344 consumer.go:824: consumer/broker/41 added subscription to testtopic/1 Consumer 1 logs around the same time: 2022/01/23 12:54:36.206173 consumer.go:284: - INFO Processing offset 84635 from partition 0 testtopic/0/84635 testvenkatshipper::others::EVENTTYPE::1156 As you see in the above logs, both consumer 1 and consumer 2 has received msg from offset 91343 from partition 1. By the time consumer 2 became active, consumer started throwing - The provided member is not known in the current generation - error message. This message is thrown when session.commit() is attempted. But claim.Messages() should not have received msg from this partition at all. I continue to see the error in logs for almost 20 seconds. The partitions are not revoked immediately when a new consumer joined, although commits failed and the other consumer has already started processing messages from the same partition. This topic that I am using has 3 partitions and the consumers are added automatically based on lag. Another issue that I see is, When a 3rd consumer is added, the 2nd consumer is rebalancing and assigned with partition 1 and 0. This third consumer with 2. The first consumer still has not got any partitions, the again rebalance happens. Looks like the consumers not are not notified of the rebalance and partition assignment to another consumer immediatley, because of which multiple rebalances happen. I could be doing something wrong, I am not sure yet, looking for help. |
@dnwe , Below is what happened. No. of partitions in my topic - 3 Events :
My overridden configurations - config.Consumer.Group.Session.Timeout = 20 * time.Second |
After debugging further , it turns out that Message channel will always hold messages as long as there are messages available in the topic and other consumer instances available. This caused continous rebalances as the cosnumers were out of sync. I used session.Context.Done() to see if the context was closed. This happens when there is a rebalance - the loop exits and the consumer re-joins the group. Pasted the code sample for those struggling with similar issues - func(handler *ConsumerGroupHandler) ConsumeClaim(session sarama.ConsumerGroupSession, claim sarama.ConsumerGroupClaim) error {
} //Old code } //Updated code
} |
Issue: cloudevents#817 Issue Explanation: IBM/sarama#2118 Fix reference: https://github.com/Shopify/sarama/blob/5e2c2ef0e429f895c86152189f625bfdad7d3452/examples/consumergroup/main.go#L177 Signed-off-by: nbajaj90 <[email protected]>
* Patch sarama_kafka rebalance fix Issue: #817 Issue Explanation: IBM/sarama#2118 Fix reference: https://github.com/Shopify/sarama/blob/5e2c2ef0e429f895c86152189f625bfdad7d3452/examples/consumergroup/main.go#L177 Signed-off-by: nbajaj90 <[email protected]> * Adding comment as per code review comment Signed-off-by: nbajaj90 <[email protected]> * Fixing typo Signed-off-by: nbajaj90 <[email protected]> * Incorporated review comment to move comment Signed-off-by: nbajaj90 <[email protected]> Signed-off-by: nbajaj90 <[email protected]>
Versions
Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
Sarama - v1.30.0
Kafka - 2.2.1.
Go- go1.16.6
Configuration
config.Consumer.Group.Session.Timeout = 20 * time.Second
config.Consumer.Group.Heartbeat.Interval = 6 * time.Second
config.Consumer.MaxProcessingTime = 500 * time.Millisecond
config.Consumer.Offsets.AutoCommit.Enable = false
config.Consumer.Return.Errors = true
What configuration values are you using for Sarama and Kafka?
Logs
When filing an issue please provide logs from Sarama and Kafka if at all
possible. You can set
sarama.Logger
to alog.Logger
to capture Sarama debugoutput.
logs: CLICK ME
2022/01/23 11:52:55.686008 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server: The provided member is not known in the current generation.
2022/01/23 11:52:55.981009 consumer.go:284: - INFO Processing offset 82704 from partition 2 testtopic/2/82704 testtopic/2/82704 testtopic::others::EVENTTYPE::1014
2022/01/23 11:52:55.981206 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server: The provided member is not known in the current generation.
2022/01/23 11:52:56.274162 consumer.go:284: - INFO Processing offset 82705 from partition 2 testtopic/2/82705 testtopic::others::EVENTTYPE::1130
2022/01/23 11:52:56.274059 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server: The provided member is not known in the current generation.
2022/01/23 11:52:56.572360 consumer.go:284: - INFO Processing offset 82706 from partition 2 testtopic/2/82706 testtopic::others::EVENTTYPE::1099
2022/01/23 11:52:56.572538 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server: The provided member is not known in the current generation.
2022/01/23 11:52:56.877602 consumer.go:284: - INFO Processing offset 82707 from partition 2
testtopic/2/82707 testtopic::others::EVENTTYPE::1133
2022/01/23 11:52:56.877766 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server: The provided member is not known in the current generation.
2022/01/23 11:52:57.182187 consumer.go:284: - INFO Processing offset 82708 from partition 2
testtopic/2/82708 testtopic::others::EVENTTYPE::1038
2022/01/23 11:52:57.182367 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server: The provided member is not known in the current generation.
2022/01/23 11:52:57.477362 consumer.go:284: - INFO Processing offset 82709 from partition 2
testtopic/2/82709 testtopic::others::EVENTTYPE::1055
2022/01/23 11:52:57.477511 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server: The provided member is not known in the current generation.
2022/01/23 11:52:57.781924 consumer.go:149: - ERROR Consumer Errors: kafka: error while consuming testtopic/2: kafka server
Problem Description
why does claim.Messages() continues to return messages when there is a rebalance? How to handle this to prevent processing messages again?
@dnwe could you please help here, Thanks.
The text was updated successfully, but these errors were encountered: