KafkaConsumer should continue to poll while waiting for buffer #4023

kkondaka · 2024-01-26T05:35:16Z

Description

KafkaConsumer should continue to poll while waiting for buffer.

KafkaConsumer waits for buffer to become available in a busy loop if it fails to get a buffer. The consumer thread may be in this busy loop for long time and this can result in kafka server not getting any heartbeats from the consumer.

To avoid this, the consumer must keep calling poll() while waiting for the buffer to be available. But since poll() fetches new records from the server, consumer should do pause() before doing poll() until buffer becomes available and do resume() once its get the buffer and resumes normal processing.

Issues Resolved

Resolves #[Issue number to be closed when this PR is merged]

Check List

New functionality includes testing.
New functionality has a documentation issue. Please link to it in this PR.
- New functionality has javadoc added
[X ] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Krishna Kondaka <[email protected]>

dinujoh · 2024-01-26T22:19:39Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

@@ -424,16 +426,25 @@ private void processRecord(final AcknowledgementSet acknowledgementSet, final Re
                bufferAccumulator.add(record);


what happens to this record that was caught in exception ? Is this lost ?

I see, the consumer retries in while loop.

No. The record is put in buffer eventually because we are in an infinite loop here.

dinujoh · 2024-01-26T22:20:03Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

@@ -424,16 +426,25 @@ private void processRecord(final AcknowledgementSet acknowledgementSet, final Re
                bufferAccumulator.add(record);
                break;
            } catch (Exception e) {
+                if (!paused) {
+                    paused = true;


Should kafka consumer be paused for all exception ?

In all cases of exception, we are trying for ever to put in the buffer, right? that's why we are pausing in all cases of exceptions.

hshardeesi · 2024-01-27T06:21:37Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

@@ -424,16 +426,25 @@ private void processRecord(final AcknowledgementSet acknowledgementSet, final Re
                bufferAccumulator.add(record);
                break;
            } catch (Exception e) {
+                if (!paused) {


I would suggest to pause only when we are approaching max.poll.ms limit, which we can also increase to 10 mins or more. I have seen this exception many times in scale testing, but buffer usually flushes after few retries and poll timeout does not expire. Pause/Resume may have some performance impact, like I read it flushes queued messages and stops fetching further records from broker, which is really not necessary if it's a momentary blip.

Sure, we can pause only after few retries of getting the needed buffer and that should help in doing all this in a momentary blip case.

hshardeesi · 2024-01-27T06:25:06Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

                if (e instanceof SizeOverflowException) {
                    topicMetrics.getNumberOfBufferSizeOverflows().increment();
                } else {
                    LOG.debug("Error while adding record to buffer, retrying ", e);
                }
                try {
                    Thread.sleep(100);
+                    consumeRecords();


Instead of conumeRecords() should we call consumer.poll() here directly ? and if it returns records then we assert and may be restart consumer because poll returning records means there is a bug somewhere. Calling consumeRecords() will re-enter this function if poll() returns messages.

That's fair. I can just do poll() and assert on records returned.

hshardeesi · 2024-01-27T06:29:38Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

                } catch (Exception ex) {} // ignore the exception because it only means the thread slept for shorter time
            }
        }
+        if (paused) {
+            consumer.resume(consumer.assignment());


I think there may be an issue here. We may have to call resume on exactly the same partitions that we paused. e.g. lets say consumer initially had 4 partitions that were paused. While paused, 2 got revoked and after reassignment we resumed only 2 assigned partitions. Now same 2 partitions are assigned back, will they remain paused? because we never called resume on them. It'll be good to test.

If the partitions have moved out of the consumer, I do not think they will be in the paused state. I do not think we can resume the partitions that are currently not owned by a consumer. But will try to test it.

I have tested this scenario. When the partitions are assigned back, they are not paused. I think once the partitions are revoked, the partitions are removed from the consumer's assignment. When they are assigned back, they are assigned as any other new partition. There is no stale state from previous assignment.

hshardeesi · 2024-01-27T06:30:41Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

@@ -520,6 +534,9 @@ public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
                ownedPartitionsEpoch.remove(topicPartition);
                partitionCommitTrackerMap.remove(topicPartition.partition());
            }
+            if (paused) {


is it really required in partition revocation?

Probably not. I thought about it before adding this code. I do not see any down side of doing it. Let me know if you can think of any side effects of this code.

hshardeesi

Few test scenarios to cover:
Create a topic with 4 partitions.
Have a producer send messages to the topic at a constant rate (1k/sec).

Test 1: Run one consumer, pause all partitions, bring up another consumer that'll take 2 partitions. data should be consumed by new consumer from 2 partitions. Shutdown second consumer partitions should get assigned back to paused consumer and remain paused.

Test2: Pause all partitions, bring up another consumer that'll take 2 partitions. Unpause first consumer. All partitions should be read by 2 consumers. shutdown second consumer, first consumer will get all partitions and should consume from all partiions after reassignment.

Test3: Run two consumer reading from 2 partiions each. Pause first consumer. scale up partiions by added 4 more partiions. Paused consumer should pause new partions also. Shutdown and bring up second consumer couple of time. Unpause first consumer all partitions should be read.

kkondaka · 2024-01-29T05:10:27Z

Few test scenarios to cover: Create a topic with 4 partitions. Have a producer send messages to the topic at a constant rate (1k/sec).

Test 1: Run one consumer, pause all partitions, bring up another consumer that'll take 2 partitions. data should be consumed by new consumer from 2 partitions. Shutdown second consumer partitions should get assigned back to paused consumer and remain paused.

Test2: Pause all partitions, bring up another consumer that'll take 2 partitions. Unpause first consumer. All partitions should be read by 2 consumers. shutdown second consumer, first consumer will get all partitions and should consume from all partiions after reassignment.

Test3: Run two consumer reading from 2 partiions each. Pause first consumer. scale up partiions by added 4 more partiions. Paused consumer should pause new partions also. Shutdown and bring up second consumer couple of time. Unpause first consumer all partitions should be read.

Tested case 1. And it works as expected.

kkondaka · 2024-01-29T05:30:32Z

Few test scenarios to cover: Create a topic with 4 partitions. Have a producer send messages to the topic at a constant rate (1k/sec).
Test 1: Run one consumer, pause all partitions, bring up another consumer that'll take 2 partitions. data should be consumed by new consumer from 2 partitions. Shutdown second consumer partitions should get assigned back to paused consumer and remain paused.
Test2: Pause all partitions, bring up another consumer that'll take 2 partitions. Unpause first consumer. All partitions should be read by 2 consumers. shutdown second consumer, first consumer will get all partitions and should consume from all partiions after reassignment.
Test3: Run two consumer reading from 2 partiions each. Pause first consumer. scale up partiions by added 4 more partiions. Paused consumer should pause new partions also. Shutdown and bring up second consumer couple of time. Unpause first consumer all partitions should be read.

Tested case 1. And it works as expected.

Tested test case 2. And it worked as expected.

Question about 3- What do you mean by "scale up partitions by increasing the number of partitions"? AFAIK, kafka partitions cannot be increased after creating the topic.

Signed-off-by: Krishna Kondaka <[email protected]>

hshardeesi · 2024-01-30T05:55:36Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

+                    if (paused) {
+                        ConsumerRecords<String, T> records = doPoll();
+                        if (records.count() > 0) {
+                            LOG.debug("Unexpected records received while the consumer is paused. Resetting the paritions to retry from last read pointer");


do we need to make it info to show up in logs?

It seems like a WARN level to me, but you probably have more context.

dlvenable · 2024-01-30T14:25:56Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

+                    if (paused) {
+                        ConsumerRecords<String, T> records = doPoll();
+                        if (records.count() > 0) {
+                            LOG.debug("Unexpected records received while the consumer is paused. Resetting the paritions to retry from last read pointer");


It seems like a WARN level to me, but you probably have more context.

dlvenable · 2024-01-30T14:27:20Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

@@ -411,29 +418,51 @@ private <T> Record<Event> getRecord(ConsumerRecord<String, T> consumerRecord, in
        return new Record<Event>(event);
    }

-    private void processRecord(final AcknowledgementSet acknowledgementSet, final Record<Event> record) {
+    private <T> void processRecord(final AcknowledgementSet acknowledgementSet, final Record<Event> record) {


If you change the code below to use ? (see my other comment), this template T becomes unnecessary and you can remove it.

dlvenable · 2024-01-30T14:30:43Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

-                    Thread.sleep(100);
+                    Thread.sleep(retrySleepTimeMs);
+                    if (paused) {
+                        ConsumerRecords<String, T> records = doPoll();


The code does not really do anything with the value of T, so you can have the following:

ConsumerRecords<String, ?> records = doPoll();

dlvenable · 2024-01-30T14:35:06Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

+        long numRetries = 0;
+        final int retrySleepTimeMs = 100;
+        // Donot pause until half the poll interval time has expired
+        final long maxRetries = topicConfig.getMaxPollInterval().toMillis() / (2 * retrySleepTimeMs);


These two variables (maxRetries and retrySleepMs) can be moved elsewhere in the class to clarify they are not so dynamic.

Perhaps make maxRetries a field, give it a clearer name (maxRetriesOnException) and set it in the constructor:

maxRetriesOnException = topicConfig.getMaxPollInterval().toMillis() / (2 * retrySleepTimeMs);

You can make a static field for the other

private static final int RETRY_ON_EXCEPTION_SLEEP_MS = 100

dlvenable · 2024-01-30T14:38:09Z

...ins/src/main/java/org/opensearch/dataprepper/plugins/kafka/consumer/KafkaCustomConsumer.java

        while (true) {
            try {
                bufferAccumulator.add(record);
                break;
            } catch (Exception e) {
+                if (!paused && numRetries++ > maxRetries) {
+                    paused = true;
+                    consumer.pause(consumer.assignment());


This is very critical logic. We should have a unit test to verify that we call pause() on these conditions.

Added new test case to test pause/resume

Signed-off-by: Krishna Kondaka <[email protected]>

KafkaConsumer should continue to poll while waiting for buffer

e386e4e

Signed-off-by: Krishna Kondaka <[email protected]>

kkondaka requested review from chenqi0805, engechas, graytaylor0, dinujoh, asifsmohammed, dlvenable and oeyh as code owners January 26, 2024 05:35

Modified to call pause() whenever parititon assignment changes

1be5171

Signed-off-by: Krishna Kondaka <[email protected]>

dinujoh reviewed Jan 26, 2024

View reviewed changes

hshardeesi reviewed Jan 27, 2024

View reviewed changes

hshardeesi suggested changes Jan 27, 2024

View reviewed changes

Addressed review comments

a373071

Signed-off-by: Krishna Kondaka <[email protected]>

hshardeesi reviewed Jan 30, 2024

View reviewed changes

hshardeesi approved these changes Jan 30, 2024

View reviewed changes

dlvenable requested changes Jan 30, 2024

View reviewed changes

Addressed review comments

296e1cf

Signed-off-by: Krishna Kondaka <[email protected]>

dlvenable approved these changes Feb 14, 2024

View reviewed changes

hshardeesi mentioned this pull request Feb 16, 2024

Delay reading from the Kafka buffer as long as the circuit breaker is open #4135

Merged

4 tasks

kkondaka merged commit c0776ef into opensearch-project:main Feb 16, 2024
46 of 50 checks passed

kkondaka deleted the kafka-buf-overflow-fix branch May 13, 2024 05:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KafkaConsumer should continue to poll while waiting for buffer #4023

KafkaConsumer should continue to poll while waiting for buffer #4023

kkondaka commented Jan 26, 2024

dinujoh Jan 26, 2024

dinujoh Jan 26, 2024

kkondaka Jan 26, 2024

dinujoh Jan 26, 2024

kkondaka Jan 26, 2024

hshardeesi Jan 27, 2024

kkondaka Jan 29, 2024

hshardeesi Jan 27, 2024

kkondaka Jan 28, 2024

hshardeesi Jan 27, 2024 •

edited

Loading

kkondaka Jan 29, 2024

kkondaka Jan 29, 2024

hshardeesi Jan 27, 2024

kkondaka Jan 29, 2024

hshardeesi left a comment

kkondaka commented Jan 29, 2024

kkondaka commented Jan 29, 2024

hshardeesi Jan 30, 2024

dlvenable Jan 30, 2024

dlvenable Jan 30, 2024

dlvenable Jan 30, 2024

dlvenable Jan 30, 2024

dlvenable Jan 30, 2024

dlvenable Jan 30, 2024

kkondaka Feb 13, 2024

		@@ -424,16 +426,25 @@ private void processRecord(final AcknowledgementSet acknowledgementSet, final Re
		bufferAccumulator.add(record);

KafkaConsumer should continue to poll while waiting for buffer #4023

KafkaConsumer should continue to poll while waiting for buffer #4023

Conversation

kkondaka commented Jan 26, 2024

Description

Issues Resolved

Check List

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hshardeesi Jan 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hshardeesi left a comment

Choose a reason for hiding this comment

kkondaka commented Jan 29, 2024

kkondaka commented Jan 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hshardeesi Jan 27, 2024 •

edited

Loading