Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17439: Make polling for new records an explicit action/event in the new consumer #17035

Merged
merged 27 commits into from
Oct 28, 2024

Conversation

kirktrue
Copy link
Collaborator

@kirktrue kirktrue commented Aug 28, 2024

Updated the FetchRequestManager to only create and enqueue fetch requests when signaled to do so by a FetchEvent.

The application thread and the background thread each contains logic that is performed if there is buffered data from a previous fetch. There's a race condition because the presence of buffered data could change between the two threads’ respective checks. Right now the window for the race condition to occur is wide open; this change aims to make the window ajar.

In the ClassicKafkaConsumer, the application thread will explicitly issue fetch requests (via the Fetcher class) at specific points in the Consumer.poll() cycle. Prior to this change, the AsyncKafkaConsumer would issue fetch requests independently from the user calling Consumer.poll(); the fetches would happen nearly continuously as soon as any assigned partition was fetchable. With this change, the AsyncKafkaConsumer introduces a FetchEvent that signals to the background thread that a fetch request should be issued. The specific points where this is done in the Consumer.poll() cycle of the AsyncKafkaConsumer now match the ClassicKafkaConsumer. In short: this makes AsyncKafkaConsumer act nearly identical to the ClassicKafkaConsumer in this regard.

As mentioned above, this change does not completely solve the problem related to fetch session eviction. Exactly how the window where the race condition can be shut completely is outside the scope of this change.

See KAFKA-17182.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@kirktrue kirktrue added consumer KIP-848 The Next Generation of the Consumer Rebalance Protocol labels Aug 29, 2024
Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I'm quite surprised to see how little code needed to change to make the fetching explicit. A couple of points.

First, I think it better not to overload the PollEvent because that's already used in the share consumer.

Second, it seems to me that there is still the potential for over-fetching, and this will still cause churn of the fetch session cache.

In the case where the consumer is only fetching a single partition, I think it works pretty well. The set of fetchable partitions will be empty if there's buffered data, and contain the only partition in the fetch session if there is not. So, you'll only send a Fetch request when there's a need for more data and the fetch session will not churn.

In the case where the consumer is fetching more than one partition on a particular node, if a subset of the partitions is fetchable, then the fetch session will be modified by sending a Fetch request and that seems to have the potential for a lot of churn.

Of course, all of this code is in common between the legacy consumer and the async consumer. The async consumer is still very keen on fetching so I don't properly grasp why this PR would make the fetch session behaviour better.

@kirktrue
Copy link
Collaborator Author

kirktrue commented Sep 3, 2024

Hi @AndrewJSchofield!

Thanks for the review 👍

First, I think it better not to overload the PollEvent because that's already used in the share consumer.

Agreed. I've introduced a FetchEvent so that the two separate mechanisms won't step on each others' toes.

Second, it seems to me that there is still the potential for over-fetching, and this will still cause churn of the fetch session cache.

Agreed. It aims to lessen the churn. Preventing the churn completely is a future task.

In the case where the consumer is only fetching a single partition, I think it works pretty well. The set of fetchable partitions will be empty if there's buffered data, and contain the only partition in the fetch session if there is not. So, you'll only send a Fetch request when there's a need for more data and the fetch session will not churn.

Correct.

In the case where the consumer is fetching more than one partition on a particular node, if a subset of the partitions is fetchable, then the fetch session will be modified by sending a Fetch request and that seems to have the potential for a lot of churn.

Correct again!

Any partition with buffered data at the point where the fetch request is being generated will be marked as "removed" from the broker's fetch session cache. That's the crux of the problem 😞

Something that I tend to lose sight of is the fact that it's not a foregone conclusion that a fetch session will be evicted when it has partitions removed. Of course, it will increase its eligibility for eviction if the broker hosting the fetch session is resource-constrained and invokes the eviction process.

Of course, all of this code is in common between the legacy consumer and the async consumer.

I'm not sure I follow. This code is all specific to the AsyncKafkaConsumer. While the ClassicKafkaConsumer has a similar race condition, it is 2-4 orders of magnitude less likely to happen.

The async consumer is still very keen on fetching so I don't properly grasp why this PR would make the fetch session behaviour better.

Yep—the design of the AsyncKafkaConsumer fetching continuously in the background makes it very keen to cause this problem. With this change, the application thread now signals when to fetch, which results in the background thread creating and issuing the fetch requests much less often.

Thanks!

@kirktrue kirktrue added the ctr Consumer Threading Refactor (KIP-848) label Sep 3, 2024
@kirktrue
Copy link
Collaborator Author

kirktrue commented Sep 3, 2024

@AndrewJSchofield, et al.—it can be helpful to compare the flow of ClassicKafkaConsumer.poll() and AsyncKafkaConsumer.poll(), specifically how it invokes fetch. Note that the sendFetches() method name, as well as when it is invoked, comes from the ClassicKafkaConsumer.poll(). So this is really making the new consumer act much more like the old one.

@kirktrue kirktrue force-pushed the KAFKA-17439-poll-explicitly branch 2 times, most recently from 9527e8f to a8567c9 Compare September 5, 2024 19:09
… the new consumer

Updated the FetchRequestManager to only create and enqueue fetch requests when signaled to do so by a FetchEvent.
@kirktrue kirktrue force-pushed the KAFKA-17439-poll-explicitly branch from bb7efc1 to e984638 Compare September 5, 2024 19:10
Copy link
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates and for the explanation of the mechanism.

I think it would be appropriate to test in FetchRequestManagerTest the mechanism of the pending fetch request future in the various permutations.

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kirktrue , thanks for the updates, some comments...

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

@kirktrue
Copy link
Collaborator Author

@lianetm—tests are passing and all comments have been addressed. Can you make another review pass? Thanks!

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @kirktrue! Took another pass and left some comments for consideration.

prepareFetchRequests(),
this::handleFetchSuccess,
this::handleFetchFailure
);
);
pendingFetchRequestFuture.complete(null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to complete this future also on pollOnClose? there may be a pendingFetchRequestFuture there that won't be completed (not that I'm seeing how leaving that future uncompleted on close will cause a problem but seems safer to complete it, consistently with how we do it here after pollInternal)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the Future-handling code to pollInternal() for consistency. LMK what you think.

@@ -1520,6 +1523,9 @@ private Fetch<K, V> pollForFetches(Timer timer) {
return fetch;
}

// send any new fetches (won't resend pending fetches)
sendFetches(timer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual poll now happens in here (addAndGet that will complete when the background has had one run, called fetchMgr.poll), so should the log line on ln 1538 "Polling for fetches with timeout..." be right before this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not polling for the incoming responses in sendFetches(), just enqueuing the outgoing requests. This mimics the ClassicKafkaConsumer in that the requests are enqueued in its sendFetches() but then toward the bottom of pollForFetches() client.poll() is invoked to wait for the results of the fetch requests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the sendFetches blocks until the CreateFetchRequestsEvent completes, and that only happens on fetchMgr.poll


So when the sendFetches completes we did poll the manager right? (and depending on time, maybe we did poll the client.poll too, which happens in the background right after polling all managers). That's why the log for "Polling for fetches" made sense to me before the sendFetches, but am I missing another poll happening after the log line maybe? (where it is now)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two ConsumerDelegate implementations work differently:

  • AsyncKafkaConsumer: FetchRequestManager.poll() will complete the event's Future on the background thread before it exits, i.e. before the thread starts the network I/O. Completing the Future starts the application thread racing toward logging that message and the background thread racing toward starting network I/O. I'll admit—I haven't dug through the code to surmise the relative costs of each thread's work before either cross their respective finish lines to win.
  • ClassicKafkaConsumer: Fetcher.sendFetchesInternal() calls ConsumerNetworkClient.send() to enqueue the request, but then it calls NetworkClient.wakeup(). Since the same ConsumerNetworkClient instance used by the consumer is also used by AbstractCoordinator.HeartbeatThread, it's technically possible that the heartbeat thread's run() method could start network I/O when it calls NetworkClient.pollNoWakeup(). Granted, that's a race that the application thread is much more likely to win given that the heartbeat thread runs much less frequently.

Here are some points to consider:

  • The definition of the term "poll" as used in the log is open to interpretation. The term "poll" is everywhere, making its meaning ambiguous at any given point of use 😢
  • I agree there is a race condition (for both consumers, but more likely for the new consumer) that could result the the log message being emitted after the network I/O has commenced
  • For this to pose a problem to users, there needs to be other log entries that we're racing with, right?. We're trying to avoid the condition where the user is confused/mislead because the entries in the log are emitted in non-deterministic ordering.
  • The log line in question is only output at level TRACE, which I assume is very rare for users to enable.

Given the above, I'm of the opinion that it's an exercise in hair splitting to alter the logging. However, I could also just change it which would have been way less effort than researching, thinking, and composing this response 🤣

If we leave the log line as it is, what would the effect be for the user?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I surely didn't intend for you to put up that long response he he, sorry. It's not about the log line per-se, it's about the alignment on where the poll happens. The classic consumer logs "Polling for records", then calls client.poll. vs us here we do sendFetches (which triggers the client.poll async in the background thread because it blocks until we poll the fetch manager), then log "Polling for fetches...".

That's the diff I saw and just wanted to understand/align on where the poll happens: once we trigger sendFetches (blocking), the client.poll will happen in the background anytime, not controlled by the app thread. Agreed? If so I'm ok with leaving the log unchanged, understanding it could come out after the client.poll happened.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the diff I saw and just wanted to understand/align on where the poll happens: once we trigger sendFetches (blocking), the client.poll will happen in the background anytime, not controlled by the app thread. Agreed?

Agreed—the background thread is going to move from calling each of the RequestManager’s poll() method to NetworkClient.poll() method without the intervention of the application thread.

If so I'm ok with leaving the log unchanged, understanding it could come out after the client.poll happened.

Thanks!

@@ -707,6 +708,8 @@ public ConsumerRecords<K, V> poll(final Duration timeout) {
updateAssignmentMetadataIfNeeded(timer);
final Fetch<K, V> fetch = pollForFetches(timer);
if (!fetch.isEmpty()) {
sendFetches(timer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point we may already have records in hand to return (consumed position updated), so we should be very careful to not throw any error here. But this sendFetches could throw interrupted because of the addAndGet right?

Shouldn't we just do a best effort to pipeline the next requests using add instead of addAndGet? It would achieve what we want, removing the risk of errors, and it would actually align better with what the classic does on this sendFetches + transmitSends:

* Poll for network IO in best-effort only trying to transmit the ready-to-send request
* Do not check any pending requests or metadata errors so that no exception should ever
* be thrown, also no wakeups be triggered and no interrupted exception either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @kirktrue! Just one nit left, almost there.

*
* <ul>
* <li>
* The method will wait for confirmation of the request creation before continuing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true now for prefetching that uses .add instead of .addAndGet, should we remove this line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Reworded to state that it will not wait for confirmation.

Copy link
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the updates @kirktrue! LGTM.

@lianetm lianetm merged commit 9e42475 into apache:trunk Oct 28, 2024
6 checks passed
@kirktrue kirktrue deleted the KAFKA-17439-poll-explicitly branch October 28, 2024 23:50
abhishekgiri23 pushed a commit to abhishekgiri23/kafka that referenced this pull request Nov 2, 2024
chiacyu pushed a commit to chiacyu/kafka that referenced this pull request Nov 30, 2024
tedyu pushed a commit to tedyu/kafka that referenced this pull request Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved clients consumer ctr Consumer Threading Refactor (KIP-848) KIP-848 The Next Generation of the Consumer Rebalance Protocol
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants