-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When the producer or client is normally closed, data will be lost #12195
When the producer or client is normally closed, data will be lost #12195
Conversation
…data in the client buffer is not flushed, and data may be lost
Can you review it? @eolivelli |
This method is |
@lordcheng10 Thanks for your contribution. For this PR, do we need to update docs? (The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks) |
Thank you for your attention! The cleanup logic can only be done after the flush is completed. If the flushAsync method is used instead of the flush method, there is no guarantee that all the data has been sent to the server before the cleanup logic is completed; |
Okay, when this discussion is completed, I will update docs。 |
I mean that when the |
You are right, I will try to modify it |
@Shoothzj As we just discussed, I replaced flush with flushAsync and resubmitted the code. |
LGTM @eolivelli, @codelipenghui, @BewareMyPower, @sijie, @hangc0276, @merlimat - PTAL, thanks. |
IMO it's the right behavior. |
I think it's correct to flush everything on a (graceful) close, though I would not characterize it as "data lost" since the send futures will not be successful. We should add testing for the behavior. |
I understand that current behaviour may be surprising for users. We must ensure that the javadoc are clear about the behaviour. I suggest you to start a discussion on [email protected] in order to reach out to a bigger audience. |
I agree with @eolivelli . IMO It should be implemented as a new API like |
The flushing on close was already the pre-existing behavior, though that got lost at some point. |
@merlimat I'm afraid not, in current
It's not so easy like you might think. We need to handle more corner cases if you added the flush semantics to First, we cannot assume pending messages are sent quickly. If your buffer memory is large enough, it might take long time to close. Assuming you have 100000 pending messages and in the timeout, only 20000 messages are persisted. What will you do now?
At any case, when you choose Assuming now for (int i = 0; i < N; i++) {
producer.sendAsync("message-i");
}
producer.close(); What if |
pulsar/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/Producer.java Lines 178 to 186 in 0460472
Given the Javadoc for
If we add a timeout to the |
As you describe, I created a PIP: #12216 @eolivelli |
I also found the problem. I think it's an inconsistency between the JavaDocs and the actual implementation. But the definition of pending write request is ambiguous. Should it be the inflight The description of
The description of
|
@lordcheng10 If you're going to submit a PIP, please follow the PIP template. But before submitting a PIP, it's better to send an email to [email protected] to start a discussion so that more people know the context. |
I've started a discussion, see https://lists.apache.org/thread.html/r8bfcb7ab28612d94d441ff5eadd996413346f0780b6f7b3484aaf7dc%40%3Cdev.pulsar.apache.org%3E |
I believe that it will be resolve #11780 . |
OK, according to the PIP template, I modified it again。 |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we're still discussing whether close implies flush on the mailing list, but I wanted to take a closer look at the actual implementation in this PR. If we do end up adding the flush, we'll need to update the Javadoc for close
and closeAsync
on the Producer
interface.
flushAsync().thenRun(() -> { | ||
final State currentState = getAndUpdateState(state -> { | ||
if (state == State.Closed) { | ||
return state; | ||
} | ||
return State.Closing; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should set the state of the producer to Closing
before triggering the flush. Otherwise, a message could be added to batchMessageContainer
after the flush and before the state is set to Closing
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right!
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
Outdated
Show resolved
Hide resolved
/pulsarbot run-failure-checks |
@lordcheng10 feel free to ping me if you need a doc review. |
The pr had no activity for 30 days, mark with Stale label. |
Motivation
In the following example, the data will be lost(The data cannot be consumed):
` try {
PulsarClient client = PulsarClient.builder().serviceUrl("pulsar://127.0.0.1:6650").build();
The reason is because the producer did not send the data in the client buffer when it was closed;
The correct approach is to first flush data when closing。