Fix seek chunked messages #205

RobertIndie · 2022-03-02T10:33:13Z

Motivation

This is the implementation of apache/pulsar#12402.

Currently, when we send chunked messages, the producer returns the message-id of the last chunk. This can cause some problems. For example, when we use this message-id to seek, it will cause the consumer to consume from the position of the last chunk, and the consumer will mistakenly think that the previous chunks are lost and choose to skip the current message. If we use the inclusive seek, the consumer may skip the first message, which brings the wrong behavior.

Here is the simple code(in java) used to demonstrate the problem.

var msgId = producer.send(...); // eg. return 0:1:-1

var otherMsg = producer.send(...); // return 0:2:-1

consumer.seek(msgId); // inclusive seek

var receiveMsgId = consumer.receive().getMessageId(); // it may skip the
first message and return like 0:2:-1

Assert.assertEquals(msgId, receiveMsgId); // fail

For more context, please see PIP-107

And I find that f# client has already stored all chunk message ids in MessageIds.chunkMessageIds. We can use this field to implement the ChunkMessageId feature like in java.

There is still work left in this PR to serialize the ChunkMessageId. To be consistent with the behavior of the Java client, when we serialize and deserialize messageIDs or compare messageId, the comparison for chunkMessageIds only needs to compare the message id of the first chunk if the message is a chunked message. Like below:

match m.ChunkMessageIds, this.ChunkMessageIds with
| Some mchunkMessageIds, Some thisChunkMessageIds when mchunkMessageIds.Length > 0 && thisChunkMessageIds.Length > 0 ->
                        mchunkMessageIds.[0] = thisChunkMessageIds.[0] // We need to check the first chunk message id if the message is a chunkd message
| _, _ -> true

We need to update the pulsar proto file before proceeding with the rest of the work. What is the correct way to generate the code for the proto? I found that the code I generated using protoc is very different from the existing generated code. Are the parameters not set correctly?

Update: The serialization for the chunk message id is added. This PR is ready for review.

Modification

Fix consumer inclusive seek for chunked message
Add compare for the first chunk message id in MessageId.

Lanayx · 2022-03-02T12:06:33Z

The code is generated using this site https://protogen.marcgravell.com/ , you'll also need to update generated modifiers from public to internal.

Lanayx · 2022-03-02T12:18:21Z

tests/IntegrationTests/Chunks.fs

+                            |> Async.AwaitTask
+                            |> Async.RunSynchronously]


Can you please rewrite it similar to this

I didn't find a good way to rewrite it. I got some compile errors when I tried it, Could you give me some guidance?

tests/IntegrationTests/Chunks.fs

fix seek chunked messages

ed52431

Lanayx reviewed Mar 2, 2022

View reviewed changes

tests/IntegrationTests/Chunks.fs Outdated Show resolved Hide resolved

RobertIndie and others added 2 commits March 9, 2022 09:54

Update pulsar proto and add serialization

edc77a4

Refactoring

0614b34

Lanayx approved these changes Mar 9, 2022

View reviewed changes

Lanayx merged commit df64c9d into fsprojects:develop Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix seek chunked messages #205

Fix seek chunked messages #205

RobertIndie commented Mar 2, 2022 •

edited

Loading

Lanayx commented Mar 2, 2022

Lanayx Mar 2, 2022

RobertIndie Mar 9, 2022

Fix seek chunked messages #205

Fix seek chunked messages #205

Conversation

RobertIndie commented Mar 2, 2022 • edited Loading

Motivation

Modification

Lanayx commented Mar 2, 2022

Lanayx Mar 2, 2022

Choose a reason for hiding this comment

RobertIndie Mar 9, 2022

Choose a reason for hiding this comment

RobertIndie commented Mar 2, 2022 •

edited

Loading