Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Deserialized BatchMessageIdImpl cannot be used for acknowledgment #19030

Open
2 tasks done
BewareMyPower opened this issue Dec 22, 2022 · 2 comments · May be fixed by BewareMyPower/pulsar#15 or #19031
Open
2 tasks done

[Bug] Deserialized BatchMessageIdImpl cannot be used for acknowledgment #19030

BewareMyPower opened this issue Dec 22, 2022 · 2 comments · May be fixed by BewareMyPower/pulsar#15 or #19031
Assignees
Labels
Stale type/bug The PR fixed a bug or issue reported a bug

Comments

@BewareMyPower
Copy link
Contributor

Search before asking

  • I searched in the issues and found nothing similar.

Version

OS: Ubuntu 20.04
Pulsar: master (41edd2e)

Minimal reproduce step

Add a unit test that does these things:

  1. Create a producer to send N messages in the same batch.
  2. Create a consumer to receive these messages and store the MessageId objects, which share the same ledger id, entry id, batch size, only the batch indexes are different.
  3. Convert these MessageId objects by a serialization (MessageId#toByteArray) and a deserialization (MessageId#fromByteArray).
  4. Acknowledge these MessageId objects.
  5. Restart the consumer, it still receives the 1st message.
    @Test
    public void testSerialization() throws Exception {
        var topic = "test-serialization-origin";
        @Cleanup var producer = pulsarClient.newProducer(Schema.INT32)
                .topic(topic)
                .batchingMaxMessages(100)
                .batchingMaxPublishDelay(1, TimeUnit.DAYS)
                .create();
        @Cleanup var consumer = pulsarClient.newConsumer(Schema.INT32)
                .topic(topic)
                .subscriptionName("sub")
                .isAckReceiptEnabled(true)
                .subscribe();

        final var numMessages = 10;
        for (int i = 0; i < numMessages; i++) {
            producer.sendAsync(i);
        }
        producer.flush();
        final var msgIds = new ArrayList<MessageId>();
        for (int i = 0; i < numMessages; i++) {
            msgIds.add(consumer.receive().getMessageId());
        }
        for (int i = 1; i < numMessages; i++) {
            final var lhs = (BatchMessageIdImpl) msgIds.get(0);
            final var rhs = (BatchMessageIdImpl) msgIds.get(i);
            assertEquals(lhs.getLedgerId(), rhs.getLedgerId());
            assertEquals(lhs.getEntryId(), rhs.getEntryId());
            assertEquals(lhs.getBatchSize(), rhs.getBatchSize());
            assertEquals(lhs.getBatchSize(), numMessages);
        }

        var deserializedMsgIds = new ArrayList<MessageId>();
        for (var msgId : msgIds) {
            var deserialized = MessageId.fromByteArray(msgId.toByteArray());
            assertTrue(deserialized instanceof BatchMessageIdImpl);
            deserializedMsgIds.add(deserialized);
        }
        for (var msgId : deserializedMsgIds) {
            consumer.acknowledge(msgId);
        }
        consumer.close();

        consumer = pulsarClient.newConsumer(Schema.INT32)
                .topic(topic)
                .subscriptionName("sub")
                .isAckReceiptEnabled(true)
                .subscribe();
        final var msg = consumer.receive(3, TimeUnit.SECONDS);
        assertNotNull(msg);
        assertEquals(msg.getValue(), 0);
    }

What did you expect to see?

The restarted consumer should receive nothing.

What did you see instead?

The restarted consumer received the 1st message.

Anything else?

The root cause is from #1424, which make all MessageId instances in the same batch share the same BatchMessageAcker object. However, when MessageId instances are created from a deserialization. It's impossible to make them share the same BatchMessageAcker.

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@BewareMyPower BewareMyPower added the type/bug The PR fixed a bug or issue reported a bug label Dec 22, 2022
@BewareMyPower BewareMyPower self-assigned this Dec 22, 2022
@BewareMyPower
Copy link
Contributor Author

The deserialization logic in #9855 is meaningless.

if (idData.hasBatchSize()) {
messageId = new BatchMessageIdImpl(idData.getLedgerId(), idData.getEntryId(), idData.getPartition(),
idData.getBatchIndex(), idData.getBatchSize(), BatchMessageAcker.newAcker(idData.getBatchSize()));
} else {
messageId = new BatchMessageIdImpl(idData.getLedgerId(), idData.getEntryId(), idData.getPartition(),
idData.getBatchIndex());
}

Creating a independent BatchMessageAcker helps nothing.

BewareMyPower added a commit to BewareMyPower/pulsar that referenced this issue Dec 22, 2022
Fixes apache#19030

### Motivation

When a `BatchMessageIdImpl` is created from a deserialization, the
`BatchMessageAcker` object cannot be shared by all instances in the same
batch, which leads to an acknowledgment failure when batch index ACK is
disabled (by default).

### Modifications

Maintain a map from the `(ledger id, entry id)` pair to the
`BatchMessageAcker` in `ConsumerImpl`. If the `BatchMessageIdImpl`
doesn't carry a valid `BatchMessageAcker`, create and cache a
`BatchMessageAcker` instance and remove it when all messages in the
batch are acknowledged.

It requires a change in `MessageIdImpl#fromByteArray` that a
`BatchMessageAckerDisabled` will be created to indicate there is no
shared acker.

To avoid making code more complicated, this patch refactors the existing
code that many logics about consumer are moved from the ACK tracker to
the consumer. It also removes the `AckType` parameter when acknowledging
a list of messages.
BewareMyPower added a commit to BewareMyPower/pulsar that referenced this issue Dec 22, 2022
Fixes apache#19030

### Motivation

When a `BatchMessageIdImpl` is created from a deserialization, the
`BatchMessageAcker` object cannot be shared by all instances in the same
batch, which leads to an acknowledgment failure when batch index ACK is
disabled (by default).

### Modifications

Maintain a map from the `(ledger id, entry id)` pair to the
`BatchMessageAcker` in `ConsumerImpl`. If the `BatchMessageIdImpl`
doesn't carry a valid `BatchMessageAcker`, create and cache a
`BatchMessageAcker` instance and remove it when all messages in the
batch are acknowledged.

It requires a change in `MessageIdImpl#fromByteArray` that a
`BatchMessageAckerDisabled` will be created to indicate there is no
shared acker.

To avoid making code more complicated, this patch refactors the existing
code that many logics about consumer are moved from the ACK tracker to
the consumer. It also removes the `AckType` parameter when acknowledging
a list of messages.
BewareMyPower added a commit to BewareMyPower/pulsar that referenced this issue Dec 23, 2022
Fixes apache#19030

### Motivation

When a `BatchMessageIdImpl` is created from a deserialization, the
`BatchMessageAcker` object cannot be shared by all instances in the same
batch, which leads to an acknowledgment failure when batch index ACK is
disabled (by default).

### Modifications

Maintain a map from the `(ledger id, entry id)` pair to the
`BatchMessageAcker` in `ConsumerImpl`. If the `BatchMessageIdImpl`
doesn't carry a valid `BatchMessageAcker`, create and cache a
`BatchMessageAcker` instance and remove it when all messages in the
batch are acknowledged.

It requires a change in `MessageIdImpl#fromByteArray` that a
`BatchMessageAckerDisabled` will be created to indicate there is no
shared acker.

To avoid making code more complicated, this patch refactors the existing
code that many logics about consumer are moved from the ACK tracker to
the consumer. It also removes the `AckType` parameter when acknowledging
a list of messages.
@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Jan 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
1 participant