Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segmentation fault when sending messages after receiving an error #326

Merged

Conversation

BewareMyPower
Copy link
Contributor

Fixes #325

Motivation

#317 introduces a bug that might cause segmentation fault when sending messages after receiving an error, see
#325 (comment) for the detailed explanation.

Modifications

When calling asyncWrite, capture the shared_ptr instead of the weak_ptr to extend the lifetime of the socket_ or tlsSocket_ field in ClientConnection. Since the lifetime is extended, in some callbacks, check isClosed() before other logic.

Add a ChunkDedupTest to reproduce this issue based on Pulsar 3.1.0. Run the test for 10 times to ensure it won't crash after this patch.

Fixes apache#325

### Motivation

apache#317 introduces a bug
that might cause segmentation fault when sending messages after
receiving an error, see
apache#325 (comment)
for the detailed explanation.

### Modifications

When calling `asyncWrite`, capture the `shared_ptr` instead of the
`weak_ptr` to extend the lifetime of the `socket_` or `tlsSocket_` field
in `ClientConnection`. Since the lifetime is extended, in some
callbacks, check `isClosed()` before other logic.

Add a `ChunkDedupTest` to reproduce this issue based on Pulsar 3.1.0.
Run the test for 10 times to ensure it won't crash after this patch.
@BewareMyPower BewareMyPower added the bug Something isn't working label Oct 8, 2023
@BewareMyPower BewareMyPower added this to the 3.4.0 milestone Oct 8, 2023
@BewareMyPower BewareMyPower self-assigned this Oct 8, 2023
BewareMyPower added a commit to streamnative/pulsar-client-cpp that referenced this pull request Oct 8, 2023
…apache#326)

Fixes apache#325

### Motivation

apache#317 introduces a bug
that might cause segmentation fault when sending messages after
receiving an error, see
apache#325 (comment)
for the detailed explanation.

### Modifications

When calling `asyncWrite`, capture the `shared_ptr` instead of the
`weak_ptr` to extend the lifetime of the `socket_` or `tlsSocket_` field
in `ClientConnection`. Since the lifetime is extended, in some
callbacks, check `isClosed()` before other logic.

Add a `ChunkDedupTest` to reproduce this issue based on Pulsar 3.1.0.
Run the test for 10 times to ensure it won't crash after this patch.
driver: bridge
services:
standalone:
image: apachepulsar/pulsar:3.1.0 # Ensure https://github.com/apache/pulsar/pull/20948 is not included
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to add a comment like "Don't change the pulsar version here.". This test could only work before 3.1.0, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@BewareMyPower BewareMyPower merged commit 5c77648 into apache:main Oct 9, 2023
11 checks passed
@BewareMyPower BewareMyPower deleted the bewaremypower/fix-crash-reconnection branch October 9, 2023 03:07
BewareMyPower added a commit to BewareMyPower/pulsar-client-cpp that referenced this pull request Oct 19, 2023
### Motivation

apache#326 fixes the possible
segmentation fault caused by async_write, but it could still crash when
triggering the callback of async_receive while the socket is destroyed.
See apache#324 (comment)

### Modifications

Capture the `shared_ptr` in `asyncReceive`.
BewareMyPower added a commit that referenced this pull request Oct 20, 2023
### Motivation

#326 fixes the possible
segmentation fault caused by async_write, but it could still crash when
triggering the callback of async_receive while the socket is destroyed.
See #324 (comment)

### Modifications

Capture the `shared_ptr` in `asyncReceive`.
BewareMyPower added a commit to streamnative/pulsar-client-cpp that referenced this pull request Oct 20, 2023
### Motivation

apache#326 fixes the possible
segmentation fault caused by async_write, but it could still crash when
triggering the callback of async_receive while the socket is destroyed.
See apache#324 (comment)

### Modifications

Capture the `shared_ptr` in `asyncReceive`.

(cherry picked from commit a0f2d32)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Segmentation fault when sending messages after receiving an error
3 participants