-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][broker] Fix wrong double-checked locking for readOnActiveConsumerTask in dispatcher #22279
[fix][broker] Fix wrong double-checked locking for readOnActiveConsumerTask in dispatcher #22279
Conversation
…erTask in dispatcher ### Motivation The access on `readOnActiveConsumerTask` is not thread safe. ```java if (readOnActiveConsumerTask != null) { return; } readOnActiveConsumerTask = topic.getBrokerService().executor().schedule(() -> {/* ... */}); ``` There is a case that: | Steps | Thread 1 | Thread 2 | | :- | :- | :- | | 1 | Read `readOnActiveConsumerTask`: null | | | 2 | call `schedule` | | | 3 | | Read `readOnActiveConsumerTask`: null | | 4 | | call `schedule` | | 5 | Write `readOnActiveConsumerTask` to the result of `schedule` | | | 6 | | Write `readOnActiveConsumerTask` to the result of `schedule` | Then `schedule()` will be called twice and only one result will be assigned to `readOnActiveConsumerTask`. ### Modifications Follow the double-checked locking when `readOnActiveConsumerTask` is null to ensure `readOnActiveConsumerTask` cannot be called concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Good catch @BewareMyPower . I wonder if the same bug pattern is in other locations in the Pulsar code base?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BewareMyPower Please also check that possible concurrent calls on scheduleReadOnActiveConsumer method aren't causing issues.
Need to check scheduleReadOnActiveConsumer method.
@lhotari
Yes, there is a chance that
However, it's still safe because in Lines 343 to 349 in 442595e
BTW, yes, the same bug pattern is use in many places. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job and awesome explanations @BewareMyPower
@lhotari This PR is merged too quickly without a 2nd review. After revisiting the code, I found the double-checked locking is unnecessary.
The original motivation is the IDE warning: It's true that it's a non-atomic operation on volatile field 'readOnActiveConsumerTask'. It's a commonly seen error to access a volatile field so IDE gives such warnings. However, we don't need to make I opened another PR (#22285) that improves the code quality and reduces the unnecessary atomicity. That PR also reverts the changes of this PR. PTAL again. |
@BewareMyPower We don't have a policy that there should be a 2nd review. This PR didn't contain any risky changes and wasn't harmful at all and that's why I merged. It's better to keep on moving instead of optimizing for perfect PRs. The resolution that you have done is a good one: opening another PR with follow-up changes. |
@BewareMyPower I agree that generalizing thread safety to a certain style in the Pulsar code base won't be very useful. However there are real problems where thread safety has been ignored and fixing the issues isn't prioritized. For example the topic level policies contain real problems, #21303. |
…erTask in dispatcher (#22279)
…erTask in dispatcher (apache#22279) (cherry picked from commit 4e0c145) (cherry picked from commit e2070a8)
…erTask in dispatcher (apache#22279) (cherry picked from commit 4e0c145) (cherry picked from commit e2070a8)
…erTask in dispatcher (apache#22279) (cherry picked from commit 4e0c145) (cherry picked from commit e2070a8)
…erTask in dispatcher (apache#22279) (cherry picked from commit 4e0c145) (cherry picked from commit e2070a8)
…erTask in dispatcher (apache#22279) (cherry picked from commit 4e0c145) (cherry picked from commit e2070a8)
Motivation
The access on
readOnActiveConsumerTask
is not thread safe.There is a case that:
readOnActiveConsumerTask
: nullschedule
readOnActiveConsumerTask
: nullschedule
readOnActiveConsumerTask
to the result ofschedule
readOnActiveConsumerTask
to the result ofschedule
Then
schedule()
will be called twice and only one result (it could be either step 5 or 6) will be assigned toreadOnActiveConsumerTask
.Modifications
Follow the double-checked locking when
readOnActiveConsumerTask
is null to ensurereadOnActiveConsumerTask
cannot be called concurrently.Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: