Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Ensure previous delayed index be removed from snapshotSegmentLastIndexTable & Make load operate asynchronous #20086

Merged
merged 9 commits into from
Apr 16, 2023

Conversation

coderzc
Copy link
Member

@coderzc coderzc commented Apr 13, 2023

PIP: #16763

Motivation

If the future of load segment timeout but, In fact load segment is successful then we need to ensure the previous delayed index is removed from snapshotSegmentLastIndexTable, otherwise, tracker will load the next segment even though we don't need it yet.

Modifications

  • Ensure the previous delayed index is removed from snapshotSegmentLastIndexTable when succeeding to load the segment
  • Make load operate asynchronous to avoid block thread

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@coderzc coderzc requested a review from mattisonchao April 13, 2023 04:23
@coderzc coderzc added area/broker type/bug The PR fixed a bug or issue reported a bug labels Apr 13, 2023
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 13, 2023
@coderzc coderzc self-assigned this Apr 13, 2023
@coderzc coderzc changed the title [fix][broker] Ensure previous delayed index be removed from snapshotSegmentLastIndexTable [fix][broker] Ensure previous delayed index be removed from snapshotSegmentLastIndexTable & Make load operate asynchronous Apr 13, 2023
@coderzc coderzc closed this Apr 13, 2023
@coderzc coderzc reopened this Apr 13, 2023
@coderzc coderzc force-pushed the fix_bucket_delayed_stcuk branch from 4a75481 to 0bf5554 Compare April 13, 2023 15:45
@coderzc coderzc force-pushed the fix_bucket_delayed_stcuk branch from 0bf5554 to 56f4902 Compare April 14, 2023 03:24
@coderzc coderzc modified the milestone: 3.0.0 Apr 14, 2023
@coderzc coderzc force-pushed the fix_bucket_delayed_stcuk branch from 7ed8dba to 84691ea Compare April 14, 2023 05:00
@codecov-commenter
Copy link

codecov-commenter commented Apr 14, 2023

Codecov Report

Attention: Patch coverage is 82.05128% with 7 lines in your changes missing coverage. Please review.

Project coverage is 72.91%. Comparing base (a332fea) to head (0239f4d).
Report is 1673 commits behind head on master.

Files with missing lines Patch % Lines
...r/delayed/bucket/BucketDelayedDeliveryTracker.java 82.05% 5 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #20086       +/-   ##
=============================================
+ Coverage     35.58%   72.91%   +37.33%     
- Complexity    12423    31908    +19485     
=============================================
  Files          1691     1868      +177     
  Lines        128771   138317     +9546     
  Branches      14044    15214     +1170     
=============================================
+ Hits          45822   100860    +55038     
+ Misses        76931    29449    -47482     
- Partials       6018     8008     +1990     
Flag Coverage Δ
inttests 24.20% <0.00%> (?)
systests 24.87% <0.00%> (-0.03%) ⬇️
unittests 72.22% <82.05%> (+39.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...sistent/PersistentDispatcherMultipleConsumers.java 76.00% <ø> (+23.60%) ⬆️
...r/delayed/bucket/BucketDelayedDeliveryTracker.java 83.91% <82.05%> (+83.91%) ⬆️

... and 1440 files with indirect coverage changes

bucket.asyncLoadNextBucketSnapshotEntry().thenAccept(indexList -> {
long loadStartTime = System.currentTimeMillis();
stats.recordTriggerEvent(BucketDelayedMessageIndexStats.Type.load);
CompletableFuture<Void> loadFuture = pendingLoad = bucket.asyncLoadNextBucketSnapshotEntry()
Copy link
Contributor

@lifepuzzlefun lifepuzzlefun Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this line split into multi line ? var a = b = c seems not usual : - )

@@ -527,17 +528,22 @@ protected long nextDeliveryTime() {
}

@Override
public synchronized long getNumberOfDelayedMessages() {
public long getNumberOfDelayedMessages() {
return numberDelayedMessages;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need an volatile on field numberDelayedMessage if we access without lock.

return this.lastMutableBucket.getBufferMemoryUsage() + sharedBucketPriorityQueue.bytesCapacity();
}

@Override
public synchronized NavigableSet<PositionImpl> getScheduledMessages(int maxMessages) {
if (!checkPendingOpDone()) {
log.info("[{}] Skip getScheduledMessages to wait for bucket snapshot load finish.", dispatcher.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method is on the hot path. we need change this log level to debug to avoid print logs frequently.

return this.lastMutableBucket.getBufferMemoryUsage() + sharedBucketPriorityQueue.bytesCapacity();
}

@Override
public synchronized NavigableSet<PositionImpl> getScheduledMessages(int maxMessages) {
if (!checkPendingOpDone()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the loading task only triggered we there is not any scheduled message left ? it seems dispatcher have to wait for the loading task finished before any scheduled message on time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the delivery of the message will be delayed. I think we can optimize this part in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. hope to see PIP-195 in 3.0.0 : - )

Copy link
Contributor

@lifepuzzlefun lifepuzzlefun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ~

@coderzc coderzc merged commit c4aec66 into apache:master Apr 16, 2023
coderzc added a commit that referenced this pull request Apr 17, 2023
…egmentLastIndexTable & Make load operate asynchronous (#20086)

(cherry picked from commit c4aec66)
@cbornet
Copy link
Contributor

cbornet commented Apr 24, 2023

This has been cherry-picked to branch-3.0 despite the code freeze.
Do you feel strongly that this needs to be in the 3.0 release ? Otherwise we should revert this and delay it 3.0.1/3.1.

@coderzc
Copy link
Member Author

coderzc commented Apr 24, 2023

This has been cherry-picked to branch-3.0 despite the code freeze. Do you feel strongly that this needs to be in the 3.0 release ? Otherwise we should revert this and delay it 3.0.1/3.1.

This is a severe issue for PIP-195 so it needs into 3.0, thanks~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker doc-not-needed Your PR changes do not impact docs ready-to-test type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants