Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Skip to persist cursor info if it failed by cursor closed #23615

Merged

Conversation

poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Nov 20, 2024

Motivation

We encountered an issue as follows

time closing topic -> closing curosr in-progress cursor mark deleting
1 Mark the cursor as closing
2 Persistent cursor metadata into the cursor ledger 54781383
3 Start to close the cursor ledger 54781383
4 Mark the ledger 54781383 is closed in memory
5 Try to write metadata to the ledger 54781383, but the ledger was marked closed
6 Try to write metadata to ZK after failed writing BK
7 Start to write the ledger metadata of 54781383 to ZK
8 The ZK server closed the connection due to the pulsar trying to write a large value
9 Write to ZK failed due to the connection being lost
10 retries and keeps triggering a closing of the ZK connection
11 The retries will fail because the connection is unstable

I think we need at least 3 fixes

  • (What this current PR attempts to do) Do not continue executing the in-progress cursor mark deletion after the cursor is closed.
  • (I will push another PR to do it) Add a mechanism that limits the data size that will persist with ZK.
  • (I will push another PR to do it) Stop retries if such an issue occurs.

Modifications

  • Skip to persist cursor info if it failed by cursor closed
  • TODO: I will write a test tomorrow.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

@poorbarcode poorbarcode added type/bug The PR fixed a bug or issue reported a bug release/3.3.3 release/3.0.8 release/4.0.1 labels Nov 20, 2024
@poorbarcode poorbarcode added this to the 4.1.0 milestone Nov 20, 2024
@poorbarcode poorbarcode self-assigned this Nov 20, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 20, 2024
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari closed this Nov 29, 2024
@lhotari lhotari reopened this Nov 29, 2024
@lhotari lhotari changed the title [fix] [broker] Skip to persist cursor info if it failed by cursor closed [fix][broker] Skip to persist cursor info if it failed by cursor closed Nov 29, 2024
@dao-jun
Copy link
Member

dao-jun commented Dec 11, 2024

close reopen to retrigger the ci

@dao-jun dao-jun closed this Dec 11, 2024
@dao-jun dao-jun reopened this Dec 11, 2024
@codecov-commenter
Copy link

codecov-commenter commented Dec 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.15%. Comparing base (bbc6224) to head (02da3bd).
Report is 813 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23615      +/-   ##
============================================
+ Coverage     73.57%   74.15%   +0.58%     
+ Complexity    32624    32145     -479     
============================================
  Files          1877     1853      -24     
  Lines        139502   143376    +3874     
  Branches      15299    16277     +978     
============================================
+ Hits         102638   106322    +3684     
+ Misses        28908    28682     -226     
- Partials       7956     8372     +416     
Flag Coverage Δ
inttests 26.72% <25.00%> (+2.14%) ⬆️
systests 23.10% <25.00%> (-1.22%) ⬇️
unittests 73.68% <100.00%> (+0.84%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 79.78% <100.00%> (+0.48%) ⬆️

... and 1021 files with indirect coverage changes

@dao-jun
Copy link
Member

dao-jun commented Dec 11, 2024

@poorbarcode pls resolve the ci failure

@poorbarcode
Copy link
Contributor Author

@poorbarcode pls resolve the ci failure

Sure

@poorbarcode poorbarcode merged commit 9850605 into apache:master Dec 31, 2024
52 checks passed
lhotari pushed a commit that referenced this pull request Jan 2, 2025
lhotari pushed a commit that referenced this pull request Jan 2, 2025
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Jan 3, 2025
…ed (apache#23615)

(cherry picked from commit 9850605)
(cherry picked from commit 86eb2a8)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Jan 3, 2025
…ed (apache#23615)

(cherry picked from commit 9850605)
(cherry picked from commit 86eb2a8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants