-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix raft appendlog deadlock #3141
Fix raft appendlog deadlock #3141
Conversation
Good job, generally LGTM. Have you try this PR in your test, maybe we don't merge to 2.6.0? |
I too think this should not go to 2.6.0, just hold for a while, I'll do more test. Tried this pr locally, append log does not block anymore but a simple match query taks more than 2s to complete after we stoped the stresser, think there may have other problem. |
OK, contact me in dingtalk if any problem. |
👌 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! Thx a lot~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Job, LGTM
Codecov Report
@@ Coverage Diff @@
## master #3141 +/- ##
==========================================
+ Coverage 84.98% 85.17% +0.19%
==========================================
Files 1289 1294 +5
Lines 117763 118265 +502
==========================================
+ Hits 100076 100731 +655
+ Misses 17687 17534 -153
Continue to review full report at Codecov.
|
This fix #3140
In
bool RaftPart::checkAppendLogResult(AppendLogResult res)
we can see that it releaselogsLock_
before settingreplicatingLogs_
to false, during this period, log can be append tologs_
until it overflows and setbufferOverFlow_
to true:nebula/src/kvstore/raftex/RaftPart.cpp
Lines 1860 to 1874 in d0fb27a
this is disatrous, because we can see from
folly::Future<AppendLogResult> RaftPart::appendLogAsync()
that oncebufferOverFlow
is set to true,appendLogAsync()
will return error directly, meanwhile there is no other task comsuming thelogs_
and resetingbufferOverFlow_
to false, that mean the whole raft append log process will be blocked:nebula/src/kvstore/raftex/RaftPart.cpp
Lines 585 to 650 in d0fb27a