-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28140 AbstractWALProvider may miss the WAL which is under creat… #5455
Conversation
…ion in getWALs method
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
The failed UTs are not related. Not sure why TestZooKeeper can fail with time out... Saw it several times recently... |
numRemoteWALUnderCreation++; | ||
} finally { | ||
numRemoteWALUnderCreationLock.unlock(); | ||
} | ||
initWAL(wal); | ||
peerId2WAL.put(peerId, Optional.of(wal)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These code modifications look good to me, but I am not familiar with AbstractWALProvider
. I have a question: What happens if a 'remoteWAL' is being constructed, but getWALs()
doesn't return it? (The 'remoteWAL' being constructed has not been put into peerId2WAL
yet.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see the analysis in this PR.
It will cause problem for replication. We rely on whether a WAL file is beingWritten to determine whether we can move on to the next file. If we miss a WAL here but the WAL file has been put into the replication queue, we could get beingWritten == false but actually the file is beingWritten, which causes unexpected behavior...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…ion in getWALs method (#5455) Signed-off-by: GeorryHuang <[email protected]> Signed-off-by: Xiaolin Ha <[email protected]> Signed-off-by: Wellington Chevreuil <[email protected]> (cherry picked from commit 391dfda)
…ion in getWALs method (apache#5455) Signed-off-by: GeorryHuang <[email protected]> Signed-off-by: Xiaolin Ha <[email protected]> Signed-off-by: Wellington Chevreuil <[email protected]>
…ion in getWALs method