Skip to content

Commit

Permalink
HBASE-28114 Replication log reader should not simply quit when queue …
Browse files Browse the repository at this point in the history
…is empty
  • Loading branch information
Apache9 committed Sep 29, 2023
1 parent 4bc7d47 commit c702eb5
Showing 1 changed file with 26 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -334,14 +334,38 @@ private HasNext tryAdvanceEntry() {
boolean beingWritten = pair.getSecond();
LOG.trace("Reading WAL {}; result={}, currently open for write={}", this.currentPath, state,
beingWritten);
// notice that, the implementation must guarantee that, when beingWritten is true, we must make
// sure that there is at least one more WAL file in the logQueue before poll out the current
// one, otherwise we may accidentally quit the shipper thread while there are still new WAL
// files arrived(which has not been enqueued into the logQueue yet).
switch (state) {
case NORMAL:
// everything is fine, just return
return HasNext.YES;
case EOF_WITH_TRAILER:
// we have reached the trailer, which means this WAL file has been closed cleanly and we
// have finished reading it successfully, just move to the next WAL file and let the upper
// layer start reading the next WAL file
// have finished reading it successfully.
if (beingWritten) {
// this usually because after we call getLogFileSizeIfBeingWritten there is a log roll
// immediately and the rolling thread has already finished writing the WAL trailer. But it
// is still possible that the log roller thread has not finished enqueuing the new WAL
// file to the log queue yet, so here we must check the logQueue first to see if there is
// a new WAL file right after the current one, if so, we can dequeue the current one and
// issue a RETRY_IMMEDIATELY, otherwise we should just issue a RETRY and let the upper
// layer retry again later, otherwise if the new WAL file still has not been enqueued when
// the upper layer retrying, it will find out that there is nothing in the replication
// queue and quit, which is incorrect and will cause replication halt forever.
PriorityBlockingQueue<Path> queue = logQueue.getQueue(walGroupId);
if (queue.size() <= 1) {
return HasNext.RETRY;
}
// otherwise, fall through to execute the same logic as beingWritten == false, as we have
// the new WAL file in logQueue already.
}
// if beingWritten is false, which means we have already fully finished the log roll, and
// the new WAL file has been enqueued into the replication queue, because when calling
// getLogFileSizeIfBeingWritten we will hold the rollWriterLock, so just move to the next
// WAL file and let the upper layer start reading the next WAL file
dequeueCurrentLog();
return HasNext.RETRY_IMMEDIATELY;
case EOF_AND_RESET:
Expand Down

0 comments on commit c702eb5

Please sign in to comment.