-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ALLUXIO-2408] Fix potential journal checkpoint failure #4306
Conversation
Merged build finished. Test PASSed. |
Test PASSed. |
Running checks...
All checks passed! |
Merged build finished. Test PASSed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaudiber Thanks for this fix! I left a few comments.
ufs.delete(path, false); | ||
} | ||
} catch (IOException e) { | ||
throw Throwables.propagate(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which exceptions will be propagated? Technically, even if you check for exists
before you delete
, there is no guarantee it will still exist at the time of the delete
. Therefore, there is currently no way to do this atomically.
So, when should this method not throw an error, and when should it? What is the purpose/goal for this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The javadoc for UnderFileSystem just says @throws IOException if a non-Alluxio error occurs
for both exists
and delete
. This method is never expected to throw an exception.
I don't think the lack of atomicity is a problem. The contract is that if the file path exists, it gets deleted. If it gets deleted between the exists and the delete, the delete will return false and the result is the same (no more file)
@@ -118,6 +118,8 @@ public void start(boolean isLeader) throws IOException { | |||
* concurrent access to the master during these phases. | |||
*/ | |||
|
|||
mJournalWriter.recoverCheckpoint(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be commented on in the previous comments about the steps/phases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/** Absolute path to the backup checkpoint file. */ | ||
private final String mBackupCheckpointPath; | ||
/** Absolute path to the temporary backup checkpoint file. */ | ||
private final String mTempBackupCheckpointPath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrmmm, maybe there should be a javadoc somewhere that describes this process. There is a temp, backup, and tempbackup file, which can be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added more detailed javadocs for each of these
public synchronized void recoverCheckpoint() { | ||
try { | ||
if (mUfs.exists(mBackupCheckpointPath)) { | ||
if (mUfs.exists(mJournal.getCheckpointFilePath())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should mJournal.getCheckpointFilePath()
be mCheckpointPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks
deleteCompletedLogs(); | ||
mUfs.delete(mBackupCheckpointPath, false); | ||
} else { | ||
// We must have crashed before writing the checkpoint file, restore the checkpoint from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure when this would happen, so if there is a javadoc for this class that describes the checkpointing process, that would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added more docs here
@@ -179,16 +213,15 @@ private synchronized OutputStream openCurrentLog() throws IOException { | |||
*/ | |||
private synchronized void deleteCompletedLogs() throws IOException { | |||
LOG.info("Deleting all completed log files..."); | |||
// Loop over all complete logs starting from the beginning. | |||
// TODO(gpang): should the deletes start from the end? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is still relevant? I think I put in this comment since we do not delete completed logs "atomically". However, with your change, it seems like the deletion of completed logs can look atomic now? if something bad happens during the deletion, it will continue the deletion on the next startup, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method finds the logs by seeing how high it can count before the log of that number doesn't exist. If we have 10 logs and only delete log #1, the next time we try to delete the logs we won't delete logs 2-10 because we see that log 1 doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh, I see. I guess the TODO was correct. Thanks for fixing it.
@@ -274,16 +307,23 @@ public synchronized void close() throws IOException { | |||
mOutputStream.close(); | |||
|
|||
LOG.info("Successfully created tmp checkpoint file: {}", mTempCheckpointPath); | |||
mUfs.delete(mJournal.getCheckpointFilePath(), false); | |||
// TODO(gpang): the real checkpoint should not be overwritten here, but after all operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this!
Thanks for the review @gpang, I think it's much better now that I've added more comments/javadoc. All of your comments should be addressed now, PTAL |
Merged build finished. Test PASSed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaudiber Thanks! Left a few comments, and looks good.
private final String mBackupCheckpointPath; | ||
/** | ||
* Absolute path to the temporary backup checkpoint file. This path is used as an intermediate | ||
* rename step when backing up mCheckpointPath to mBackupCheckpointPath. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: use {@link }
for mCheckpointPath
and mBackupCheckpointPath
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
UnderFileSystemUtils.deleteIfExists(mUfs, mTempBackupCheckpointPath); | ||
UnderFileSystemUtils.deleteIfExists(mUfs, mBackupCheckpointPath); | ||
// Rename in two steps so that we never have identical mCheckpointPath and | ||
// mBackupCheckpointPath. This is a concern since UFS may implement rename as copy + delete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an invariant we want to maintain. Maybe this invariant should be explicitly mentioned in a class javadoc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more docs and centralized the code in one place to make it easier to see all the invariants
public synchronized void recoverCheckpoint() { | ||
try { | ||
if (mUfs.exists(mTempBackupCheckpointPath)) { | ||
// If mCheckpointPath exists, step 2 must have implemented rename as copy + delete, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is assuming the invariant that both dont exist at the same time. Do we want to precondition that, or is that potentially unsafe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only assumption is that all three files don't exist. I've added a Precondition for this
Merged build finished. Test FAILed. |
Test FAILed. Build result: FAILURE[...truncated 1887 lines...][JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/shell/pom.xml to org.alluxio/alluxio-shell/1.4.0-SNAPSHOT/alluxio-shell-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/oss/pom.xml to org.alluxio/alluxio-underfs-oss/1.4.0-SNAPSHOT/alluxio-underfs-oss-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/oss/target/alluxio-underfs-oss-1.4.0-SNAPSHOT.jar to org.alluxio/alluxio-underfs-oss/1.4.0-SNAPSHOT/alluxio-underfs-oss-1.4.0-SNAPSHOT.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/oss/target/alluxio-underfs-oss-1.4.0-SNAPSHOT-sources.jar to org.alluxio/alluxio-underfs-oss/1.4.0-SNAPSHOT/alluxio-underfs-oss-1.4.0-SNAPSHOT-sources.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/keyvalue/client/pom.xml to org.alluxio/alluxio-keyvalue-client/1.4.0-SNAPSHOT/alluxio-keyvalue-client-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/hdfs/pom.xml to org.alluxio/alluxio-underfs-hdfs/1.4.0-SNAPSHOT/alluxio-underfs-hdfs-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/hdfs/target/alluxio-underfs-hdfs-1.4.0-SNAPSHOT.jar to org.alluxio/alluxio-underfs-hdfs/1.4.0-SNAPSHOT/alluxio-underfs-hdfs-1.4.0-SNAPSHOT.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/hdfs/target/alluxio-underfs-hdfs-1.4.0-SNAPSHOT-sources.jar to org.alluxio/alluxio-underfs-hdfs/1.4.0-SNAPSHOT/alluxio-underfs-hdfs-1.4.0-SNAPSHOT-sources.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/swift/pom.xml to org.alluxio/alluxio-underfs-swift/1.4.0-SNAPSHOT/alluxio-underfs-swift-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/swift/target/alluxio-underfs-swift-1.4.0-SNAPSHOT.jar to org.alluxio/alluxio-underfs-swift/1.4.0-SNAPSHOT/alluxio-underfs-swift-1.4.0-SNAPSHOT.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/swift/target/alluxio-underfs-swift-1.4.0-SNAPSHOT-sources.jar to org.alluxio/alluxio-underfs-swift/1.4.0-SNAPSHOT/alluxio-underfs-swift-1.4.0-SNAPSHOT-sources.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/s3a/pom.xml to org.alluxio/alluxio-underfs-s3a/1.4.0-SNAPSHOT/alluxio-underfs-s3a-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/s3a/target/alluxio-underfs-s3a-1.4.0-SNAPSHOT.jar to org.alluxio/alluxio-underfs-s3a/1.4.0-SNAPSHOT/alluxio-underfs-s3a-1.4.0-SNAPSHOT.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/s3a/target/alluxio-underfs-s3a-1.4.0-SNAPSHOT-sources.jar to org.alluxio/alluxio-underfs-s3a/1.4.0-SNAPSHOT/alluxio-underfs-s3a-1.4.0-SNAPSHOT-sources.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/examples/pom.xml to org.alluxio/alluxio-examples/1.4.0-SNAPSHOT/alluxio-examples-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/core/server/pom.xml to org.alluxio/alluxio-core-server/1.4.0-SNAPSHOT/alluxio-core-server-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/keyvalue/common/pom.xml to org.alluxio/alluxio-keyvalue-common/1.4.0-SNAPSHOT/alluxio-keyvalue-common-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/pom.xml to org.alluxio/alluxio-underfs/1.4.0-SNAPSHOT/alluxio-underfs-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/glusterfs/pom.xml to org.alluxio/alluxio-underfs-glusterfs/1.4.0-SNAPSHOT/alluxio-underfs-glusterfs-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/glusterfs/target/alluxio-underfs-glusterfs-1.4.0-SNAPSHOT.jar to org.alluxio/alluxio-underfs-glusterfs/1.4.0-SNAPSHOT/alluxio-underfs-glusterfs-1.4.0-SNAPSHOT.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/underfs/glusterfs/target/alluxio-underfs-glusterfs-1.4.0-SNAPSHOT-sources.jar to org.alluxio/alluxio-underfs-glusterfs/1.4.0-SNAPSHOT/alluxio-underfs-glusterfs-1.4.0-SNAPSHOT-sources.jar[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/integration/pom.xml to org.alluxio/alluxio-integration/1.4.0-SNAPSHOT/alluxio-integration-1.4.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/workspace/Alluxio-Pull-Request-Builder/keyvalue/pom.xml to org.alluxio/alluxio-keyvalue/1.4.0-SNAPSHOT/alluxio-keyvalue-1.4.0-SNAPSHOT.pomchannel stoppedTest FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaudiber some minor comments
* Creates a new instance of {@link CheckpointManager}. | ||
* | ||
* @param ufs the under file system holding the journal | ||
* @param the directory for the journal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: the javadoc params seem out of date with the actual parameters of the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
// Rename in two steps so that we never have identical mCheckpointPath and | ||
// mBackupCheckpointPath. This is a concern since UFS may implement rename as copy + delete. | ||
mUfs.rename(mCheckpointPath, mTempBackupCheckpointPath); | ||
mUfs.rename(mTempBackupCheckpointPath, mBackupCheckpointPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: maybe add a log message after this that the backup of the checkpoint was completed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
public void recoverCheckpoint() { | ||
try { | ||
Preconditions.checkState( | ||
!(mUfs.exists(mCheckpointPath) && mUfs.exists(mTempBackupCheckpointPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be duplicate exists
calls for these files? Maybe add a TODO to remove the duplicate calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we cannot guarantee atomicity between exists & next operation, I think we should just cache the values for the duration of the recovery, unless we expect an update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Merged build finished. Test PASSed. |
Test PASSed. |
public void recoverCheckpoint() { | ||
try { | ||
Preconditions.checkState( | ||
!(mUfs.exists(mCheckpointPath) && mUfs.exists(mTempBackupCheckpointPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we cannot guarantee atomicity between exists & next operation, I think we should just cache the values for the duration of the recovery, unless we expect an update.
/** | ||
* Recovers the checkpoint file in case the master crashed while updating it previously. | ||
*/ | ||
public void recoverCheckpoint() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you document the expected state after this method is called? ie.
CheckpointPath is always valid, CheckpointPath + editlogs -> up to date master state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@calvinjia @gpang All comments addressed, PTAL |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
https://alluxio.atlassian.net/browse/ALLUXIO-2408