-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-15046: Get rid of unnecessary fsyncs inside UnifiedLog.lock to stabilize performance #14242
Changes from 6 commits
184b031
b3d53e3
136b5a9
165ec5e
8691c87
7e40de4
6be4d05
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -72,18 +72,20 @@ public CheckpointFile(File file, | |
tempPath = Paths.get(absolutePath + ".tmp"); | ||
} | ||
|
||
public void write(Collection<T> entries) throws IOException { | ||
public void write(Collection<T> entries, boolean sync) throws IOException { | ||
synchronized (lock) { | ||
// write to temp file and then swap with the existing file | ||
try (FileOutputStream fileOutputStream = new FileOutputStream(tempPath.toFile()); | ||
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(fileOutputStream, StandardCharsets.UTF_8))) { | ||
CheckpointWriteBuffer<T> checkpointWriteBuffer = new CheckpointWriteBuffer<>(writer, version, formatter); | ||
checkpointWriteBuffer.write(entries); | ||
writer.flush(); | ||
fileOutputStream.getFD().sync(); | ||
if (sync) { | ||
fileOutputStream.getFD().sync(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ocadaruma : I realized a potential issue with this change. The issue is that if sync is false, we don't force a flush to disk. However, the OS could flush partial content of the leader epoch file. If the broker has a hard failure, the leader epoch file could be corrupted. In the recovery path, since we always expect the leader epoch file to be well-formed, a corrupted leader epoch file will fail the recovery. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @junrao Hmm, that's true. Thanks for pointing out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, @ocadaruma ! |
||
} | ||
} | ||
|
||
Utils.atomicMoveWithFallback(tempPath, absolutePath); | ||
Utils.atomicMoveWithFallback(tempPath, absolutePath, sync); | ||
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a test to verify that the recovery point is only advanced after the producer state has been flushed to disk?