-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KAFKA-8522] Implementing proposal as outlined in KIP-534 #7600
[KAFKA-8522] Implementing proposal as outlined in KIP-534 #7600
Conversation
@junrao @hachikuji Have finished the initial ground work. Will be glad to get some comments. |
Retest this please. |
@junrao @hachikuji Fixed the issue. Ready for review. |
@hachikuji @guozhangwang Have any comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ConcurrencyPractitioner : Thanks for the PR. Looks good overall. Left a few comments below.
@@ -643,7 +643,8 @@ private[log] class Cleaner(val id: Int, | |||
maxLogMessageSize: Int, | |||
transactionMetadata: CleanedTransactionMetadata, | |||
lastRecordsOfActiveProducers: Map[Long, LastRecord], | |||
stats: CleanerStats): Unit = { | |||
stats: CleanerStats, | |||
tombstoneRetentionMs: Long): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add the javadoc for tombstoneRetentionMs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
@@ -431,6 +431,16 @@ public LegacyRecord outerRecord() { | |||
return record; | |||
} | |||
|
|||
@Override | |||
public long deleteHorizonMs() { | |||
return -1L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use RecordBatch.NO_TIMESTAMP here too?
public long deleteHorizonMs() { | ||
if (isDeleteHorizonSet()) { | ||
return firstTimestamp(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for {} for single line statement.
return attributes; | ||
} | ||
|
||
public static void writeEmptyHeader(ByteBuffer buffer, | ||
byte magic, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing indentation is correct.
public long deleteHorizonMs() { | ||
if (isDeleteHorizonSet()) { | ||
return super.loadFullBatch().deleteHorizonMs(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for {} for single line statement.
if (firstTimestamp == null) { | ||
if (isDeleteHorizonSet()) { | ||
firstTimestamp = deleteHorizonMs; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for {} for single line statement.
val isRetainedValue = record.hasValue || retainDeletes | ||
val isRetainedValue = record.hasValue || | ||
(!batch.isDeleteHorizonSet() || | ||
time.milliseconds() < batch.deleteHorizonMs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should use the new condition only when the record batch is of the latest message format. Otherwise, we could continue to use the old condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I ran tests, and spotbug main reported dodgy code when I checked if the version was less than the most recent one. Apparently, by this point, we already know that the batch is the latest version (v_2) anyways, so we don't need to check for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, how do we know at this point that batch is of V2? It seems that it could be any version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, will push this to git then. Can't find exactly where it is checked.
|
||
// check that the control batch has been emptied of records | ||
// if not, then we do not set a delete horizon until that is true | ||
if (batch.isControlBatch() && transactionMetadata.onControlBatchRead(batch)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we already called transactionMetadata.onControlBatchRead(batch) in shouldDiscardBatch(). So, we probably don't want to call it again on the same batch. Instead, we want to pass along the return value to retrieveDeleteHorizon().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, working on this. Will address this soon.
if (writeOriginalBatch) { | ||
// we check if the delete horizon should be set to a new value | ||
// in which case, we need to reset the base timestamp and overwrite the timestamp deltas | ||
if (writeOriginalBatch && !shouldSetDeleteHorizon) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need to write the DeleteHorizonTime if the batch contains a tombstone or a control record. This increases the chance for more optimized writeOriginalBatch case to happen.
* ------------------------------------------------------------------------------------------------- | ||
* | Unused (6-15) | Control (5) | Transactional (4) | Timestamp Type (3) | Compression Type (0-2) | | ||
* ------------------------------------------------------------------------------------------------- | ||
* --------------------------------------------------------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some doc on how Delete Horizon flag is being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
batch.writeTo(bufferOutputStream); | ||
filterResult.updateRetainedBatchMetadata(batch, retainedRecords.size(), false); | ||
} else { | ||
MemoryRecordsBuilder builder = buildRetainedRecordsInto(batch, retainedRecords, bufferOutputStream); | ||
MemoryRecordsBuilder builder = buildRetainedRecordsInto(batch, retainedRecords, bufferOutputStream, firstTimestamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably only want to set the DeleteHorizonTime if the batch contains tombstone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I missed that. Should've also checked in the else part of the code as well.
@@ -159,6 +159,7 @@ private static FilterResult filterTo(TopicPartition partition, Iterable<MutableR | |||
for (MutableRecordBatch batch : batches) { | |||
long maxOffset = -1L; | |||
BatchRetention batchRetention = filter.checkBatchRetention(batch); | |||
final long firstTimestamp = filter.retrieveDeleteHorizon(batch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
firstTimestamp => deleteHorizonMs?
boolean writeOriginalBatch = true; | ||
boolean shouldSetDeleteHorizon = !batch.isDeleteHorizonSet() && firstTimestamp != RecordBatch.NO_TIMESTAMP; | ||
boolean containsTombstones = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containsTombstones => containsTombstonesOrMarker?
@junrao All comments have been addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ConcurrencyPractitioner : Thanks for the updated patch. A few more comments below. Also, could we add a test for the new logic?
} | ||
|
||
@Override | ||
public boolean isDeleteHorizonSet() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isDeleteHorizonSet => deleteHorizonSet ?
@@ -557,6 +577,16 @@ public long baseOffset() { | |||
return loadFullBatch().baseOffset(); | |||
} | |||
|
|||
@Override | |||
public long deleteHorizonMs() { | |||
return loadFullBatch().deleteHorizonMs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a legacy record, deleteHorizonMs is never going to be set. It seems we can avoid loading the full batch? Ditto in isDeleteHorizonSet below.
@Override | ||
public long deleteHorizonMs() { | ||
if (isDeleteHorizonSet()) | ||
return super.loadFullBatch().deleteHorizonMs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, why do we need to call loadFullBatch()? Could we just call loadBatchHeader()?
@@ -159,6 +159,7 @@ private static FilterResult filterTo(TopicPartition partition, Iterable<MutableR | |||
for (MutableRecordBatch batch : batches) { | |||
long maxOffset = -1L; | |||
BatchRetention batchRetention = filter.checkBatchRetention(batch); | |||
long deleteHorizonMs = filter.retrieveDeleteHorizon(batch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it seems that we can just get deleteHorizonMs from batch directly, instead of through filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I get what you are saying. Just want to point out that this method is still necessary when the delete horizon is not set.
@@ -785,7 +801,14 @@ private[log] class Cleaner(val id: Int, | |||
* 2) The message doesn't has value but it can't be deleted now. | |||
*/ | |||
val latestOffsetForKey = record.offset() >= foundOffset | |||
val isRetainedValue = record.hasValue || retainDeletes | |||
val isLatestVersion = batch.magic() < RecordBatch.MAGIC_VALUE_V2 | |||
val shouldRetainDeletes = isLatestVersion match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For boolean value, we can just use if/else instead of match/case.
} else { | ||
val canDiscardBatch = transactionMetadata.onBatchRead(batch) | ||
canDiscardBatch | ||
(canDiscardBatch, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a control batch is empty, we need to check whether the batch should be retained based on deleteHorizonMs similar to what's in shouldRetainRecord. We can ignore retainTxnMarkers since a control batch is always in V2.
Also, it's a bit awkward to have to return 2 values. Maybe it's simpler to just return canDiscardControlBatch and let the caller decide how the batch should be retained.
boolean writeOriginalBatch = true; | ||
boolean shouldSetDeleteHorizon = !batch.isDeleteHorizonSet() && deleteHorizonMs != RecordBatch.NO_TIMESTAMP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, a more intuitive check is probably !batch.isDeleteHorizonSet() && messageFormat >= V2.
@@ -250,6 +304,7 @@ private static MemoryRecordsBuilder buildRetainedRecordsInto(RecordBatch origina | |||
originalBatch.compressionType(), timestampType, baseOffset, logAppendTime, originalBatch.producerId(), | |||
originalBatch.producerEpoch(), originalBatch.baseSequence(), originalBatch.isTransactional(), | |||
originalBatch.isControlBatch(), originalBatch.partitionLeaderEpoch(), bufferOutputStream.limit()); | |||
builder.setDeleteHorizonMs(deleteHorizonMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding a new setDeleteHorizonMs() method, would it be better to just add another constructor for MemoryRecordsBuilder that includes DeleteHorizonMs? This way, it's more consistent with how DeleteHorizonMs is set in another classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junrao That wouldn't work. Checkstyle prohibits any constructor which have more than 13 parameters. Adding a new one would make it 14. I think that we should do a builder pattern instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably just bump up the number of parameter limit in the checkstyle file to 14.
Print statements are temporary. |
Cool, so I fixed the majority of the tests. |
Retest this please. |
@junrao Feel free to leave some comments. I fixed all the failing tests, just need to add some new ones. Will do so today. |
Retest this please. |
Migrating to #7884 . |
Trying to resolve long running issues we have with transactional marker and tombstone retention.