-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Fixed certain operations failing to add new data files during retries #9230
Conversation
Assert.assertTrue("Should commit same new manifest", new File(newManifest.path()).exists()); | ||
Assert.assertTrue( | ||
"Should commit the same new manifest", | ||
metadata.currentSnapshot().allManifests(FILE_IO).contains(newManifest)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually testing that the manifest is reused across commit attempts because it came from apply
that was called after the latest failure (without a cleanup operation).
Instead of manually exercising this path, why not use a transaction here and fail just once to validate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this to not refresh the manifest. Can you clarify what you mean about failing just once, it's significant for this bug report that the entire commitTransaction
call fails. The behavior with a single failure and internal retries is different.
|
||
Assertions.assertThatThrownBy(append::commit) | ||
.isInstanceOf(CommitFailedException.class) | ||
.hasMessage("Injected failure"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is likely failing the style checks. You can run ./gradlew spotlessApply
to fix it.
f89a103
to
aa9bee7
Compare
@ConeyLiu could you also please take a look at this? |
@@ -895,7 +895,7 @@ private void cleanUncommittedAppends(Set<ManifestFile> committed) { | |||
} | |||
} | |||
|
|||
this.cachedNewDataManifests = committedNewDataManifests; | |||
this.cachedNewDataManifests = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than setting this to null
, I think a better approach would be to handle this exactly like it's done for cachedNewDeleteManifests
, where cachedNewDeleteManifests
is initialized to be a linked list and then is iterated over and files are removed.
I did a quick check with
--- a/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
+++ b/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
@@ -94,7 +94,7 @@ abstract class MergingSnapshotProducer<ThisT> extends SnapshotProducer<ThisT> {
private PartitionSpec dataSpec;
// cache new data manifests after writing
- private List<ManifestFile> cachedNewDataManifests = null;
+ private final List<ManifestFile> cachedNewDataManifests = Lists.newLinkedList();
private boolean hasNewDataFiles = false;
// cache new manifests for delete files
@@ -885,17 +885,15 @@ abstract class MergingSnapshotProducer<ThisT> extends SnapshotProducer<ThisT> {
}
private void cleanUncommittedAppends(Set<ManifestFile> committed) {
- if (cachedNewDataManifests != null) {
- List<ManifestFile> committedNewDataManifests = Lists.newArrayList();
- for (ManifestFile manifest : cachedNewDataManifests) {
- if (committed.contains(manifest)) {
- committedNewDataManifests.add(manifest);
- } else {
- deleteFile(manifest.path());
+ if (!cachedNewDataManifests.isEmpty()) {
+ ListIterator<ManifestFile> dataManifestsIterator = cachedNewDataManifests.listIterator();
+ while (dataManifestsIterator.hasNext()) {
+ ManifestFile dataManifest = dataManifestsIterator.next();
+ if (!committed.contains(dataManifest)) {
+ deleteFile(dataManifest.path());
+ dataManifestsIterator.remove();
}
}
-
- this.cachedNewDataManifests = null;
}
ListIterator<ManifestFile> deleteManifestsIterator = cachedNewDeleteManifests.listIterator();
@@ -950,12 +948,12 @@ abstract class MergingSnapshotProducer<ThisT> extends SnapshotProducer<ThisT> {
}
private List<ManifestFile> newDataFilesAsManifests() {
- if (hasNewDataFiles && cachedNewDataManifests != null) {
+ if (hasNewDataFiles && !cachedNewDataManifests.isEmpty()) {
cachedNewDataManifests.forEach(file -> deleteFile(file.path()));
- cachedNewDataManifests = null;
+ cachedNewDataManifests.clear();
}
- if (cachedNewDataManifests == null) {
+ if (cachedNewDataManifests.isEmpty()) {
try {
RollingManifestWriter<DataFile> writer = newRollingManifestWriter(dataSpec());
try {
@@ -968,7 +966,7 @@ abstract class MergingSnapshotProducer<ThisT> extends SnapshotProducer<ThisT> {
writer.close();
}
- this.cachedNewDataManifests = writer.toManifestFiles();
+ this.cachedNewDataManifests.addAll(writer.toManifestFiles());
and that seemed to work. We would probably want to use the same approach in FastAppend
(but I haven't checked that part for FastAppend
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @nastra this solution is much better and solid. We should not set cachedNewDataManifests
to null directly. Consider the following case:
1. Transaction transaction = table.newTransaction();
2. transaction.newAppend().appendFile(...).commit(); // generate new manifest file
3. // transaction operation failed
4. transaction.commitTransaction(); // failed to commit
With the changes in this PR, the new manifest files generated by step 2 will not be deleted in step 4. I think the failed UTs could be verified this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I pushed a new commit that uses the linked list like your patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nastra Wouldn't it be safer to set the list to empty at the end of this function? If this list isn't emptied it can lead to the same bug since it will think it doesn't need to produce manifests in the newDataFilesAsManifests
method.
I don't think this is an issue right now because I expect all commits to fail or succeed so all files to be deleted, but maybe for safety we should just clear it?
@ConeyLiu If I understand correctly the files are only cleared once the transaction/operation fully fails so it should clear everything at the end right?
@@ -313,6 +313,37 @@ public void testRecoveryWithoutManifestList() { | |||
metadata.currentSnapshot().allManifests(FILE_IO).contains(newManifest)); | |||
} | |||
|
|||
@Test | |||
public void testRecoveryWithTransaction() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test doesn't fail, I'm guessing you were planning to update FastAppend
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a test and fix for FastAppend
that failed in my first commit. It only fails when not using a transaction.
Based on Ryan's comments in slack I removed that for now (though it's still a bug since it can break the table) but still added a new test to cover the case with a transaction that wasn't covered before.
83fe828
to
2f5a847
Compare
During manual retries (calling commit again on a failed change) the operation would partially succeed. For example Rewrite may only write the deletes and not the appends. This commit fixes the issue and adds some tests that fail without the fix.
2f5a847
to
2650e8f
Compare
Based on further Slack discussions I re-added the FastApply fix. Couple of notes:
|
f92bef9
to
d909eea
Compare
Fixed trying to delete an already deleted file during cleanup. This would cause the cleanup to fail early and skip the deletion of all subsequent files.
d909eea
to
551b5c6
Compare
This is a similar issue to the bugs with manual retries with SnapshotProducer but the transaction was caching the TableMetadata which points to deleted files after cleanup. This forces the transaction to re-apply the changes ensuring a new valid TableMetadata is used
551b5c6
to
4170bfd
Compare
@jasonf20, sorry for being unclear in my reply earlier, but I don't think that the tests in this PR reproduce the error. The tests here reproduce similar errors, but do it by manually calling After looking at the problem more, I was able to produce a test case that used both the transaction and append APIs correctly and lost data. Here's the test: @Test
public void testTransactionRecommit() {
// update table settings to merge when there are 3 manifests
table.updateProperties().set(TableProperties.MANIFEST_MIN_MERGE_COUNT, "3").commit();
// create manifests so that the next commit will trigger a merge
table.newFastAppend().appendFile(FILE_A).commit();
table.newFastAppend().appendFile(FILE_B).commit();
// start a transaction with appended files that will merge
Transaction transaction = Transactions.newTransaction(table.name(), table.ops());
AppendFiles append = transaction.newAppend().appendFile(FILE_D);
Snapshot pending = append.apply();
Assert.assertEquals(
"Should produce 1 pending merged manifest", 1, pending.allManifests(table.io()).size());
// because a merge happened, the appended manifest is deleted the by append operation
append.commit();
// concurrently commit FILE_A without a transaction to cause the previous append to retry
table.newAppend().appendFile(FILE_C).commit();
Assert.assertEquals(
"Should produce 1 committed merged manifest",
1,
table.currentSnapshot().allManifests(table.io()).size());
transaction.commitTransaction();
Set<String> paths =
Sets.newHashSet(
Iterables.transform(
table.newScan().planFiles(), task -> task.file().path().toString()));
Set<String> expectedPaths = Sets.newHashSet(
FILE_A.path().toString(),
FILE_B.path().toString(),
FILE_C.path().toString(),
FILE_D.path().toString()
);
Assert.assertEquals("Should contain all committed files", expectedPaths, paths);
Assert.assertEquals(
"Should produce 2 manifests",
2,
table.currentSnapshot().allManifests(table.io()).size());
} The problem happens when two concurrent commits both compact the latest manifests. When that happens, the new data file manifest is removed by the operation cleanup logic because it was not committed to the transaction. Then, when the transaction needs to re-apply the change, the reuse logic notices that the appended file manifests have already been written and reuses them. However, the list is no longer correct because it was filtered. There are a few things to take away from this. First, this doesn't affect fast appends because that operation never merges manifests. However, it's still incorrect to filter the added manifests list so we should apply the same fix to diff --git a/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java b/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
index e06a491098..bde92daa4a 100644
--- a/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
+++ b/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
@@ -879,16 +879,20 @@ abstract class MergingSnapshotProducer<ThisT> extends SnapshotProducer<ThisT> {
private void cleanUncommittedAppends(Set<ManifestFile> committed) {
if (cachedNewDataManifests != null) {
+ boolean hasDelete = false;
List<ManifestFile> committedNewDataManifests = Lists.newArrayList();
for (ManifestFile manifest : cachedNewDataManifests) {
if (committed.contains(manifest)) {
committedNewDataManifests.add(manifest);
} else {
deleteFile(manifest.path());
+ hasDelete = true;
}
}
- this.cachedNewDataManifests = committedNewDataManifests;
+ if (hasDelete) {
+ cachedNewDataManifests = null;
+ }
}
ListIterator<ManifestFile> deleteManifestsIterator = cachedNewDeleteManifests.listIterator(); This is going to be better for transactions than always setting the cached manifests to null because the manifests are usually in the committed list. We also need to check whether similar logic is needed for |
Hi @rdblue. Thanks for info. Good to know about the manifest compaction conflict case. I was looking for a way the list could be partially cleared and this answers that. I have pushed another commit returning to my original fix with the Some more notes/questions:
What do you think? |
To which test case are you referring? I didn't think any of them were valid uses of the API because commit was called multiple times and the behavior is undefined.
What is the purpose of I think that we should focus on fixing the bug and then discuss cases where |
To the test cases, I added to
When
Alright, I pushed a commit that removes Let me know if this is good and I can squash all the commits when ready to merge. |
Thanks, @jasonf20! The current version looks great. I'll commit it when tests are passing. For the issues caused by calling |
…ifest merges (#9230) (#9337) Co-authored-by: Jason <[email protected]>
Since the following PR: #6335 FastAppend and subclasses of MergingSnapshotProducer will skip newly added data files during retries.
This happens because the cached value is set to an empty list instead of null and then if commit is called again, when
newDataFilesAsManifests
is called the logic is skipped and no data files are returned.This commit fixes the issue and adds some tests that fail without the fix.
See #9222 for an earlier discussion.
Initially, I thought this bug should be solved with validation but upon further discussion, I am re-submitting this with tests.
Fixes #9227