-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iceberg: support row-level delete and update #8565
Conversation
@jackye1995 can you please add product test that would assert compatibility between Trino and Spark? Line 1911 in 1364fbb
|
If I wanted to try to this out, I'd need to create an Iceberg Table adhering to the Iceberg Format Specification V2, since you are proposing using delete snapshots, right? And should we bump Iceberg to 0.12 (that version has the final V2 spec)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @alexjo2144
@@ -112,7 +112,7 @@ private RowBlock(int startOffset, int positionCount, @Nullable boolean[] rowIsNu | |||
} | |||
|
|||
@Override | |||
protected Block[] getRawFieldBlocks() | |||
public Block[] getRawFieldBlocks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder why this is needed, and whether this is actually used correctly.
|
||
public static IcebergColumnHandle createUpdateRowIdColumnHandle(Schema tableSchema, TypeManager typeManager) | ||
{ | ||
return create(required(ROW_ID_COLUMN_INDEX, ROW_ID_COLUMN_NAME, DeleteSchemaUtil.posDeleteSchema(tableSchema).asStruct()), typeManager); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it used for deletes only, or for updates as well?
serializeToBytes(table.schema()), | ||
serializeToBytes(table.spec()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there already was an idea to add schema to IcebergTableHandle and it was rejected (?) for some reason.
@phd3 do you remember?
} | ||
else { | ||
Schema posDeleteSchema = DeleteSchemaUtil.posDeleteSchema(table.getSchema()); | ||
ConnectorPageSink posDeleteSink = new IcebergPageSink( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
positionalDeletesSink
private final List<IcebergColumnHandle> allTableColumns; | ||
private final List<IcebergColumnHandle> updateColumns; | ||
private final ConnectorPageSource source; | ||
private final ConnectorPageSink posDeleteSink; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
positionalDeletesSink
FileContent.POSITION_DELETES, | ||
maxOpenPartitions); | ||
|
||
ConnectorPageSink updateRowSink = new IcebergPageSink( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updatedDataSink
private final List<IcebergColumnHandle> updateColumns; | ||
private final ConnectorPageSource source; | ||
private final ConnectorPageSink posDeleteSink; | ||
private final ConnectorPageSink updateRowSink; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updatedDataSink
} | ||
|
||
Block[] updatedRows = new Block[allTableColumns.size()]; | ||
Block[] oldRows = ((RowBlock) rowIdBlock.getRawFieldBlocks()[2]).getRawFieldBlocks(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cast to RowBlock
isn't entirely correct.
See #9354 and perhaps we should use ColumnarRow
here.
cc @djsstarburst
resultBlocks[i] = RowBlock.fromFieldBlocks(pageSize, Optional.empty(), rowIdComponentBlocks); | ||
} | ||
else { | ||
resultBlocks[i] = sourcePage.getBlock(allTableColumns.indexOf(columnHandle)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexOf
use here looks quadratic, and we seem to be doing this for every page.
close in favor of #10075 |
This PR adds support for writing Iceberg position delete. Similar to #8534 , I first present our working internal implementation backported to Trino, some parts might not work because of internal differences, but once we agree with the general approach I will make the fix and add unit tests.
Also, there is a missing piece that has to be added after #8534 is first merged, so that the
IcebergPageSource
has the ability to retain the row position channel and pass it to the updatable page source.A few key points:
ROW(string file_path, long pos, row(table schema))
, which matches Iceberg's position delete file schema.beginXXX
andfinishXXX
operation implementation. The only difference is that update writes new data files after writing the delete files. This is because update in Iceberg is modelled as delete + insert.This is a bare minimum backport, I left some inline TODOs, and also there are many optimizaitons we can make after the base version is checked in, I tried to keep this as simple as possible to avoid too many disagreements around optimization related changes. Please let me know if this looks good or not, thanks!
@phd3 @electrum @findepi @losipiuk @caneGuy @rdblue @hashhar