-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix applyFilter when an Iceberg table does not have any snapshots #13576
Fix applyFilter when an Iceberg table does not have any snapshots #13576
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tables created by Spark may not have a snapshot committed if they are newly created, empty tables.
Does it indicate that it would be better to fix CREATE TABLE logic in the connector?
This would have the benefit of improving test coverage, and support, for this case. |
.map(ManifestFile::partitionSpecId) | ||
.collect(toImmutableSet()); | ||
.collect(toImmutableSet())) | ||
.orElseGet(() -> ImmutableSet.copyOf(icebergTable.specs().keySet())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.orElseGet(() -> ImmutableSet.copyOf(icebergTable.specs().keySet())); | |
// No snapshot, so no data. This case doesn't matter. | |
.orElseGet(() -> ImmutableSet.copyOf(icebergTable.specs().keySet())); |
Tables created by Spark may not have a snapshot committed if they are newly created, empty tables.
13212c5
to
7190dae
Compare
Sorry, my comment was unclear. My question was if we will fix Iceberg CREATE TABLE logic to make it the same behavior with Spark after merging this PR as-is. |
Both are valid, going strictly by the specification. I don't know that matching the Spark behavior buys us anything besides making it easier to test this edge case where there are not snapshots in the history. |
that's the sole reason I'd consider this doing for. however, i actually value the fact that table history (snapshots) contains entry for initial empty state. This indeed is part of the table history. |
Description
Tables created by Spark may not have a snapshot committed if they are newly created, empty tables.
Fix
Iceberg connector
Fix handling of querying newly created, empty tables.
Related issues, pull requests, and links
Introduced by: #13239
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: