-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reading Iceberg delete files when not needed #13395
Avoid reading Iceberg delete files when not needed #13395
Conversation
ReaderPageSource dataPageSource = readerPageSourceWithRowPositions.getReaderPageSource(); | ||
|
||
if (dataPageSource.get().isFinished()) { | ||
return new EmptyPageSource(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched the approach here to just wrap the DeleteFilter reading in a Supplier. I think that reads better
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/DeleteFile.java
Outdated
Show resolved
Hide resolved
deleteFile.lowerBounds().entrySet().stream().collect(toImmutableMap(Map.Entry::getKey, entry -> entry.getValue().array())); | ||
Map<Integer, byte[]> upperBounds = deleteFile.upperBounds() == null ? | ||
ImmutableMap.of() : | ||
deleteFile.upperBounds().entrySet().stream().collect(toImmutableMap(Map.Entry::getKey, entry -> entry.getValue().array())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.array()
do we need to make a defensive copy of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. Added a call to clone
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/DeleteFile.java
Outdated
Show resolved
Hide resolved
d3b7369
to
43369ad
Compare
Applied comments in fixup commit, thanks @findepi |
b676542
to
716b527
Compare
squashed |
@alexjo2144 can you please rebase? |
Parqet only. Skip reading the delete files associated with a data file if the deletes are not relevant. This can happen when the statistics from the data file already show the split can be skipped. Additionally, this can happen when the line numbers read by the split are known and can be used to filter positional deletes.
716b527
to
6df8a8b
Compare
Description
Parqet only.
Skip reading the delete files associated with a data file if the deletes are
not relevant. This can happen when the statistics from the data file already
show the split can be skipped. Additionally, this can happen when the line
numbers read by the split are known and can be used to filter positional
deletes.
Performance improvement
Iceberg connector
Minimize I/O operations
Related issues, pull requests, and links
#13219
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
(x) No release notes entries required.
( ) Release notes entries required with the following suggested text: