-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API, Core: Add data file reference to DeleteFile #11443
Conversation
@@ -448,6 +449,12 @@ public Object get(int pos) { | |||
return wrapped.equalityFieldIds(); | |||
case 15: | |||
return wrapped.sortOrderId(); | |||
case 16: | |||
if (wrapped instanceof DeleteFile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of this instanceof
given that it is called in a tight loop. We need it as IndexedDataFile
is used for both data and deletes. At the same time, I am not sure it is reasonable to add IndexedDeleteFile
for the sake of handling 3 different fields.
Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would define referencedDataFile
on ContentFile
and have DataFile
always return null
. That's how we handle content
in v1/v2. DataFile
returns a constant value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid unnecessary methods in DataFile
. The same goes for content offset and size, which will only be applicable to deletes. Let me think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better not to expose these methods on DataFile
:
- Avoids reliance on validation outside of this block to ensure these fields are null for data files. We had bugs before that allowed persisting equality field IDs for data files.
- Avoids unnecessary methods that can be invoked but have no meaning, which we have to explain.
@@ -339,6 +344,9 @@ protected <T> void internalSet(int pos, T value) { | |||
this.sortOrderId = (Integer) value; | |||
return; | |||
case 17: | |||
this.referencedDataFile = value != null ? value.toString() : null; | |||
return; | |||
case 18: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the ordinal of fileOrdinal
. This normally doesn't matter as we read with a projection. We also changed it a couple of times before. Anything I missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as both reads and writes are updated, it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming!
Thanks for reviewing, @rdblue @amogh-jahagirdar! |
This PR adds
referenced_data_file
to our delete metadata as per the design doc for improving position deletes.This PR cannot be merged until the vote is closed. This work is part of #11122.