-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RLE and dictionary filter only enabled for UTF8 since format version 17. #3868
Conversation
This fixes an issue in the filter pipeline where we should only skip offsets unfiltering for RLE/dictionary filters for UTF8 strings starting at version 17. --- TYPE: IMPROVEMENT DESC: RLE and dictionary filter only enabled for UTF8 since format version 17.
340113a
to
58d37ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @KiterLuc !
Added a backport label for 2.14
but I wanted to check with @ihnorton : are we planning some maintenance release for 2.13
too? The reader_base.cc
part of this bugfix is a zero day bug for RLE/dict when we have nullable string attributes, so it would need to be backported to every maintenance release we are planning to issue.
We only enabled this for 2.14 right? Is 2.13 effected? A further question I have is, we didn't allow creation of utf-8 attributes with RLE or Dictionary before format 17, so while the safety check is needed, its not possible for someone to have an array where this would cause a problem? |
first question: second question: |
We have no choice to make those two changes at once. Otherwise the back compat tests will not pass. This was actually found by the back compat tests, which include a RLE UTF8 nullable attribute. The back compat tests would actually not pass today if we were to generate back compat arrays for format 17/18, that's why I didn't add a UT for it as we technically have one already. I have also seen that the back compat arrays previous to format 17 don't RLE encode UTF8 so we are good there. |
…17. (#3868) This fixes an issue in the filter pipeline where we should only skip offsets unfiltering for RLE/dictionary filters for UTF8 strings starting at version 17. --- TYPE: IMPROVEMENT DESC: RLE and dictionary filter only enabled for UTF8 since format version 17.
…17. (#3868) (#3871) This fixes an issue in the filter pipeline where we should only skip offsets unfiltering for RLE/dictionary filters for UTF8 strings starting at version 17. --- TYPE: IMPROVEMENT DESC: RLE and dictionary filter only enabled for UTF8 since format version 17. Co-authored-by: KiterLuc <[email protected]>
This fixes an issue in the filter pipeline where we should only skip offsets unfiltering for RLE/dictionary filters for UTF8 strings starting at version 17.
TYPE: IMPROVEMENT
DESC: RLE and dictionary filter only enabled for UTF8 since format version 17.