-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
physical-plan: Cast nested group values back to dictionary if necessary #12586
Conversation
c1a9dae
to
4adf217
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @brancz -- I have a suggestion for how to simplify the code. Let me know if it makes sense
if expected.is_nested() && needs_nested_dictionary_encoding(expected, array)? | ||
{ | ||
*array = dictionary_encode_nested(array.clone(), expected)?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me like dictionary_encode_nested
would work on non nested types as well, and thus you could simply always call it (and avoid the special case for Dict above too).
That which would make this code simpler and easier to follow. You can probably avoid the clone if you wanted (not a huge deal give it is an Arc):
if expected.is_nested() && needs_nested_dictionary_encoding(expected, array)? | |
{ | |
*array = dictionary_encode_nested(array.clone(), expected)?; | |
} | |
*array = dictionary_encode_nested(&array, expected)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough! I was optimizing for not attempting to dictionary encode a bit prematurely. Changed it!
4888aac
to
d0e908a
Compare
Fixed all lints. Can you re-run tests? |
BTW the code looks 👨🍳 to me -- once the CI passes I think this will be good to go Also, once we have merged this CI runs on your future PR will run automaticallly |
Awesome! 🥳 We have more things on our list that we'll work off over the next weeks/months, so that'll come in handy! |
d0e908a
to
2894523
Compare
Huh weird, that last lint didn't fail for me locally, fixed it. |
2894523
to
2e86402
Compare
I'm really sorry, I think now it should truly be the last time that you need to re-run. I fixed it exactly the way clippy suggested. |
Got it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 thank you very much @brancz 🙏
here is hoping that the next few PRs will go more smoothly!
Which issue does this PR close?
Closes #12542
Rationale for this change
Non-nested arrays were already being dictionary encoded again if the input was dictionary encoded, so this just replicates that behavior for structs and lists.
What changes are included in this PR?
Note this may not be conclusive, I've only included lists and structs since those are the cases/types that I need and that I'm most familiar with. I think it'd be nice to get these changes in though and handle any further cases in separate PRs.
Are these changes tested?
Yes, added a unit test that tests precisely the scenario that led me to opening the issue.
Are there any user-facing changes?
No, only a bug fix. If the queries didn't panic before it could have been a change in behavior, but since all queries of this sort panicked before it's purely a bug fix.
@alamb @tustvold @andygrove