-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colexec: hash aggregator doesn't maintain the partial ordering when spilling to disk #63159
Comments
I think that fixing this on the execution side will be too invasive and could be error-prone because we use the same component So I think - at least in the short term - it is better to fix it from the optimizer side. An idea that was mentioned is using a segmented sort + streaming aggregation in this case, or, alternatively, planning a general sort after the hash aggregation. cc @rytaft @RaduBerinde |
I guess another idea would be to fallback to the row-by-row processor in such case, but I think that would be quite unfortunate, and I would treat it as the last resort. The advantage of this approach is that it'll be a very small change (like 3 lines of code). |
Another option is to plan an explicit external sort to restore the partial ordering not maintained by the hash aggregator when it spills to disk. This might be a bit finicky when the columns from the partial ordering are not output by the aggregator - we'll need to insert "fake" any_not_null aggregates for those and then project them out. |
That sounds error-prone. I agree that the cleanest solution is in the optimizer, I will work on it. |
Thank you @RaduBerinde! |
Currently, the vectorized hash aggregator doesn't maintain the partial ordering if it has to spill to disk. Consider the following logic test which will fail on
fakedist-disk
config:The issue is present only on 21.1 since before this release we didn't have the disk spilling support. There are several possible ways to mitigate this problem, and as the first step I will look into supporting the partial ordering by the external hash aggregator.
The text was updated successfully, but these errors were encountered: