-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Aggregations in ANSI mode do not detect overflows #3424
Comments
I went through ANSI mode in Spark 3.2.0-rc3 to find all of the operators that checked the ANSI mode config. I then looked at the operators that we already support and the following is the results. N/A means we don't support the operator but it checks ANSI mode. The Aggregations I pulled out just the ones that we support and started to look through them to see if there was anything special about it.
|
Does it mean that for all non-N/A there's work to do (since we do implement, but we are likely not checking the ansi mode flag)? |
More or less. I have not gone through and done an audit on what our code is doing yet. Will start working though it and updating things. I will probably just have us fall back to the CPU for all of these initially in ANSI mode, and then file follow on issues to have us clean it up so we are doing as much of the right things as we can for each operator. |
I'll take the aggregates for now, may be just preventing them to get on the GPU if ANSI. |
I will take all of the arithmetic ones. |
I will start to work on the Collection Operations and Complex Type Extractors operations now |
Never mind those are all done. I'll start to look at Date Time Expressions instead now. |
It looks like everything is in except for Aggregations right now, so I am going to assign this to @abellina who is working on that code. |
Describe the bug
We recently hit a bug in the tests where ANSI mode was being leaked #3423. That was fixed, but it should have failed on the GPU too, not only on the CPU. If we are in ANSI mode we really should fall back to the CPU for any aggregation that can overflow. This is going to be bad for anyone using ANSI mode, but until we can get help from cudf to support this type of thing we are not going to be able to support it.
There is an added problem that because the aggregations happen in different orders on the GPU, even from one run to the next, we might overflow on one run and not on another if the data includes both negative and positive values in it. If we do implement overflow checking we are going to need to think this through very carefully.
The text was updated successfully, but these errors were encountered: