-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] result mismatch found in round #6827
Comments
Yes the round function gets data in double format, double is used throughout upstream. We might have to refactor codebase and upstream functions to move to something which can handle the high precision thanks |
Thanks! |
Hi @jiangjiangtian, is this issue noticed in a runtime workload? How do you evaluate the importance of this issue to your workload? Thanks. |
I find the issue in a test workload. We have some remaining problems, one of which includes this question. |
Hi @jiangjiangtian, here are my thoughts. If we cast double as decimal before rounding, how to make sure we obtain the expected decimal? I believe the problem with the number To solve this issue ultimately, I agree with @ArnavBalyan, we need to map Spark double type to a higher-precision type in C++ like |
Thanks for your reply! |
@jiangjiangtian Would you like to open a PR to add a simple UT for the issue? Could mark it as ignored before we actually fix it. |
Given that Spark uses Although the constructor @ArnavBalyan @jiangjiangtian Please help share your thoughts. Thanks. |
@rui-mo Thanks for your investigation. |
I did some tests on the Java BigDecimal setScale API but found it could also give unexpected results, e.g., round(0.575, 2) is 0.57. |
Add a mismatch case:
gluten 1.2.0 with velox:
vanilla spark:
|
Backend
VL (Velox)
Bug description
SQL:
Gluten returns 0.56, but vanilla spark returns 0.55.
The reason of this mismatch is that the result of std::nextafter(0.5549999999999999) is 0.55500000000000005, which make std::round return 56.
Besides, the following SQL also can't produce the result result in gluten:
Gluten returns 0.1933, spark returns 0.1922.
I have tried the following modification in round:
I can fix the first example, but the second example still has a mismatch.
The modification will cause other mismatch:
Before the modification, gluten returns the right result, which is 0.58. After the modification, glutens returns 0.57.
Spark version
3.0
Spark configurations
No response
System information
No response
Relevant logs
No response
The text was updated successfully, but these errors were encountered: