Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] result mismatch found in round #6827

Open
jiangjiangtian opened this issue Aug 14, 2024 · 12 comments
Open

[VL] result mismatch found in round #6827

jiangjiangtian opened this issue Aug 14, 2024 · 12 comments
Labels
bug Something isn't working triage

Comments

@jiangjiangtian
Copy link
Contributor

jiangjiangtian commented Aug 14, 2024

Backend

VL (Velox)

Bug description

SQL:

SELECT round(cast(0.5549999999999999 as double), 2);

Gluten returns 0.56, but vanilla spark returns 0.55.
The reason of this mismatch is that the result of std::nextafter(0.5549999999999999) is 0.55500000000000005, which make std::round return 56.
Besides, the following SQL also can't produce the result result in gluten:

SELECT round(cast(0.19324999999999998 as double), 2);

Gluten returns 0.1933, spark returns 0.1922.

I have tried the following modification in round:

return static_cast<TNum>(std::round(std::nextafter(static_cast<long double>(number), kInf) * factor) / factor);

I can fix the first example, but the second example still has a mismatch.
The modification will cause other mismatch:

SELECT round(cast(0.575 as double), 2);

Before the modification, gluten returns the right result, which is 0.58. After the modification, glutens returns 0.57.

Spark version

3.0

Spark configurations

No response

System information

No response

Relevant logs

No response

@jiangjiangtian jiangjiangtian added bug Something isn't working triage labels Aug 14, 2024
@jiangjiangtian
Copy link
Contributor Author

CC @rui-mo @ArnavBalyan @kecookier

@ArnavBalyan
Copy link
Contributor

Yes the round function gets data in double format, double is used throughout upstream. We might have to refactor codebase and upstream functions to move to something which can handle the high precision thanks

@jiangjiangtian
Copy link
Contributor Author

Yes the round function gets data in double format, double is used throughout upstream. We might have to refactor codebase and upstream functions to move to something which can handle the high precision thanks

Thanks!
I think we can use DecimalRoundFunction to work around, but it also needs a lot of effort.

@rui-mo
Copy link
Contributor

rui-mo commented Aug 15, 2024

Hi @jiangjiangtian, is this issue noticed in a runtime workload? How do you evaluate the importance of this issue to your workload? Thanks.

@jiangjiangtian
Copy link
Contributor Author

Hi @jiangjiangtian, is this issue noticed in a runtime workload? How do you evaluate the importance of this issue to your workload? Thanks.

I find the issue in a test workload. We have some remaining problems, one of which includes this question.
I try to fix this issue using DecimalRoundFunction. Do you think it is a possible solution?

@rui-mo
Copy link
Contributor

rui-mo commented Aug 15, 2024

Hi @jiangjiangtian, here are my thoughts. If we cast double as decimal before rounding, how to make sure we obtain the expected decimal? I believe the problem with the number 0.55499999999999999 is that we do not have an exact representation for it, thus we can not tell if it is simply a representation of 0.555 or if it should actually be 0.55499999999999999 as you indicated. This problem, I believe, also arises when casting a double as a decimal, which could potentially produce an unexpected result.

To solve this issue ultimately, I agree with @ArnavBalyan, we need to map Spark double type to a higher-precision type in C++ like boost::multiprecision::cpp_dec_float_50 but this will take a lot of work.

@jiangjiangtian
Copy link
Contributor Author

Hi @jiangjiangtian, here are my thoughts. If we cast double as decimal before rounding, how to make sure we obtain the expected decimal? I believe the problem with the number 0.55499999999999999 is that we do not have an exact representation for it, thus we can not tell if it is simply a representation of 0.555 or if it should actually be 0.55499999999999999 as you indicated. This problem, I believe, also arises when casting a double as a decimal, which could potentially produce an unexpected result.

To solve this issue ultimately, I agree with @ArnavBalyan, we need to map Spark double type to a higher-precision type in C++ like boost::multiprecision::cpp_dec_float_50 but this will take a lot of work.

Thanks for your reply!
I think you are right.

@zhztheplayer
Copy link
Member

@jiangjiangtian Would you like to open a PR to add a simple UT for the issue? Could mark it as ignored before we actually fix it.

@rui-mo
Copy link
Contributor

rui-mo commented Aug 16, 2024

Given that Spark uses BigDecimal(d).setScale(_scale, mode).toDouble when rounding a double number, I wonder whether we could apply a similar strategy to the round function to achieve the same result as Spark.

Although the constructor BigDecimal(double val, MathContext mc) claims the result is unpredictable due to the imprecision of double, if we follow its implementation, it's possible the results for Spark and Gluten could become the same.

https://github.com/openjdk/jdk/blob/jdk-24%2B11/src/java.base/share/classes/java/math/BigDecimal.java#L991-L1087

https://github.com/openjdk/jdk/blob/jdk-24%2B11/src/java.base/share/classes/java/math/BigDecimal.java#L2900-L2985

@ArnavBalyan @jiangjiangtian Please help share your thoughts. Thanks.

@jiangjiangtian
Copy link
Contributor Author

Given that Spark uses BigDecimal(d).setScale(_scale, mode).toDouble when rounding a double number, I wonder whether we could apply a similar strategy to the round function to achieve the same result as Spark.

Although the constructor BigDecimal(double val, MathContext mc) claims the result is unpredictable due to the imprecision of double, if we follow its implementation, it's possible the results for Spark and Gluten could become the same.

https://github.com/openjdk/jdk/blob/jdk-24%2B11/src/java.base/share/classes/java/math/BigDecimal.java#L991-L1087

https://github.com/openjdk/jdk/blob/jdk-24%2B11/src/java.base/share/classes/java/math/BigDecimal.java#L2900-L2985

@ArnavBalyan @jiangjiangtian Please help share your thoughts. Thanks.

@rui-mo Thanks for your investigation.
I think it is a possible solution. If we made this, we can follow Spark's implementation: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1532-L1595

@rui-mo
Copy link
Contributor

rui-mo commented Aug 21, 2024

I did some tests on the Java BigDecimal setScale API but found it could also give unexpected results, e.g., round(0.575, 2) is 0.57.
Link for the test code: https://onlinegdb.com/DJIsuTIbc

@wForget
Copy link
Member

wForget commented Nov 1, 2024

Add a mismatch case:

select (199 / 426), round((199 / 426), 8), round(0.4671361502347418, 8)

gluten 1.2.0 with velox:

+---------------------+------------------------+-------------------------------+--+
|     (199 / 426)     | round((199 / 426), 8)  | round(0.4671361502347418, 8)  |
+---------------------+------------------------+-------------------------------+--+
| 0.4671361502347418  | 0.46713614999999997    | 0.46713615                    |
+---------------------+------------------------+-------------------------------+--+

vanilla spark:

+---------------------+------------------------+-------------------------------+--+
|     (199 / 426)     | round((199 / 426), 8)  | round(0.4671361502347418, 8)  |
+---------------------+------------------------+-------------------------------+--+
| 0.4671361502347418  | 0.46713615             | 0.46713615                    |
+---------------------+------------------------+-------------------------------+--+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

5 participants