Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement ROUND() UDF #3404

Merged
merged 42 commits into from
Sep 26, 2019
Merged

Conversation

purplefox
Copy link
Contributor

@purplefox purplefox commented Sep 24, 2019

Description

This PR implements the ROUND function for KSQL.

It continues the work started in #3028 - thanks @ouertani for your contribution!

To summarise:

  • The Round function is implemented using the new UDF framework
  • ROUND can take one arg (the value to round) in which case it rounds to the nearest integer value.
  • ROUND can take two args (the value to round, and the number of decimal places) in which case it rounds to that number of decimal places
  • The number of decimal places can also be negative, meaning the value can also be rounded to powers of ten to the left of the decimal point.
  • Rounding is “half up” meaning values are rounded to the nearest neighbour, if there is a tie (i.e. digit is 5) they are rounded up.
  • Rounding of double and decimal (BigDecimal) values is supported.
  • Rounding of null returns null
  • When number of decimal places is required BigDecimal is used to provide correct rounding. It is error prone to implement a correct rounding with decimal places using Math.round() and integer and float (double) arithmetic alone. This is due to the inherent imprecision in binary floating point representations. Please see this https://stackoverflow.com/questions/153724/how-to-round-a-number-to-n-decimal-places-in-java and related discussions for more information.
  • When constructing a BigDecimal from a double the BigDecimal.valueOf(double) method is used. This constructs via a String to provide an exact representation. This is important to give correct rounding behaviour for double values such as -265.365 which cannot be represented exactly as binary floating point numbers.
  • Java.lang.Math.round() is used for rounding of doubles where decimal places is not specified. We could use BigDecimal everywhere but this is likely to give somewhat better performance for these cases.
  • The behaviour of half up rounding in java.lang.Math.round and BigDecimal is inconsistent so we have to do a bit of jiggery pokery to get the Java.lang.Math behaviour consistently across all cases. This is discussed more in comments in the code

Testing:

  • New Unit test
  • Added to QTT test - in particular added test that verified using DECIMAL: values

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

ouertani and others added 30 commits June 28, 2019 14:55
Copy link
Member

@JimGalasyn JimGalasyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a couple of suggestions.

@vpapavas
Copy link
Member

Great job Tim and awesome you used the schema provider! LGTM with one minor comment:
Why did you name the test json file array_functions? I think it would look clearer to have a json file per udf.

@vpapavas vpapavas self-requested a review September 24, 2019 15:58
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! thanks @purplefox

@agavra agavra requested a review from a team September 24, 2019 16:01
@purplefox
Copy link
Contributor Author

Hi @vpapavas - the json file is called that because that's what it was called in the original PR where it was created. But I agree it could have a better name :)

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @purplefox

Few nits and few questions & suggestions below.

Also, this PR would benefit from tests that ensured the return type schema was correct. You can do this in the unit test, but also in the json test. (I've called this out below).

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @purplefox

Only nit's remaining.

I've again expressed my thoughts on the, in my view, excessive comments. But won't press you again on it.

I've also again asked for the QTT tests to be placed in a more appropriate file name. I'll ping you to discuss, as I think there is some confusion here.

Aside from those, LGTM!

@purplefox purplefox merged commit f9783a9 into confluentinc:master Sep 26, 2019
@purplefox purplefox deleted the ouertani-KSQL-2583 branch September 30, 2019 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants