-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert in-memory Column.round to Java #7521
Conversation
...o/table/data/column/operation/map/numeric/DoubleLongBooleanOpWithSpecialNumericHandling.java
Outdated
Show resolved
Hide resolved
...o/table/data/column/operation/map/numeric/DoubleLongBooleanOpWithSpecialNumericHandling.java
Outdated
Show resolved
Hide resolved
std-bits/table/src/main/java/org/enso/table/data/column/storage/numeric/DoubleStorage.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, I think the direction is good.
I'm glad that the refactor splitting into unary and binary ops actually made it a bit cleaner to add a ternary op as you just did.
I think the direction is good to proceed, just needs polishing. In particular, please ensure that the result type is like we expect.
Benchmarks: significant improvement for floats; not much for ints:
|
@radeusgd value_type tests are already in Column_Spec. |
New benchmarks; indeed, I commented out the parameters and forgot I had done it. The benchmarks show improvements for both types. (Except in the fast-return case, where it isn't doing anything anyway.)
|
@@ -6,7 +6,7 @@ from Standard.Test import Bench, Faker | |||
## Bench Utilities ============================================================ | |||
|
|||
vector_size = 1000000 | |||
iter_size = 100 | |||
iter_size = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we leave this at 100? or is it so long now we want it shorter!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes 47s with iter_size = 10
so I assume we don't want it much longer than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future - I think in that case we may want to consider to reduce the vector_size
instead, to make each iteration a bit shorter - but to keep the number of iterations. 10 may be too few to see warmup and also to see the distribution of the timings (i.e. if the variance is low or high).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interested in ensuring consistency between the builtin and the column implementation. Either by sharing the algorithm or by sharing the tests. Otherwise OK as far as I can say.
@@ -31,8 +37,7 @@ public class Core_Math_Utils { | |||
* @throws IllegalArgumentException if `n` is outside the allowed range. | |||
* @throws IllegalArgumentException if `decimalPlaces` is outside the allowed range. | |||
*/ | |||
public static double roundDouble(double n, int decimalPlaces, boolean useBankers) { | |||
|
|||
public static double roundDouble(double n, long decimalPlaces, boolean useBankers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a copy of the code in RoundNode
? Shall not we seek a way to share the code among both versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing it, we decided it was sufficient to share the tests; sharing the implementation between builtins and regular polyglot libraries is a bigger task.
* | ||
* @param <I> the supported storage type. | ||
*/ | ||
public abstract class TernaryMapOperation<T, I extends Storage<? super T>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the naming - I
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @radeusgd for that.
@Override | ||
public Storage<Long> runTernaryMap(AbstractLongStorage storage, Object decimalPlacesObject, Object useBankersObject, MapOperationProblemBuilder problemBuilder) { | ||
if (decimalPlacesObject == null) { | ||
throw new IllegalArgumentException("decimalPlacesObject must not be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: shall not the exception be NullPointerException
? Or use java.util.Objects.requireNonNull
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the null check.
} | ||
|
||
if (!(decimalPlacesObject instanceof Long decimalPlaces)) { | ||
throw new UnexpectedTypeException("a long."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exception would be thrown if decimalPlacesObject
was null
. E.g. do we need the above == null
check at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the null check.
@@ -518,6 +518,19 @@ spec = | |||
3.1416 . round 3 . should_equal 3.142 | |||
3.9999 . round 3 . should_equal 4.0 | |||
|
|||
Test.specify "Can round negative decimals to a specified number of decimal places" <| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure the behavior of RoundNode
builtin and the Column.round
is the same, can we generalize this Numbers_Spec
and run it twice on the same values? Once using regular round
builtin, once creating a Column
with the same value and rounding via Column.round
? I assume the result should be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing the tests this way shall ensure the two rounding implementations are as identical as we need.
|
||
polyglot java import java.math.BigInteger | ||
|
||
spec prefix round_fun = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -30,6 +31,8 @@ spec setup = | |||
result.to_vector.at 0 | |||
do_round n dp=0 use_bankers=False = do_op n (_.round dp use_bankers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. do_op
creates a table with row "x"
and single value n
and then applies the operation to column "x"
and returns the result.
@@ -9,6 +9,7 @@ import project.Function.Function | |||
import project.Internal.Rounding_Helpers | |||
import project.Nothing.Nothing | |||
import project.Panic.Panic | |||
import project.IO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this import used?
@@ -0,0 +1,288 @@ | |||
from Standard.Base import all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right place to have this.
Standard.Test
is a test library, like JUnit
. We should not distribute our tests with it.
IMO we should just move this test altogether to Table_Tests
if we want to run it for both scalars and Columns. I don't think there is much harm in moving the scalar case from Tests
to Table_Tests
- IMO much less than distributing our internal test suites as part of a general and user-facing testing library.
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
./run ide build
.