-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the implementation not to use any external function. #3
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3 +/- ##
============================================
- Coverage 100.00% 25.00% -75.00%
============================================
Files 5 5
Lines 42 28 -14
Branches 5 4 -1
============================================
- Hits 42 7 -35
- Misses 0 21 +21
Continue to review full report at Codecov.
|
I'm confused by the CI errors on (IIUC) Node.js 1-4. They don't seem related to my changes, but the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, leaving to @ljharb the questions about code style and CI.
implementation.js
Outdated
* causes an overflow, resulting in Infinity. | ||
* This code path is also used when the input is already an Infinity. | ||
*/ | ||
if (mod >= 3.4028235677973366E38) { // overflow-threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old implementation used 3.4028234663852886e+38
, which is lower than this number. I verified that this number is correct, but could you add a test for it?
We already have one for "big numbers":
Lines 48 to 49 in be55c4c
t.test('rounds properly with the max float 32', function (st) { | |
var maxFloat32 = 3.4028234663852886e+38; |
It would probably be good to test both the new magic number and 3.4028235677973362e+38
, which is the float that comes right before it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old implementation used
3.4028234663852886e+38
, which is lower than this number.
The old implementation used the max-binary32 value. The new implementation refers to the midpoint between max-binary32 and Infinity in binary32.
It would probably be good to test both the new magic number and
3.4028235677973362e+38
, which is the float that comes right before it.
The new magic number was already tested in "returns infinity for large numbers". I have added a test for the float just below it. I have also added tests around the subnormal-normal boundary.
Do you have a microbenchmark that shows the perf diff? |
The special code path for zeros and NaNs is removed. They fall through as the general case. We avoid `$Number` by using the asm.js idiom `v = +x`. We avoid `abs` by using `s * v`. This causes `-0` to fall through as is, but it turns out that it has the desired effect in the end. We optimize the subnormal path to use only one multiplication and one division. We dispatch the overflow case from the normal and subnormal cases ahead of time, which allows to more easily justify the correctness of the algorithm. We hard-code the magical constants to avoid having to look up external identifiers, ensuring that immediate values are used in the generated code. Finally, and perhaps most importantly, we add significant documentation about the algorithms for the normal and subnormal paths, with---if not proof---at least an argument and references of why they are correct.
01e9356
to
34070bd
Compare
How does hardcoding the constants in the algorithm improve performance? Any performance-motivated change should definitely include benchmarks (in the PR is fine). In particular, I'd like comparisons of:
Additionally, performance is only a sensible concept to discuss in the context of a specific implementation. This package supports basically every version of Firefox, Safari, IE, Edge, Chrome, node, and many others - so we'd want to compare performance in all of these. |
I don't have micro-benchmarks, but I do have a benchmark of a raytracer where every single floating point value is a
I guess it's easier for the engine's JIT to compile them to immediate values, rather than loading them from memory or having to protect the immediates with deoptimization guards.
True. Here the performance measurements are done on a Linux Ubuntu machine with Node.js 14.14.0. This is probably not representative of all the possible versions. But I would say that an implementation that uses 0 external function call has a higher chance of being compiled in a consistent way across as many engines as possible. |
Since the "optimizations" are controversial, I resubmitted a variant of this PR as #4, which preserves the use of |
Here is the implementation of the
fround
polyfill used by Scala.js 1.9.0+ when targeting ECMAScript 5.1, adapted for variable names to match the previous variable names in this codebase. The inspiration for using Veltkamp's splitting for the normal form case form came from the polyfill in core-js, which seems to also be the inspiration for the previous algorithm in this codebase. I have however reconstructed a proof with references that it is indeed correct. I designed the algorithm (if it can be called that) for the subnormal form case.I intentionally break two eslint rules for the sake of performance:
+x
instead of$Number(x)
If that is not welcome in this repository, or misguided, I am happy to revert to using
$Number
and/or declaring the magic constants invar
s as before.The special code path for zeros and NaNs is removed. They fall through as the general case.
We avoid
$Number
by using the asm.js idiomv = +x
. We avoidabs
by usings * v
. This causes-0
to fall through as is, but it turns out that it has the desired effect in the end.We optimize the subnormal path to use only one multiplication and one division.
We dispatch the overflow case from the normal and subnormal cases ahead of time, which allows to more easily justify the correctness of the algorithm.
We hard-code the magical constants to avoid having to look up external identifiers, ensuring that immediate values are used in the generated code.
Finally, and perhaps most importantly, we add significant documentation about the algorithms for the normal and subnormal paths, with---if not proof---at least an argument and references of why they are correct.