Exp and pow return infinity too early #600

MattyMuir · 2024-10-31T02:04:14Z

Describe the bug
The exp function (for SIMD and pure C) returns infinity for all values past a certain threshold, but this threshold is slightly lower than it should be. Looking at the code for double precision, infinity is returned for all x > 709.78271114955742909217217426. However, there are values past this limit whose exponential can still be represented with a finite double. All standard libraries I've tested use the following, slightly larger, threshold instead x > 709.782712893384 (0x1.62e42fefa39efp+9 in binary). This doesn't appear to be an issue for the single precision implementation.

This same threshold is also used in the pow function, so the bug is present there as well.

Command lines, logs and environment
All the build steps, environment and configuration are the same as in my other issue, #570, though they shouldn't matter in this case.

To Reproduce

int main()
{
	double x = 709.782712;
	std::cout << "libc:  " << exp(x) << '\n';
	std::cout << "Sleef: " << Sleef_exp_u10(x) << '\n';
}

This produces the following output on my machine

libc:  1.79769e+308
Sleef: inf

Fixing this issue might be as simple as just changing the constant in the implementation, provided the series is still accurate past this point.

The text was updated successfully, but these errors were encountered:

blapie · 2024-10-31T09:24:57Z

Hello @MattyMuir, thanks for reporting this!

We will try to adjust the threshold so it is a little closer to general expectation. I also think that constants should be represented using hexfloat, they currently use too many digits, which I don't think is necessary.

Please consider the 2 following notes just FYI

Note 1: I take back my original note, the ieee754 standard does mandate overflow for exp and pow (table 9.1) in the way defined in section 7.4:

The overflow exception shall be signaled if and only if the destination format’s largest finite number is exceeded in magnitude by what would have been the rounded floating-point result

Note 2: Other math libraries like Arm Optimized routines have a more elaborate approach, it evacuates the early overflow to a separate branch where it is fixed, see https://github.com/ARM-software/optimized-routines/blob/master/math/aarch64/sv_exp_1u5.c.

blapie · 2024-10-31T10:21:05Z

Just edited my notes above, as the first one was mistaken.

MattyMuir · 2024-11-16T21:30:05Z

Just had a look at some of the single precision functions. Looks like similar bugs are present in log1pf, sinhf, and coshf. I think the tester needs to be updated to catch these, especially if overflow is mandated by the IEEE standard.

blapie · 2024-11-18T09:25:20Z

Hello @MattyMuir, Thank you for reporting these too. It does seem to impact several routines indeed.
I'll start looking into fixing implementation and updating tests shortly.

blapie · 2024-11-20T17:36:58Z

I'll soon upload a PR to fix this and add some tests. I just realised #523 reported the same issue, and confirms that the ulp error is still 1.0 in the interval between the old and new thresholds.

blapie · 2024-11-21T09:20:20Z

Similarly the log1pf issue was pointed out in #473.

blapie · 2024-11-21T15:17:24Z

Hello @shibatch, We are looking at a range of functions for which SLEEF implements an early overflow bound (as in earlier than what the standard recommends). I suspect these bounds were set to maintain accuracy, could you please confirm that? If so we need to be careful if we are going to update them.
Were there any other reasons driving these choices of bounds?

Luckily it turns out that exp still provides 1ULP when threshold is changed to one meeting the standard, but it is notoriously hard to assess accuracy for bivariate functions like pow or functions for which the number of extra finite point to test is too large to be computed exhaustively (e.g. log1p).

I will run tests as thoroughly as I can, but I would appreciate your insight on this.

If you have specific comments on exp or pow please comment on #523, and on log1p(f) please comment on #473.

shibatch · 2024-11-21T15:27:02Z

Hello,
I don't see any problem in changing the overflow threshold.
I am aware of pretty many minor accuracy problems, and I have been fixing them bit by bit.
On the other hand, when I heard requests from users, they were mainly talking about how to make the computation faster rather than about those minor accuracy issues, so fixing them was a low priority.

blapie · 2024-11-21T16:05:15Z

Ok, good to know. I think it is fair, they are relatively minor issues. Thanks!

Issue shibatch#600 highlighted a limit of current implementation where early overflow is noticed, which diverges from the C99 standard. This patch adds test to catch this type of cases. It only checks a few points therefore we don't know for sure if overflow occurs at the right time but we know if something is wrong in the overflow region. Fix both scalar and simd implementation (double precision) Fixes shibatch#523.

blapie · 2024-11-28T16:53:05Z

I've just created a PR to fix the threshold. As indicated in #523 we can confidently say that the change does not affect overall accuracy in exp.

I have checked pow accuracy with our in-house ulp error assessment program, and here is what I get

...
Largest error so far is 0.50511 at [0x1.7fed001e5f0edp-1, 0x1.1b5ce4d1fb0aep+11] ([0x3fe7fed001e5f0ed, 0x40a1b5ce4d1fb0ae])
  Got  0x1.6e9d97108d4e2p-942
  Want 0x1.6e9d97108d4e1p-942
Largest error so far is 0.57493 at [0x1.7f136e35a1af6p-1, 0x1.2c2f3c91cf6c5p+11] ([0x3fe7f136e35a1af6, 0x40a2c2f3c91cf6c5])
  Got  0x1.ee75e5d42b4bp-1006
  Want 0x1.ee75e5d42b4afp-1006
Largest error so far is 0.59458 at [0x1.7e7a67798b72dp-1, 0x1.e0157ee6672fbp+10] ([0x3fe7e7a67798b72d, 0x409e0157ee6672fb])
  Got  0x1.fb663ae3f3a63p-809
  Want 0x1.fb663ae3f3a62p-809
Largest error so far is 0.65361 at [0x1.7f5c8e80a3cf7p-1, 0x1.235db085e49b7p+11] ([0x3fe7f5c8e80a3cf7, 0x40a235db085e49b7])
  Got  0x1.f992991fa3f97p-974
  Want 0x1.f992991fa3f96p-974
Largest error so far is 0.74696 at [0x1.7ff1b57d71188p-1, 0x1.2e8d51b04ab8p+11] ([0x3fe7ff1b57d71188, 0x40a2e8d51b04ab80])
  Got  0x1.e75a8477de9e7p-1006
  Want 0x1.e75a8477de9e6p-1006
Largest error so far is 0.74696 at [0x1.7ff1b57d71188p-1, 0x1.2e8d51b04ab8p+11] ([0x3fe7ff1b57d71188, 0x40a2e8d51b04ab80])
  Got  0x1.e75a8477de9e7p-1006
  Want 0x1.e75a8477de9e6p-1006

@shibatch Can you please confirm wether the 1.0ULP threshold include the extra 0.5ULP offset introduced by the rounding of the input value (worst case in round to nearest)?

If so then, our result indicates that there are a few points above 1ULP.
If it does not, then the effective accuracy is 1.5ULP, so our result simply confirms that with high confidence pow is under 1.0ULP(+0.5ULP) error.

These max are independent of the change in oflow threshold, since |ylogx| < 709, for instance

x = 7.498910e-01
y = 2.420416e+03
ylogx = -6.966623e+02 // |ylogx| < 709
ret = 2.776058e-303 (0x1.e75a8477de9e7p-1006) versus exact return value = 0x1.e75a8477de9e6p-1006

Note that it is quite difficult to isolate cases for which ylog(x) > thres.

shibatch · 2024-11-29T05:04:39Z

I have confirmed that the error exceeds 1ulp when arguments are (0x1.7fed001e5f0edp-1,0x1.1b5ce4d1fb0aep+11).

pow : arg0 = 0x1.7fed001e5f0edp-1 (0.749855), arg1 = 0x1.1b5ce4d1fb0aep+11 (2266.9), ulp = 1.00511, t = 3.852255144842898e-284, c = 3.85226e-284

blapie · 2024-11-29T14:13:38Z

Thanks for confirming.
Would you be able to confirm these 2 affirmations?

So the u10 routines are effectively 1.5ULP (u35 are effectively 4ULP) when considering worst-case rounding of the input?
Do you also confirm the maximum reported error of 0.74696 ULP (+0.5ULP)?

Largest error so far is 0.74696 at [0x1.7ff1b57d71188p-1, 0x1.2e8d51b04ab8p+11] ([0x3fe7ff1b57d71188, 0x40a2e8d51b04ab80])
  Got  0x1.e75a8477de9e7p-1006
  Want 0x1.e75a8477de9e6p-1006

shibatch · 2024-11-29T14:26:34Z

Rounding of inputs is not taken into account in calculating accuracy.
If we are going to take that into account, what happens when we give a very large argument for sin?

MattyMuir · 2024-11-29T15:02:53Z

I have confirmed that the error exceeds 1ulp when arguments are (0x1.7fed001e5f0edp-1,0x1.1b5ce4d1fb0aep+11).

pow : arg0 = 0x1.7fed001e5f0edp-1 (0.749855), arg1 = 0x1.1b5ce4d1fb0aep+11 (2266.9), ulp = 1.00511, t = 3.852255144842898e-284, c = 3.85226e-284

Testing with these same arguments in the latest version also exceeds 1ULP error. It looks like this inaccuracy is unrelated to the new PR.

blapie mentioned this issue Nov 20, 2024

F64 exp returns infinity slightly too soon #523

Open

blapie mentioned this issue Nov 28, 2024

Fix oflow bound in exp and pow. #604

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exp and pow return infinity too early #600

Exp and pow return infinity too early #600

MattyMuir commented Oct 31, 2024

blapie commented Oct 31, 2024 •

edited

Loading

blapie commented Oct 31, 2024

MattyMuir commented Nov 16, 2024

blapie commented Nov 18, 2024

blapie commented Nov 20, 2024

blapie commented Nov 21, 2024

blapie commented Nov 21, 2024

shibatch commented Nov 21, 2024

blapie commented Nov 21, 2024

blapie commented Nov 28, 2024

shibatch commented Nov 29, 2024

blapie commented Nov 29, 2024

shibatch commented Nov 29, 2024

MattyMuir commented Nov 29, 2024

Exp and pow return infinity too early #600

Exp and pow return infinity too early #600

Comments

MattyMuir commented Oct 31, 2024

blapie commented Oct 31, 2024 • edited Loading

blapie commented Oct 31, 2024

MattyMuir commented Nov 16, 2024

blapie commented Nov 18, 2024

blapie commented Nov 20, 2024

blapie commented Nov 21, 2024

blapie commented Nov 21, 2024

shibatch commented Nov 21, 2024

blapie commented Nov 21, 2024

blapie commented Nov 28, 2024

shibatch commented Nov 29, 2024

blapie commented Nov 29, 2024

shibatch commented Nov 29, 2024

MattyMuir commented Nov 29, 2024

blapie commented Oct 31, 2024 •

edited

Loading