-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float min/max/clamp follow IEEE754-2008 (and not IEEE754-2019) WRT negative zeros #83984
Comments
@rustbot label: +A-floating-point +C-bug |
Rust just calls the
There are also
Maybe rustc should be changed to use |
LLVM's minnum (at the link above) also says
which is incorrect for -0.0, what this bug is. So maybe this is an LLVM bug. |
More reading: The minimum intrinsic seems to exist since LLVM 8.0, while the minimum LLVM that rustc supports is 10, so it should be safe to use from that point of view.
Hmm that's bad. :/. |
I think this is a bug in
and
is more important (since those two parts of the docs are in contradiction). If there's a performance cost to the former, I could imagine LLVM would rather keep it as-is. That said, for them it wouldn't be breaking to fix the bug so that they're in compliance with IEEE minNum, since they don't specify the result. |
For Rust, I don't know how much the documentation mentioning this behaviour prevents a change. It might just be that it only means that the documentation needs to be changed as well, nothing more. The sentence seems to have been introduced by commit f44d287 . |
I filed an LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=49886 And yeah, I just busted out my old copy of IEEE 754-2008 and can confirm that it doesn't specify the sign of -0.0. I mentioned maybe a new intrinsic could be added instead.
IMO it would be a breaking change (even if it weren't documented, but especially given that it is). Faking Ord for floats by a.partial_cmp(b).unwrap() isn't unheard of, and so changing things to introduce way more NaNs could cause working code to break. (Also, I could easily imagine using the documentation there to justify not checking for NaN in the result of Also way more people will notice and care about the handling of NaNs than -0.0 also. All that said, we can fix this for |
Do hardware implementations support this behaviour? Some random documentation for
thus not supporting the behaviour desired here. AArch's LLVM's x86 backend is unable to select machine code for the |
I looked at what cg_clif is doing: rust/compiler/rustc_codegen_cranelift/src/intrinsics/mod.rs Lines 1053 to 1062 in 673d0db
Taken together with Cranelift's documentation for its To the extent that has any bearing on this discussion, I guess it just pushes slightly toward the same semantics being adopted in cg_llvm too.
I agree, but I wonder whether there's any way something like a Crater run could provide an indication of the extent to which these semantics are actually being relied upon (I can't see how—though it could at very least determine how many crates actually use the existing If, as I suspect, the semantics of the existing methods cannot be changed, perhaps alternative methods might be added instead (eg |
Reproducing here @sunfishcode's helpful summary in https://github.com/bjorn3/rustc_codegen_cranelift/issues/1049#issuecomment-817067277:
|
I'm certain I've implicitly relied on this in ways that wouldn't be tested on crater. A lot of the times you don't bother testing with NaN. Often, even if you do, it can be tricky to get NaN all the way into the right place in the code, but you want it handled a certain way if it does get there. Also, floating point stuff is common in gui and graphics code, some of which isn't open source, and the stuff that is doesn't run on crater (needs a GUI). Again, I strongly feel that the documentation there is a promise that these are the semantics of the function, and that you can rely on them. Just because NaN is a weird value, and floats are less precise than the integers, doesn't mean the documented behavior of a function like this is allowed to change on that value. It's hard enough to handle floating point edge cases robustly. If it wasn't floats and NaN, would you consider changing the behavior? Say of i32::signum on zero (which is an example of a case where the result isn't exactly "obvious", since the input isn't precisely part of the obvious input range) — the answer is (hopefully) of course not, the function promises that it behaves that way on the function, so it can't change. Sorry, I might have a bit of a chip on my shoulder about this stuff, but I think it's a bit ridiculous to even consider, given that our IEEE compliance around NaN isn't even in question — we're compliant so long as this function is And I'd only want to deprecate or lint if we can't bring it into line with |
Anyway, changing the behavior on (Note: I don't think this example is that implausible) |
Just so we're absolutely clear on this: The current behavior regarding negative zero in both Rust and LLVM is very much intentional, and should very much not be "fixed". |
The current behavior around negative zero is that it's unspecified. It follows an old version of the spec, but that function was replaced with one that clarifies the behavior for negative zero. For LLVM, I have way weaker opinions, but if anything I think the "fix" is probably a new pair of intrinsics that match minimumNumber/maximumNumber. The current llvm.minnum/llvm.maxnum would stay as-is. For Rust, I think it should be updated to be compliant with the new IEEE754 spec, which has all minimum and maximum functions treat -0.0 as less than 0.0, and removed the min/max functions where they're treated as equal. Do you have a reason that the current behavior around that should be kept? |
The current behavior If it turns out it's a big performance hit to handle it as specified, that could make a difference in my opinion though, perhaps. If we don't change it, we should document that it's based on minNum from IEEE754-2008 (and not minimumNumber from IEEE754-2019), to avoid future confusion. I also think there's no reason not to change |
The behavior exists for performance reasons. x86 does not support IEEE754 min/max operations, so all special cases need to be implemented explicitly. When targeting newer x86, NaN is handled through an additional cmpunord+blendv. On older targets (and if non-NaN cannot be proven) it's significantly more expensive. Handling negative zero adds more cost on top of that. In addition to that, fmin/fmax in C are spec'd to follow IEEE754-2008 semantics, in that handling of negative zero is explicitly specified as optional for implementation feasibility reasons. This means that for soft-float targets, if the operation gets libcall-legalized, we do not have a guarantee that the invoked libcall will have special treatment of negative zero. |
Hmm, it's hard for me to say. Given that IEEE 754 2019 defines several min/max functions, but they all have the same behavior for -0.0 (that it's less than +0.0), that feels like a strong indication that that behavior is to be preferred. I also feel like these functions are typically outside the hot path, and that in practice the branch that's needed to check for -0.0 will be correctly predicted essentially every time (assuming its a branch — and I guess it may not be). That said, the decision here would presumably be mirrored by core_simd, and I do feel that it's very plausible that a simd version of min/max could be in the hot path — I've written code where this is the case, and while the scalar fallback in that code didn't use f32::min/max (nor could it have, really, it needed to use an explicit comparison)... And I guess there are probably cases I'm not thinking of where you need the minimum value in an array or something, and then f32::min would be in the hot path. That said, I was surprised that we didn't follow IEEE754 2019, and someone else may be too in the future. I'd feel a bit better if there's an explicit decision here, since it's a decision to stick with an outdated version of these APIs for performance reasons... |
Regarding min/max, it sounds like NaN handling requires additional instructions, regardless of which behavior is desired on x86, and so that does not sound like a particularly strong performance concern. |
It was definitely deliberate, because there's no good choice for what non-NAN value to return for a NAN input. |
In image processing these functions are very often in the hot path, and -0 never happens. Sadly, with float functions there's always this tension between prioritizing accuracy and performance. I know for some scientific cases it's necessary to handle every weird edge case up to ULP. OTOH there are lots of cases where floats are just used for banal imprecise arithmetic. Not because floats are the right tool for the job, but only because otherwise half of the CPU would sit unused. |
Most modern application processor architectures I have seen the pipeline diagrams for actually have their pipelines combined for floating point operations and SIMD, because they have combined floating point and SIMD registers, so integral SIMD math would use the same pipelines, and may actually outstrip floating point performance. Floating point performance on its own does matter, of course, which is why I think having a highly consistent semantics is not merely some academic or scientific mathematical concern, as it enables the fastest evaluation of all: the one taking zero time, because it was already calculated. So while I am sympathetic to the performance concerns, I do not believe we should ignore the possibilities for alternatives, other gains we could be getting, or places we can make up for any losses we might encounter as a result of tightening our specification. That said, I am not in any particular rush to a conclusion, here. I expect this issue, like many floating point issues in this repository, will remain open for some time before it resolves one way or another. I am merely noting that it is possible there is a solution that may satisfy both concerns. |
I have now adopted cg_llvm's semantics in bjorn3/rustc_codegen_cranelift@c6564f8. I would slightly prefer if either the semantics of Cranelift's fmin/fmax or fmin_pseudo/fmax_pseudo instructions would be chosen, but for now this fixes some libcore and stdsimd tests. In the later case by ensuring that the scalar min/max operations match the vector ones which already had the same semantics as cg_llvm due to explicitly using comparisons and lane selection instead of dedicated intrinsics. |
@kornelski Note that if some code does something equivalent to Concerning the min and max function, if the compiler can determine that the (+0,−0) or (−0,+0) case cannot occur, it can optimize. Ditto if the programmer gives such a hint to the compiler. |
As an example why |
Yeah, I find minNum/maxNum's behavior regarding nan's pretty good, although the arguments they make for why they changed it in 2019 are compelling as well. Changes here would need to be new functions, I think. |
Honestly I like |
Right, i think there's cases where you want to propagate nans, and cases where you want to swallow them, and propagation is more consistent with most of floating point semantics. |
…tmcm Adds IEEE 754-2019 minimun and maximum functions for f32/f64 IEEE 754-2019 removed the `minNum` (`min` in Rust) and `maxNum` (`max` in Rust) operations in favor of the newly created `minimum` and `maximum` operations due to their [non-associativity](https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/background/minNum_maxNum_Removal_Demotion_v3.pdf) that cannot be fix in a backwards compatible manner. This PR adds `fN::{minimun,maximum}` functions following the new rules. ### IEEE 754-2019 Rules > **minimum(x, y)** is x if x < y, y if y < x, and a quiet NaN if either operand is a NaN, according to 6.2. For this operation, −0 compares less than +0. Otherwise (i.e., when x = y and signs are the same) it is either x or y. > **maximum(x, y)** is x if x > y, y if y > x, and a quiet NaN if either operand is a NaN, according to 6.2. For this operation, +0 compares greater than −0. Otherwise (i.e., when x = y and signs are the same) it is either x or y. "IEEE Standard for Floating-Point Arithmetic," in IEEE Std 754-2019 (Revision of IEEE 754-2008) , vol., no., pp.1-84, 22 July 2019, doi: 10.1109/IEEESTD.2019.8766229. ### Implementation This implementation is inspired by the one in [`glibc` ](https://github.com/bminor/glibc/blob/90f0ac10a74b2d43b5a65aab4be40565e359be43/math/s_fminimum_template.c) (it self derived from the C2X draft) expect that: - it doesn't use `copysign` because it's not available in `core` and also because `copysign` is unnecessary (we only want to check the sign, no need to create a new float) - it also prefer `other > self` instead of `self < other` like IEEE 754-2019 does I originally tried to implement them [using intrinsics](Urgau@1d8aa13) but LLVM [error out](https://godbolt.org/z/7sMrxW49a) when trying to lower them to machine intructions, GCC doesn't yet have built-ins for them, only cranelift support them nativelly (as it doesn't support the nativelly the old sementics). Helps with rust-lang#83984
To define Clamp in terms of Minimum and Maximum instead of Min and Max is not possible right? |
It seems that C has added They have kept |
We do have But the discussion above talked a lot about NaNs when the issue really is about |
In a blog post (shameless plug) I wrote on clamping between 0.0 and 1.0, I noticed that we don't treat
-0.0
as less than+0.0
. Instead, we return whichever is the first argument:That behavior is kinda incoherent, probably unintentional, and seems highly unlikely that anybody is relying on it, so I suspect we're free to fix it.
While IEEE-754 defines two families of min/max functions (differing on NaN handling), all of them are required to treat
-0.0
as less than+0.0
. We should fix our version to do this ¹.This problem is present in
{float}::clamp
too, which ideally would produce +0.0 forf32::clamp(-0.0, +0.0, _)
. (However, note that clamp uses a different² version of min/max than max/min, so fixing it here won't automatically fix it for clamp).Excerpt from IEEE754 2019 section 9.6 "Minimum and maximum operations" (emphasis mine)
CC @workingjubilee
¹ Alternatively, since these functions only "should" be provided, we could say that we just don't provide equivalents to those functions and that ours are different functions. If we do this, we should have a very good reason IMO, and should document this as well.
² Note that for reasons³ our
{float}::clamp
functions are documented as propagating NaNs, but{float}::{min, max}
are not. This means this does have to be fixed in multiple places, although, it's coherent under IEEE754 if we assume{float}::{min, max}
useminimumNumber
andmaximumNumber
respectively{float}::clamp
usesminimum
andmaximum
³ Nobody asked, but personally, I'd rather clamp be consistent with min/max, which for this has the benefit of guaranteeing that the result is always in the provided bound, which is quite desirable sometimes. I know it was probably deliberate, and I do get why, and know it can't be changed, but I'm still going to use this footnote-within-a-footnote on an obscure floating point bug report to gripe about it.
The text was updated successfully, but these errors were encountered: