-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the runtime to have deterministic floating-point to integer conversions #61885
Comments
Tagging subscribers to this area: @dotnet/area-system-numerics Issue DetailsIssue.NET currently depends on the underlying platform to perform However, for cases of overflow or cases where the underlying platform does not directly support a given conversion, each platform can return different results. This differing behavior can lead to downstream issues when dealing with such conversions and lead to hard to diagnose bugs. ProposalWe should follow in the footsteps of other languages and standardize our conversion behavior here to be consistent on all platforms. In particular, we should standardize our behavior to be consistent with ARM64, Rust, and WASM and to saturate the conversions rather than some other behavior such as using a sentinel value (x86/x64). What's RequiredFor ARM32 and ARM64, there will be effectively no impact and the codegen will remain identical to what it is today as the underlying conversions already perform saturation on overflow. The exception is for conversions to small integers where there is no direct underlying platform support and so manual clamping of the input to the appropriate range will be required before converting. For x86 and x64, the platform only directly supports converting to int32 with additional support for converting to int64 on x64. Conversions to small integer types and unsigned types will require the same manual clamping as ARM32/ARM64. Additionally, since the underlying platform returns a sentinel value ( Perf ImpactFor the conversions that need additional handling, there is going to be additional cost and measurable perf impact. For a Skylake processor simple conversions would go from ~7 cycles to approx. between 19 and 24 cycles, or a theoretical "worst case" taking ~3.4x more time. The measured impact by Rust was much lower and was closer to a 0.8x regression in real world scenarios involving heavy floating-point to integer conversions (in particular an "RBG JPEG encoding" algorithm). Additional ConsiderationsWe should additionally expose a set of unsafe APIs that perform the "platform specific" conversions. These would allow developers who understand the xplat differences and need the perf to still get the underlying behavior. It is likely not worth waiting for additional asks on this as other platforms, such as Rust have already exposed such APIs on their end. Additionally, we will likely require them for our own code in the BCL, including for I would propose these are exposed as This would result in the following additions (the commented out APIs already exist or are approved and will be implemented in .NET 7): namespace System
{
public struct Double
{
public static int ConvertToInt32(double value);
public static int ConvertToInt32Unsafe(double value);
public static long ConvertToInt64(double value);
public static long ConvertToInt64Unsafe(double value);
public static uint ConvertToUInt32(double value);
public static uint ConvertToUInt32Unsafe(double value);
public static ulong ConvertToUInt64(double value);
public static ulong ConvertToUInt64Unsafe(double value);
}
public struct Half
{
public static int ConvertToInt32(Half value);
public static int ConvertToInt32Unsafe(Half value);
public static long ConvertToInt64(Half value);
public static long ConvertToInt64Unsafe(Half value);
public static uint ConvertToUInt32(Half value);
public static uint ConvertToUInt32Unsafe(Half value);
public static ulong ConvertToUInt64(Half value);
public static ulong ConvertToUInt64Unsafe(Half value);
}
public struct Single
{
public static int ConvertToInt32(float value);
public static int ConvertToInt32Unsafe(float value);
public static long ConvertToInt64(float value);
public static long ConvertToInt64Unsafe(float value);
public static uint ConvertToUInt32(float value);
public static uint ConvertToUInt32Unsafe(float value);
public static ulong ConvertToUInt64(float value);
public static ulong ConvertToUInt64Unsafe(float value);
}
}
namspace System.Numerics
{
public static class Vector
{
// public static Vector<int> ConvertToInt32(Vector<float> value);
public static Vector<int> ConvertToInt32Unsafe(Vector<float> value);
// public static Vector<long> ConvertToInt64(Vector<double> value);
public static Vector<long> ConvertToInt64Unsafe(Vector<double> value);
// public static Vector<uint> ConvertToUInt32(Vector<float> value);
public static Vector<uint> ConvertToUInt32Unsafe(Vector<float> value);
// public static Vector<ulong> ConvertToUInt64(Vector<double> value);
public static Vector<ulong> ConvertToUInt64Unsafe(Vector<double> value);
}
}
namespace System.Runtime.Intrinsics
{
public static class Vector64
{
// public static Vector64<int> ConvertToInt32(Vector64<float> value);
public static Vector64<int> ConvertToInt32Unsafe(Vector64<float> value);
// public static Vector64<long> ConvertToInt64(Vector64<double> value);
public static Vector64<long> ConvertToInt64Unsafe(Vector64<double> value);
// public static Vector64<uint> ConvertToUInt32(Vector64<float> value);
public static Vector64<uint> ConvertToUInt32Unsafe(Vector64<float> value);
// public static Vector64<ulong> ConvertToUInt64(Vector64<double> value);
public static Vector64<ulong> ConvertToUInt64Unsafe(Vector64<double> value);
}
public static class Vector128
{
// public static Vector128<int> ConvertToInt32(Vector128<float> value);
public static Vector128<int> ConvertToInt32Unsafe(Vector128<float> value);
// public static Vector128<long> ConvertToInt64(Vector128<double> value);
public static Vector128<long> ConvertToInt64Unsafe(Vector128<double> value);
// public static Vector128<uint> ConvertToUInt32(Vector128<float> value);
public static Vector128<uint> ConvertToUInt32Unsafe(Vector128<float> value);
// public static Vector128<ulong> ConvertToUInt64(Vector128<double> value);
public static Vector128<ulong> ConvertToUInt64Unsafe(Vector128<double> value);
}
public static class Vector256
{
// public static Vector256<int> ConvertToInt32(Vector256<float> value);
public static Vector256<int> ConvertToInt32Unsafe(Vector256<float> value);
// public static Vector256<long> ConvertToInt64(Vector256<double> value);
public static Vector256<long> ConvertToInt64Unsafe(Vector256<double> value);
// public static Vector256<uint> ConvertToUInt32(Vector256<float> value);
public static Vector256<uint> ConvertToUInt32Unsafe(Vector256<float> value);
// public static Vector256<ulong> ConvertToUInt64(Vector256<double> value);
public static Vector256<ulong> ConvertToUInt64Unsafe(Vector256<double> value);
}
}
|
This is being prototyped under #61761. This had an internal design doc written up by @davidwrighton that largely states the above recommendation, notably differing on the CC. @jkotas |
I would prefer the terminology "fast" over "unsafe". It is not dangerous, just inaccurate in rare cases. This also fits the flag |
For me "unsafe" also doesn't seem to be the best choice for the suffix. What about |
I proposed
|
|
Will there be a compiler option to opt-out where this strictness is not required? |
@tannergooding is there a design document for the overall changes? We reviewed the necessary APIs, not the proposed changes to code generation and changing the default., which would require more representation from other groups. I should have asked this, but for some reason this didn't occur to me.
That would be an interesting question to ask. In general, I think we try to avoid global modes that change language semantics (like checked/unchecked) but at the same time we often have compatibility switches to get back old behavior. I don't think I've seen enough data to have a fully formed opinion where this would fall. |
There is already a C# compiler option for checked/unchecked, so it seems logical (to me, at least) to have a similar option here. It would be a bit tedious if the only way to get back the performance you previously had, was to find any affected conversions/casts and call |
Right. The believe from the compiler folks is that making checked/unchecked a compiler option was a mistake, so we shouldn't be using it as an example of prior art worth emulating. |
Yes, but that's the same decision that other languages in this space have made. Having correct and deterministic behavior across platforms is becoming increasingly prevalent and the bugs and issues due to people relying on x86/x64 specific behavior are only becoming more obvious as more libraries move to support things like ARM64. If it is tedious, its because it needs to be carefully considered on if the differing behavior is actually acceptable and how it may impact a given application. |
I agree that having correct and deterministic behavior across platforms is desirable, and I'm glad it's going to be the default. |
@terrajobst. There was an internal writeup by David Wrighton and otherwise this is the current doc covering everything. I still have to writeup a more extensive breaking change doc that details the specific error boundaries, but implementation wise its covered under the prototype here: #61761 |
Was David's document already reviewed? We should probably have a public version and share it via dotnet/designs. This seems like an impactful change that warrants wider circulation and most likely a breaking change notice for .NET 7. |
It was circulated already yes. And yes, it's on my plate to ensure all things like the design doc, breaking change doc, perf numbers, etc all get done and documented. |
I would like to put my 2 cents: the proposal smells like strictfp in Java. It was bad design decision introduced more than 20 years ago. As a result, Probably, the better choice is to use environment variable or MSBuild property for that. |
https://openjdk.java.net/jeps/306 forces all platforms to have consistent behavior, which is exactly what this proposal attains. The problem is that this then leads to real perf issues in certain workloads around graphics, machine learning, or other workloads that frequently deal with This is being handled by the introduction of APIs that specifically cover
This is generally a bad idea for several reasons, including that semantics of your program can no longer be determined simply by reading the code. Likewise, it would never be fine-grained enough to allow differentiation between methods that depend on consistent behavior vs those that are robust in the face of inconsistent behavior. |
Exactly, that's why JEP proposes consistent behavior over all platforms. Otherwise, strictfp causes fragmentation of math API: Regardless of the runtime behavior, method naming looks inconsistent across .NET: |
I am concerned about the fragmentation too but I haven't seen the full plan yet, so I don't believe I'm in a position to fully judge this yet. However, doing API design for quite some time now I've come to appreciate that extreme positions are only useful to illustrate a general design philosophy. In practice, the sweet spot is often to give most customers The One True Way of doing things while also having a more advanced API that has wrinkles for the corner cases or tight loops.
A lot of our naming convention is a result of considering how APIs are grouped in documentation and code completion. Suffixes work better because they collocate both APIs, thus making the developer aware that there is an unsafe (and thus presumably more "lightweight") alternative.
That is a good point and something we have done in other areas. However, to me this depends on how problematic the faster versions will be in practice. |
Issue
.NET currently depends on the underlying platform to perform
floating-point to integer
conversions. For most scenarios, this behavior is correct and IEEE 754 compliant.However, for cases of overflow or cases where the underlying platform does not directly support a given conversion, each platform can return different results.
This differing behavior can lead to downstream issues when dealing with such conversions and lead to hard to diagnose bugs.
Proposal
We should follow in the footsteps of other languages and standardize our conversion behavior here to be consistent on all platforms.
In particular, we should standardize our behavior to be consistent with ARM64, Rust, and WASM and to saturate the conversions rather than some other behavior such as using a sentinel value (x86/x64).
NaN
will likewise be specially handled and overflow to0
.What's Required
For ARM32 and ARM64, there will be effectively no impact and the codegen will remain identical to what it is today as the underlying conversions already perform saturation on overflow. The exception is for conversions to small integers where there is no direct underlying platform support and so manual clamping of the input to the appropriate range will be required before converting.
For x86 and x64, the platform only directly supports converting to int32 with additional support for converting to int64 on x64. Conversions to small integer types and unsigned types will require the same manual clamping as ARM32/ARM64. Additionally, since the underlying platform returns a sentinel value (
0x80000000
or0x80000000_00000000
) we will need to have logic to handle this.Perf Impact
For the conversions that need additional handling, there is going to be additional cost and measurable perf impact.
For a Skylake processor simple conversions would go from ~7 cycles to approx. between 19 and 24 cycles, or a theoretical "worst case" taking ~3.4x more time.
The measured impact by Rust was much lower and was closer to a 0.8x regression in real world scenarios involving heavy floating-point to integer conversions (in particular an "RBG JPEG encoding" algorithm).
Additional Considerations
We should additionally expose a set of unsafe APIs that perform the "platform specific" conversions. These would allow developers who understand the xplat differences and need the perf to still get the underlying behavior.
It is likely not worth waiting for additional asks on this as other platforms, such as Rust have already exposed such APIs on their end. Additionally, we will likely require them for our own code in the BCL, including for
Vector<T>
,Vector64<T>
,Vector128<T>
, andVector256<T>
. This will likewise allow the mentioned vector types to be consistent by default and provide fast fallbacks where appropriate.I would propose these are exposed as
ConvertTo*Unsafe
, matching the existingConvertTo*
algorithms we have in several other locations.This would result in the following additions (the commented out APIs already exist or are approved and will be implemented in .NET 7):
The text was updated successfully, but these errors were encountered: