Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Proposal: Add System.Numerics.Half 16 bit floating point number conforming to IEEE 754:2008 binary16 #936

Closed
4creators opened this issue Dec 4, 2017 · 38 comments · Fixed by #37630
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Numerics
Milestone

Comments

@4creators
Copy link
Contributor

4creators commented Dec 4, 2017

This proposal introduce System.Half numeric structure conforming to IEEE 754:2008 binary16 specification and defines it mainly as a interchange format used at interface between managed code and external code capable of handling binary16 arithmetic operations. It can be extended to support full binary16 arithmetic when required coreclr runtime support will be available. Additional extension of functionality could be available in future via language support for new floating-point number literals.

Rationale and Proposed API

Changing computation requirements have led to extension of IEEE 754 floating point standard with new floating-point number sizes named as binary16, binary32 (equivalent to float), binary64 (equivalent to double) and binary128. Any further extensions are possible in 32 bits increments. Current computation workloads in AI, graphics, media and gaming take advantage of binary16 format to simultaneously speed up calculations and keep data size small. Increasing number of hardware supports binary16 (or similar) arithmetic: graphics cards from Nvidia and AMD, Arm processors, Arm and Intel SIMD conversion instructions.

Adding System.Numerics.Half API should enable implementation and usage of F16C Intel conversion intrinsics as well as of several Arm intrinsics. In addition System.Numerics.Half can be used to support partially IEEE 754 conforming Arm floating-point 16 bit arithmetic.

System.Numerics.Half format can represent normalized positive and negative values in the range of 2^{-14} to 65504.

System.Numerics.Half API

System.Numerics.Half is a binary - power of 2 floating point number and its bit representation is as follows: (i) 1 bit represents sign, (ii) 5 bits represent mantissa, and (iii) 11 bits (10 explicitly stored) represent significand.

namespace System.Numerics
{
    //
    // Summary:
    //     Represents a half-precision floating-point number.
    public struct Half : IComparable, IFormattable, IComparable<Half>, IEquatable<Half>, IConvertible, ISpanFormattable
    {
        public static readonly Half MinValue;

        public static readonly Half Epsilon;

        public static readonly Half MaxValue;

        public static readonly Half PositiveInfinity;

        public static readonly Half NegativeInfinity;

        public static readonly Half NaN;

        public static bool IsInfinity(Half h);
        
        public static bool IsFinite(Half value);

        public static bool IsNaN(Half h);

        public static bool IsNegativeInfinity(Half h);

        public static bool IsPositiveInfinity(Half h);

        public static bool IsNormal(Half h);

        public static bool IsSubnormal(Half h);

        public static bool IsNegative(Half h);

        public static Half Parse(string s);

        public static Half Parse(string s, NumberStyles style);

        public static Half Parse(string s, NumberStyles style, IFormatProvider provider);

        public static Half Parse(string s, IFormatProvider provider);
        
        public static Half Parse(ReadOnlySpan<char> s);
        
        public static Half Parse(ReadOnlySpan<char> s, NumberStyles style);
        
        public static Half Parse(ReadOnlySpan<char> s, IFormatProvider provider);
        
        public static Half Parse(ReadOnlySpan<char> s, NumberStyles style, IFormatProvider provider);

        public bool TryFormat(Span<char> destination, out int charsWritten, ReadOnlySpan<char> format, IFormatProvider provider);

        public static bool TryParse(string s, out Half result);

        public static bool TryParse(string s, NumberStyles style, IFormatProvider provider, out Half result);

        public static bool TryParse(ReadOnlySpan<char> s, out Half result);

        public static bool TryParse(ReadOnlySpan<char> s, NumberStyles style, IFormatProvider provider, out Half result);

        public int CompareTo(object value);

        public int CompareTo(Half value);

        public bool Equals(Half obj);

        public override bool Equals(object obj);

        public override int GetHashCode();

        public TypeCode GetTypeCode();

        public string ToString(IFormatProvider provider);

        public string ToString(string format);

        public string ToString(string format, IFormatProvider provider);

        public override string ToString();

        public static explicit operator Half(float value);

        public static explicit operator Half(double value);

        public static explicit operator float(Half value);

        public static explicit operator double(Half value);

        public static bool operator ==(Half left, Half right);

        public static bool operator !=(Half left, Half right);

        public static bool operator <(Half left, Half right);

        public static bool operator >(Half left, Half right);

        public static bool operator <=(Half left, Half right);

        public static bool operator >=(Half left, Half right);

/*
Operators not being considered for v1

        public static implicit operator Single(Half value);

        public static implicit operator Double(Half value);

        public static implicit operator Int32(Half value);

        public static implicit operator Int64(Half value);

        public static implicit operator Half(Byte value);

        public static implicit operator Half(SByte value);

        public static implicit operator Half(Int16 value);

        public static explicit operator Half(Single value);

        public static explicit operator Half(Double value);

        public static explicit operator Byte(Half value);

        public static explicit operator SByte(Half value);

        public static explicit operator Int16(Half value);

        public static explicit operator Half(Int32 value);

        public static explicit operator Half(Int64 value);
*/
    }
}

Naming

Existing implementations use either Half or _FloatN (_Float16, _Float32) as the name of type implementing IEEE 754 binary16. _FloatN is alredy defined in ISO/IEC TS 18661-3:2015 C language standard. Since IEEE 754 specifies extended binary floating-point types it was naturally adopted by C2x ISO standard as _FloatNx, i.e. _Float64x. Similar naming convention could be useful in .NET framework where besides Single and Double there could be defined IEEE 754:2008 binaryN, binaryNx conformant System.Numerics.Float8, System.Numerics.Float16, System.Numerics.Float32, System.Numerics.Float64, System.Numerics.Float128 etc. types and equivalent to binaryNx System.Numerics.Float8Ext ... System.Numerics.Float64Ext, System.Numerics.Float128Ext types.

This type of naming convention would be intuitively very familiar to programmers due to signed and unsigned integer naming and in addition similar naming convention could be used in the case of implementing decimal numbers covering wider range of bit sizes.

Language support

Despite that this part of proposal should go into csharplang repository for the sake of completeness it is included here as well.

Full support of new System.Numerics.Half numeric type would require introduction of new numeric literals allowing for type initialization in code. C#/VB/F# language support may be achieved with new numeric floating-point literal suffixes fxx for all floating-point type initialization:

Half h = 0.45e-11f16;
Single f = 0.45768e-35f32;
Double d = 0.45678342987e-57f64;

Such scheme would be future proof by allowing support of any size of floating point literal by adjusting fxx numeric value. Furthermore, it would allow to support extended precision binary floating-point literals just by adding e or x at the end of the suffix i.e. f64e or f64x.

It is not necessary to introduce new language keyword i.e. half to provide language level support for new type, however, it could be beneficial. The timing for such support should be dependent on arithmetic operations support by the runtime.

Arithmetic operations

It is possible to support arithmetic operations on System.Numerics.Half on all architectures by implicitly promoting it to System.Single and performing calculation using System.Single native CPU support and converting result back to System.Half and alternatively by using partial support available currently on some Arm processors. Arm Half implementation is similar but not conforming to IEEE 754:2008 as it does not support infinities or NaNs. Instead, the range of exponents is extended, so that this format can represent normalized values in the range of 2^{-14} to 131008.

Several vendors are in process of implementing support for Half IEEE 754:2008 arithmetic in silicon and full support for hardware based arithmetic operations should be coming soon.

Updates

Converted const Half syntax to static readonly.

Added implicit conversion operators to Single and Double and explicit conversion operators from Single and Double and additional possible implicit integer conversions with matching reverse explicit conversions.

June 1, 2020: Removed all the implicit and explicit operators save for to/from float/double. The fact that the language may decide to add support for Half in the future means we need to careful with the operators we add right now. I think the best approach here is to define just explicit operators for now. cc @tannergooding

Open Problems

Namespace

It is not entirely clear in which namespace Half should be defined. Should it be System.Half or System.Numerics.Half. C11 standard defines _FloatN (_Float16, _Float32) as supported basic types, therefore, due to C# being in a C language family could follow C.

Naming

Referring to existing standardization efforts in C2x it could be desired to keep naming scheme similar and consistent with current naming scheme for integrals. Additional types introduced in IEEE 754:2008 and emergence of 8bit binary float usage may support using FloatNN naming scheme i.e. Float16, Float32, Float64 and in near future Float8 and Float128.

@tannergooding
Copy link
Member

const Half will not be possible at this time (dotnet/csharplang#688 is one such language proposal that could support it).

I'm not sure that IConvertible should be implemented.

The rest of the API surface looks feasible to be implemented in software without too much overhead.

There should likely be an implicit conversions to System.Single explicit conversion from System.Single supported (possible System.Double as well) so that users can create the types (this can also be software based without too much overhead).

It might be worthwhile putting this in the System.Numerics namespace (it does need to be in mscorlib, so that System.Runtime.Intrinsics can use it as an interchange type).

@4creators
Copy link
Contributor Author

@tannergooding Updated proposal with const Half correction and implicit/explicit operators. Due to range of values represented by System.Half it should be possible to have implicit conversion operators to Int32, Int64, BigInteger and from Int16, Sbyte, Byte. Conversion from Int16 would require rounding.

@jkotas
Copy link
Member

jkotas commented Dec 5, 2017

@4creators
Copy link
Contributor Author

cc @eerhardt @ViktorHofer @tarekgh
Pinging since don't know which area it should be assigned to: System namespace or System.Numerics

@tannergooding
Copy link
Member

I think the namespace it is in will likely come up in API review and get decided there (unless there is a clear consensus otherwise).

@eerhardt
Copy link
Member

eerhardt commented Dec 7, 2017

I think it is fine being in the System.Numerics label on GitHub, but my opinion would be that the type belongs in the System namespace to be beside System.Single and System.Double.

I'm not familiar with how to add intrinsic types, would the runtime need to be updated? Or could this solely be done in System.Private.CoreLib?

@4creators
Copy link
Contributor Author

Adding any intrinsics using Half supporting processor instructions would require runtime changes. This is an ongoing work which adds Intel and Arm hardware intrinsics. Some of the instructions using 16bit floating point numbers cannot be currently supported due to missing System.Half. Adding this API should unblock this work.

@eerhardt
Copy link
Member

eerhardt commented Dec 7, 2017

I meant "would the runtime need to be updated strictly for adding the type?"- outside the hardware intrinsics portion.

For example, if you look at the IL for System.Double:

.class public sequential ansi sealed serializable beforefieldinit System.Double
    extends System.ValueType
    implements System.IComparable,
               System.IFormattable,
               System.IConvertible,
               class System.IComparable`1<float64>,
               class System.IEquatable`1<float64>
{
    .custom instance void System.Runtime.CompilerServices.TypeForwardedFromAttribute::.ctor(string) = (
        01 00 4b 6d 73 63 6f 72 6c 69 62 2c 20 56 65 72
        73 69 6f 6e 3d 34 2e 30 2e 30 2e 30 2c 20 43 75
        6c 74 75 72 65 3d 6e 65 75 74 72 61 6c 2c 20 50
        75 62 6c 69 63 4b 65 79 54 6f 6b 65 6e 3d 62 37
        37 61 35 63 35 36 31 39 33 34 65 30 38 39 00 00
    )
    .field private float64 m_value

A new IL type would need to be added - float16 - right?

@fiigii
Copy link
Contributor

fiigii commented Dec 7, 2017

@eerhardt Another solution is to use [Intrinsic] attribute to mark System.Half, then the runtime/compiler can specially treat this type. This solution may be simpler and easier because we do not need to change IL and System.Half is just a struct.

See dotnet/coreclr#15340

@4creators
Copy link
Contributor Author

4creators commented Dec 7, 2017

@eerhardt

I would imagine it could be created in such way:

.class public sequential ansi sealed serializable beforefieldinit System.Half
    extends System.ValueType
    implements System.IComparable,
               System.IFormattable,
               System.IConvertible,
               class System.IComparable`1<float64>,
               class System.IEquatable`1<float64>
{
    .custom instance void System.Runtime.CompilerServices.TypeForwardedFromAttribute::.ctor(string) = (
    )
    .field private unsigned int16 m_value

@tannergooding
Copy link
Member

Yeah. I don't think modifying the IL to support a new type is a good idea. System.Half should likely just be a normal struct and use .field private int16 m_value as its backing field.

I think, overall, it should be fairly opaque and just have the basic operations that can be easily implemented in software (equality, compare, isfinite, etc). The other operations (such as the arithmetic operations, etc) can be fairly expensive to emulate and should likely not be included.

Instead, we should instruct users to cast to System.Single, perform the operations, and cast back (which is all the hardware support x86 gives under FP16C anyways).

This basically means that System.Half is an interchange type to support interoping with hardware that supports it (which might be one reason for putting it in System.Numerics).

If full hardware support ends up getting implemented sometime in the future, it might be worth revisiting what operations are exposed (but I would suggest we also expose an IsHardwareAccelerated property at that time).

@4creators
Copy link
Contributor Author

Hmm, perhaps it would necessary to add IL modifiers .size 2, and .pack 2.

@fiigii
Copy link
Contributor

fiigii commented Dec 7, 2017

which might be one reason for putting it in System.Numerics

Agree.

@4creators
Copy link
Contributor Author

The other operations (such as the arithmetic operations, etc) can be fairly expensive to emulate and should likely not be included.

Some Half implementations implicitly promote it to Single, do all calculations using the format and after return result as Half.
see: Half-precision floating point library http://half.sourceforge.net/

@4creators
Copy link
Contributor Author

This basically means that System.Half is an interchange type to support interoping with hardware that supports it

Not for long. Arm already supports Half scalar arithmetic which is similar but not conforming to IEEE 754:2008 as it does not support infinities or NaNs. Instead, the range of exponents is extended, so that this format can represent normalized values in the range of 2^{-14} to 131008.

@tannergooding
Copy link
Member

tannergooding commented Dec 7, 2017

Some Half implementations implicitly promote it to Single, do all calculations using the format and after return result as Half.

Those implementations might not be considered IEEE compliant (I don't remember the exact spec wording on this) as they can return a different result.

IIRC the spec dictates that each operation needs to be (effectively) computed to infinite precision and then rounded to the appropriate precision.

So:

Half a = Half.MaxValue;
Half b = Half.MaxValue;

Half c1 = (Half)((Single)a + (Single)b - (Single)a); // c = Half.MaxValue <-- I don't think this is compliant, as it does two operations and then rounds

Half c2 = (Half)((Single)a + (Single)b); // c = Half.Infinity
c2 = (Half)((Single)c2 - (Single)a); // c = Half.Infinity <-- I think this is compliant, as it rounds after each operation

Edit 5.1 of the spec basically dictates it as I indicated. Each individual operation effectively computes an intermediate result to infinite precision and then rounds the intermediate result to the target precision. Doing multiple operations (unless there is a specific/explicit operation, such as fma, that indicates otherwise) and then rounding at the end is "wrong"

@4creators
Copy link
Contributor Author

4creators commented Dec 7, 2017

Those implementations might not be considered IEEE compliant

Agree, they look a lot like FMA calculations. However, GCC introduced Half to libc in two modes: (i) IEEE compliant, (ii) Arm implementation compliant. IMO opinion most ppl using Half currently for AI or media/graphics/games will be more interested in getting fast calculations and saving memory than being strictly IEEE 754 compliant. Compliance may be most important in other scenarios where I do not think Half is very appealing.

@tannergooding
Copy link
Member

IMO opinion most ppl using Half currently for AI or media/graphics/games will be more interested in getting fast calculations and saving memory than being strictly IEEE 754 compliant.

I think that is a separate discussion and should be separate from the initial proposal to add the type.

I believe the initial proposal should add an opaque interchange type with a minimum set of operations. The minimum set should basically those listed in the OP, less the IConvertible implementation.

Additional operations can be discussed separately and added later, if required, as they likely warrant some further discussion (on implementation, exposure, fast vs compliant modes, etc).

@4creators
Copy link
Contributor Author

I believe the initial proposal should add an opaque interchange type with a minimum set of operations.

Agree that's the best way to proceed.

@4creators
Copy link
Contributor Author

@eerhardt Can we mark this api as ready for review or should I change it before it is ready for review.

@eerhardt
Copy link
Member

Did we decide on the namespace we are proposing? I thought that decision was still up in the air.

@tannergooding
Copy link
Member

I think it's still up. I believe System.Numerics is more appropriate (it isn't a primitive type), but the Api review will end up discussing this in either case.

I do think the OP could be updated to reflect the alternate name space suggested and to remove the IConvertible implementation.

@4creators
Copy link
Contributor Author

@eerhardt @tannergooding OP has been updated as suggested.

@tannergooding
Copy link
Member

@eerhardt, this looks reasonable to me for the API proposal.

It should be everything required for it to be an opaque interchange type (base structure + conversion operations) and none of it should be overly complicated to implement in software.

@eerhardt
Copy link
Member

Marking the API as ready for review.

@4creators 4creators changed the title API Proposal: Add System.Half 16 bit floating point number conforming to IEEE 754:2008 binary16 API Proposal: Add System.Numerics.Half 16 bit floating point number conforming to IEEE 754:2008 binary16 Jan 26, 2018
@eerhardt
Copy link
Member

eerhardt commented Feb 6, 2018

Putting this in the Future milestone since the API is not required for the .NET Core 2.1 release.

@justinormont
Copy link

Have we considered supporting bfloat16 in addition to IEEE float16?

Intel is adding hardware support for Google's bfloat16 format, we may want to follow them. They claim speed advantages due to reduced bus bandwidth with minor differences in accuracy for the machine learning process.

The bfloat16 is a truncated float32 which maintains the dynamic range of the float32 (~1e-38 to ~3e38.) while reducing the significand's precision.

The other large gain is to avoid the overflow/underflow that a standard float16 would hit due to its limited range (~6e-8 to 65504). In an IEEE float16, a vanishing/exploding gradient can very easily hit the ~6e-8/66k limit of a float16. This otherwise is rather tricky for a normal coder to avoid in the float16, but easily handled in the bfloat16's larger range. Tensorflow's ops normally use bfloat16 in mixed precision by doing multiplies in bfloat16 and summation into float32.

On the differences of float16 & bfloat16, Google says, "One thing that we find with TensorFlow is that if you're training w/ half-precision IEEE like this you really have to pay attention and you need to get an expert involved in a model-by-model basis to make sure all of your numbers stay in the right range and don't overflow and those techniques are hard to apply in general across a wide range of models."

Conversion to/from float32 & bfloat16 is quite trivial as seen in Tensorflow's bfloat16 code. Until hardware support catches up, where we can fully utilize bfloat16, we can still make use of the format to reduce memory use and get the related bus transfer speed, and usability gains. When we need to operate on the bfloat16 numbers, we'll have to incur the small but non-zero cost of converting them back into IEEE 754 single-precision 32-bit floats until the hardware support arrives.

The main limiting factor for why to not support bfloat16 is the lack of current hardware support. Only Google's TPUs, and various upcoming Intel processors support bfloat16. Whereas IEEE float16 has wide support in GPUs and CPUs (Haswell and beyond).

@tannergooding
Copy link
Member

@justinormont, it has come up before but there hasn't been a formal proposal yet.

I think that, eventually, both will need to be supported. Each has their own benefits/drawbacks and has use in various fields. For example:

  • float16 has reduced range
    • This means users need to be more careful about overflow/underflow
  • bfloat16 has reduced accuracy (2-4 digits)
    • This means that users need to be careful about the gap between values

As you mentioned, bfloat16 doesn't currently have any hardware support. The good news is, that like float16, it is fairly trivial to support a number of basic operations (such as conversion, etc) on these types in software.

I would imagine that, when/if hardware support becomes available, the respective hardware intrinsic API would also go up for review.

@tannergooding
Copy link
Member

As a comparison of the two data formats, you can look at the following program and its output:
ComparisonProgram.zip

You'll notice that, on bfloat16 the reduced precision causes numbers above 128 to drop all decimal digits and numbers above 256 can't even be represented accurately as whole numbers (we start jumping by 2 and the largest gap between numbers is ~1.33×10^36).

While on float16, you'll notice that we have much greater precision but the maximum value we can represent is 65504 (float16 doesn't start skipping whole numbers until 2048 and the biggest gap between numbers is 32).

@danmoseley
Copy link
Member

@pgovind will drive this along

@pgovind
Copy link
Contributor

pgovind commented Jun 2, 2020

Do we want 2 Half constructors that take in float and double? I already updated the API proposal to define explicit conversion operators.

public Half(float value)
public Half(double value)

@tannergooding
Copy link
Member

I don't believe so, the other "primitive" types don't have equivalents and I don't think we'd want to add them

You would just do Half h = (Half)floatValue like you would do with double to float (float f = (float)doubleValue)

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review labels Jun 4, 2020
@terrajobst
Copy link
Member

terrajobst commented Jun 4, 2020

Video

  • Type looks good as proposed
  • But we decided to put it in System to align with the other primitves
namespace System
{
    public readonly struct Half : IComparable, IFormattable, IComparable<Half>, IEquatable<Half>, IConvertible, ISpanFormattable
    {
        public static readonly Half MinValue;
        public static readonly Half Epsilon;
        public static readonly Half MaxValue;
        public static readonly Half PositiveInfinity;
        public static readonly Half NegativeInfinity;
        public static readonly Half NaN;
        public static bool IsInfinity(Half h);
        public static bool IsFinite(Half value);
        public static bool IsNaN(Half h);
        public static bool IsNegativeInfinity(Half h);
        public static bool IsPositiveInfinity(Half h);
        public static bool IsNormal(Half h);
        public static bool IsSubnormal(Half h);
        public static bool IsNegative(Half h);
        public static Half Parse(string s);
        public static Half Parse(string s, NumberStyles style);
        public static Half Parse(string s, NumberStyles style, IFormatProvider provider);
        public static Half Parse(string s, IFormatProvider provider);
        public static Half Parse(ReadOnlySpan<char> s);
        public static Half Parse(ReadOnlySpan<char> s, NumberStyles style);
        public static Half Parse(ReadOnlySpan<char> s, IFormatProvider provider);
        public static Half Parse(ReadOnlySpan<char> s, NumberStyles style, IFormatProvider provider);
        public bool TryFormat(Span<char> destination, out int charsWritten, ReadOnlySpan<char> format, IFormatProvider provider);
        public static bool TryParse(string s, out Half result);
        public static bool TryParse(string s, NumberStyles style, IFormatProvider provider, out Half result);
        public static bool TryParse(ReadOnlySpan<char> s, out Half result);
        public static bool TryParse(ReadOnlySpan<char> s, NumberStyles style, IFormatProvider provider, out Half result);
        public int CompareTo(object value);
        public int CompareTo(Half value);
        public bool Equals(Half obj);
        public override bool Equals(object obj);
        public override int GetHashCode();
        public TypeCode GetTypeCode();
        public string ToString(IFormatProvider provider);
        public string ToString(string format);
        public string ToString(string format, IFormatProvider provider);
        public override string ToString();
        public static explicit operator Half(float value);
        public static explicit operator Half(double value);
        public static explicit operator float(Half value);
        public static explicit operator double(Half value);
        public static bool operator ==(Half left, Half right);
        public static bool operator !=(Half left, Half right);
        public static bool operator <(Half left, Half right);
        public static bool operator >(Half left, Half right);
        public static bool operator <=(Half left, Half right);
        public static bool operator >=(Half left, Half right);
    }
}

jkotas pushed a commit that referenced this issue Jun 25, 2020
pgovind pushed a commit to pgovind/runtime that referenced this issue Jun 25, 2020
kevinwkt pushed a commit to kevinwkt/runtimelab that referenced this issue Jul 15, 2020
@carlreinke
Copy link
Contributor

System.Half in master doesn't implement IConvertible. Did that get dropped intentionally?

Will System.TypeCode get a new member for Half?

@tannergooding
Copy link
Member

IConvertible was dropped intentionally. We can't readily expand the IConvertible interface and the overall support that IConvertible provides for non-primitive types is fairly limited.

.NET 6 will take a closer look at where in the framework System.Half support should be added (including System.Convert).

@ghost ghost locked as resolved and limited conversation to collaborators Dec 19, 2020
trylek pushed a commit to smx-smx/runtime that referenced this issue May 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Numerics
Projects
None yet
Development

Successfully merging a pull request may close this issue.