Add generic Math methods for numeric types #18244

Daniel-Svensson · 2016-08-20T17:24:13Z

Overview

With the addition of System.Numerics.Vector it suddenly became possible to write some generic algorithms that worked for most primitive numeric datatypes. However as soon as you need to work with single elements you are out of luck.

I propose a new generic API for scalar datatypes based upon the ideas behind System.Numerics.Vector.

Just as for vectors many of the genereic methods could be generated using .tt files such as in https://github.com/dotnet/corefx/blob/master/src/System.Numerics.Vectors/src/System/Numerics/Vector.tt

API

This is an example of how the API could look like.
All methods below are taken from the Vector class (https://msdn.microsoft.com/en-us/library/system.numerics.vector(v=vs.111).aspx).

Additional methods such as those found in System.Math (https://msdn.microsoft.com/en-us/library/system.math.aspx) would be interesting as well.
It could also be interesting to add non-generic methods such as float Round(float x, int digits); as well

All functionlity could be provided in a new nuget package in order to allow out of band releases without conflicts with the current Math type.

public static class Scalar
{
    public static T Abs<T>(T left, T right) where T : struct
    public static T Add<T>(T left, T right) where T : struct
    public static T AndNot<T>(T left, T right) where T : struct
    public static T BitwiseAnd<T>(T left, T right) where T : struct
    public static T BitwiseOr<T>(T left, T right) where T : struct
    public static T Divide<T>(T left, T right) where T : struct
    public static T Max<T>(T left, T right) where T : struct
    public static T Min<T>(T left, T right) where T : struct
    public static T Multiply<T>(T left, T right) where T : struct
    public static T Negate<T>(T left, T right) where T : struct
    public static T OnesComplement<T>(T left, T right) where T : struct
    public static T SquareRoot<T>(T left, T right) where T : struct
    public static T Subtract<T>(T left, T right) where T : struct
    public static T Xor<T>(T left, T right) where T : struct

    // Useful methods part of Vector<T>
    public static T Zero<T>()  where T : struct
    public static T One<T>()  where T : struct
}

The generic requirements for some methods might be changed a bit to allow additional types, such as using providing extra overload such as those suggested in https://github.com/dotnet/corefx/issues/1583.

Comparison and equality checking was not included since IComparable<> and IEquatable<> are implemented for the types already, but if it is considered a good idea to be as similar as possible to Vector then the following methods should also be considered

    public static bool GreaterThan<T>(T left, T right) where T : struct
    public static bool GreaterThanOrEqual<T>(T left, T right) where T : struct
    public static bool LessThan<T>(T left, T right) where T : struct
    public static bool LessThanOrEqual<T>(T left, T right) where T : struct

Types to support

I propose that all the primitive numeric types including decimal are supported (the same types as Vector<> support but with the addition of decimal). Invoking any method with any other type should result in an NotSupportedException()

Byte
SByte
UInt16
Int16
UInt32
Int32
UInt64
Int64
Single
Double
Decimal (The only difference from Vector)

Related issues:

I found a few issues regarding similar ideas

New API for single-precision math #14353 is about adding new Math functionality which could be added to the "Scalar" type
- With corresponding PR in Adding single-precision math functions. coreclr#5492
Add generic overloads to Math.Min and Math.Max #14510 - Add generic overloads to Math.Min and Math.Max
Add a Clamp method to System.Math #13994 - add Clamp method

The text was updated successfully, but these errors were encountered:

mellinoe · 2016-12-07T00:02:26Z

I like the concept, but I am worried that we will get really bad performance out of this. Vector<T> has a functional IL implementation using this pattern, but it is most certainly not fast. Part of that is because of the odd generic code style (inherited here), but it has other problems, like massive structure size, massive function length, etc. that aren't necessarily inherited here. Those generally result in degenerate code-gen.

I think it would be difficult to move forward on this until we were able to understand what it would take to make it high-performance. Is it possible to have a fully-portable binary with this stuff in it, or would we need special handling in the JIT (like Vector<T>)? If the former is possible, it would be good to have a minimal prototype already implemented and available to test.

As an aside, I use a very crude version of this pattern in the System.Numerics.Vectors tests themselves: https://github.com/dotnet/corefx/blob/master/src/System.Numerics.Vectors/tests/Util.cs#L109. It relies on dynamic.

Daniel-Svensson · 2016-12-11T21:15:48Z

@mellinoe what would be acceptable performance for code such as this?
I have compared 3 different implementations of which 2 makes sense and could be used to make a small prototype with some of the basic functions.

My first usage scenario was as a complement to Vector in order to process the last few elements or to "add the finishing" to algorithms, but then it would be very beneficiary to be able to use this for more general calculations or places where you don't need vector.

Ideally I think this is one area where it makes sense to improve the JIT. But I haven't added any feature request in CoreCLR since I wanted feedback on this idea.
I am unsure if the best approach is to add instrincts or to allowing the generic approach used by vector to be inlined (which maybe could help with other cases to), but in either case even Vector should be available with decent performance even without hardware acceleration.

Implementation Strategies

1. "IL"

There is no way to write genereic methods using arithmetics or primitives in C# (that I know of at least), but it is still valid IL (at least sort of).
An example of a method which does standard addition (no overflow checking) for builtin primititves

.method public hidebysig static !!T AddGeneric<valuetype T> (
            !!T '',
            !!T ''
        ) cil managed 
    {
        IL_0000: ldarg.0
        IL_0001: ldarg.1
        IL_0002: add
        IL_0003: ret
    }

The downside of this is that the JIT will raise a ExecutionEngineException when trying to run this code.
Since this code will almost certainly be inliened, using it with an unsupported type will raise the execption
when the calling method is first invoked (JITted).

I have tried multiple approaches of adding some kind of error checking in order to allow a nicer error to be raised,
but havent found a good solution so far. It seems like Vector will raise TypeInitializationException when trying
to create one instance with an unsupported T such as deciamal.

Pros:

As fast as possible
Jit will understand add method and perform constant propagation
That is Add(1,3) will just evaluate to at jit time 4, no add instruction will be emitted

Cons:

ExecutionEngineException if used with wrong type
Cannot only handle types supported by "add" opcode and not other types such as decimal
How to easily embed this with C# code ?

2. Handle all types explicilty (as done for vector)

Example code for add:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static T Add<T>(T lhs, T rhs)
    where T : struct
{
    if (typeof(T) == typeof(float))
    {
        return (T)(object)(((float)(object)lhs) + ((float)(object)rhs));
    }
    if (typeof(T) == typeof(int))
    {
        return (T)(object)(((int)(object)lhs) + ((int)(object)rhs));
    }
    ....
    throw new NotSupportedException();
}

Pros:

Quite fast
- Jit will optimize method to eliminate all comparisons at jit time
Can choose which types to handle (including deciamal)
Can choose exception

Cons:

Will not be inlined by current jit
No constant propagation
Much code and not to easy to read even if it is repetitive

3. Using Dynamic

The approach as used in https://github.com/dotnet/corefx/blob/master/src/System.Numerics.Vectors/tests/Util.cs#L109

Performance comparisons

Benchmark 1: Computing sum of 100 000 (float) array elements

A simple benchmark summing 100000 random floats that are located in an array.

BenchmarkDotNet=v0.10.1, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-2500K CPU 3.30GHz, ProcessorCount=4
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0
  Job-FPOZGJ : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0

Jit=RyuJit  Platform=X64  IterationTime=1.0000 s  
LaunchCount=1  TargetCount=3  WarmupCount=3

Method	Mean	StdDev	Max	Scaled	Scaled-StdDev	Gen 0	Allocated
ForLoopSumImpl	71.7315 us	0.2031 us	71.8703 us	1.00	0.00	-	0 B
GenericHelperClass	239.2871 us	0.5135 us	239.6566 us	3.34	0.01	-	0 B
GenericILMethod	71.9311 us	0.0304 us	71.9632 us	1.00	0.00	-	0 B
Generic_Sum_Dynamic	1,497.0433 us	4.9572 us	1,500.9469 us	20.87	0.07	1469.7421	4.8 MB

FoorLoopSumImpl

The baseline to comapre against

float sum = 0.0;
for (int i = 0; i < _values.Length; ++i)
    sum = sum + _values[i];
return sum;

GenericHelperClass

This is implementation strategy 2.

GenericILMethod

This is implementation strategy 1 where the IL defined method is located in another assembly.

Generic_Sum_Dynamic

Implementation strategy 3.
This approach takes much more time and generates a number of allocations.

Benchmark 2: Computing the sum of all integers from 1 .. 10 000

In this test the array accesses and corresponding memory load is taken out away from the equation.
We still have the for loop, but this should be able to measure a more realistic impact in more number
intensive applications.

BenchmarkDotNet=v0.10.1, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-2500K CPU 3.30GHz, ProcessorCount=4
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0
  Job-FPOZGJ : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0

Jit=RyuJit  Platform=X64  IterationTime=1.0000 s  
LaunchCount=1  TargetCount=3  WarmupCount=3

Method	Mean	StdDev	Max	Scaled	Scaled-StdDev	Gen 0	Allocated
ForLoopSumImpl	4.2266 us	0.0018 us	4.2283 us	1.00	0.00	-	0 B
GenericILMethod	4.2287 us	0.0008 us	4.2291 us	1.00	0.00	-	0 B
GenericHelperClass	19.1062 us	0.0351 us	19.1468 us	4.52	0.01	-	0 B
DynamicHelperC	143.2929 us	0.1961 us	143.4865 us	33.90	0.04	146.9374	480 kB

FoorLoopSumImpl

The baseline to compare against

int sum = 0.0;
for (int i = 0; i < 10000; ++i)
    sum = sum + i;
return sum;

GenericILMethod

Using the Add method from strategy 1.

GenericHelperClass

This is implementation strategy 2.

DynamicHelperC

This is implementation strategy 3.
This is exactly 1/10th the number of additions and also one tenth the number of allocations.

Benchmark summary

Strategy 1 (IL based method)
yields the same runtime performance (but there is more work for jitter) as when using "+" directly.
Strategy 2 (multiple if method)
results in a small overhead of 3-3.5 times for these simple examples.
Strategy 3 (dynamic)
results in a much larger overhead of 20-34 times the speed and with addition allocations as a results.

danmoseley · 2016-12-16T21:06:10Z

@mellinoe for the question above

mellinoe · 2016-12-16T21:45:09Z

@Daniel-Svensson Wow, that was extremely thorough! Thanks for the attention to detail. Do you have the code available anywhere for reference? It sounds like you may only have the Add operation implemented, but it should be easy to "auto-generate" the rest of the code using some text templating (I use T4 templates for Vector<T>).

I like this proposal overall, and based on the testing, performance numbers are not catastrophically bad (~3-5x). I think that using the Generic IL implementation is not feasible, even if it is super fast. It's not 100% clear to me whether it's actually valid IL in the first place, or if our JIT just allows it. The "multiple-if" method seems to be the way forward.

A few more questions/points come to mind:

Is this the full set of operations that are desirable? Anything on the border that might be excluded or included?
I think it makes sense to include the comparison operators, if nothing else but for consistency. Also, it seems like you might not have a T : IComparable constraint on your function. Even if you did, it would involve boxing to call the interface method.
Are there any other libraries out there tackling this sort of problem? How do they express these concepts?
Is the need for this widespread enough and useful enough to be directly included in the BCL, or can it be a separate extension library for those who want it?

Daniel-Svensson · 2016-12-27T21:02:13Z

I've cleaned up some of the examples and uploaded them to https://github.com/Daniel-Svensson/ClrExperiments

Is this the full set of operations that are desirable? Anything on the border that might be excluded or included?

I would not call the list complete at the moment.
Before calling it complete the API provided by Math (such as Sign etc), (MathF se link in description), Vector and Vector{T} should be considered to see if there are other methods which makes senes and could be useful.

I think it makes sense to include the comparison operators, if nothing else but for consistency. Also, it seems like you might not have a T : IComparable constraint on your function. Even if you did, it would involve boxing to call the interface method.

Including the constraints should not force boxing to occur, the runtime is pretty god at de-virtualization in order to avoid boxing when the type is known as is the case here. But it should certainly be considered if it is desired to require the calling code to require IComparable{T} in the generic constraints or not.

Are there any other libraries out there tackling this sort of problem? How do they express these concepts?

I've tried to find such a library before I started with these experiments. But I did not find any alternative for C# / VB.Net. F# in itself allows these kind of generic functions and algorithms to be written, but I do not know of any good way of writing them in F# and exposing them to other languages except of providing a facade with overloads for all types.

dcwuser · 2017-02-07T00:33:25Z

Two main observations: (1) The type signature issues here are complex. (2) This doesn't seem to me to add much practical value.

By the way, Abs takes one argument, not two.

First: The point of having methods like this is presumably to be able to build sophisticated algorithms on top of them which will then automatically work for all the built-in numeric types. Let's start with looks to me like the simplest algorithm one might think to build for arbitrary numeric types: averaging. To average, we need to add a bunch of Ts; no problem there. Then we need to divide, not by a T but by an integer. Well, if our values are doubles or decimals, that's probably not a problem, since integer can be cast to those types. But if our values are integers, we don't want to do integer division, we actually want to do double division and produce a double. (Or maybe sometimes a decimal?) To deal with this, we will need to start adding some mixed-type overloads and maybe name variants to specify different conventions.

By the way, what should Negate of an unsigned type produce? Maybe the corresponding signed type, but then the output T is different from the input T and what do we do if the input value is too big to negate? Maybe we just throw an exception, but then any algorithm using Negate is automatically useless for ~50% of our built-in classes.

Second: Suppose we added variants to cover all the ambiguities and were able to write something really sophisticated and useful on top of this, like say a matrix inversion algorithm. (That would be really impressive, considering that the inverse of an integer matrix is not an integer matrix, but still imagine we did.) How useful would that be to people who care about matrix inversion? As one of those people, I would say "not so much". The first thing we want is certainly to invert a matrix of doubles, but the next thing isn't to invert a matrix of singles or some other built-in-type, but rather to invert a complex matrix or a matrix of some arbitrary-precision type, in any case a matrix of some new type that goes beyond the capabilities of the built-ins, and such a case wouldn't be covered by these methods.

tannergooding · 2020-03-04T12:59:59Z

@agocke, @jaredpar.

Would this be covered or more readily implementable with something like the shapes proposal for C#?

TylerBrinkley · 2020-03-04T13:47:25Z

I apologize for the shameless plug, but if anyone is looking for this sort of functionality currently you may want to consider my library Genumerics.

danmoseley · 2020-03-04T17:13:43Z

@TylerBrinkley I think plugging relevant libraries is always welcome..!

tannergooding · 2021-04-29T20:45:20Z

Closing this. As mentioned in other issues (such as #50647 or #25058) we are looking into the correct surface area to expose here as part of the static abstracts in interfaces feature the language and runtime are adding.

This is currently being tracked in our dotnet/designs repo and I've added the initial rough draft of the surface area here: dotnet/designs#205

Further discussion or suggestions should happen there and in future PRs that further refine the proposed surface area.

stephentoub assigned mellinoe Sep 1, 2016

mellinoe removed their assignment Dec 7, 2016

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 5.0 milestone Jan 31, 2020

maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020

jnm2 mentioned this issue May 6, 2020

[Proposal] Add interface hierarchy to primitive types dotnet/csharplang#3423

Closed

tannergooding modified the milestones: 5.0.0, Future Jun 23, 2020

tannergooding removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2020

hansmbakker mentioned this issue Apr 29, 2021

Add support System.Numerics.Vectors types with double precision #24168

Open

tannergooding closed this as completed Apr 29, 2021

ghost locked as resolved and limited conversation to collaborators May 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generic Math methods for numeric types #18244

Add generic Math methods for numeric types #18244

Daniel-Svensson commented Aug 20, 2016

mellinoe commented Dec 7, 2016

Daniel-Svensson commented Dec 11, 2016 •

edited

Loading

danmoseley commented Dec 16, 2016

mellinoe commented Dec 16, 2016

Daniel-Svensson commented Dec 27, 2016

dcwuser commented Feb 7, 2017

tannergooding commented Mar 4, 2020

TylerBrinkley commented Mar 4, 2020

danmoseley commented Mar 4, 2020

tannergooding commented Apr 29, 2021

Add generic Math methods for numeric types #18244

Add generic Math methods for numeric types #18244

Comments

Daniel-Svensson commented Aug 20, 2016

Overview

API

Types to support

Related issues:

mellinoe commented Dec 7, 2016

Daniel-Svensson commented Dec 11, 2016 • edited Loading

Implementation Strategies

1. "IL"

2. Handle all types explicilty (as done for vector)

3. Using Dynamic

Performance comparisons

Benchmark 1: Computing sum of 100 000 (float) array elements

Benchmark 2: Computing the sum of all integers from 1 .. 10 000

Benchmark summary

danmoseley commented Dec 16, 2016

mellinoe commented Dec 16, 2016

Daniel-Svensson commented Dec 27, 2016

dcwuser commented Feb 7, 2017

tannergooding commented Mar 4, 2020

TylerBrinkley commented Mar 4, 2020

danmoseley commented Mar 4, 2020

tannergooding commented Apr 29, 2021

Daniel-Svensson commented Dec 11, 2016 •

edited

Loading