Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic Math methods for numeric types #18244

Closed
Daniel-Svensson opened this issue Aug 20, 2016 · 10 comments
Closed

Add generic Math methods for numeric types #18244

Daniel-Svensson opened this issue Aug 20, 2016 · 10 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Numerics
Milestone

Comments

@Daniel-Svensson
Copy link
Contributor

Overview

With the addition of System.Numerics.Vector it suddenly became possible to write some generic algorithms that worked for most primitive numeric datatypes. However as soon as you need to work with single elements you are out of luck.

I propose a new generic API for scalar datatypes based upon the ideas behind System.Numerics.Vector.

Just as for vectors many of the genereic methods could be generated using .tt files such as in https://github.com/dotnet/corefx/blob/master/src/System.Numerics.Vectors/src/System/Numerics/Vector.tt

API

This is an example of how the API could look like.
All methods below are taken from the Vector class (https://msdn.microsoft.com/en-us/library/system.numerics.vector(v=vs.111).aspx).

Additional methods such as those found in System.Math (https://msdn.microsoft.com/en-us/library/system.math.aspx) would be interesting as well.
It could also be interesting to add non-generic methods such as float Round(float x, int digits); as well

All functionlity could be provided in a new nuget package in order to allow out of band releases without conflicts with the current Math type.

public static class Scalar
{
    public static T Abs<T>(T left, T right) where T : struct
    public static T Add<T>(T left, T right) where T : struct
    public static T AndNot<T>(T left, T right) where T : struct
    public static T BitwiseAnd<T>(T left, T right) where T : struct
    public static T BitwiseOr<T>(T left, T right) where T : struct
    public static T Divide<T>(T left, T right) where T : struct
    public static T Max<T>(T left, T right) where T : struct
    public static T Min<T>(T left, T right) where T : struct
    public static T Multiply<T>(T left, T right) where T : struct
    public static T Negate<T>(T left, T right) where T : struct
    public static T OnesComplement<T>(T left, T right) where T : struct
    public static T SquareRoot<T>(T left, T right) where T : struct
    public static T Subtract<T>(T left, T right) where T : struct
    public static T Xor<T>(T left, T right) where T : struct

    // Useful methods part of Vector<T>
    public static T Zero<T>()  where T : struct
    public static T One<T>()  where T : struct
}

The generic requirements for some methods might be changed a bit to allow additional types, such as using providing extra overload such as those suggested in https://github.com/dotnet/corefx/issues/1583.

Comparison and equality checking was not included since IComparable<> and IEquatable<> are implemented for the types already, but if it is considered a good idea to be as similar as possible to Vector then the following methods should also be considered

    public static bool GreaterThan<T>(T left, T right) where T : struct
    public static bool GreaterThanOrEqual<T>(T left, T right) where T : struct
    public static bool LessThan<T>(T left, T right) where T : struct
    public static bool LessThanOrEqual<T>(T left, T right) where T : struct

Types to support

I propose that all the primitive numeric types including decimal are supported (the same types as Vector<> support but with the addition of decimal). Invoking any method with any other type should result in an NotSupportedException()

  • Byte
  • SByte
  • UInt16
  • Int16
  • UInt32
  • Int32
  • UInt64
  • Int64
  • Single
  • Double
  • Decimal (The only difference from Vector)

Related issues:

I found a few issues regarding similar ideas

@mellinoe
Copy link
Contributor

mellinoe commented Dec 7, 2016

I like the concept, but I am worried that we will get really bad performance out of this. Vector<T> has a functional IL implementation using this pattern, but it is most certainly not fast. Part of that is because of the odd generic code style (inherited here), but it has other problems, like massive structure size, massive function length, etc. that aren't necessarily inherited here. Those generally result in degenerate code-gen.

I think it would be difficult to move forward on this until we were able to understand what it would take to make it high-performance. Is it possible to have a fully-portable binary with this stuff in it, or would we need special handling in the JIT (like Vector<T>)? If the former is possible, it would be good to have a minimal prototype already implemented and available to test.

As an aside, I use a very crude version of this pattern in the System.Numerics.Vectors tests themselves: https://github.com/dotnet/corefx/blob/master/src/System.Numerics.Vectors/tests/Util.cs#L109. It relies on dynamic.

@mellinoe mellinoe removed their assignment Dec 7, 2016
@Daniel-Svensson
Copy link
Contributor Author

Daniel-Svensson commented Dec 11, 2016

@mellinoe what would be acceptable performance for code such as this?
I have compared 3 different implementations of which 2 makes sense and could be used to make a small prototype with some of the basic functions.

My first usage scenario was as a complement to Vector in order to process the last few elements or to "add the finishing" to algorithms, but then it would be very beneficiary to be able to use this for more general calculations or places where you don't need vector.

Ideally I think this is one area where it makes sense to improve the JIT. But I haven't added any feature request in CoreCLR since I wanted feedback on this idea.
I am unsure if the best approach is to add instrincts or to allowing the generic approach used by vector to be inlined (which maybe could help with other cases to), but in either case even Vector should be available with decent performance even without hardware acceleration.

Implementation Strategies

1. "IL"

There is no way to write genereic methods using arithmetics or primitives in C# (that I know of at least), but it is still valid IL (at least sort of).
An example of a method which does standard addition (no overflow checking) for builtin primititves

.method public hidebysig static !!T AddGeneric<valuetype T> (
            !!T '',
            !!T ''
        ) cil managed 
    {
        IL_0000: ldarg.0
        IL_0001: ldarg.1
        IL_0002: add
        IL_0003: ret
    }

The downside of this is that the JIT will raise a ExecutionEngineException when trying to run this code.
Since this code will almost certainly be inliened, using it with an unsupported type will raise the execption
when the calling method is first invoked (JITted).

I have tried multiple approaches of adding some kind of error checking in order to allow a nicer error to be raised,
but havent found a good solution so far. It seems like Vector will raise TypeInitializationException when trying
to create one instance with an unsupported T such as deciamal.

Pros:

  • As fast as possible
  • Jit will understand add method and perform constant propagation
    That is Add(1,3) will just evaluate to at jit time 4, no add instruction will be emitted

Cons:

  • ExecutionEngineException if used with wrong type
  • Cannot only handle types supported by "add" opcode and not other types such as decimal
  • How to easily embed this with C# code ?

2. Handle all types explicilty (as done for vector)

Example code for add:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static T Add<T>(T lhs, T rhs)
    where T : struct
{
    if (typeof(T) == typeof(float))
    {
        return (T)(object)(((float)(object)lhs) + ((float)(object)rhs));
    }
    if (typeof(T) == typeof(int))
    {
        return (T)(object)(((int)(object)lhs) + ((int)(object)rhs));
    }
    ....
    throw new NotSupportedException();
}

Pros:

  • Quite fast
    • Jit will optimize method to eliminate all comparisons at jit time
  • Can choose which types to handle (including deciamal)
  • Can choose exception

Cons:

  • Will not be inlined by current jit
  • No constant propagation
  • Much code and not to easy to read even if it is repetitive

3. Using Dynamic

The approach as used in https://github.com/dotnet/corefx/blob/master/src/System.Numerics.Vectors/tests/Util.cs#L109

Performance comparisons

Benchmark 1: Computing sum of 100 000 (float) array elements

A simple benchmark summing 100000 random floats that are located in an array.

BenchmarkDotNet=v0.10.1, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-2500K CPU 3.30GHz, ProcessorCount=4
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0
  Job-FPOZGJ : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0

Jit=RyuJit  Platform=X64  IterationTime=1.0000 s  
LaunchCount=1  TargetCount=3  WarmupCount=3  
Method Mean StdDev Max Scaled Scaled-StdDev Gen 0 Allocated
ForLoopSumImpl 71.7315 us 0.2031 us 71.8703 us 1.00 0.00 - 0 B
GenericHelperClass 239.2871 us 0.5135 us 239.6566 us 3.34 0.01 - 0 B
GenericILMethod 71.9311 us 0.0304 us 71.9632 us 1.00 0.00 - 0 B
Generic_Sum_Dynamic 1,497.0433 us 4.9572 us 1,500.9469 us 20.87 0.07 1469.7421 4.8 MB

FoorLoopSumImpl

The baseline to comapre against

float sum = 0.0;
for (int i = 0; i < _values.Length; ++i)
    sum = sum + _values[i];
return sum;

GenericHelperClass

This is implementation strategy 2.

GenericILMethod

This is implementation strategy 1 where the IL defined method is located in another assembly.

Generic_Sum_Dynamic

Implementation strategy 3.
This approach takes much more time and generates a number of allocations.

Benchmark 2: Computing the sum of all integers from 1 .. 10 000

In this test the array accesses and corresponding memory load is taken out away from the equation.
We still have the for loop, but this should be able to measure a more realistic impact in more number
intensive applications.

BenchmarkDotNet=v0.10.1, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-2500K CPU 3.30GHz, ProcessorCount=4
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0
  Job-FPOZGJ : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1586.0

Jit=RyuJit  Platform=X64  IterationTime=1.0000 s  
LaunchCount=1  TargetCount=3  WarmupCount=3  
Method Mean StdDev Max Scaled Scaled-StdDev Gen 0 Allocated
ForLoopSumImpl 4.2266 us 0.0018 us 4.2283 us 1.00 0.00 - 0 B
GenericILMethod 4.2287 us 0.0008 us 4.2291 us 1.00 0.00 - 0 B
GenericHelperClass 19.1062 us 0.0351 us 19.1468 us 4.52 0.01 - 0 B
DynamicHelperC 143.2929 us 0.1961 us 143.4865 us 33.90 0.04 146.9374 480 kB

FoorLoopSumImpl

The baseline to compare against

int sum = 0.0;
for (int i = 0; i < 10000; ++i)
    sum = sum + i;
return sum;

GenericILMethod

Using the Add method from strategy 1.

GenericHelperClass

This is implementation strategy 2.

DynamicHelperC

This is implementation strategy 3.
This is exactly 1/10th the number of additions and also one tenth the number of allocations.

Benchmark summary

  • Strategy 1 (IL based method)
    yields the same runtime performance (but there is more work for jitter) as when using "+" directly.
  • Strategy 2 (multiple if method)
    results in a small overhead of 3-3.5 times for these simple examples.
  • Strategy 3 (dynamic)
    results in a much larger overhead of 20-34 times the speed and with addition allocations as a results.

@danmoseley
Copy link
Member

@mellinoe for the question above

@mellinoe
Copy link
Contributor

@Daniel-Svensson Wow, that was extremely thorough! Thanks for the attention to detail. Do you have the code available anywhere for reference? It sounds like you may only have the Add operation implemented, but it should be easy to "auto-generate" the rest of the code using some text templating (I use T4 templates for Vector<T>).

I like this proposal overall, and based on the testing, performance numbers are not catastrophically bad (~3-5x). I think that using the Generic IL implementation is not feasible, even if it is super fast. It's not 100% clear to me whether it's actually valid IL in the first place, or if our JIT just allows it. The "multiple-if" method seems to be the way forward.

A few more questions/points come to mind:

  • Is this the full set of operations that are desirable? Anything on the border that might be excluded or included?
  • I think it makes sense to include the comparison operators, if nothing else but for consistency. Also, it seems like you might not have a T : IComparable constraint on your function. Even if you did, it would involve boxing to call the interface method.
  • Are there any other libraries out there tackling this sort of problem? How do they express these concepts?
  • Is the need for this widespread enough and useful enough to be directly included in the BCL, or can it be a separate extension library for those who want it?

@Daniel-Svensson
Copy link
Contributor Author

I've cleaned up some of the examples and uploaded them to https://github.com/Daniel-Svensson/ClrExperiments

Is this the full set of operations that are desirable? Anything on the border that might be excluded or included?

I would not call the list complete at the moment.
Before calling it complete the API provided by Math (such as Sign etc), (MathF se link in description), Vector and Vector{T} should be considered to see if there are other methods which makes senes and could be useful.

I think it makes sense to include the comparison operators, if nothing else but for consistency. Also, it seems like you might not have a T : IComparable constraint on your function. Even if you did, it would involve boxing to call the interface method.

Including the constraints should not force boxing to occur, the runtime is pretty god at de-virtualization in order to avoid boxing when the type is known as is the case here. But it should certainly be considered if it is desired to require the calling code to require IComparable{T} in the generic constraints or not.

Are there any other libraries out there tackling this sort of problem? How do they express these concepts?

I've tried to find such a library before I started with these experiments. But I did not find any alternative for C# / VB.Net. F# in itself allows these kind of generic functions and algorithms to be written, but I do not know of any good way of writing them in F# and exposing them to other languages except of providing a facade with overloads for all types.

@dcwuser
Copy link
Contributor

dcwuser commented Feb 7, 2017

Two main observations: (1) The type signature issues here are complex. (2) This doesn't seem to me to add much practical value.

By the way, Abs takes one argument, not two.

First: The point of having methods like this is presumably to be able to build sophisticated algorithms on top of them which will then automatically work for all the built-in numeric types. Let's start with looks to me like the simplest algorithm one might think to build for arbitrary numeric types: averaging. To average, we need to add a bunch of Ts; no problem there. Then we need to divide, not by a T but by an integer. Well, if our values are doubles or decimals, that's probably not a problem, since integer can be cast to those types. But if our values are integers, we don't want to do integer division, we actually want to do double division and produce a double. (Or maybe sometimes a decimal?) To deal with this, we will need to start adding some mixed-type overloads and maybe name variants to specify different conventions.

By the way, what should Negate of an unsigned type produce? Maybe the corresponding signed type, but then the output T is different from the input T and what do we do if the input value is too big to negate? Maybe we just throw an exception, but then any algorithm using Negate is automatically useless for ~50% of our built-in classes.

Second: Suppose we added variants to cover all the ambiguities and were able to write something really sophisticated and useful on top of this, like say a matrix inversion algorithm. (That would be really impressive, considering that the inverse of an integer matrix is not an integer matrix, but still imagine we did.) How useful would that be to people who care about matrix inversion? As one of those people, I would say "not so much". The first thing we want is certainly to invert a matrix of doubles, but the next thing isn't to invert a matrix of singles or some other built-in-type, but rather to invert a complex matrix or a matrix of some arbitrary-precision type, in any case a matrix of some new type that goes beyond the capabilities of the built-ins, and such a case wouldn't be covered by these methods.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@tannergooding
Copy link
Member

@agocke, @jaredpar.

Would this be covered or more readily implementable with something like the shapes proposal for C#?

@TylerBrinkley
Copy link
Contributor

I apologize for the shameless plug, but if anyone is looking for this sort of functionality currently you may want to consider my library Genumerics.

@danmoseley
Copy link
Member

@TylerBrinkley I think plugging relevant libraries is always welcome..!

@tannergooding
Copy link
Member

Closing this. As mentioned in other issues (such as #50647 or #25058) we are looking into the correct surface area to expose here as part of the static abstracts in interfaces feature the language and runtime are adding.

This is currently being tracked in our dotnet/designs repo and I've added the initial rough draft of the surface area here: dotnet/designs#205

Further discussion or suggestions should happen there and in future PRs that further refine the proposed surface area.

@ghost ghost locked as resolved and limited conversation to collaborators May 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Numerics
Projects
None yet
Development

No branches or pull requests

8 participants