Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Vector operations up to 200x slower on Linux #15426

Closed
ianhays opened this issue Oct 13, 2015 · 7 comments
Closed

Perf: Vector operations up to 200x slower on Linux #15426

ianhays opened this issue Oct 13, 2015 · 7 comments
Assignees
Labels
area-System.Numerics os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Milestone

Comments

@ianhays
Copy link
Contributor

ianhays commented Oct 13, 2015

Perf results for the individual iterations are available on my share, or I can post them here if requested. The tests that produced these results are currently in PR dotnet/corefx#3764. I included all Vector tests here (not just the slower ones) to simplify comparisons.

test name Windows time Linux time linux/windows
Perf_Vector2.Operation(operation: Add_Function) 8.000289665 59.11430001 7.389019958
Perf_Vector2.Operation(operation: Add_Operator) 0.288809858 52.2279 180.8383562
Perf_Vector2.Operation(operation: Distance_Squared) 22.23037618 21.2417 0.955525891
Perf_Vector2.Operation(operation: Dot) 0.351532631 18.29149999 52.03357636
Perf_Vector2.Operation(operation: Length_Squared) 0.38232381 15.3581 40.17039899
Perf_Vector2.Operation(operation: Mul_Function) 8.379477337 60.432 7.211905656
Perf_Vector2.Operation(operation: Mul_Operator) 0.320741452 56.456 176.0171619
Perf_Vector2.Operation(operation: Normalize) 60.02426801 67.0325 1.116756642
Perf_Vector2.Operation(operation: SquareRoot) 0.321026555 73.6878 229.5380205
Perf_Vector2.Operation(operation: Sub_Function) 8.074986785 60.292 7.466513767
Perf_Vector2.Operation(operation: Sub_Operator) 0.266571784 56.0038 210.0890016
Perf_Vector3.Operation(operation: Add_Function) 10.99273613 78.2232 7.115898995
Perf_Vector3.Operation(operation: Add_Operator) 5.44148564 64.7522 11.89972818
Perf_Vector3.Operation(operation: Cross) 37.46773341 70.217 1.874065859
Perf_Vector3.Operation(operation: Distance_Squared) 23.74256521 20.9329 0.881661262
Perf_Vector3.Operation(operation: Dot) 5.302070023 18.5697 3.502349068
Perf_Vector3.Operation(operation: Length_Squared) 14.50720713 14.7616 1.01753562
Perf_Vector3.Operation(operation: Mul_Function) 10.89209459 74.3459 6.825675206
Perf_Vector3.Operation(operation: Mul_Operator) 5.499646757 64.2378 11.68035018
Perf_Vector3.Operation(operation: Normalize) 69.1874949 90.3523 1.305905065
Perf_Vector3.Operation(operation: SquareRoot) 2.665432739 117.7835 44.18925989
Perf_Vector3.Operation(operation: Sub_Function) 10.91575818 75.24789999 6.89351108
Perf_Vector3.Operation(operation: Sub_Operator) 5.528157108 65.3777 11.82631006
Perf_Vector4.Operation(operation: Add_Function) 2.999859159 70.9498 23.65104368
Perf_Vector4.Operation(operation: Add_Operator) 2.746117033 66.9818 24.39145863
Perf_Vector4.Operation(operation: Distance_Squared) 19.94584174 20.5697 1.03127761
Perf_Vector4.Operation(operation: Dot) 2.631220317 21.6594 8.231693811
Perf_Vector4.Operation(operation: Length_Squared) 2.74754255 15.3553 5.588739654
Perf_Vector4.Operation(operation: Mul_Function) 3.059730896 69.6198 22.75356963
Perf_Vector4.Operation(operation: Mul_Operator) 2.709053576 68.6514 25.34147003
Perf_Vector4.Operation(operation: Normalize) 69.79761642 82.259 1.178535947
Perf_Vector4.Operation(operation: SquareRoot) 1.368781964 144.8611 105.8321222
Perf_Vector4.Operation(operation: Sub_Function) 3.020101508 70.3999 23.31044165
Perf_Vector4.Operation(operation: Sub_Operator) 2.664292325 69.3502 26.0295011

Loop/testing overhead is equivalent to roughly .3 ms on both platforms.

@mellinoe

@stephentoub
Copy link
Member

@ianhays, what's the value of Vector.IsHardwareAccelerated in your Windows tests and in your Linux tests?

@ianhays
Copy link
Contributor Author

ianhays commented Oct 13, 2015

False on Ubuntu, True on Windows. That explains that then.

@stephentoub
Copy link
Member

@mellinoe, am I correct in assuming that this is likely because Ian's running the Ubuntu tests in a VM? Or are there known JIT issues that would cause it to return false on Unix in general?

If it's likely due to the VM, Ian, it'd be good to try it out with Ubuntu installed natively on the hardware, rather than in a VM (and/or trying Windows in a VM). If we're comparing hardware-accelerated on one platform vs non-hardware-accelerated on another platform, the results aren't meaningful.

@mellinoe
Copy link
Contributor

That may be the case. I haven't had a chance to try to repro this myself, but I will take a look in the coming weeks. I chatted with @CarolEidt briefly yesterday and didn't think of anything that would be blocking this on Linux, so we will have to investigate further.

@benaadams
Copy link
Member

WIndows in a VM is Vector.IsHardwareAccelerated true

@stephentoub
Copy link
Member

Turns out SIMD isn't enabled on Unix yet:
https://github.com/dotnet/coreclr/issues/983

@stephentoub
Copy link
Member

Closing as a dup of https://github.com/dotnet/coreclr/issues/983

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 1.0.0-rtm milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Numerics os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

5 participants