-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surprising benchmark numbers #51
Comments
Unless I'm mistaken this doesn't look like an easy fix: The solutions I see:
Will put this on hold until further discussed (but will PR some current improvements made in process). |
Hm, I do want faster to seamlessly integrate with A good argument for moving away from those types would be that we could implement std::simd's traits on our types, but that'd be a manual process and I'd likely have more work to do to get it compiling on stable. |
While working on #47 I noticed what looks like performance regressions in the
cargo bench
, in particular functions likemap_simd
andmap_scalar
, but quite a few others.However, comparing #49 to the commit before the refactoring, the numbers are mostly unchanged.
I then assumed it's related to unfortunate default feature flags on my machine, but playing with
avx2
andsse4.1
didn't have any effect either. I also have a first implementation of #48, and it actually looks like no fallbacks are emitted formap_simd
. (Tried to cross check that withradare2
, but have some problems locating the right symbol / disassembly for the benchmarks). Lastly, the functionsmap_scalar
andmap_simd
differ a bit, but even when I make them equal (e.g.,sqrt
vs.rsqrt
) the difference remains.rustc
became so good in auto-vectorization?tests::map_simd
andtests::map_scalar
?Running on
rustc 1.29.0-nightly (9fd3d7899 2018-07-07)
, MBP 2015, i7-5557U.Update: I linked the latest faster version from my SVM library and I don't see these problems in 'production':
Update 2 Seems to be related to some intrinsics. When I dissect the benchmark, I get
I now think that each intrinsic should have its own benchmark, e.g.
intrinsic_abs_scalar
,intrinsic_abs_simd
, ...Update 3 ... oh boy. I think that by "arcane magic" Rust imports and prefers
std::simd::f32x4
and friends over thefaster
types and methods.So when you do
my_f32s.abs()
, it callsstd::simd::f32x4::abs
, notfaster::arch::current::intrin::abs
.The reason I think that's the problem is you can now easily do
my_f32s.sqrte()
, which isn't implemented infaster
, but instd::simd
.What's more annoying is that it doesn't warn about any collision, and that
std::simd
is actually slower than "vanilla" Rust.TODO:
#![feature(stdsimd)]
except inlib.rs
Update 4 Now one more thing makes sense ... I sometimes got
use of unstable library feature 'stdsimd'
in test cases and I didn't understand why. Probably because that's where thestd::simd
built-ins were used.The text was updated successfully, but these errors were encountered: