You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#957 has a function that can be compiled either with SSE2 or AVX2. It could be desirable to have binaries where the function is compiled in the 2 versions, and depending on the availability of the instruction set, decide which version to use.
Caution: do not forget to add a call to _mm256_zeroupper() after the last use of the AVX/AVX2 instruction set, to avoid transition penalties when returning to SSE code. See https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties
(This is not needed when compiling the code entirely with AVX2 enabled, since SSE code used the VEX mnemonics, which avoid this)
The text was updated successfully, but these errors were encountered:
#957 has a function that can be compiled either with SSE2 or AVX2. It could be desirable to have binaries where the function is compiled in the 2 versions, and depending on the availability of the instruction set, decide which version to use.
Caution: do not forget to add a call to _mm256_zeroupper() after the last use of the AVX/AVX2 instruction set, to avoid transition penalties when returning to SSE code. See https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties
(This is not needed when compiling the code entirely with AVX2 enabled, since SSE code used the VEX mnemonics, which avoid this)
The text was updated successfully, but these errors were encountered: