Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover lost SSE2 support on MSVC 64-bit builds; FFTW_VERSION #292

Open
olafk1 opened this issue Sep 5, 2022 · 2 comments
Open

Recover lost SSE2 support on MSVC 64-bit builds; FFTW_VERSION #292

olafk1 opened this issue Sep 5, 2022 · 2 comments

Comments

@olafk1
Copy link

olafk1 commented Sep 5, 2022

For MSVC, the HAVE_SSE2 preprocessor definition will only be enabled in CMakeLists.txt if the compiler successfully accepts /arch:SSE2 as a command line parameter. However, this applies to MSVC 32-bit compilers only. MSVC 64-bit compilers have the switch /arch:SSE2 removed, since this is already the default minimum architecture supported on all 64-bit CPUs (see Microsoft Docs here).

As a result, fftw DLLs lose all SSE2 support when the common configuration -DENABLE_SSE2=1 -DENABLE_AVX=1 is passed to cmake. This leads to an illegal instruction and a crash during runtime on non-AVX-CPUs.

The following patch should at least re-enable SSE2 support for MSVC 64-bit compilers, although /arch:SSE2 has been removed as the default minimum architecture.

diff --git a/CMakeLists.txt b/CMakeLists.txt
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -194,2 +194,7 @@

+# Activate SSE2 support even if the corresponding compiler flag for 64-bit MSVC has been removed
+if (CMAKE_CL_64 AND ENABLE_SSE2)
+  set (HAVE_SSE2 TRUE)
+endif ()
+
 if (HAVE_SSE2 OR HAVE_AVX)

Side note: the version number in CMakeLists.txt version 3.3.10 wrongly indicates FFTW_VERSION 3.3.9 and needs a brush up.

@olafk1
Copy link
Author

olafk1 commented Sep 15, 2022

The fix above addresses a significant performance flaw in MSVC non-AVX builds for 64-bit systems. It turned out that very common SSE2/AVX builds (like that from conda-forge) still crash on some older 64-bit CPUs. This is caused by a bigger conceptual bug in CMakeList.txt, which should be discussed in another ticket, because upcoming CPU architectures are now also affected.

@egpbos
Copy link

egpbos commented Dec 1, 2022

This makes sense. Have the maintainers seen this? @matteo-frigo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants