Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add arm 64 build #8

Merged
merged 13 commits into from
Mar 4, 2024
Merged

Add arm 64 build #8

merged 13 commits into from
Mar 4, 2024

Conversation

k-dominik
Copy link

@k-dominik k-dominik commented Mar 1, 2024

Summary:

  • Added simde-based build for arm-64
    • added simde as submodule
  • pybind11 as dependency instead of submodule (-> pybind11 upgrade)

Details:

  • fastfilters is pretty flexible wrt to compilation with different capabilities. For the osx-arm64 builds, this PR sets all capabilities to 1 and let's simde handle the translation from x86 assembly. Hence the conditional includes in the code (#ifdef _USE_SIMDE_ON_ARM_).
  • CMakeLists.txt got a lot of conditional behavior now, that relies on
    • APPLE_ARM64: to activate general compiler flags for arm64 processors
    • USE_SIMDE_ON_ARM: to switch on simde headers, and also to bypass all processor capability checking and enabling all capabilities. Furthermore this excludes the "lower capability" files for arm (so if something was available with avx and avx2 simd, then only avx2 is compiled for the arm build.

added USE_SIMDE_ON_ARM variable, that activates SIMDE headers if set
* conditional execution of compilation flags for non-simde builds
* only include the most "advanced" instruction variants on arm
  (so exclude anything avx if avx2 or avx+fma is there.
* cpu.c variants for intel (arm will be added later) copied dynamically
with "fake" functions that all return true if queried for cpu features.
With SIMDE we want to use the most "advanced" versions always, so all
features are enabled at runtime.
for arm we can omit all the magic that compiles for different kinds of
simd instruction sets like avx, avx2 is done for intel. For arm we use
simde with the avx2 instruction set versions.

Hence, this commit adds two switches to the cmake interface:
* `USE_SIMDE_ON_ARM`: to activate using SIMDE to translate to whatever
  target architecture (that is supported)
* `APPLE_ARM64`: for explicit clang targeting m1/m2 compilation flags
  foreseeing that it might be slightly different for arch64

Then, throughout the code, there are now splattered `_USE_SIMDE_ON_ARM_`
conditionals. E.g., the various `*_init` functions that would check for
certain processor capabilities at runtime to activate the most effective
code path. These all point to the avx2 versions when SIMDE is active.

All in all not very pretty - might want to clean this up a bit in the
future...
Copy link
Author

@k-dominik k-dominik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another look at this after almost a year... It's ugly ;) but it seems to work. Maybe over summer there is some quiet time to clean it up. But for now it's more important to build from the same source for all platforms.

@k-dominik k-dominik merged commit 38e606f into devel Mar 4, 2024
3 checks passed
@k-dominik k-dominik deleted the add-arm-64-build branch March 4, 2024 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant