Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

headless: ARM NEON intrinsics for the Linux version of Surge #570

Closed
jarkkojs opened this issue Feb 10, 2019 · 11 comments
Closed

headless: ARM NEON intrinsics for the Linux version of Surge #570

jarkkojs opened this issue Feb 10, 2019 · 11 comments
Assignees
Labels
Feature Request New feature request

Comments

@jarkkojs
Copy link
Collaborator

jarkkojs commented Feb 10, 2019

In order to test the headless Surge on ARM (e.g. Raspberry Pi 3) we need to use NEON intrinsics. The fastest way to get this done right now is to use https://github.com/jratcliff63367/sse2neon. This will also allow us to maintain only one version of the DSP code. We should take a copy of SSE2NEON.h. Using a submodule would be an overkill here.

Lets create a new file called SIMD.h that will have these contents:

#pragma once

#ifdef LINUX && __arm__
#include "SSE2NEON.h"
#else
#include <xmmintrin.h>
#include <emmintrin.h>
#endif

And use this in the files where either of these files is included.

@jarkkojs jarkkojs self-assigned this Feb 10, 2019
@jarkkojs jarkkojs added the Feature Request New feature request label Feb 10, 2019
@baconpaul
Copy link
Collaborator

Yeah a copy into libs/ is fine if that works

@jarkkojs jarkkojs changed the title headless: ARM NEON intrisics headless: ARM NEON intrisics for the Linux version of Surge Feb 10, 2019
@jarkkojs
Copy link
Collaborator Author

jarkkojs commented Feb 10, 2019

At this point I will test that the code will compile with ARM cross-compilation environment (possibly with BuildRoot as I'm a seasoned user of that tool). Do you agree that when we have DSP compiling on ARM it will be good enough to close this issue? I would.

@jarkkojs
Copy link
Collaborator Author

This is what can be used for cross-compilation with premake5: https://github.com/premake/premake-core/wiki/gccprefix. CMake would be again better because most of the embedded toolchains like OpenEmbedded, Yocto and BuildRoot support and integrate to it out of the box.

@jarkkojs jarkkojs changed the title headless: ARM NEON intrisics for the Linux version of Surge headless: ARM NEON intrinsics for the Linux version of Surge Feb 10, 2019
@baconpaul
Copy link
Collaborator

Cool. You saw #568 also right?

@jarkkojs
Copy link
Collaborator Author

@baconpaul, was not aware but this substitutes that issue.

@jarkkojs
Copy link
Collaborator Author

jarkkojs commented Feb 21, 2019

OK, so here is the deal: premake5 totally sucks enormously for cross-compilation environments. I'm going to resolve it by creating CMakeLists.txt targeted only for the headless build and use that in the BuildRoot recipe.

It takes me more time to glue premake5 to work with BuildRoot than to do that. It does not really scale to that.

It is about this easy:

https://buildroot.org/downloads/manual/manual.html#_infrastructure_for_cmake_based_packages

@baconpaul
Copy link
Collaborator

Cool!

Can I make one request as you do that: Make the cmakefile when used for non-cross-compilation print a clear message. So if I am on my mac and do "cmake" print a "use premise for now" message or some such?

@jarkkojs
Copy link
Collaborator Author

jarkkojs commented Feb 21, 2019

Not sure about that one. Cross-compilation only means being able to optionally pass prefix for tool chain and things like that. There is no way to reliably detect it. CMake can be also used to build in host environment the headless version. I guess README.md could mention its current scope and purpose.

@baconpaul
Copy link
Collaborator

OK!

Just trying to avoid confused users during transitionary period. Basically "don't use cmake except for ARM" is what we kinda need to accomplish I think. Or are you thinking "use cmake for all the headless builds all platforms"? THat's also fine!

And I guess people can read the doc!

@jarkkojs
Copy link
Collaborator Author

I'll try to get it working. Then figure out the next steps :)

@baconpaul baconpaul added this to the 1.7 or later milestone Feb 24, 2019
@jarkkojs
Copy link
Collaborator Author

So, it is fairly obvious now that this header strategy is a mess and hard to test that no errors are made when adding missing SSE2 wrappers.

What I think instead we should do is to write SIMD code progressively by using https://github.com/p12tic/libsimdpp. This is a header-only library so you essentially just need to take snapshot of it and add it to the include path. This far more sustainable strategy as we can do the transformation effect by effect. Portability is not the only benefit but it should give some performance boost because it can utilize latest SIMD ISAs such as AVX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request New feature request
Projects
None yet
Development

No branches or pull requests

2 participants