-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does not compile on/for AArch64 (ARM64) #14
Comments
Thank you for the report. It is in my plans to start working on AArch64 in the following days. |
Thanks for the fast reply. Just to follow up, I also did forget to mention that I did have to change some cudnn functions in order to get the project to compile with cudnn v5, but it wasn't that difficult. Most of the functions with issues have a backwards compatible version, most of the time simply ending in _v3. I did manage to get the project to compile just now. I practically butchered the Makefile, effectively telling it to treat the aarch* case as one that also uses the system OpenBLAS libraries, and setting up the flags for AArch64 compilation. I wouldn't suggest to anyone to do the kind of hack I applied to the Makefile-- there has to be s a saner way to structure those changes (perhaps by having different cases for arm/aarch32 and aarch64). For reference, even though I am not a fan of what I did, here is the diff. I have not yet checked to see if the resulting program/library works properly. If I find any other issues and/or pitfalls while testing on AArch64, I'll report them. |
Thank you very much for your help, it's much appreciated. |
I've pushed a new commit that has integrated OpenBLAS for aarch64. I've tested it on the Tegra TX1 with CUDNNv5. Thanks to your diff, it saved me precious time. |
There are a couple of problems that prevent it from compiling for AArch64, but they pretty much all revolve around the ARM assembly found in
thvector.h
and inOpenBLAS-stripped/arm
. ARM isn't compatible with AArch64 assembly:aarch*
, it defines__NEON__
, which enables assembly optimizations inthvector.h
. As I mentioned before, this assembly is not compatible with aarch64.OpenBLAS-stripped/arm/*.S
assembly files are not compatible with AArch64.-mfpu
and-mfp16-format
flags. See https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html for the current list latest GCC supports. Ubuntu Xenial GCC 5.4 does not even support all of the options mentioned in that list (namely, +fp16 is not listed in the GCC 5.4 documentation for the same page).I ran into these issues while trying to compile this project for the Nvidia TX1, which now bundles a GCC compiling for AArch64, running Ubuntu Xenial LTS.
If I can get this project to compile, I'll try to explain what it was I had to do to achieve it. Currently, with modifications to the Makefile, I can compile most of the project but it is getting hung up on the OpenBLAS-stripped part. I'm trying to see if I can get it to compile with the system provided OpenBLAS library.
The text was updated successfully, but these errors were encountered: