Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not compile on/for AArch64 (ARM64) #14

Open
gemarcano opened this issue Feb 20, 2017 · 4 comments
Open

Does not compile on/for AArch64 (ARM64) #14

gemarcano opened this issue Feb 20, 2017 · 4 comments

Comments

@gemarcano
Copy link

There are a couple of problems that prevent it from compiling for AArch64, but they pretty much all revolve around the ARM assembly found in thvector.h and in OpenBLAS-stripped/arm. ARM isn't compatible with AArch64 assembly:

  • When the makefile detects the architecture as aarch*, it defines __NEON__, which enables assembly optimizations in thvector.h. As I mentioned before, this assembly is not compatible with aarch64.
  • OpenBLAS-stripped/arm/*.S assembly files are not compatible with AArch64.
  • AArch64 GCC does not like/accept -mfpu and -mfp16-format flags. See https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html for the current list latest GCC supports. Ubuntu Xenial GCC 5.4 does not even support all of the options mentioned in that list (namely, +fp16 is not listed in the GCC 5.4 documentation for the same page).

I ran into these issues while trying to compile this project for the Nvidia TX1, which now bundles a GCC compiling for AArch64, running Ubuntu Xenial LTS.

If I can get this project to compile, I'll try to explain what it was I had to do to achieve it. Currently, with modifications to the Makefile, I can compile most of the project but it is getting hung up on the OpenBLAS-stripped part. I'm trying to see if I can get it to compile with the system provided OpenBLAS library.

@gemarcano gemarcano changed the title Does not compile with AArch64 (ARM64) Does not compile on/for AArch64 (ARM64) Feb 20, 2017
@mvitez
Copy link
Owner

mvitez commented Feb 20, 2017

Thank you for the report. It is in my plans to start working on AArch64 in the following days.

@gemarcano
Copy link
Author

Thanks for the fast reply.

Just to follow up, I also did forget to mention that I did have to change some cudnn functions in order to get the project to compile with cudnn v5, but it wasn't that difficult. Most of the functions with issues have a backwards compatible version, most of the time simply ending in _v3.

I did manage to get the project to compile just now. I practically butchered the Makefile, effectively telling it to treat the aarch* case as one that also uses the system OpenBLAS libraries, and setting up the flags for AArch64 compilation. I wouldn't suggest to anyone to do the kind of hack I applied to the Makefile-- there has to be s a saner way to structure those changes (perhaps by having different cases for arm/aarch32 and aarch64).

For reference, even though I am not a fan of what I did, here is the diff.

I have not yet checked to see if the resulting program/library works properly. If I find any other issues and/or pitfalls while testing on AArch64, I'll report them.

@mvitez
Copy link
Owner

mvitez commented Feb 20, 2017

Thank you very much for your help, it's much appreciated.

@mvitez
Copy link
Owner

mvitez commented Feb 23, 2017

I've pushed a new commit that has integrated OpenBLAS for aarch64. I've tested it on the Tegra TX1 with CUDNNv5. Thanks to your diff, it saved me precious time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants