-
-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for CUDA_NO_HALF #436
Conversation
Hello Chaoyang, First - thanks for the interest in contributing. :-) Personally, I don't do half-precision work, so my experience here is very limited. I noticed that there are several relevant definitions in the CUDA headers:
Are you sure we should be using the first of these? Also, would it not be sufficient if I just used |
How about:
? |
The core problem is that So, answering to your earlier questions, I think the only way to turn off this is to define But I think another solution could be that we do not employ |
Well, you could still use CUDA_NO_HALF to control whether the aliasing of ... on the other hand, I don't use cuda_fp16 for anything else, so... maybe let's go with your initial suggestion, but also, but CUDA_NO_HALF is specified, we don't including the cuda_fp16 header either. How about that? |
I think that's good solution and I amended the commit. Modify anything if you'd like :) |
Done, unless someone else complains and wants a more elaborate solution. By the way - if you could take the time to give feedback on the changes since 0.5.6 on the development branch, that would help, since all I need to release 0.6 is either a bit longer to wait or more feedback about it from users, to give me enough confidence. |
Cool. I'll try that and if I find any bug I'll submit an issue. |
@georgelyu ... and if you don't - please email me |
Sometimes the half type defined in CUDA library conflicts with the one defined in other libraries (in Imath for example, which is the problem I'm encountering right now). I wanna use
#define CUDA_NO_HALF
to get past this, but the type is used inarray.hpp
. So, I wrote a naive way to support the macro, which is checked in Line 2535 ofcuda_fp16.hpp
.