-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cufft: use CUFFT_COMPATIBILITY_FFTW_PADDING instead of CUFFT_COMPATIBILITY_NATIVE #52
Comments
Yes, I agree. I don't want to touch the kernel layout at any cost. |
If you are busy, I am happy to work on this next week. On 7 November 2015 at 21:09, Arne Vansteenkiste [email protected]
Mykola |
Sounds good. Let me know if you need any help. |
Thanks, this should help indeed. |
Thanks @godsic! All tests pass on my 680. |
CUFFT_COMPATIBILITY_NATIVE mode is broken since cuda7.0+. Since CUFFT_COMPATIBILITY_NATIVE has been marked as DEPRECATED since cuda6.0, Nvidia developers are not willing to fix it. Therefore mumax3 should switch to the default CUFFT_COMPATIBILITY_FFTW_PADDING mode. This is rather big change.
Since Nvidia now suggests to use out-of-place r2c and c2r transforms, i.e. see page 20 of the "CUFFT LIBRARY USER'S GUIDE DU-06707-001_v7.5", then I would prefer to take this route. This will marginally increase mumax3 memory consumption, but is way less harmful, then say touching mighty demag kernel code.
@barnex any thoughts?
The text was updated successfully, but these errors were encountered: