-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix UB in atomics with automatic storage #2586
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Georgy Evtushenko <[email protected]> Co-authored-by: gonzalobg <[email protected]>
// uses inline PTX to bypass __isLocal. | ||
_LIBCUDACXX_BEGIN_NAMESPACE_STD | ||
|
||
_CCCL_DEVICE inline bool __cuda_is_local(const void* __ptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we prefix those with __cccl
?
{ | ||
return false; | ||
} | ||
if (0 == __atomic_memcmp((const void*) __ptr, (const void*) __expected, sizeof(_Type))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please swap
if (0 == __atomic_memcmp((const void*) __ptr, (const void*) __expected, sizeof(_Type))) | |
if (__atomic_memcmp((const void*) __ptr, (const void*) __expected, sizeof(_Type)) == 0) |
} | ||
memcpy((void*) __ptr, (void const*) &__val, sizeof(_Type)); | ||
return true; | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the comments
template <class _Type> | ||
_CCCL_DEVICE bool __cuda_load_weak_if_local(const volatile _Type* __ptr, _Type* __ret) | ||
{ | ||
#if defined(_LIBCUDACXX_ATOMIC_UNSAFE_AUTOMATIC_STORAGE) | ||
if (!__cuda_is_local((const void*) __ptr)) | ||
{ | ||
return false; | ||
} | ||
memcpy((void*) __ret, (void const*) __ptr, sizeof(_Type)); | ||
// Required to workaround a compiler bug, see nvbug/4064730 | ||
__nanosleep(0); | ||
return true; | ||
#else | ||
return false; | ||
#endif | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: This is only needing the size of _Type
Could we pass in a void* and either template this on size_t or pass it as an argument?
Applies below
_CCCL_DEVICE bool | ||
__cuda_compare_exchange_weak_if_local(volatile _Type* __ptr, _Type* __expected, const _Type* __desired, bool* __success) | ||
{ | ||
if (!__cuda_is_local((const void*) __ptr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are missing the #if defined(_LIBCUDACXX_ATOMIC_UNSAFE_AUTOMATIC_STORAGE)
.
I'd expect that we'd want it here for the same reason we want it in the other ops.
And ditto for the ops below.
(If we don't need it, I don't remember why)
// Enable bypassing automatic storage checks in atomics when using CTK 12.2 and below and if NDEBUG is defined. | ||
# if defined(_CCCL_CUDACC_BELOW_12_2) && !defined(NDEBUG) | ||
# define _LIBCUDACXX_ATOMIC_UNSAFE_AUTOMATIC_STORAGE | ||
# endif // _CCCL_CUDACC_BELOW_12_2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gevtushenko fyi (I remember you had feedback about this)
@@ -154,6 +154,8 @@ static inline _CCCL_DEVICE bool __atomic_compare_exchange_cuda(_Type* __ptr, _Ty | |||
__proxy_t* __ptr_proxy = reinterpret_cast<__proxy_t*>(__ptr); | |||
__proxy_t* __exp_proxy = reinterpret_cast<__proxy_t*>(__exp); | |||
__proxy_t* __des_proxy = reinterpret_cast<__proxy_t*>(&__des); | |||
bool __res = false; | |||
if (__cuda_compare_exchange_weak_if_local(__ptr_proxy, __exp_proxy, __des_proxy, &__res)) {{return __res;}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably a n00b question, but why the double braces {{}}
?
Description
Closes #2585
See #478 for more details.
This fixes UB when using automatic storage with atomics (which has little or no use-case)
Example of users encountering wrong values in automatic storage: https://stackoverflow.com/questions/75640847/what-exactly-does-cudaatomic-does
Checklist