-
Notifications
You must be signed in to change notification settings - Fork 187
Added support for most of <mutex> #113
Conversation
TODO: Test ports and heterogeneous tests. |
@wmaxey @ogiroux is this just waiting on porting tests? I'd like to move forward with exploring NVIDIA/cccl#990, but it depends on having a |
@jrhemstad How important is this and should I try to get this rebased? |
It's important, but not urgent. No need to try and get it merged before the next release. There is still a sizable amount of work to add tests before it can be merged anyways. |
0f16134
to
2c19f4a
Compare
4952f9e
to
e8b2e53
Compare
...x/thread.mutex.requirements/thread.mutex.requirements.mutex/thread.mutex.class/lock.pass.cpp
Outdated
Show resolved
Hide resolved
...ream-tests/test/cuda/thread/thread.mutex/thread.once/thread.once.callonce/call_once.pass.cpp
Outdated
Show resolved
Hide resolved
handler.join_test_thread(); | ||
handler.syncthreads(); | ||
|
||
assert(init3_called[Sco] == 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(init3_called[Sco] == 2); | |
assert(init3_called[Sco] == 1); |
if (init3_called[Sco] == 1) | ||
#ifdef __CUDA_ARCH__ | ||
_LIBCUDACXX_UNREACHABLE(); | ||
#else | ||
TEST_THROW(1); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (init3_called[Sco] == 1) | |
#ifdef __CUDA_ARCH__ | |
_LIBCUDACXX_UNREACHABLE(); | |
#else | |
TEST_THROW(1); | |
#endif | |
assert(init3_called[Sco] == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically that would not be the same. We need to throw to ensure that we do not increment the init3_called
variable afterwards but continue the test as expected
While everything works fine locally, this PR is incredibly flaky in CI. We need to investigate why that is the case and find a strategy to unflake it |
if (__m_ == nullptr) | ||
__throw_system_error(EPERM, "unique_lock::lock: references null mutex"); | ||
if (__owns_) | ||
__throw_system_error(EDEADLK, "unique_lock::lock: already locked"); | ||
#endif // _LIBCUDACXX_NO_EXCEPTIONS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without exceptions, on the host, we should be always printing the error, and aborting.
Without exceptions, on the device, we should be __traping, and when on debug mode, we should be printing the error.
Same for all other error handling in this PR for which we are already producing errors.
using __libcpp_mutex_base_t = __libcpp_mutex_t; | ||
#else | ||
template<int _Sco> | ||
using __libcpp_mutex_base_t = __atomic_semaphore_base<_Sco,1>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments would help here, but from skimming through the implementation of __atomic_semaphore_base
, it seems that its not fair, which will starve threads from different NUMA nodes at system scope?
@@ -274,6 +345,8 @@ void | |||
swap(unique_lock<_Mutex>& __x, unique_lock<_Mutex>& __y) _NOEXCEPT | |||
{__x.swap(__y);} | |||
|
|||
#ifndef _LIBCUDACXX_HAS_THREAD_API_CUDA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is condition variable supported?
Dropping in favor of NVIDIA/cccl#187 |
Has mutex, timed_mutex, once_flag, call_once, unique_lock, scoped_lock, and varied free functions that go with them. Excludes only condition variable support and the recursive versions of mutex.