-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add _LIBCUDACXX_REQUIRES_EXPR
to the concepts emulation macros
#2564
Conversation
🟩 CI finished in 1h 49m: Pass: 100%/366 | Total: 4d 14h | Avg: 18m 06s | Max: 1h 18m | Hits: 34%/27865
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
pycuda | |
CCCL C Parallel Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CCCL C Parallel Library |
🏃 Runner counts (total jobs: 366)
# | Runner |
---|---|
298 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
25 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to say, that I am not a big fan of this, because it introduces a subtle potential for bugs in that there is no short circuiting pre C++20.
Previously if a type would not satisfy resource
we would not even check for allocate_async
With the resource
concept this is fine because there is no requirement in the earlier concept fragments that later one rely on but in libcu++ we have a lot of cases where an expression is invalid when a previous requirement is not met.
That means we would have a fragmentation in how we do things with a nonobvious difference in the way they work.
ah that's true because of the change i made to the i've switched back to a nested
template <class _Resource>
inline auto __async_resource__LIBCUDACXX_CONCEPT_FRAGMENT_impl_(
_Resource &__res, void *__ptr, size_t __bytes, size_t __alignment,
::cuda::stream_ref __stream)
-> _Concept::_Enable_if_t<!(
decltype(_Concept::_Requires<resource<_Resource>>,
_Concept::_Requires<_CUDA_VSTD::same_as<
void *, decltype(__res.allocate_async(__bytes, __alignment,
__stream))>>,
_Concept::_Requires<_CUDA_VSTD::same_as<
void, decltype(__res.deallocate_async(
__ptr, __bytes, __alignment, __stream))>>,
void(), false){})> {}
template <typename... _As>
inline char __async_resource__LIBCUDACXX_CONCEPT_FRAGMENT_(
_Concept::_Tag<_As...> *,
decltype(&__async_resource__LIBCUDACXX_CONCEPT_FRAGMENT_impl_<_As...>));
inline char (&__async_resource__LIBCUDACXX_CONCEPT_FRAGMENT_(...))[2];
template <class _Resource>
inline constexpr bool async_resource =
(1u == sizeof(__async_resource_LIBCUDACXX_CONCEPT_FRAGMENT_(
static_cast<_Concept::_Tag<_Resource> *>(nullptr), nullptr)));
template <class _Resource>
inline constexpr bool async_resource = _Concept::_Requires_expr_impl<
struct _Libcudacxx_requires_expr_detail_539,
_Resource>::_Is_satisfied((_Concept::_Tag<void, _Resource> *)nullptr,
(void (*)(_Resource &__res, void *__ptr,
size_t __bytes, size_t __alignment,
::cuda::stream_ref __stream)) nullptr);
struct _Libcudacxx_requires_expr_detail_539 {
using _Self_t = _Libcudacxx_requires_expr_detail_539;
template <class, class _Resource>
static auto _Well_formed(_Resource &__res, void *__ptr, size_t __bytes,
size_t __alignment, ::cuda::stream_ref __stream)
-> decltype(_Concept::_Requires<resource<_Resource>>,
_Concept::_Requires<_CUDA_VSTD::same_as<
void *, decltype(__res.allocate_async(
__bytes, __alignment, __stream))>>,
_Concept::_Requires<_CUDA_VSTD::same_as<
void, decltype(__res.deallocate_async(
__ptr, __bytes, __alignment, __stream))>>,
void()) {}
template <
class... Args, class Sig,
class = decltype(static_cast<Sig *>(&_Self_t::_Well_formed<Args...>))>
static constexpr bool _Is_satisfied(_Concept::_Tag<Args...> *, Sig *) {
return true;
}
static constexpr bool _Is_satisfied(void *, ...) { return false; }
}; |
🟨 CI finished in 1h 56m: Pass: 99%/366 | Total: 6d 10h | Avg: 25m 22s | Max: 1h 21m | Hits: 36%/27865
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
pycuda | |
CCCL C Parallel Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CCCL C Parallel Library |
🏃 Runner counts (total jobs: 366)
# | Runner |
---|---|
298 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
25 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
🟩 CI finished in 3h 15m: Pass: 100%/366 | Total: 6d 10h | Avg: 25m 24s | Max: 1h 21m | Hits: 36%/27865
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
pycuda | |
CCCL C Parallel Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CCCL C Parallel Library |
🏃 Runner counts (total jobs: 366)
# | Runner |
---|---|
298 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
25 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
🟩 CI finished in 3h 23m: Pass: 100%/366 | Total: 7d 18h | Avg: 30m 38s | Max: 1h 29m | Hits: 10%/27915
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
pycuda | |
CCCL C Parallel Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CCCL C Parallel Library |
🏃 Runner counts (total jobs: 366)
# | Runner |
---|---|
298 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
25 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
…DIA#2564) * add `_LIBCUDACXX_REQUIRES_EXPR` to the concepts emulation macros * work around nvcc pre-12.2 bug and molify nvrtc * silence warning about an always true condition * simplify macro substitution with the help of an alias template * fix the short-circuiting behavior of the `async_resource` concept pre-c++20 * replace C-style casts with C++-style `static_cast`s * add missing `_LIBCUDACXX_HIDE_FROM_ABI` function annotations * restore short-circuiting in the `resource` concept
…DIA#2564) * add `_LIBCUDACXX_REQUIRES_EXPR` to the concepts emulation macros * work around nvcc pre-12.2 bug and molify nvrtc * silence warning about an always true condition * simplify macro substitution with the help of an alias template * fix the short-circuiting behavior of the `async_resource` concept pre-c++20 * replace C-style casts with C++-style `static_cast`s * add missing `_LIBCUDACXX_HIDE_FROM_ABI` function annotations * restore short-circuiting in the `resource` concept
Description
while working on other things, i stumbled on a nicer way to author "concepts" when concepts are not available. consider the definition of the
resource
concept today:first you have to define a goofy concept fragment (which needs to be named), and then you use the fragment when defining the concept.
this PR adds a macro named
_LIBCUDACXX_REQUIRES_EXPR
which can be used directly within the definition of the pseudo-concept. with it, theresource
concept looks like this:as shown above, i have also added support for type constraints on required expressions. within a
_LIBCUDACXX_REQUIRES_EXPR
, this:is equivalent to the following C++20 requirement:
i have tested this with C++14/17/20 and with microsoft's broken preprocessor.
we could replace all uses of
_LIBCUDACXX_CONCEPT_FRAGMENT
with_LIBCUDACXX_REQUIRES_EXPR
. this PR only makes the change for theresource
andasync_resource
concepts.Checklist