You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The enqueue_raw_kernel_launch_in_current_context does not check for an error with cudaGetLastError, thus kernel launch errors like cudaErrorInvalidConfiguration go uncaught, and the kernel launch silently fails.
Minimal example:
The following example demonstrates an example kernel, whose launch should fail since the maximum number of threads per block is 1024, and we are trying to launch it with 1500 threads.
#include<device_launch_parameters.h>
#include<stdio.h>
#include<iostream>
#include<cuda/api.hpp>usingnamespacestd;
__global__ voidtest_kernel() {
int i = blockIdx.x * blockDim.x + threadIdx.x;
if (i == 0) {
printf("Hello CUDA\n");
}
}
intmain() {
if (cuda::device::count() == 0) {
std::cerr << "No CUDA devices on this system" << "\n";
exit(EXIT_FAILURE);
}
cuda::device::current::set(cuda::device::get(0));
auto device = cuda::device::current::get();
try {
cout << "Executing the kernel:" << endl;
cuda::launch_configuration_t lc = cuda::launch_config_builder()
.overall_size(2048)
.block_dimensions(1500)
.build();
cuda::launch(test_kernel, lc);
} catch (std::exception ex) {
cout << ex.what() << endl;
}
cuda::synchronize(device);
return0;
}
Current output:
Executing the kernel:
The kernel launch fails silently.
Expected output:
Executing the kernel:
Kernel launch failed: invalid configuration argument
The kernel launch error from cudaGetLastError is handled and converted into an exception that is caught here.
The text was updated successfully, but these errors were encountered:
The
enqueue_raw_kernel_launch_in_current_context
does not check for an error withcudaGetLastError
, thus kernel launch errors likecudaErrorInvalidConfiguration
go uncaught, and the kernel launch silently fails.Minimal example:
The following example demonstrates an example kernel, whose launch should fail since the maximum number of threads per block is 1024, and we are trying to launch it with 1500 threads.
Current output:
The kernel launch fails silently.
Expected output:
The kernel launch error from
cudaGetLastError
is handled and converted into an exception that is caught here.The text was updated successfully, but these errors were encountered: