pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle #406

janehwu · 2023-03-03T00:59:24Z

janehwu
Mar 3, 2023

Hi, I'm trying to execute a CUDA kernel inside a pytorch autograd.Function backward() implementation during network training (mixing pytorch and pycuda, which I know is tricky), and it seems that pytorch autograd changes the context used by pycuda such that I'm getting a cuFuncSetBlockShape error when I try to execute any kernel.

A sketch of my code is below:

import pycuda.driver as drv
from cuda_kernel_file import mod

# This does nothing
DummyFunction_gpu = mod.get_function("Dummy_Function")

drv.init()
pycuda_ctx = drv.Device(0).retain_primary_context()

class CustomFunction(autograd.Function):
    @staticmethod
    def forward(ctx, ...):
        # Do something

    @staticmethod
    def backward(ctx, ...):
        blockdim = 1, 1, 1
        blocks_per_grid = 1, 1, 1
        DummyFunction_gpu(block=blockdim, grid=blocks_per_grid)

And the error I'm getting is:

    loss.backward()
  File "/home/janehwu/anaconda3/envs/cir/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
    torch.autograd.backward(
  File "/home/janehwu/anaconda3/envs/cir/lib/python3.8/site-packages/torch/autograd/__init__.py", line 200, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/janehwu/anaconda3/envs/cir/lib/python3.8/site-packages/torch/autograd/function.py", line 274, in apply
    return user_fn(self, *args)
  File "/home/janehwu/test.py", line 257, in backward
    DummyFunction_gpu(block=blockdim, grid=blocks_per_grid)
  File "/home/janehwu/anaconda3/envs/cir/lib/python3.8/site-packages/pycuda/driver.py", line 481, in function_call
    func._set_block_shape(*block)
pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle

I've also tried adding pycuda_ctx.push() and pycuda_ctx.pop() before/after the kernel call, but that gives the same error.

Interestingly, this is only a problem on an A100, and the above code works fine on a 3090 (with pycuda version 2021.1). Is it possible to resolve this error on the A100 with pycuda version 2022.2.2? Thanks!

inducer · 2023-03-03T04:06:41Z

inducer
Mar 3, 2023
Maintainer

Honestly not sure, and I don't currently have the bandwidth to help. Sorry!

0 replies

laugh12321 · 2024-01-20T14:32:01Z

laugh12321
Jan 20, 2024

I encountered a similar issue. When I set grid and block to (32, 32, 1), the program runs normally. However, if I use larger values such as (64, 64, 1), I encounter pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid argument error. Here is the code.

code

self._preprocess_kernel(
    self._image_device, width * 3, width, height, 
    device, dst_width, dst_height, 
    np.int32(128), self._d2s_device,
    grid=(32, 32, 1),
    block=(32, 32, 1),
    stream=self._stream
)

error

  File "e:\Projects\Cpp\YOLO\python\yolo.py", line 271, in _preprocess
    self._preprocess_kernel(
  File "D:\laugh\Program\Miniconda3\envs\paddle\lib\site-packages\pycuda\driver.py", line 481, in function_call
    func._set_block_shape(*block)
pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid argument

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle #406

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle #406

janehwu Mar 3, 2023

Replies: 2 comments

inducer Mar 3, 2023 Maintainer

laugh12321 Jan 20, 2024

janehwu
Mar 3, 2023

inducer
Mar 3, 2023
Maintainer

laugh12321
Jan 20, 2024