Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async benchmarks always deadlock #136

Closed
gevtushenko opened this issue Jun 30, 2023 · 1 comment · Fixed by #157
Closed

Async benchmarks always deadlock #136

gevtushenko opened this issue Jun 30, 2023 · 1 comment · Fixed by #157
Assignees
Labels
type: bug: functional Does not work as intended.

Comments

@gevtushenko
Copy link
Collaborator

The recent switch to lazy loading by default in CTK 12.2 seems to have broken the async benchmarks. This can be reproduced by nvbench.example.axes. The deadlock can be fixed by CUDA_MODULE_LOADING=EAGER. We should incorporate this information into the error message or set the variable ourselves.

@gevtushenko gevtushenko added the type: bug: functional Does not work as intended. label Jun 30, 2023
@alliepiper
Copy link
Collaborator

We likely want eager loads by default anyway to make sure that lazy loads aren't affecting measurements. Let's look into defining that var from the NVBench main implementation.

@jrhemstad jrhemstad added this to CCCL Aug 9, 2023
@github-project-automation github-project-automation bot moved this to Todo in CCCL Aug 9, 2023
alliepiper added a commit to alliepiper/nvbench that referenced this issue Apr 3, 2024
alliepiper added a commit to alliepiper/nvbench that referenced this issue Apr 6, 2024
This is the best way we have to diagnose a regression for
NVIDIA#136.
alliepiper added a commit that referenced this issue Apr 6, 2024
* Set `CUDA_MODULE_LOADING=EAGER` before `main`.

Fixes #136

* Portability for `setenv`.

* Remove pre-main CUDART usage and setup env in main.

* Fail examples if they deadlock.

This is the best way we have to diagnose a regression for
#136.

* Add an initialize method to benchmark_manager for CUDA-related setup.

Benchmarks are created statically, so their constructors cannot call the CUDA APIs without breaking the CUDA_MODULE_LOAD setup.

This method is called from `main` after the environment has been configured.
@github-project-automation github-project-automation bot moved this from Todo to Done in CCCL Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug: functional Does not work as intended.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants