Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add info about NVTX ranges to dev guide #8461

Merged
merged 1 commit into from
Jun 14, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions cpp/docs/DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,7 @@ namespace detail{
} // namespace detail

void external_function(...){
CUDF_FUNC_RANGE(); // Auto generates NVTX range for lifetime of this function
detail::external_function(...);
}
```
Expand All @@ -355,6 +356,12 @@ asynchrony if and when we add an asynchronous API to libcudf.
**Note:** `cudaDeviceSynchronize()` should *never* be used.
This limits the ability to do any multi-stream/multi-threaded work with libcudf APIs.

### NVTX Ranges

In order to aid in performance optimization and debugging, all compute intensive libcudf functions should have a corresponding NVTX range.
In libcudf, we have a convenience macro `CUDF_FUNC_RANGE()` that will automatically annotate the lifetime of the enclosing function and use the functions name as the name of the NVTX range.
For more information about NVTX, see [here](https://github.com/NVIDIA/NVTX/tree/dev/cpp).

### Stream Creation

There may be times in implementing libcudf features where it would be advantageous to use streams
Expand Down