You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
As discussed in several issues (#166, rapidsai/raft#1722, NVIDIA/cutlass#1027, NVIDIA/cub#545), there are numerous insidious issues that arise from the visibility and linkage of __global__ functions and their enclosing function.
Given the importance of getting this right for all current and future symbols, we should have a way to automate checking this as part of CI.
Describe the solution you'd like
In an ideal world, we would have a tool that would verify the visibility for __global__ functions and their immediately enclosing function are hidden (i.e., t/T as reported by nm).
I do not believe it is possible to robustly identify which functions invoke a __global__ function, so I doubt this part will be possible.
At minimum, we should be able to identify all __global__ symbols from an arbitrary object file and then verify their visibility using nm.
At a high level, I expect the solution will be:
Get a list of all the __global__ symbols
Cross-reference those symbols with nm and verify they are t/T (hidden)
To get a list of all the __global__ symbols there are a few preliminary options:
Use cubobjdump A.so --list-text
Use nm and grep for __device_stub*. Everything after __device_stub should be the kernel symbol name
Describe alternatives you've considered
No response
Additional context
This solution should be general enough that it could be shared and used by other projects like RAFT and CUTLASS.
The text was updated successfully, but these errors were encountered:
Is this a duplicate?
Area
General CCCL
Is your feature request related to a problem? Please describe.
As discussed in several issues (#166, rapidsai/raft#1722, NVIDIA/cutlass#1027, NVIDIA/cub#545), there are numerous insidious issues that arise from the visibility and linkage of
__global__
functions and their enclosing function.Given the importance of getting this right for all current and future symbols, we should have a way to automate checking this as part of CI.
Describe the solution you'd like
In an ideal world, we would have a tool that would verify the visibility for
__global__
functions and their immediately enclosing function are hidden (i.e.,t/T
as reported bynm
).I do not believe it is possible to robustly identify which functions invoke a
__global__
function, so I doubt this part will be possible.At minimum, we should be able to identify all
__global__
symbols from an arbitrary object file and then verify their visibility usingnm
.At a high level, I expect the solution will be:
__global__
symbolsnm
and verify they aret/T
(hidden)To get a list of all the
__global__
symbols there are a few preliminary options:cubobjdump A.so --list-text
nm
and grep for__device_stub*
. Everything after__device_stub
should be the kernel symbol nameDescribe alternatives you've considered
No response
Additional context
This solution should be general enough that it could be shared and used by other projects like RAFT and CUTLASS.
The text was updated successfully, but these errors were encountered: