You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! Currently I need to use some NVIDIA Library API(like NVSHMEM) that I think not be support currently in Triton. So I want to know If the user needs to insert some custom or third-party api call (with parameters related to threadIdx or blockIdx) into triton cuda kernel, can this be done as efficiently and concisely as possible?
for nvidia backend, will generate ir in the order ["ttir", "ttgir", "llir", "ptx", "cubin"], user may focus more on maintaining api call message in "ttir" and "ttgir", and do something when generating llir?(not for sure)
Thanks for any help anyone could give.
The text was updated successfully, but these errors were encountered:
Hello! Currently I need to use some NVIDIA Library API(like NVSHMEM) that I think not be support currently in Triton. So I want to know If the user needs to insert some custom or third-party api call (with parameters related to threadIdx or blockIdx) into triton cuda kernel, can this be done as efficiently and concisely as possible?
The followings are some references I found:
Thanks for any help anyone could give.
The text was updated successfully, but these errors were encountered: