You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm just confuse when triton will choose to use cp.async to load global memory to shared memory(bypass register), and when it choose to use global->register->smem?
Since cp.async can save some register use. (My kernel contains register spilling, I just want to find out whether I can use cpasync to reduce register use)
The text was updated successfully, but these errors were encountered:
I'm just confuse when triton will choose to use cp.async to load global memory to shared memory(bypass register), and when it choose to use global->register->smem?
Since cp.async can save some register use. (My kernel contains register spilling, I just want to find out whether I can use cpasync to reduce register use)
The text was updated successfully, but these errors were encountered: