-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare example to isolate shader miscompilation #199
Conversation
When tile_alloc uses a workgroup shared variable to broadcase `tile_offset`, it gives wrong answers seemingly consistent with a nonexistent barrier on mac.
Uploading not to merge but to help with investigating a potential shader miscompilation. When run on mac (just
With exact values varying from run to run. The correct output is:
The values at offsets 4, 12, and 20 are instances of I looked at the naga output and the |
This version of the repro just runs a single very simple shader with two bindings, and doesn't use the preprocessor. It should print nothing (buffer is all zero), but prints garbage.
I have indeed reduced the example a lot. Currently it's just running tile_alloc.wgsl, which is now a very simple shader that just does an atomic add (on one thread), followed by a barrier and writing the result of that add. |
Make it so correct output is [1, 1, 1] rather than [0, 0, 0]
Not intended to merge, so closing. Working around the miscompilation is still a relevant issue, though, and this could be done at many possible layers in the stack: in the shader (as is currently the case), disabling buffer robustness (or possibly changing the configuration to use clamping rather than conditionalization), or in naga to avoid patterns (such as passing threadgroup shared variables by reference to the main entry point) that trigger miscompilation. Probably an issue should be opened to track that work. |
We have an ugly workaround for miscompilation of a workgroup uniform load (see #199). The miscompilation no longer repros, and it's not clear we still need it. Possibly something changed in naga, or possibly Apple drivers have improved. It would be possible to go a little further and trim the allocation of the Paths array so it's not rounded up to workgroup size, but the practical benefit of that is marginal. Sending this as a PR to see if there still may be problems.
When tile_alloc uses a workgroup shared variable to broadcase
tile_offset
, it gives wrong answers seemingly consistent with a nonexistent barrier on mac.