-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline layout changes invalidate all bindings on DX12 #1926
Comments
So this to me seems partially like a developer education issue: users shouldn't be switching layouts many times per frame (yes, I know about getBindGroupLayout, I consider it mostly a misfeature). We should teach users that switching pipeline layouts is slow, and if they share pipeline layouts a lot, their code will be faster. |
You are totally correct. However, I think this is more than just an education issue. The API is good when it surfaces the costs. So, supposing I read the following code:
How can I reason about performance? It would require me to do this series of steps:
This is very well hidden. Basically, one can't look at the code and say how expensive it is. If we consider WebGPU on Vulkan and Metal, then the situation is much better: if I see |
In Dawn we chose to go with option 2) and lazily re-bind descriptor tables as needed. Looking at Vulkan drivers, they also lazily apply descriptor sets at draw time, and setting the pipeline invalidates the descriptor sets. While this shows that Vulkan's descriptor sets aren't a representative abstraction of what the hardware can do, I think that for WebGPU it is useful to help the developers by making the bindgroup state persistent so 1) they can do less |
Thanks for the links! I find the code fairly convincing. |
It could be that Mesa's drivers are just not optimized enough. |
There's probably things your can do on very bindless hardware to decouple the two things. I don't see anything in AMD hardware that would prevent implementing Vulkan in a way that keeps the bindings on pipeline change. However what I think this shows is that drivers decided that the opportunity to do more optimizations (promote some stuff to inline descriptors or other reduction of indirections, etc.) has more value than the pure CPU cost of emitting commands again. CPU perf is no longer the bottleneck: GPU perf is. |
D3D12 has an interesting command: SetGraphicsRootSignature.
In WebGPU terms, that would be equivalent to something like
setPipelineLayout(required GPUPipelineLayout layout)
.It's documented that an application have to set the root signature before doing a draw call, doesn't matter if it's done before or after the pipeline setup.
Here is what particularly worries me:
Perhaps, we could get some clarification from Microsoft about the "behavior is undefined" part? I.e. why it's undefined, and what happens under the hood.
Since WebGPU user doesn't have direct access to this
SetGraphicsRootSignature
, the user agent has one of the following options:Option-1
Set the root signature on the pipeline change (
setPipeline
). Re-bind all the resources in bind groups at this point.This option has a problem: it makes
setPipeline
performance difficult to reason about. It would be lightweight on Vulkan/Metal but very heavy on D3D12, and only when the pipeline layout is changing.Note: On Vulkan specifically, we also nave to re-bind a portion of the pipeline layout, based on the layout compatibility rules.
At least it's possible to control the costs of it: if I know that my pipelines' layouts are only different in bind group number 3, then I can be assured that bind groups 0, 1, and 2 aren't going to be rebound internally when I do the pipeline switch.
On D3D12, however, it appears that any minor change in the pipeline layout leads to a performance cliff.
Option-2
Don't bind any resources at
setBindGroup
. Instead, lazily bind everything at draw call.This option has a similar problem: it's difficult to reason about the cost of
setBindGroup
anddraw
when any lazy binding is taking place. It puts DX12 at disadvantage, since Vulkan and Metal draw calls aren't encumbered by the lazy state.The text was updated successfully, but these errors were encountered: