-
Notifications
You must be signed in to change notification settings - Fork 969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suballocate DX12 buffer creation #3163
Conversation
Nice to see these changes @Elabajaba, if there are any features that wgpu would like to see in our allocator please just file an issue on the repo. |
Blocked on #3207 until Mozilla gets around to vendoring windows-rs. |
Codecov Report
@@ Coverage Diff @@
## master #3163 +/- ##
==========================================
+ Coverage 64.30% 64.36% +0.05%
==========================================
Files 83 85 +2
Lines 42270 42397 +127
==========================================
+ Hits 27181 27287 +106
- Misses 15089 15110 +21
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Just stumbling upon this PR, I'll see to accelerating Traverse-Research/gpu-allocator#138 so that you're unblocked on that regard! |
As #3207 is basically perma-blocked until further notice, I think we should work around the situation by having a feature flag and falling back to the old behavior when it's disabled. This will let us continue to innovate, and also not force the issue with moz. |
…and which is the slow path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for all this work! Looks great!
Checklist
[ ] Blocked on Migrate to Windows-rs from winapi #3207Worked around by feature gating it behind thewindows_rs
featurecargo clippy
.presser
Traverse-Research/gpu-allocator#138 lands, and see if migrating to presser would be neededConnections
~~ Blocked on #3207 ~~ Worked around by feature gating it behind the
windows_rs
featurecloses #2720
Description
DX12 is currently quite slow in wgpu. This uses gpu-allocator to batch together allocations into heaps and uses CreatePlacedResource instead of CreateCommittedResource to create buffers and textures, which leads to large performance gains (~30-50% in "normal" scenarios, with significantly larger gains in write_buffer heavy scenarios (~250x+ in an unrealistic scenario where it calls write_buffer 1000x in a loop, going from ~1fps to ~250fps)), and in my testing no performance decreases.
Testing
Tested the examples, ran cargo test, backported it to 0.14 and tested against bevy+bistro, and tested against a modified water example where it loops the render write_buffer 1000x times on the main thread, 500x each on 2 scoped threads, or 100x each on 10 scoped threads to make sure multithreading wouldn't panic.
It was quite a bit faster in all of these scenarios, except for bevy+bistro at 4k where it was heavily gpu limited and ran about the same.
Potential Future Improvements
ZwAllocateLocallyUniqueId
taking almost 40% of the time in the 1000x looped write_buffer test