-
-
Notifications
You must be signed in to change notification settings - Fork 35.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: WebGPURenderer prototype single uniform buffer update / pass #27388
base: dev
Are you sure you want to change the base?
Conversation
Reduce the number of calls from I like your idea, but I wonder if it wouldn't be better to have this configured in Node and adjusted at |
8424829
to
1bc9388
Compare
Hi @aardgoose Do you think about fixing the conflicts? I was thinking about merge this PR soon |
I'll take a look tomorrow. |
Awesome! Is it ready for review @aardgoose? Can you promote it from Draft to PR maybe? 😊 |
@RenaudRohlinger will do. We might want to select specific uniform groups to be managed in this way, which is now possible as the buffer is passed through the NodeBuilder. An obvious next stage is to look at reclaiming unused buffers, but we need a deallocation mechanism first, when a material is disposed of. |
34c5527
to
82107c4
Compare
Added lists per extent size (multiple of block size) for freed buffers when objects are removed from the scene graph.. These lists are used for new allocations in preference to free space at the end of the buffer. Block size is typically 256 bytes (https://web3dsurvey.com/webgpu/limits/minStorageBufferOffsetAlignment). Added reworked example with continuous removal and addition of new objects and stats demonstrating buffer use. This only uses blocks of 256B or less. |
📦 Bundle sizeFull ESM build, minified and gzipped.
🌳 Bundle size after tree-shakingMinimal build including a renderer, camera, empty scene, and dependencies.
|
refactored and remove global state
b3f48ce
to
40d6ed1
Compare
I've been conducting performance benchmarks and believe this PR could significantly enhance the webgpu_performances.html example, particularly within the WebGL backend. It could potentially boost performance from around 30fps to over 120fps. Due to the force-push, I'm unable to check out the PR myself. If possible, could you give it a try? Additionally, to address the performance issues in While I'm still investigating the exact cause of the performance drop in WebGL, I'm fairly confident this PR addresses a major bottleneck. The issue seems to stem from overwhelming the GPU with hundreds of buffer uploads, or at least CPU-GPU data transfer, which then causes a drop in the subsequent 5-6 frames every 6 frames in the RAF. |
We need to check if the I haven't had time to implement UniformGroup on all nodes yet. If we don't do this, we won't be able to achieve optimal performance because the model's matrix groups will be confused with those of the material, causing unnecessary overhead for both backends. I think after this we will be able to implement buffer sharing more safely. |
@aardgoose This is better than I imagined. This is an entire ecosystem for managing uniform groups. I have to take a look at it in peace, because I see you've already put a lot of energy into it. |
What I don't see yet is the possibility of bundling uniforms in custom uniformsGroups as a user. This made it possible to separate uniforms that are constantly updated from uniforms that are rarely or not updated at all. |
I think this is a very good thing, but there is a lack of parameterization if users want to create UniformGroups themselves. As far as I can see, the UniformsGroup class is intended for use in the backend. As an illustrative example, here are some uniforms that I use. At the moment I can pass these all individually to the shader.
But I think a little further ahead where the journey with WebGPU is going. I replaced the classic attributes in my vertex shader with storagebuffers because they give me access to all vertices. With a drawIndirect buffer and an infobuffer which contains the information about the visibility of the instances, I no longer have any attributes at all. The vertex shader is controlled via drawIndirect and my InstanceInfoBuffer, both of which I fill in a compute shader. |
Prototype mechanism to reduce number of
writeBuffer()
calls using a single large buffer for all object uniforms groups, which is updated before the renderPass is submitted. As used in some other engines with WebGPU.All examples run correctly with this PR. Effects greatest with large numbers of objects being rendered. The largest changes are the GPU thread times which are greatly reduced when testing with the webgpu_sprites examples. From 5ms/frame with per object buffer to 2.5ms with single buffer in my brief testing.
No attempt has been made: