-
-
Notifications
You must be signed in to change notification settings - Fork 35.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BatchedMesh: Proposal #22376
Comments
Sounds like it fits to https://developer.mozilla.org/en-US/docs/Web/API/WEBGL_multi_draw Advantage:
Disadvantage:
|
Oh even better! Do you know if there's signal from Firefox or Safari on plans to support yet? If it's not on the way I would probably lean toward doing a first implementation without it, but trying to keep a compatible API so we can upgrade later. This also gives the advantage that we can implement it in |
This would be really helpful... And it saves us from having to keep an extra (un-transformed) copy of vertex data in memory. If I understand correctly, dynamic batching in popular game engines will not do this, but instead rewrites vertices on the CPU, I'm not sure why that would be if this extension is widely availlable. 🤔 |
@donmccurdy I did some experiments with this idea a few years back, it's definitely an improvement if you have lots of mostly-static geometry in the scene. One trick I found that made it easier to work with is to use const batchedMesh = new BatchedMesh( templateGeometry, material, maxVertexCount, maxTriangleCount );
// Append meshes, up to max vertex and triangle limit.
const obj1 = batchedMesh.add( mesh1 );
const obj2 = batchedMesh.add( mesh2 );
const obj3 = batchedMesh.add( mesh3 );
function update(dt) {
obj1.position.set(Math.sin(performance.now() / 1000), 0, 0);
obj1.rotation.y += dt;
// etc
} |
I have been wanting to introduce
The https://bugzilla.mozilla.org/show_bug.cgi?id=1536673 I have no idea about Safari.
That sounds good to me. Even if we use the extension, we will need fallback path until all major browsers support the extension. Even without the extension, we may be able to use the uniform array approach by adding an extra vertex data having id. It may be a bit memory costly tho.
Yeah. And we will not need to transfer updated positions to WebGL, it is another performance advantage.
Interesting... I guess updating big uniform array every frame might be costly or they want scalability (the maximum uniform array size limits the number of objects)? How can I know the maximum uniform array size? Max Fragment/Vertex Uniform Vectors? And I'm wondering how many objects per one Personally I prefer your API idea because it would be more user friendly. But coming up some thoughts in my mind, for example what Update: I prefer passing
|
Maybe two ways to approach the proxy objects:
Perhaps relevant, Babylon.js' instancing API offers a choice of proxy objects and "thin" instancing (without proxies), and the latter improves performance "drastically" for large counts. I don't necessarily want to jump straight to offering two APIs, but do wonder if there are lessons learned here. Also – the performance difference between "merged" and "naive" on this example is... not much. Perhaps because it's the same geometry? Both around 25-30 FPS on my machine. Without a better understanding of why that is, it might be better to skip all the avoidable scene graph overhead for now, until we can prove that we've got a performance win, and then make the API easier to use from there. |
why not simply using mesh and that's it? i wonder why we need proxies or object ids at all? once the meshes are added the batchedmesh instance has full access to all its children and can read out the transforms whenever it wants, or maybe i just don't understand the limitations that it would cause. but in any case, i would always prefer an api surface that can be expressed imperatively and declaratively and that is something three's own object3d/mesh does exceptionally well. const batchedMesh = new BatchedMesh(templateGeometry, material, maxVertexCount, maxTriangleCount)
batchedMesh.add(mesh1)
batchedMesh.add(mesh2)
batchedMesh.add(mesh3)
...
function update(dt) {
mesh1.position.set(Math.sin(performance.now() / 1000), 0, 0)
mesh1.rotation.y += dt
}
...
batchedMesh.remove(mesh1)
... edit: i take it the main problem with meshes is the material. batched mesh already sets one, so repurposing mesh like that probably is unwanted. could an intermediate object maybe help? if this isn't thrown into a batch it could maybe render black or not at all. class MeshInstance extends THREE.Object3D {
constructor(geom) {
super()
this.geometry = geom
// this.needsUpdate = false → is this the reason we're talking proxies? aren't flags a precedent?
}
} |
Using a proxy object (either a literal ES6 Proxy or just some intermediate with getters/setters) can help with performance. There isn't really an easy built-in way to observe an object for changes and only run some code (eg, reprojecting some vectors) for objects when they change position, orientation, scale, or any other properties that could be expressed as uniforms or attributes - using a proxy object you can present the same interface, but set dirty flags which can then be checked each frame to know whether that object needs updating in the batched mesh.
Part of my experiment involved exposing this in a declarative way for JanusWeb, if I recall it just involved wrapping the objects I wanted to merge in a <CombinedMesh>
<Object src="model1" pos="1 2 3" />
<Object src="model2" pos="0 0 1" />
...etc...
</CombinedMesh>
As proposed yeah, the BatchedMesh object takes a material as the argument, which would make the |
There's a big difference on my machine. When I put the slider to full it's 60fps on both instanced and merged, 16fps on naive. |
I'm little late to the party. Here is a set of ideas (not necessarily mutually compatible)
new BatchedMesh( material, maxVertexCount, maxTriangleCount );
new BatchedMesh( material );
new BatchedMesh( arrayOfGeometries, material );
const id = batchedMesh.add( mesh ); This might be a shortcut of: const id = batchedMesh.add( mesh.geometry, mesh.matrixWorld );
for( var i=0; i<n; i++) batchedMesh.setMatrixAt( i, otherMatrix );
|
Many vertex attributes do not have meaningful default values (e.g. UVs, normals, tangents), and because of how WebGL semantics work we can't really ignore attributes for specific parts of the batch. I'm hoping that being strict about these requirements will create fewer surprises for users than hiding them behind leaky workarounds. I'm not 100% set of having the "template geometry" argument, but we do get complaints when a subclass does not accept the same initial arguments as its parent class (
Note that these are maximum vertex/index counts, they don't have to be exact. I think being able to preallocate a certain amount of space for adding/removing objects in the batch — without incurring cost of re-rebuilding — is an important benefit. InstancedMesh takes a similar approach.
Hm this is a good point. We probably do want users to be able to iterate over the objects. But we also want stable IDs, and if an object is removed from the batch, would that invalidate higher indexes? Unlike with InstancedMesh, we actually could support a sparse index list here, e.g. having objects at indexes 1-4 and 6-10, and "nothing" at index=5, could be supported if we wanted.
I don't think we can support every feature and use case of BufferGeometry here, e.g. if you need to recompute normals it would be best to keep the original BufferGeometry around. Two other thoughts. First, batchedMesh.addGeometry( geometry, matrix, static = false ); // stores geometry
batchedMesh.addGeometry( geometry, matrix, static = true ); // does not store geometry, setMatrixAt will fail In either case we can retain the geometry's bounding box to support raycasting. |
Alternative, index-based API: const templateGeometry = geometries[ 0 ];
const batchedMesh = new BatchedMesh( templateGeometry, material, maxVertexCount, maxIndexCount );
batchedMesh.setGeometryAt( 0, geometries[ 0 ], matrix1 );
batchedMesh.setGeometryAt( 1, geometries[ 1 ], matrix2, static = true );
console.log( batchedMesh.count ); // → 2
batchedMesh.setGeometryAt( 1, null );
console.log( batchedMesh.count ); // → 1 (trailing nulls are dropped?)
batchedMesh.setMatrixAt( 0, matrix3 ); ^this could be compatible with a proxy- or handle-based API, too, but I guess I like the idea of not requiring proxies to use BatchedMesh, and/or making it possible to build that proxy interface in userland. |
I see. Two questions:
The answers may give clues what features need specific attention for optimization. Personally, I see myself using BatchedMesh for a project with hundreds of rooms in dungeons (https://boytchev.github.io/meiro/) - the rooms the same material, but are geometrically different, they are static, and having them batched will benefit from furstum culling. |
Both are important; I wouldn't want to pick just one. But
Being able to loop over the batch later seems fundamental for a container API, e.g. for raycasting and frustum culling. This thread is intended to gather feedback on what users need from the API. |
I really like this addition. I'd hacked together a similar helper awhile ago for a prototype that used mesh skinning to get this type of batching with dynamic transforms: https://twitter.com/garrettkjohnson/status/1390406757134917638 On my use case a bit -- the robotics models I was loading were deeply nested with a lot of rigid meshes that shared a few materials and the number of draw calls was taxing the framerate in VR. I had already written the code for all the raycast interactions so I wanted something that would "just work" and be a drop-in replacement without other changes. The approach used skinning, bone weights, and a similar "proxy" approach for raycasting and transform updating I think everything I described above or has been described regarding "proxies" can be implemented on top of the originally proposed API and more. I like the approach of manually adding geometry for three.js core because it's simple, flexible, and doesn't impose any transform update overhead (matrix world updates and batched transform assignment). The Proxy meshes approach could make a nice example class, though, for those that want that convenience.
I think a sparse index makes sense. I'm imagining a mapping from id -> position in index array (and length of geometry) so it can be removed or updated easily.
What's your reasoning for making the Definitely looking forward to this! |
I was thinking of the |
Oh I see I misunderstood the first few posts in the thread -- I didn't realize you were originally intending to transform every vertex on the CPU. I think doing the vertex transformations in a shader would be a good approach. If I were making a more purpose-built, from scratch implementation of my prototype above I wouldn't use skinning because it involves multiple weights per vertex and therefore more transformations. In theory without the mult_draw extension you'd only need a single new vertex attribute storing the geometry index (just another 1 byte per vertex for up to 256 geometries?) which would be used to access a matrix4 uniform to rigidly transform the geometry.
I've heard of this, too, and it's not clear why something like the skinning or shader-transform approach wouldn't be preferred even if the extension is available. It's hard to find a lot about dynamic batching online that doesn't just discuss Unity's approach at a high level but I did find this GPU Gems 2 article on it that touches on a few different batching approaches and mentions it but is fairly dated (2005). And from Godot's documention on batching and vertex baking it seems they "bake" vertex transforms and colors to avoid per-instance parameters. Perhaps it's worth asking Juan? |
I went ahead and messaged Juan from Godot and Aras from Unity to get their take on the modern benefits of baking vertices for dynamic batching and bottom line is it doesn't sound like something we should be using. For Unity it sounds like the dynamic batching with vertex baking was designed for use on platforms that don't support shaders. And for Godot apparently that style of vertex baking is only done for 2D rendering because it affords some other tricks in that scenario. Juan also mentioned mobile limitations (max UBO size, texture fetch performance) and a special case in tile-based deferred rendering (which we're not using) that might crop up with the uniform matrix4x4 array or texture approach but considering we're already using data textures for skinning I'd think that's what we should go with unless we find that the uniform limitation wouldn't be a problem. Lastly given that here are a few more thoughts on a few of the API functions: class BatchedMesh extends Mesh {
// no template geometry needed? Derive the required attributes from the first geometry
// added which cannot be changed after
constructor( material, maxVertexCount, maxTriangleCount );
// include offset and count so only a subset of the geometry can be added (a la geometry groups
// for cases where only a piece of the geometry is rendered with the appropriate material)
addGeometry( geometry, offset = 0, count = Infinity ) : id;
updateGeometry( id, geometry, offset = 0, count = Infinity );
removeGeometry( id );
setMatrixAt( id, matrix );
// I'm not exactly sure how visibility would work without the multi draw arrays extension. Perhaps
// all vertices for that geometry are collapsed to a single point outside of the screen clip?
setVisibilityAt( id, visible );
} |
Thanks @gkjohnson! I'm happy to take their word for it and skip vertex baking, or at most consider it a short-term solution. I think you're suggesting an I'm having trouble finding numbers on real UBO sizes in WebGL. Assuming the minimum is similar to OpenGL's 16kb, we have an upper bound of something like 250 objects per batch with float32 storage, which may be a plus for data textures here.
Could also support this by respecting the draw range on the given geometry. 👍
Might be cheap enough to rewrite or rearrange the index buffer for visibility changes, no need to change the vertex data. |
I think a good short term / simplest implementation of the class as we've described would just be done on top of The reason to use a purpose-built implementation instead of just using a skinned mesh would be for shader complexity and memory improvements. For skinning 4 bone weights and 4 bone indices are specified per vertex and the shader has to do 16 texture lookups to get all the matrices (4 pixels per matrix). A more optimal implementation for a BatchedMesh class would only require 1 index per vertex in an attribute, no weight attribute, and only 4 texture samples in the shader.
In my opinion if the project has chosen DataTextures as the solution for skinning it makes sense to do the same here.
That's true and maybe that should be the default but I'm imagining the use case I had above where multiple meshes had geometry groups and some groups between meshes shared a material. If you needed to adjust draw range to only stuff in a certain part of the geometry that would impose a pattern like this: const group = geometry.groups[ index ];
const initialDrawRange = { ...geometry.drawRange };
geometry.setDrawRange( group.start, group.count );
batchedMesh.addGeometry( geometry );
geometry.setDrawRange( initialDrawRange.start, initialDrawRange.count );
Yeah I'd think that would be pretty fast enough! I'm not exactly clear on how we'd easily fall back to that if multi draw arrays extension is available in the long term but it's probably not worth worrying about now |
I would like to vote for DataTexture. In addition to what you folks mentioned above, Three.js doesn't support UBO yet and UBO is only for WebGL2. We may be able to revisit the choice later if we encounter texture fetch or upload performance issue. I started to make a prototype of mainly the renderer and shaders with this proposal + |
Made a prototype. Branch: Online demo: three.js.webgl.-.multi_draw.-.Google.Chrome.2021-09-05.13-39-27.online-video-cutter.com.mp440 fps (BatchedMesh) vs 30 fps (regular Mesh) on my Windows 10 Chrome. |
would it be much trouble to add multi material support like @takahirox indicated? i think multiple geometries sharing the same material is quite unusual. in pretty much 100% of all cases that i've worked on (assemblies for instance), geometries have distinct materials, and that would then allow us to deal with instanced GLTF. |
BatchedMesh will only have performance benefits if all "sub meshes" share a single shader / material just like InstancedMesh. If you want to batch multiple meshes that share a shader but have different material properties you'll have to do some preprocessing to use things like vertex colors in place of diffuse material color and / or use a texture atlas which would also require adjusting geometry UVs. With more shader changes you could index into an array uniform for individual sub mesh surface properties (roughness, color, textures, etc). The multi draw extension provides a |
gl_DrawID sounded like the golden goose for a moment. thanks for clearing that up! with how powerful materials in threejs have gotten we can still probably get a lot out of it with vertex props. |
This looks awesome!! |
Hi all, I am relatively new to Webgl/Three.js but was working with desktop OpenGL before. I was a little surprised that Three.js does not support batching already. This looks great and my current project would benefit from this for sure. I am looking forward to this feature a lot. After reading through the thread I also have some ideas I wanted to share:
As far as I understand it would be possible to render different materials in one batch, as long as they use the same shader. Using vertex attributes would be one way to do it, but it would also be possible to write the material settings into a UBO or a data texture and and access it with a material id. The material id would be stored in a vertex attribute, or together with the transformations matrices for the model. If I understand correctly, In the current proposal the user would be required to manually set up the BatchedMesh and then add the corresponding pseudo-objects to the scene graph to track Transformation. |
After almost 30 PRs we have a fully featured
So I think this can be closed. Looking forward to seeing what kinds of performance improvements this brings to projects! Thanks to @takahirox and @donmccurdy for getting it started! |
Wow, it looks wonderful! 😍 Do you have a link to the documentation? I struggle to find it. |
This is not released yet - BatchedMesh and the docs will be available in r159. |
Interesting to see this happen! Would love to see some documentation. 10 years ago almost to the day I did suggest a batching system for ThreeJS here: #4221 (comment). But I am not completely sure how it compares to what was delivered in these PRs. |
If you want a sneak peak at the docs they can see them via githack. There are some other updates in #27231, though: https://raw.githack.com/mrdoob/three.js/dev/docs/index.html?q=Batched#api/en/objects/BatchedMesh
Heh it's hard for me to tell, I think - three.js has changed a lot in those 10 years it looks like 😅 |
Hi, thanks for such feature, its so incredible... i tried in r158 version and get some issues like there is no raycasting, no reaction to shadows and ambient occlusion samples in doc page. I think is normal no? because i have watch improvements in batched meshes for r159. I also notice that bounding box of scene it is not calculate like with instanceMesh or normal meshes... Perhaps @gkjohnson can iluminate me a little. |
Please try the implementation in the next release and ask at the forum if you have further questions about InstancedMesh vs BatchedMesh. |
Maybe nice to know about for later: |
thank you for this, this is fantastic work! |
Currently the APIs available for reducing draw calls include:
InstancedMesh
BufferGeometryUtils.mergeBufferGeometries([a, b, c, ...])
I think we might be able to solve some common difficulties with reducing draw calls, even where the objects to be rendered are not instances of the same thing. I've floated the idea in a couple of earlier comments (#19164 (comment), #18918 (comment)) but wanted to open a separate issue for it.
Calling this
BatchedMesh
for now, it's meant to be a way of (1) merging, (2) rendering, and (3) updating a group of objects that would otherwise be drawn as separate Mesh instances. The objects must all be the same primitive type (i.e. triangles), must share a material, and must have compatible vertex attributes (e.g. if one has normals, they all have normals). With those requirements we could offer an API like:This offers a few advantages over
mergeBufferGeometries()
. First, you get an ID that lets you remove or modify each geometry later. Second, we can potentially do much faster raycasting (we know the bounding box and transform of each member geometry) and return that ID as a result. Third, we can do frustum culling (maybe?) by updating the index buffer.The text was updated successfully, but these errors were encountered: