Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GLB packing has issues, stemming from its one buffer limit #1318

Open
chipweinberger opened this issue Apr 13, 2018 · 17 comments
Open

GLB packing has issues, stemming from its one buffer limit #1318

chipweinberger opened this issue Apr 13, 2018 · 17 comments
Labels
breaking change Changes under consideration for a future glTF spec version, which would require breaking changes. spec:glb
Milestone

Comments

@chipweinberger
Copy link

chipweinberger commented Apr 13, 2018

The GLB single-file format should be augmented imo. Only supporting a single buffer brings a couple big issues:

  1. It is not reversible. You lose the filenames and file delineations when smushing everything into a single buffer. This lost data is a huge barrier to using the GLB format at all.
  2. it comes with memory cost at load time. If you want to load a single scene from a GLB file, you need to load the entire GLB binary buffer into memory. edit: It's actually possible to implement partial buffer loading in an intelligent way, but it's not as straight forward as it could be.

I don't see a reason why GLB packing is limited to a single buffer. IMO, every buffer with uri == "" should appear in the GLB as a seperate chunk, in the order that they are found in the JSON.

^ Unfortunately that's a breaking change. To keep breaking changes small, we could also do something like this - an extension to the buffer definition:


buffer.extensions.glbUnpacking.unpackedBuffers//an array of GlbUnpackedBuffer objects

struct GlbUnpackedBuffer{
     uint64 byteOffset;//where this unpacked buffer starts within the GLB binary chunk 
     uint64 byteLength;//the length of the unpacked buffer
     string uri;//the URI of the unpacked buffer
     vector<uint32_t> associatedBufferViews;//the buffer views that will be updated to point to this new unpacked buffer
}
@chipweinberger
Copy link
Author

@donmccurdy @zellski

Curious your guys' thoughts on these issues.

@javagl
Copy link
Contributor

javagl commented Apr 13, 2018

I'll (have to) leave it to others to address the critique and suggestions in detail. But there has been some research for a streaming format that may, to some extent, be what you're aiming at with your second point: https://x3dom.org/src/ The format is basically equivalent to Binary glTF (1.0), but adds the concept of "chunks", so that it is not necessary to load the whole buffer before the first elements can be displayed. It does not address the issue as a whole, though (and glTF is not supposed to be a streaming format, after all). But some further, related discussion is here: #1177

@gkv311
Copy link
Contributor

gkv311 commented Apr 14, 2018

  1. it comes with a huge memory cost. If you want to load a single scene from a GLB file, now you need to load the entire buffer into memory.

Is it something specific to JavaScript?
I don't see why the buffer needs to be loaded into memory as whole - normal streaming should work fine basing on file offsets within accessor/bufferView...

@donmccurdy
Copy link
Contributor

donmccurdy commented Apr 14, 2018

If you want to load a single scene from a GLB file, you need to load the entire GLB binary buffer into memory....

I assume then that you are putting many scenes into a single GLB file? While the format supports this, I think most people are using it for single models, not even an entire single scene. So it seems like you have a specific use case in mind, do you mind saying more to clarify that? It would be good to understand the scenario (streaming? LODs? encapsulating all assets for a small game?) before adjusting the format.

Also note that the best practice guidance is not to say that binary GLB is always the right choice.. When sharing textures among multiple models, having several .gltf files referencing the same textures would be a very reasonable way to avoid duplicate texture data.

@chipweinberger
Copy link
Author

chipweinberger commented Apr 15, 2018

@gkv311 technically you are right. You don't need to. But it adds difficulty in writing low memory/partial gltf loading libraries. I have 2 functions: loadBuffer and loadBufferView. Neither is sufficient when considering GLB and strided vertex attributes because both will load more memory than needed, or require multiple calls to overlapping regions (strided buffer views). My current solution to this problem it to just divide the big GLB buffer into smaller ones on load (strided buffer views get mapped to a single buffer, and other buffer views each get their own buffer), and then just use LoadBuffer exclusively. It's not a perfect solution since to keep things simple you need to alter the gltf document during unpacking, and a loader should not necessitate that. I welcome suggestions.

@donmccurdy I'm writing a gltf loading library, so the use case is up to the developer. Partial loading is a design goal given to me by a developer =)

@scurest
Copy link

scurest commented Apr 17, 2018

I don't understand how it adds difficulty. You have the exact same problem for a regular glTF file that stores everything in a single separate BIN file.

To load just the parts of the buffer you need, this is my first idea:

Collect a list of all the buffer views you need. For each one, find the smallest interval [start_off, end_off) of bytes in the buffer you need. Sort the list of all these intervals by their start_off. Make a pass through the list merging overlapping intervals. The result is a minimal list of interval you need in memory. Load a chunk of memory from the buffer for each interval. Now when you need some interval [a,b) of the buffer for a buffer view, scan the list for an interval [c,d) that contains [a,b), take the chunk you loaded for interval [c,d), and return a reference to the range [a-c, b-c) inside that chunk.

(This is assuming you need the buffer views exposed as actual byte arrays eg. for uploading to the GL. For buffer views where you don't need this you can, of course, just use seek and read values on the fly.)

@vpenades
Copy link
Contributor

vpenades commented Apr 19, 2018

I agree with @chipweinberger , GLB should support multiple buffers to align with gltf capabilities.

For example, converting a glTF with multiple buffers to a single GLB already presents a problem because it requires rebuilding the big buffer from scratch, and updating all the viewbuffers accordingly.

Also, there's no need for an extension or anything in the schema to support multiple buffers in GLB.... just say that you find 4 buffer declarations in the JSON with an empty URI; then, expect to find 4 buffer chunks, and assign in order as they're found.

@scurest
Copy link

scurest commented Apr 19, 2018

FWIW I was also surprised that there was only one BIN chunk allowed when I first read the spec.

Also, there's no need for an extension or anything in the schema to support multiple buffers in GLB.... just say that you find 4 buffer declarations in the JSON with an empty URI; then, expect to find 4 buffer chunks, and assign in order as they're found.

I'd prefer to explicitly say what chunk the buffer is referencing (where if you don't say anything it defaults to the first one, as today). Currently the meaning of buffers[i] depends only on buffers[i] itself, but if you "assigned in order as they're found" its meaning could depend on the whole sequence buffers[0], buffers[1], ..., buffers[i].

@chipweinberger
Copy link
Author

chipweinberger commented Apr 23, 2018

Unfortunately these are still breaking changes so they would need to wait until Gltf3...

My extension would be nice to have in the mean time as it solves much of the issue, particularly for packing to be fully reversible.

@scurest that is essentially what I am doing. It is possible to avoid the memory problems with GLB files, but it's harder than it should be IMO.

@chipweinberger chipweinberger changed the title GLB packing has big issues, stemming from its one buffer limit GLB packing has issues, stemming from its one buffer limit Jun 4, 2018
@naikrovek
Copy link

I realize that this doesn't address the issue topic: can you Base64 encode additional buffers and include them in the JSON chunk?

@donmccurdy
Copy link
Contributor

Note that the spec allows you to (a) reference external .bin files from additional buffers, or (b) include base64 buffers in the JSON chunk of a GLB. The latter is not efficient, but the former certainly could have uses. Of course that means you now have multiple files.

I think we could consider an extension or future spec change for this, if someone would like to put together a proposal. But this hasn't seemed to be an issue for most applications, and the vast majority of .gltf files only contain a single buffer. The only code I've seen that writes glTF files with multiple buffers was some hacky scripts of my own, actually. So more feedback would also be helpful to prioritize this.

@naikrovek
Copy link

I am not sure multiple binary buffers are needed, though I do remember wishing for them at some point. I can see them as a slight convenience for serialization library authors. A reader wouldn't be slowed down at all by multiple binary buffers, I wouldn't think. It seems like this could be served by an extension easily, and I'm not going to write it. 😁

@chipweinberger
Copy link
Author

chipweinberger commented Feb 7, 2019

I would love to see an extension for this. Fully reversible gltf->glb->gltf seems like a no brainer. Until then glbs will remain lossy which is a big shame.

I personally use multiple binary buffers all the time. In fact it's my default encoding to put each mesh into its own buffer. The huge upside of doing this is it allows you to edit gltfs by hand, easily remove things, and easily see what takes up all the space in your gltf files in a file browser.

Maybe @donmccurdy you wouldn't mind making the extension PR for this?

@donmccurdy
Copy link
Contributor

I won't have the bandwidth to write and push through an extension PR on this right now. Some of that ("easily remove things", "easily see what takes up all the space") seems like an opportunity for tooling like an SDK that wouldn't depend on the internal layout of the data being arranged for human edits. It may also be difficult to convince client implementations to support an extension that primarily makes things easier for authoring tools. Nevertheless I think it's a reasonable idea, and I'm not opposed to an extension, or considering inclusion in a TBD (2.x? 3.x?) release.

@chipweinberger
Copy link
Author

chipweinberger commented Feb 13, 2019

The nice thing about an extension is its backwards compatible. It would be optionally supported by GLB packers and unpackers, but no one needs to know about it. Everything would still be stored in a single buffer in the GLB.

@lexaknyazev lexaknyazev added this to the glTF Next milestone May 6, 2019
@prenex
Copy link

prenex commented Oct 12, 2020

Hi!

I also find glb to be lacking for not supporting multiple buffers IF regular gltf does support it when buffers are separate files.

I am making an exporter plugin for an existing app now and already implemented gltf + bin export, but some other app (where I didn't wrote its importer) needs glb and cannot load the former kind. Both applications are custom software components but they need to communicate.

When I am writing out the gltf + bin files I have multiple buffers and really efficiently just write the binary out from in-memory buffers. Also the vertex format we use is interleaved vertex format which further complicate converting these buffers into a single buffer and play around with buffer views and accessors further because handling of interleaved data is also done using views and accessors.

It would have been ten times easier and likely much more simple and less error-prone if I could just save the already existing data into multiple buffers. Likely also much more memory efficient on our c++ side but have no idea about the other side.

I actually estimated this addition can be done in 1-2 hours - until I found that I either need to write some code that takes existing buffers and compacts indexed interleaved data from 6-7 buffers into a single one - or call some kind of external code or service which implements gltf -> glb packing.

I like gltf because it usually lets us directly work with data that is in GPU-usable formats, but having no support to package a gltf + bin directly into a glb file without reorganizing the buffer layout - and thus add processing steps to what otherwise is just an efficient copy / writeout - is crazy.

My first thought was like OMG I found a typo in the docs of them saying only a single buffer is supported. Wow is it not a typo???
:-(

Will of course work on this, but isn't gltf's original goal to let I/O just work by utilizing direct read or memcopy of data without processing binaries into some other format the application needs???

@prenex
Copy link

prenex commented Oct 14, 2020

Okay... after I did my refactors I must say likely it is better this way and the real guts of the problem is why on earth is it supported to add multiple buffers as multiple files on .gltf outputs... I can think of supporting data sharing and better load times maybe but doubt if people use this.

It is not that bad, but maybe it should be much better documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes under consideration for a future glTF spec version, which would require breaking changes. spec:glb
Projects
None yet
Development

No branches or pull requests

9 participants