feat(gltf): Add GLBArrowLoader #3160

donmccurdy · 2024-11-08T18:58:32Z

For various reasons this might not be the implementation we ultimately prefer for GLBArrowLoader, but it's interesting as a draft for discussion. I've implemented GLBArrowLoader (and GLTFArrowLoader would be a trivial addition) parsing with glTF Transform and encoding the output directly to a list of [ArrowTable, Matrix4] tuples. If the input is compatible (non-interleaved, non-instanced, non-indexed, non-quantized...) then no pre-processing is required for vertex buffers, and buffers are handed to Arrow as-is. If these constraints are not satisfied, the conversion is handled automatically:

await document.transform(unweld(), uninstance(), dequantize());

Ideally, any pre-processing would be done offline, there's little reason to do that at runtime in a production application.

For at least some of these cases (indexed meshes, in particular) it may be preferable to find some way to represent the source data directly with Arrow – perhaps one table for the vertex attributes and another for the indices. But this gets into larger questions about what we'd prefer an Arrow-centric representation of a Mesh to be.

If we prefer to use the existing parsing code instead of parsing with glTF Transform I have no objection; this just seemed like a simple (<150 LOC) proof-of-concept that understands any ratified glTF extensions.

Preview:

donmccurdy · 2024-11-08T19:02:50Z

modules/gltf/src/glb-arrow-loader.ts

+  // Unclear how represent indexed, instanced, or normalized meshes as
+  // ArrowTable. Convert to simpler representations for now.
+  await document.transform(unweld(), uninstance(), dequantize());


Pre-processing to ensure the vertex buffers can be represented as Arrow tables. Possibly quantized meshes (int8 or int16 vertex attributes) could be represented, but we'd need to put the 'normalized: boolean' option somewhere.

Similarly there could be ways to represent instanced draws with a second Arrow table for the instance transforms, if we want to go that direction.

My thoughts about indexed tables was to add an index column as a List<Uint32> and put all the indexes in the first row. We'd need to store an offset to the end of the index list in every other row (in case we had such rows).

const indexes = new Uint32Array([0, 1, 2, 3, 4, 5]); const nextIndex = indexes.length; const indexOffsets = new Uint32Array(5).fill(nextIndex);

We can add metadata on every column (field) has a metadata: Map<string, string> and we can even add a mesharrow: JSON.stringify(...) key there.

donmccurdy · 2024-11-08T19:03:09Z

modules/gltf/src/glb-arrow-loader.ts

+  io?: PlatformIO;
+};
+
+export type ArrowTableTransformList = [ArrowTable, Matrix4][];


TBD what the output should ideally be here?

donmccurdy · 2024-11-08T19:04:52Z

modules/gltf/src/glb-arrow-loader.ts

+export type GLBArrowLoaderOptions = LoaderOptions & {
+  io?: PlatformIO;
+};


If the application is running in a non-browser environment (Node, Deno) or requires other dependencies (Draco, Meshopt) then a custom I/O class should be provided. Example:

import { NodeIO } from '@gltf-transform/core'; import { KHRONOS_EXTENSIONS } from '@gltf-transform/extensions'; import draco3d from 'draco3dgltf'; const io = new NodeIO() .registerExtensions(KHRONOS_EXTENSIONS) .registerDependencies({ 'draco3d.decoder': await draco3d.createDecoderModule() });

Loaders.gl has a bunch of abstractions like ReadableFile etc that have implementations that work under Node, Browser, HTTPs etc. This seems to duplicate some of the IO class responsibilities.

I wonder if we could create a LoadersIO custom IO class for gltf-transform on top of the loaders.gl abstractions that glued the two libraries together, so that your gltf-transform would work with the abstractions we typically use in loaders.

That way this gltf-transform based loader wouldn't become a one-off loader that doesn't fully work like other loaders do.

donmccurdy · 2024-11-08T19:07:30Z

modules/gltf/src/glb-arrow-loader.ts

+  const arrowSchema = new arrow.Schema(arrowFields);
+  const arrowStruct = new arrow.Struct(arrowFields);
+  const arrowData = new arrow.Data(arrowStruct, 0, vertexCount, 0, undefined, arrowAttributes);
+  const arrowRecordBatch = new arrow.RecordBatch(arrowSchema, arrowData);
+  const arrowTable = new arrow.Table([arrowRecordBatch]);


If we happen to know that a scene contains ~100 mesh primitives that are similar¹, possibly they could be returned as a single Table, with a separate table of transforms?

Footnotes

Where 'similar' might mean ... sharing a material? ... same vertex attribute types? ... same node transform? ↩

ibgreen

Very nice. I will take a deeper look but some quick comments.

First, FWIW - In the current model, the existing non-Arro GLB loader just cracks the GLB container structure but doesn't now anything about glTF. In that model, this might more properly be called GLTFArrowLoader even though it loads a .glb file.

The glTFLoader understands the GLTF structure inside a GLB and uses the GLB loader to get to that data (JSON chunk and binary chunk).

We used GLB loader for some other non-glTF customer binary data use cases in the past, but that is a vey minor consideration at this time.

We can discuss moving to your gltf library. The API seems a little bit fancy for my tastes, and appears to duplicates some of the loaders.gl infrastructure to load from different sources, but I am sure I can get behind it if you are keen.

ibgreen · 2024-11-08T19:07:10Z

modules/gltf/src/glb-arrow-loader.ts

+  // Unclear how represent indexed, instanced, or normalized meshes as
+  // ArrowTable. Convert to simpler representations for now.
+  await document.transform(unweld(), uninstance(), dequantize());


My thoughts about indexed tables was to add an index column as a List<Uint32> and put all the indexes in the first row. We'd need to store an offset to the end of the index list in every other row (in case we had such rows).

const indexes = new Uint32Array([0, 1, 2, 3, 4, 5]); const nextIndex = indexes.length; const indexOffsets = new Uint32Array(5).fill(nextIndex);

ibgreen · 2024-11-08T19:08:30Z

modules/gltf/src/glb-arrow-loader.ts

+  // Unclear how represent indexed, instanced, or normalized meshes as
+  // ArrowTable. Convert to simpler representations for now.
+  await document.transform(unweld(), uninstance(), dequantize());


We can add metadata on every column (field) has a metadata: Map<string, string> and we can even add a mesharrow: JSON.stringify(...) key there.

donmccurdy · 2024-11-09T02:56:34Z

Thanks @ibgreen! I don't have strong feelings about whether to use the library at the moment. I guess the fundamental question is what should be returned, which we can perhaps discuss via #3161, other questions might be downstream of that.

donmccurdy added 3 commits November 8, 2024 12:45

feat(gltf): Add GLBArrowLoader

134bc90

Examples: Add GLBArrowLoader to pointcloud arrow example

f3edcaf

revert unrelated changes

6d4a9e4

donmccurdy commented Nov 8, 2024

View reviewed changes

donmccurdy requested a review from ibgreen November 8, 2024 19:12

ibgreen reviewed Nov 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gltf): Add GLBArrowLoader #3160

feat(gltf): Add GLBArrowLoader #3160

donmccurdy commented Nov 8, 2024 •

edited

Loading

donmccurdy Nov 8, 2024

ibgreen Nov 8, 2024

ibgreen Nov 8, 2024

donmccurdy Nov 8, 2024

donmccurdy Nov 8, 2024

ibgreen Nov 11, 2024

donmccurdy Nov 8, 2024 •

edited

Loading

ibgreen left a comment

ibgreen Nov 8, 2024

ibgreen Nov 8, 2024

donmccurdy commented Nov 9, 2024

feat(gltf): Add GLBArrowLoader #3160

Are you sure you want to change the base?

feat(gltf): Add GLBArrowLoader #3160

Conversation

donmccurdy commented Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

donmccurdy Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

Footnotes

ibgreen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

donmccurdy commented Nov 9, 2024

donmccurdy commented Nov 8, 2024 •

edited

Loading

donmccurdy Nov 8, 2024 •

edited

Loading