Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gltf): Add GLBArrowLoader #3160

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

donmccurdy
Copy link
Collaborator

@donmccurdy donmccurdy commented Nov 8, 2024

For various reasons this might not be the implementation we ultimately prefer for GLBArrowLoader, but it's interesting as a draft for discussion. I've implemented GLBArrowLoader (and GLTFArrowLoader would be a trivial addition) parsing with glTF Transform and encoding the output directly to a list of [ArrowTable, Matrix4] tuples. If the input is compatible (non-interleaved, non-instanced, non-indexed, non-quantized...) then no pre-processing is required for vertex buffers, and buffers are handed to Arrow as-is. If these constraints are not satisfied, the conversion is handled automatically:

await document.transform(unweld(), uninstance(), dequantize());

Ideally, any pre-processing would be done offline, there's little reason to do that at runtime in a production application.

For at least some of these cases (indexed meshes, in particular) it may be preferable to find some way to represent the source data directly with Arrow – perhaps one table for the vertex attributes and another for the indices. But this gets into larger questions about what we'd prefer an Arrow-centric representation of a Mesh to be.

If we prefer to use the existing parsing code instead of parsing with glTF Transform I have no objection; this just seemed like a simple (<150 LOC) proof-of-concept that understands any ratified glTF extensions.

Preview:

dragon

Comment on lines +38 to +40
// Unclear how represent indexed, instanced, or normalized meshes as
// ArrowTable. Convert to simpler representations for now.
await document.transform(unweld(), uninstance(), dequantize());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-processing to ensure the vertex buffers can be represented as Arrow tables. Possibly quantized meshes (int8 or int16 vertex attributes) could be represented, but we'd need to put the 'normalized: boolean' option somewhere.

Similarly there could be ways to represent instanced draws with a second Arrow table for the instance transforms, if we want to go that direction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts about indexed tables was to add an index column as a List<Uint32> and put all the indexes in the first row. We'd need to store an offset to the end of the index list in every other row (in case we had such rows).

const indexes = new Uint32Array([0, 1, 2, 3, 4, 5]);
const nextIndex = indexes.length;
const indexOffsets = new Uint32Array(5).fill(nextIndex);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add metadata on every column (field) has a metadata: Map<string, string> and we can even add a mesharrow: JSON.stringify(...) key there.

io?: PlatformIO;
};

export type ArrowTableTransformList = [ArrowTable, Matrix4][];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBD what the output should ideally be here?

Comment on lines +12 to +14
export type GLBArrowLoaderOptions = LoaderOptions & {
io?: PlatformIO;
};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the application is running in a non-browser environment (Node, Deno) or requires other dependencies (Draco, Meshopt) then a custom I/O class should be provided. Example:

import { NodeIO } from '@gltf-transform/core';
import { KHRONOS_EXTENSIONS } from '@gltf-transform/extensions';
import draco3d from 'draco3dgltf';

const io = new NodeIO()
  .registerExtensions(KHRONOS_EXTENSIONS)
  .registerDependencies({
    'draco3d.decoder': await draco3d.createDecoderModule()
  });

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loaders.gl has a bunch of abstractions like ReadableFile etc that have implementations that work under Node, Browser, HTTPs etc. This seems to duplicate some of the IO class responsibilities.

I wonder if we could create a LoadersIO custom IO class for gltf-transform on top of the loaders.gl abstractions that glued the two libraries together, so that your gltf-transform would work with the abstractions we typically use in loaders.

That way this gltf-transform based loader wouldn't become a one-off loader that doesn't fully work like other loaders do.

Comment on lines +90 to +94
const arrowSchema = new arrow.Schema(arrowFields);
const arrowStruct = new arrow.Struct(arrowFields);
const arrowData = new arrow.Data(arrowStruct, 0, vertexCount, 0, undefined, arrowAttributes);
const arrowRecordBatch = new arrow.RecordBatch(arrowSchema, arrowData);
const arrowTable = new arrow.Table([arrowRecordBatch]);
Copy link
Collaborator Author

@donmccurdy donmccurdy Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we happen to know that a scene contains ~100 mesh primitives that are similar1, possibly they could be returned as a single Table, with a separate table of transforms?

Footnotes

  1. Where 'similar' might mean ... sharing a material? ... same vertex attribute types? ... same node transform?

@donmccurdy donmccurdy requested a review from ibgreen November 8, 2024 19:12
Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. I will take a deeper look but some quick comments.

First, FWIW - In the current model, the existing non-Arro GLB loader just cracks the GLB container structure but doesn't now anything about glTF. In that model, this might more properly be called GLTFArrowLoader even though it loads a .glb file.

The glTFLoader understands the GLTF structure inside a GLB and uses the GLB loader to get to that data (JSON chunk and binary chunk).

We used GLB loader for some other non-glTF customer binary data use cases in the past, but that is a vey minor consideration at this time.

We can discuss moving to your gltf library. The API seems a little bit fancy for my tastes, and appears to duplicates some of the loaders.gl infrastructure to load from different sources, but I am sure I can get behind it if you are keen.

Comment on lines +38 to +40
// Unclear how represent indexed, instanced, or normalized meshes as
// ArrowTable. Convert to simpler representations for now.
await document.transform(unweld(), uninstance(), dequantize());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts about indexed tables was to add an index column as a List<Uint32> and put all the indexes in the first row. We'd need to store an offset to the end of the index list in every other row (in case we had such rows).

const indexes = new Uint32Array([0, 1, 2, 3, 4, 5]);
const nextIndex = indexes.length;
const indexOffsets = new Uint32Array(5).fill(nextIndex);

Comment on lines +38 to +40
// Unclear how represent indexed, instanced, or normalized meshes as
// ArrowTable. Convert to simpler representations for now.
await document.transform(unweld(), uninstance(), dequantize());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add metadata on every column (field) has a metadata: Map<string, string> and we can even add a mesharrow: JSON.stringify(...) key there.

@donmccurdy
Copy link
Collaborator Author

Thanks @ibgreen! I don't have strong feelings about whether to use the library at the moment. I guess the fundamental question is what should be returned, which we can perhaps discuss via #3161, other questions might be downstream of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants