Training guide #51
Replies: 2 comments
-
Hey there, @MarcusLoppe, hope you're doing great ! Are you looking for interesting projects to work on at this moment ? I'm building a 3D design tool that is set to disrupt any industry where CAD software is used. We're getting started with the architecture industry and working our way up from there. Imagine if you could generate 3D exteriors, façades, interior scenes, and eventually whole cars, planes and buildings using natural language and/or reference images ? The GPT of 3D creation if you will. If that's something you're interested in, shoot me an e-mail at [email protected] and we can have a chat.. |
Beta Was this translation helpful? Give feedback.
-
Your demo notebook, MeshGPT_demo.ipynb, has been particularly helpful to me. However, there are some questions that I need your assistance with. Could you please tell me where I can download the pre-trained models, mesh-transformer and mesh-encoder, that are loaded in the code? |
Beta Was this translation helpful? Give feedback.
-
Hi,
I thought I could create a post about how to train the models.
I made a demo notebook MeshGPT_demo.ipynb that loads and trains the demo meshes, it uses some of my functions that I've created so you'll need to install my fork.
Training on 3d meshes is very resource intensive, so preprocessing the data to minimize the amount of work the training needs to do.
You can see the order of the preprocessing in demo notebook, if you want implement this on your own take a look at the MeshDataset which does the processing.
Loading mesh
In the paper they processed the mesh by the below, this preprocess is at the get_mesh() function in the demo notebook.
Augment mesh
In the paper they augment the mesh to make it more generalized , they used a jitter factor from -0.1 to 1.0 and rescale the mesh from 0.70-1.25.
I implement this in the mesh, but a issue occurs if you rescale it by 1.25 since this might make the highest vertex above the 1.0 height limit, so I reduced this to 0.7-1.0.
They never mentioned how many times they augmented a 3D mesh model in the paper but the Polygen paper they used 50 augmentations per model.
Details about why preprocessing is very important & required dataset values:
For training the Autoencoder it requires the additional information;
For training the transformer it requires the additional information;
If this information isn't provided by the dataset; the models needs to generate this data when training a batch.
This is very wasteful since it never saves this data due to the dataloader uses a temporary copy of the data and never is reused, this means that each training step it generates this data and it's removed after the step.
This will slow down the training since it takes time to generate this data but VRAM usage as well.
If this data is already stored in the dataset it will take up almost no VRAM.
I did some testing and generated the face_edges and codes, I used batch size of 64 and ran for 55 times with random vertices and faces.
Using a dataset under 700 faces and batch size of 64 and generated for 55 steps used: 11.8GB.
But when I increased the face count to 2800 the VRAM usage goes wild, reaching 35 GB VRAM.
Beta Was this translation helpful? Give feedback.
All reactions