-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training reproducible with PyTorch but not with PyTorch + PyTorch3D #659
Comments
Which parts of PyTorch3D are you using? |
@bottler Thank you for your reply. I am using class MeshRendererWithDepth(nn.Module):
def __init__(self, rasterizer):
super().__init__()
self.rasterizer = rasterizer
def forward(self, meshes_world, **kwargs) -> torch.Tensor:
fragments = self.rasterizer(meshes_world, **kwargs)
return fragments.zbuf
raster_settings = RasterizationSettings(
image_size= raster_image_size,
blur_radius= 0,
faces_per_pixel= 2,
perspective_correct=False,
cull_backfaces= True,
max_faces_per_bin= 320
)
renderer = MeshRendererWithDepth(
rasterizer=MeshRasterizer(
cameras=cameras,
raster_settings=raster_settings
)
)
depth_maps = renderer(meshes_world= mesh, R=R_camera, T= T_camera) |
Separate CUDA threads deal with separate faces. If there are two or more faces which have exactly the same distance to a certain pixel, then the order in which they appear in the output for that pixel is not determined. Further, if the nearest It should be possible to change PyTorch3D to remove this non-determinism, e.g. by making a lower-indexed equally-distant face count as "closer". |
@bottler Thank you for your reply. One option is that I change the Can we ensure lower-indexed equally-distant face as the closer face through a |
Yes. This would be a code change in a couple of places in |
You might know more about your specific meshes. But in general, setting |
Will determinism be added as a PyTorch3D feature in the future? In other words, is reproducibility in your TODO list? In my opinion, reproducibility in Rasterization is an important feature to add to PyTorch3D. This addition easily reproduces the training.
Could you elaborate more on this? A code snippet explaining the same would be great. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
❓ How to ensure reproducibility of training with PyTorch3D
I am trying to reproduce the training with PyTorch + PyTorch3D. When I only use PyTorch and do not use PyTorch3D, my entire training is reproducible. In other words, when I execute my training script, the errors and the logs match. However, when I introduce PyTorch3D based rendering in training, the training becomes irreproducible.
Libraries and their versions -
Code to seed out the training
I also looked if I am missing something in the PyTorch 1.5.1 reproducibility documentation but could not find anything else.
The latest PyTorch reproducibility documentation says that
Furthermore, if you are using CUDA tensors, and your CUDA version is 10.2 or greater, you should set the environment variable CUBLAS_WORKSPACE_CONFIG according to CUDA documentation
Since I am using Cuda 10.1, so I assume this problem should not arise.
It would be great if you could tell how do we remove randomness while using PyTorch3D in order to fully reproduce the training.
The text was updated successfully, but these errors were encountered: