Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance gap on NERF synthetic dataset #4

Open
walsvid opened this issue Feb 18, 2022 · 12 comments
Open

Performance gap on NERF synthetic dataset #4

walsvid opened this issue Feb 18, 2022 · 12 comments

Comments

@walsvid
Copy link

walsvid commented Feb 18, 2022

As mentioned in another issue #2, using the default config like nerf-pytorch does not get comparable performance to Instant-NGP.

@yashbhalgat
Copy link
Owner

yashbhalgat commented Feb 20, 2022

Hi @walsvid, this is indeed the case, thanks for pointing out! Although the renders look "good", the PSNR is not as good as reported values in Instant-NGP.

Note: You should look at the testset PSNR (not the training PSNR). I have just pushed some code to print these values. Pull the latest master branch.

Here's an example with Lego dataset: PSNR on test set = 28.89. Corresponding renders:

video.mp4

To be honest, I am not sure why this is the case. We can different values for finest_res (the above result is with 1024). If you figure out the reason, please let me know!

@ashawkey
Copy link

ashawkey commented Mar 6, 2022

A possible reason is that instant-ngp loads all train/val/test data to train (see here) for nerf_synthetic dataset?

@Xuanqing-C
Copy link

Seems that if we add the TV_loss's iteration, PSNR could rise. (in run_nerf.py line 876)

@Feynman1999
Copy link

Feynman1999 commented May 31, 2022

so, can someone give a benchmark for this repo v.s. official instant-ngp repo? it would be very useful!

@shreyask3107
Copy link

Hi @yashbhalgat , may I know what is the expected difference in training speed of This repo, pytorch-nerf and instant-ngp?
I ran the pytorch-nerf and this repo and did not find any such significant gain in speed (which is the main purpose of instant-ngp)
Thank You.

@yashbhalgat
Copy link
Owner

@Feynman1999 regarding benchmarks v/s Instant-NGP, I will try to get this ready, but it won't be any time soon as I am currently caught up with other projects. If you or someone else can work on this, feel free to open a pull request. :)

@shreyk25,

  1. this is a pure PyTorch(+Python) implementation, so it isn't as fast as the CUDA(C++) implementation by Instant-NGP.
  2. Compared to nerf-pytorch, the iterations/second speed during training is almost the same. Although, as mentioned in the README, the HashNeRF algorithm converges much faster (refer the video below). For the "chair" dataset, HashNeRF converges in about 3000 iterations (which translates to 15 minutes training time) while vanilla NeRF (nerf-pytorch) will take a few hours to reach similar performance. I have observed approximately 20x convergence speedup compared to nerf-pytorch.
Chair.Convergence.mp4

Hope this helps. :)

@Feynman1999
Copy link

Seems that if we add the TV_loss's iteration, PSNR could rise. (in run_nerf.py line 876)

have you tried this tvloss? how it affect psnr?

@kwea123
Copy link

kwea123 commented Jul 4, 2022

A possible reason is that instant-ngp loads all train/val/test data to train (see here) for nerf_synthetic dataset?

The author just clarified to me that they only use the train split to train: kwea123/ngp_pl#1

@leo-frank
Copy link

I verify that "hash encoding + small MLP" has faster convergence speed than “vanilla large MLP”, see this figure (Bottom is HashNeRF, while the above is vanilla NeRF) :

at 500 iteration:
image
at 1000 iteration:
image

@wen-yuan-zhang
Copy link

Thanks for the great work @yashbhalgat , do you now have any idea about the performance (numerical results) gap between instant and this implementation? I guess there may be some lack of essential implementation details in this repo, but I am not able to find it out...

@MrMois
Copy link

MrMois commented Aug 3, 2023

@zParquet I am wondering the same and searching for quite a while now, because the gap is quite large. Two differences I found so far are:

  1. Here, small eps for Adam is only applied to hash table entries, yet in the paper it seems they use it also for the MLP.

    {'params': embedding_params, 'eps': 1e-15}

  2. They mention the mapping between grid and table to be 1:1 on lower resolutions. This injectivity is not given in this implementation, because in theory there could also happen collisions on the lower resolutions. However they are very unlikely, which makes the effect on PSNR questionable.

Further ideas are welcomed :D

This work seems to reach atleast PSNR 34, however they are using an own Cuda version of the encoding.
https://github.com/ashawkey/torch-ngp

@MrMois
Copy link

MrMois commented Aug 3, 2023

After reading the appendix E.3, it seems a huge benefit comes from a (very) large number of rays in each batch, at the cost of fewer samples. However these fewer samples seem to be possible because of their additional nested occupancy grids.

The batch size has a significant effect on the quality and speed of NeRF convergence. We found that training from a larger number of rays, i.e. incorporating more viewpoint variation into the batch, converged to lower error in fewer steps.

image

EDIT: Author of the original paper states something similar: NVlabs/instant-ngp#118

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants