-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-initializing main() because the training of light MLP diverged and all the values are zero. #4
Comments
Hi Did the code restart or crash? |
It keeps restarting and then keeps repeating this error. |
Hi, The only solution unfortunately is to restart the code. I did not find time to check if it is a GPU problem because in one of the GPU I used, it did not occur and ran into the problem only on other GPUs. So please try restarting the code. If it still does not work, let me know! The error is caused because we do not use an activation function for the output layer of the light MLP to keep the values unconstrained as we have tonemap the values. |
The same problem is occured on A800 |
Hi, does the code never work or does it happen only a few times? |
Unfortunately, the code never work |
Hi, |
I find the problem, it's because lack of the module robust_loss_pytorch. It runs successfully now! |
yeah, pip install git+https://github.com/jonbarron/robust_loss_pytorch |
Glad to hear it works now :) I think I forgot to include it in the requirements.txt. @Orange-Ctrl i can also see a warning of tinycudann installation in your log image. You have compiled tinycudann on a different GPU device and running it on another one. To get the best performance, please make sure it is properly installed. |
Sorry to bother you @sbharadwajj But it really doesn't work even with the installation of above "robust_loss_pytorch", my GPU is RTX3090, never run successfully for once! Could you give me some help, really thanks! |
Hi @zydmu123 did you manage to run the code in the end? I'm having the same issue on RTX3090 and I also tried |
@Yingyan-Xu did you verify if the mask if correct? Can you quickly save the mask and check? |
Thanks for open source such a great project!
When I trained the yufeng and marcel data sets, errors quickly occurred: Re-initializing main() because the training of light MLP diverged and all the values are zero.
Code tested on RTX 3090Ti
How can I solve this problem?
The text was updated successfully, but these errors were encountered: