Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss is nan at the begining #16

Open
ustczhouyu opened this issue Jan 15, 2020 · 4 comments
Open

loss is nan at the begining #16

ustczhouyu opened this issue Jan 15, 2020 · 4 comments

Comments

@ustczhouyu
Copy link

ustczhouyu commented Jan 15, 2020

❓ Questions and Help

help!! when I train my own dataset, the loss is nan at the begining, can anybody tell me how to deal with it? thanks a lot!!
image
@mrlooi

@mrlooi
Copy link
Owner

mrlooi commented Jan 21, 2020

That's odd, but without more info I can't really provide more help. Have you resolved it?

@HashiamKadhim
Copy link

Are you working with a single GPU? If so did you decrease the batch size so that the batch fits into GPU memory? If yes to both:

Set the SOLVER.BASE_LR in your model_config.yaml file about an order of magnitude lower (for example, set it to 0.0025).

Having a larger batch size gives you stability allowing you to increase learning rate. When batch size goes down, a good rule of thumb is that the learning rate should go down as well.

@ustczhouyu
Copy link
Author

@HashiamKadhim @mrlooi Thank you very much, when i set the lr to 0.005, it works. But when I train the model on a dataset containing many small objects, I encountered other difficulties. 1. The model will detect two or more small objects that are close together in the horizontal or vertical direction as one. 2. Due to the complex background of this dataset, some backgrounds are even similar to the texture of the foreground, leading to some false positives. What should I do to solve these two problems? (For example, which parameters should be modified or what kind of branch should be added?) Please help me.

@Johnsyisme
Copy link

Hi! I also came into the same issue, I did some tests and the grad is always nan
tmp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants