-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
At training the loss bbox_loss is always zero #266
Comments
After figuring out the NAN, I adjusted the learning rate and also a smaller dataset to be sure if the network is learning anything. At least I don't have anymore the NAN values in loss. But I keep having the loss of bbox_loss at zero:
The loss of bbox should be different of zero, right? |
Hi I've met the same issue here. Do you figure out how to solve this? Is there anything to do with the image dataset or the annotation? Thank you! |
Hi, |
Thank you! I solved this. There are some negative bbox values in my dataset. I just get rid of those data and everything works fine now. |
Hi, what is meant by "pad the images" ? During my training the values for bbox_loss and rpn_loss_bbox are always 0. I tried different datasets but no change. Do these values have to be different from 0? Results with this model are not satisfying, very big boxes und low scores for classes! any advices? |
Hi~ I have the same problem, or even worse. Because I can not use GPU, I setup the environment following this blog and this blog , and run the demo using This is the training log:
So, how can I fix it? Is there anything relation to the CPU MOD files? UPDATE: I changed the base_lr to 0.00001 (1e-5), and it worked. |
Hi~ |
If you are using CPU mode, please use this pull request. |
@jinyu121 thank you~ |
@Mato98 @Soda-Wong Hi, i am facing the exact same issue with my bbox_loss = 0. Even using a LR of 0.00005 is not changing it! Any ideas or solutions? |
Have anybody solved the problem? I Meet this issue when i run a R-FCN trainning with OHEM, loss_bbox = 0 (* 1 = 0 loss) from the very first iteration. However, when i run code without OHEM, everything is ok and i got a MAP of 0.78. So i'm sure there is nothing to do with my data annotation. WHAT else can cause this? |
Having the same issue, kinda got it down to the anchor boxes. the IoU between the anchor boxes and the ground truth is always zero so it always gives a negative, hence everything ends up zero. Still looking for a solution, will appreciate if you guys found one @VersionHX |
@mukeshmithrakumar yeah i have solved the problem with the solution here. It seems to be some version conflict of Numpy. |
Thanks @VersionHX |
I'm trying to train a ZF network for a custom dataset (following instructions from here), using the command:
In the RPN stage, the loss seems ok, but when it arrive at the stage 1 of Fast R-CNN model, the loss is NAN:
What it means? Is the network learning something or not working?
The text was updated successfully, but these errors were encountered: