-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuning issues, loss: -nan after 100 iterations #644
Comments
Try a smaller base_lr On Tuesday, July 8, 2014, mezN [email protected] wrote:
Sergio |
I reduced the size of my dataset and also changed the proportion e.g. 50 aeroplanes 100 no-aeroplanes, also I reduced the learning rate and now it seems to work. Thanks, would you mind telling me what exactly momentum and weight_decay does? I understood all parameters of the solver.prototxt except them now :) |
Take a look here Sergio 2014-07-08 6:35 GMT-07:00 mezN [email protected]:
|
thanks, I closed this issue :) |
Hello,
First of all, I know that several issues regarding that topic already exists, unfortunately none of them provided sufficient information for me to solve my problem.
I was trying to reuse the pretrained imagenet model in order to solve a binary classification task. So what did I do:
What I got was the following:
I0707 17:01:49.294651 13063 solver.cpp:106] Iteration 0, Testing net
I0707 17:20:26.931828 13063 solver.cpp:142] Test score #0: 0.002
I0707 17:20:26.931887 13063 solver.cpp:142] Test score #1: 1.84863
I0707 22:03:18.925554 13063 solver.cpp:237] Iteration 100, lr = 0.001
I0707 22:03:19.511451 13063 solver.cpp:87] Iteration 100, loss = -nan
Additionally I made a few runs with slightly different network definitions, e.g. use all the imagenet layers and put an extra fully connected layer with 2 outputs on top or use just 1 output, but these failed as well with the same output.
I did not find much documentation on finetuning, except the slides in the presentation and several issues #31 #328 #140 and more.
I am new to caffe and it is my first time that I work with neural networks, therefore please don't be afraid of writing detailed answers. E.g. Is it sufficient to just reduce the number of outputs of the last fully connected layer in order to make the imagenet suitable for a binary classification task?
Best regards,
Chris
The text was updated successfully, but these errors were encountered: