-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DeepNet in Pytorch #23
Comments
Hi @Goutam-Kelam. Are you initializing the first three layers from VGG-M? You might also need to adjust the learning rates or train for longer -- losses tend to be implemented differently in different frameworks. |
In one of my trials i had initialized the weights with VGG-M as mentioned and happened to train the network for about 15 epochs. The avg loss didn't change much and it was approximately 4. I had reduced the learning rate by half every 2000 iterations as mentioned in your protext file. When the results were not as expected I manually initialized the weights of the layers as it was displayed in the protext file. I am not sure where I am messing up. Could be it the prediction stage itself but there also I did the exact post processing as described by you in one of the earlier issues. |
You need to transfer weights from VGG M to get good results if I recall correctly. You might need to play around with the initial learning rate too. Try setting it close to as high as you can initially without causing divergence, then only reduce when train loss flattens off for a few epochs. |
I had kept the LR at 1.3e-7. So are you suggesting i should keep it to somewhere near E-5 and train it for few epochs say 5 and then reduce it every 2000 iterations. |
Hi @Goutam-Kelam : is this pytorch implementation stable now? Would like to try it. |
Hi , i would like to try this pytorch implementation . Is it stable now? Thanks. |
Hi, I am trying to implement the DeepNet architecture in pytorch. The code seems to work fine but the result are not as expected. I have done as per the protext files which are provided in the issue 3 and 9. You can find my implementation in https://github.com/Goutam-Kelam/Visual-Saliency/tree/master/Deep_Net. It would be helpful if you can tell me where my mistake lies.
Thankyou in advance
The text was updated successfully, but these errors were encountered: