Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask for advice for att2in2 under scst training #37

Closed
miracle24 opened this issue Apr 12, 2018 · 2 comments
Closed

Ask for advice for att2in2 under scst training #37

miracle24 opened this issue Apr 12, 2018 · 2 comments

Comments

@miracle24
Copy link

Hi. I trained att2in2 model with the default settings, but I got a lower score than https://github.com/ruotianluo/ImageCaptioning.pytorch/issues/10
Here is my result:
Bleu_1: 0.796 Bleu_2: 0.622 Bleu_3: 0.471 Bleu_4: 0.351 ROUGE_L: 0.561 CIDEr: 1.118
and result in https://github.com/ruotianluo/ImageCaptioning.pytorch/issues/10:
Bleu_1: 0.777 Bleu_2: 0.613 Bleu_3: 0.465 Bleu_4: 0.347 ROUGE_L: 0.560 CIDEr: 1.156

And this how I trained the model:
(1) pretrained att2in2 for 25 epochs with the same settings (the same spatial feature of image, the same batch size, schedule sampling strategy from 0, the same learning rate decay, and so on), and I obtained comparable results with yours.
(2) then I trained it with scst for another 35 epochs. Learning rate was fixed to 5e-5. The cache for computing CIDEr is coco-train-idxs.

Compared with your result, the CIDEr is worse, but others metrics are better. The result bothers me a little bit, which makes me doubt about my experiment settings.
Is there anything trivial details I missed?
I wonder the schedule-sampling used in pretrained model will affect the exploration of the RL, but I have not had it a try. Any advice will be appreciated. Thanks a lot.

@miracle24 miracle24 changed the title The performance of att2in2 with scst bothers me. Ask for advice for att2in2 under scst training Apr 12, 2018
@ruotianluo
Copy link
Owner

I did find if you start scst at different times, the performance will be different in such way. (Higher on other metrics and lower on cider)

I actually forgot what exact setting I use. Try to start scst at 30 epochs or 35 epochs?

@miracle24
Copy link
Author

Ok. Thanks a lot. I will keep trying. It really takes too much time to train the model with RL. Sigh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants