Skip to content

Latest commit

 

History

History
145 lines (134 loc) · 6.61 KB

MODEL_ZOO.md

File metadata and controls

145 lines (134 loc) · 6.61 KB

Models

Results are on karpathy test split, beam size 5. The evaluated models are the checkpoint with the highest CIDEr on validation set. Without notice, the numbers shown are not selected. The scores are just used to verify if you are getting things right. If the scores you get is close to the number I give (it could be higher or lower), then it's ok.

Trained with Resnet101 feature:

Collection: link

Name CIDEr SPICE Download Note
FC 0.953 0.1787 model&metrics --caption_model newfc
FC
+self_critical
1.045 0.1838 model&metrics --caption_model newfc
FC
+new_self_critical
1.053 0.1857 model&metrics --caption_model newfc

Trained with Bottomup feature:

Collection: link

Name CIDEr SPICE Download Note
Att2in 1.089 0.1982 model&metrics My replication
Att2in
+self_critical
1.173 0.2046 model&metrics
Att2in
+new_self_critical
1.195 0.2066 model&metrics
UpDown 1.099 0.1999 model&metrics My replication
UpDown
+self_critical
1.227 0.2145 model&metrics
UpDown
+new_self_critical
1.239 0.2154 model&metrics
UpDown
+Schedule long
+new_self_critical
1.280 0.2200 model&metrics Best of 5 models
schedule proposed by yangxuntu
Transformer 1.1259 0.2063 model&metrics
Transformer(warmup+step decay) 1.1496 0.2093 model&metrics Although this schedule is better, the final self critical results are similar.
Transformer
+self_critical
1.277 0.2249 model&metrics This could be higher in my opinion. I chose the checkpoint with the highest CIDEr on val set, so it's possible some other checkpoint may perform better. Just let you know.
Transformer
+new_self_critical
1.303 0.2289 model&metrics

Trained with vilbert-12-in-1 feature:

Collection: link

Name CIDEr SPICE Download Note
Transformer 1.158 0.2114 model&metrics The config needs to be changed to use the vilbert feature.