My reproduction of Google's im2txt (or Show and Tell) model using scripts for conducting training and evaluation on a SLURM cluster.
I followed this tutorial from the im2text repository. In order to develop the scripts included in this repository for training the model using SLURM. Start there and run the appropriate script. Each script has TODO's requiring updates for your directories, etc. Here are the basic steps to reproduce the training once all dependecies are installed(Need help installing TF on CENTOS 6.8 without root? Check here):
- Clone the im2txt model into your desired directory
- Download and process the MSCOCO data using
MSCOCO.sh
- Download the InceptionV3 checkpoint using
get_inception.sh
- Modify
tensorflowShowandTell0.sh
andtensorflowShowandTell1.sh
for your system. - Initialize the training and evaluation using
im2txttrain.sh
to call the scripts modified in step 4 - After 5,000 steps complete, run
tensorboard.sh
to start a tensorboard server. - Follow this gist to access the server.
- You can modify and run
showAndTellClassify.sh
to test your model when the script from step 5 completes. - Run
trainWithInception.sh
to complete an additional 2 Million training steps including updates to the Inception network.