Skip to content

Implementation of CVPR2017 paper "A Hierarchical Approach for Generating Descriptive Image Paragraphs" in Tensorflow (in progress...)

License

Notifications You must be signed in to change notification settings

InnerPeace-Wu/im2p-tensorflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Densecap-tensorflow

Implementation of CVPR2017 paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs by ** Jonathan Krause, Justin Johnson, Ranjay Krishna, Fei-Fei Li**

NOTE: This repo is based on densecap-tensorflow, and it's still buggy.

Note

Update 2018.1.27

  • Following procedures will be adapted for IM2P soon.

Dependencies

To install required python modules by:

pip install -r lib/requirements.txt

Preparing data

Download

Website of Visual Genome Dataset

  • Make a new directory VG wherever you like.
  • Download images Part1 and Part2, extract all (two parts) to directory VG/images
  • Download image meta data, extract to directory VG/1.2 or VG/1.0 according to the version you download.
  • Download region descriptions, extract to directory VG/1.2 or VG/1.0 accordingly.
  • For the following process, we will refer directory VG as raw_data_path

Unlimit RAM

If one has RAM more than 16G, then you can preprocessing dataset with following command.

$ cd $ROOT/lib
$ python preprocess.py --version [version] --path [raw_data_path] \
        --output_dir [dir] --max_words [max_len]

Limit RAM (Less than 16G)

If one has RAM less than 16G.

  • Firstly, setting up the data path in info/read_regions.py accordingly, and run the script with python. Then it will dump regions in REGION_JSON directory. It will take time to process more than 100k images, so be patient.
$ cd $ROOT/info
$ python read_regions --version [version] --vg_path [raw_data_path]
  • In lib/preprocess.py, set up data path accordingly. After running the file, it will dump gt_regions of every image respectively to OUTPUT_DIR as directory.
$ cd $ROOT/lib
$ python preprocess.py --version [version] --path [raw_data_path] \
        --output_dir [dir] --max_words [max_len] --limit_ram

Compile local libs

$ cd root/lib
$ make

Train

Add or modify configurations in root/scripts/dense_cap_config.yml, refer to 'lib/config.py' for more configuration details.

$ cd $ROOT
$ bash scripts/dense_cap_train.sh [dataset] [net] [ckpt_to_init] [data_dir] [step]

Parameters:

  • dataset: visual_genome_1.2 or visual_genome_1.0.
  • net: res50, res101
  • ckpt_to_init: pretrained model to be initialized with. Refer to tf_faster_rcnn for more init weight details.
  • data_dir: the data directory where you save the outputs after prepare data.
  • step: for continue training.
    • step 1: fix convnet weights
    • stpe 2: finetune convnets weights
    • step 3: add context fusion, but fix convnets weights
    • step 4: finetune the whole model.

Demo

Create a directory data/demo

$ mkdir $ROOT/data/demo

Then put the images to be tested in the directory and run

$ cd $ROOT
$ bash scripts/dense_cap_demo.sh [ckpt_path] [vocab_path]

It will create html files in $ROOT/demo, just click it. Or you can use the web-based visualizer created by karpathy by

$ cd $ROOT/vis
$ python -m SimpleHTTPServer 8181

Then point your web brower to http://localhost:8181/view_results.html.

TODO:

  • Debugging.

References

About

Implementation of CVPR2017 paper "A Hierarchical Approach for Generating Descriptive Image Paragraphs" in Tensorflow (in progress...)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published