[project page] [paper]
If this code helps with your work/research, please consider citing
Rohit Girdhar and Deva Ramanan. Attentional Pooling for Action Recognition. Advances in Neural Information Processing Systems (NIPS), 2017.
@inproceedings{Girdhar_17b_AttentionalPoolingAction,
title = {Attentional Pooling for Action Recognition},
author = {Girdhar, Rohit and Ramanan, Deva},
booktitle = {NIPS},
year = 2017
}
This code was trained and tested with
- CentOS 6.5
- Python 2.7
- TensorFlow 1.1.0-rc2 (6a1825e2)
[Updated by Xiaoyu Liu]
My machine is Ubuntu 16.04, Cuda-9.0 and Cudnn 7
Clone the code and create some directories for outputs
$ git clone --recursive https://github.com/rohitgirdhar/AttentionalPoolingAction.git
$ export ROOT=`pwd`/AttentionalPoolingAction
$ cd $ROOT/src/
$ mkdir -p expt_outputs data
$ # compile some custom ops
$ sudo ldconfig /usr/local/lib/python2.7/dist-packages/tensorflow /usr/local/cuda-9.0/lib64
$ cd custom_ops; make; cd ..
$ mkdir $ROOT/src/libs && cd $ROOT/src/libs
$ git clone --recursive [email protected]:ronghanghu/tensorflow_compact_bilinear_pooling.git
$ # modify the compile.sh
$ cd tensorflow_compact_bilinear_pooling/sequential_fft/
$ sh compile.sh
Update tensorflow_compact_bilinear_pooling/sequential_fft/compile.sh
to:
#!/usr/bin/env bash
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
nvcc -std=c++11 -c -o sequential_batch_fft_kernel.cu.o \
sequential_batch_fft_kernel.cu.cc \
-D_GLIBCXX_USE_CXX11_ABI=0 \
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC \
-L$TF_LIB -ltensorflow_framework
g++ -std=c++11 -shared -o ./build/sequential_batch_fft.so \
sequential_batch_fft_kernel.cu.o \
sequential_batch_fft.cc \
-D_GLIBCXX_USE_CXX11_ABI=0 \
-I $TF_INC -fPIC \
-lcudart -lcufft -L/usr/local/cuda/lib64 \
-L$TF_LIB -ltensorflow_framework
rm -rf sequential_batch_fft_kernel.cu.o
[Updated finish]
You can download the tfrecord
files for MPII I used from
here
and uncompress on to a fast local disk.
If you want to create your own tfrecords, you can use the following steps, which is
what I used to create the linked tfrecord files
Convert the MPII data into tfrecords. The system also can read from individual JPEG files, but that needs a slightly different intial setup.
First download the MPII images and annotations, and un-compress the files.
$ cd $ROOT/utils/dataset_utils
$ # Set the paths for MPII images and annotations file in gen_tfrecord_mpii.py
$ python gen_tfrecord_mpii.py # Will generate the tfrecord files
While MPII dataset comes with pose labels, I also experiment with HMDB-51 and HICO, pose for which was computed using an initial version of OpenPose. I provide the extracted keypoints here: HMDB51 and HICO.
First download and unzip the
pretrained models
to a $ROOT/src/pretrained_models/
.
The models can be run by
# Baseline model (no attention)
$ python eval.py --cfg ../experiments/001_MPII_ResNet_pretrained.yaml
# With attention
$ python eval.py --cfg ../experiments/002_MPII_ResNet_withAttention_pretrained.yaml
# With pose regularized attention
$ python eval.py --cfg ../experiments/003_MPII_ResNet_withPoseAttention_pretrained.yaml
Method | mAP | Accuracy |
---|---|---|
Baseline (no attention) | 26.2 | 33.5 |
With attention | 30.3 | 37.2 |
With pose regularized attention | 30.6 | 37.8 |
Train a attentional pooled model on MPII dataset, using python train.py --cfg <path to YAML file>
.
$ cd $ROOT/src
$ python train.py --cfg ../experiments/002_MPII_ResNet_withAttention.yaml
# To train the model with pose regularized attention, use the following config
$ python train.py --cfg ../experiments/003_MPII_ResNet_withPoseAttention.yaml
# To train the baseline without attention, use the following config
$ python train.py --cfg ../experiments/001_MPII_ResNet.yaml
Test the model trained above on the validation set, using python eval.py --cfg <path to YAML file>
.
$ python eval.py --cfg ../experiments/002_MPII_ResNet_withAttention.yaml
# To evaluate the model with pose regularized attention
$ python eval.py --cfg ../experiments/003_MPII_ResNet_withPoseAttention.yaml
# To evaluate the model without attention
$ python train.py --cfg ../experiments/001_MPII_ResNet.yaml
The performance of these models should be similar to the above released pre-trained models.
This is for getting the final number on MPII test set.
# Train on the train + val set
$ python train.py --cfg ../experiments/004_MPII_ResNet_withAttention_train+val.yaml
# Test on the test set
$ python eval.py --cfg ../experiments/004_MPII_ResNet_withAttention_train+val.yaml --save
# Convert the output into the MAT files as expected by MPII authors (requires matlab/octave)
$ cd ../utils;
$ bash convert_mpii_result_for_eval.sh ../src/expt_outputs/004_MPII_ResNet_withAttention_train+val.yaml/<filename.h5>
# Now the generated mat file can be emailed to MPII authors for test evaluation