Skip to content

xiaoyuliu/AttentionalPoolingAction

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Attentional Pooling for Action Recognition

[project page] [paper]

If this code helps with your work/research, please consider citing

Rohit Girdhar and Deva Ramanan. Attentional Pooling for Action Recognition. Advances in Neural Information Processing Systems (NIPS), 2017.

@inproceedings{Girdhar_17b_AttentionalPoolingAction,
    title = {Attentional Pooling for Action Recognition},
    author = {Girdhar, Rohit and Ramanan, Deva},
    booktitle = {NIPS},
    year = 2017
}

Pre-requisites

This code was trained and tested with

  1. CentOS 6.5
  2. Python 2.7
  3. TensorFlow 1.1.0-rc2 (6a1825e2)

Getting started

[Updated by Xiaoyu Liu]

My machine is Ubuntu 16.04, Cuda-9.0 and Cudnn 7

Clone the code and create some directories for outputs

$ git clone --recursive https://github.com/rohitgirdhar/AttentionalPoolingAction.git
$ export ROOT=`pwd`/AttentionalPoolingAction
$ cd $ROOT/src/
$ mkdir -p expt_outputs data
$ # compile some custom ops
$ sudo ldconfig /usr/local/lib/python2.7/dist-packages/tensorflow /usr/local/cuda-9.0/lib64
$ cd custom_ops; make; cd ..
$ mkdir $ROOT/src/libs && cd $ROOT/src/libs
$ git clone --recursive [email protected]:ronghanghu/tensorflow_compact_bilinear_pooling.git
$ # modify the compile.sh
$ cd tensorflow_compact_bilinear_pooling/sequential_fft/
$ sh compile.sh 

Update tensorflow_compact_bilinear_pooling/sequential_fft/compile.sh to:

#!/usr/bin/env bash
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

nvcc -std=c++11 -c -o sequential_batch_fft_kernel.cu.o \
  sequential_batch_fft_kernel.cu.cc \
  -D_GLIBCXX_USE_CXX11_ABI=0 \
  -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC \
  -L$TF_LIB -ltensorflow_framework

g++ -std=c++11 -shared -o ./build/sequential_batch_fft.so \
  sequential_batch_fft_kernel.cu.o \
  sequential_batch_fft.cc \
  -D_GLIBCXX_USE_CXX11_ABI=0 \
  -I $TF_INC -fPIC \
  -lcudart -lcufft -L/usr/local/cuda/lib64 \
  -L$TF_LIB -ltensorflow_framework

rm -rf sequential_batch_fft_kernel.cu.o

[Updated finish]

Data setup

You can download the tfrecord files for MPII I used from here and uncompress on to a fast local disk. If you want to create your own tfrecords, you can use the following steps, which is what I used to create the linked tfrecord files

Convert the MPII data into tfrecords. The system also can read from individual JPEG files, but that needs a slightly different intial setup.

First download the MPII images and annotations, and un-compress the files.

$ cd $ROOT/utils/dataset_utils
$ # Set the paths for MPII images and annotations file in gen_tfrecord_mpii.py
$ python gen_tfrecord_mpii.py  # Will generate the tfrecord files

Keypoint labels for other datasets

While MPII dataset comes with pose labels, I also experiment with HMDB-51 and HICO, pose for which was computed using an initial version of OpenPose. I provide the extracted keypoints here: HMDB51 and HICO.

Testing pre-trained models

First download and unzip the pretrained models to a $ROOT/src/pretrained_models/. The models can be run by

# Baseline model (no attention)
$ python eval.py --cfg ../experiments/001_MPII_ResNet_pretrained.yaml
# With attention
$ python eval.py --cfg ../experiments/002_MPII_ResNet_withAttention_pretrained.yaml
# With pose regularized attention
$ python eval.py --cfg ../experiments/003_MPII_ResNet_withPoseAttention_pretrained.yaml

Expected performance on MPII Validation set

Method mAP Accuracy
Baseline (no attention) 26.2 33.5
With attention 30.3 37.2
With pose regularized attention 30.6 37.8

Training

Train a attentional pooled model on MPII dataset, using python train.py --cfg <path to YAML file>.

$ cd $ROOT/src
$ python train.py --cfg ../experiments/002_MPII_ResNet_withAttention.yaml
# To train the model with pose regularized attention, use the following config
$ python train.py --cfg ../experiments/003_MPII_ResNet_withPoseAttention.yaml
# To train the baseline without attention, use the following config
$ python train.py --cfg ../experiments/001_MPII_ResNet.yaml

Testing and evaluation

Test the model trained above on the validation set, using python eval.py --cfg <path to YAML file>.

$ python eval.py --cfg ../experiments/002_MPII_ResNet_withAttention.yaml
# To evaluate the model with pose regularized attention
$ python eval.py --cfg ../experiments/003_MPII_ResNet_withPoseAttention.yaml
# To evaluate the model without attention
$ python train.py --cfg ../experiments/001_MPII_ResNet.yaml

The performance of these models should be similar to the above released pre-trained models.

Train + test on the final test set

This is for getting the final number on MPII test set.

# Train on the train + val set
$ python train.py --cfg ../experiments/004_MPII_ResNet_withAttention_train+val.yaml
# Test on the test set
$ python eval.py --cfg ../experiments/004_MPII_ResNet_withAttention_train+val.yaml --save
# Convert the output into the MAT files as expected by MPII authors (requires matlab/octave)
$ cd ../utils;
$ bash convert_mpii_result_for_eval.sh ../src/expt_outputs/004_MPII_ResNet_withAttention_train+val.yaml/<filename.h5>
# Now the generated mat file can be emailed to MPII authors for test evaluation

About

Code/Model release for NIPS 2017 paper "Attentional Pooling for Action Recognition"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.6%
  • C++ 3.0%
  • Other 0.4%