This is the source code for "Slicing Convolutional Neural Network for Crowd Video Understanding". It aims at learning generic spatio-temporal features from crowd videos, especially for long-term temporal learning (i.e. 100 frames
).
Three-branch Slicing CNN model (i.e. xy-, xt-, and yt-branch)
Crowd attribute recognition (i.e. 94 crowd-related attributes)
A fork of the well-known Caffe framework with Multi-GPU
training and Dimension Swap
layer.
Apart from the official installation prerequisites, we have several other dependencies:
- Install
openmpi
to allow multi-gpu running - Python packages (e.g. numpy, scipy, scikit-image, etc.)
- Add
export PYTHONPATH="[path_python_layer]:$PYTHONPATH"
to~/.bashrc
and restart the terminal. Here[path_python_layer]
indicates the absolute path of the python script ofpy_dim_swap_layer.py
.
Get the Caffe code
git clone --recursive https://github.com/amandajshao/Slicing-CNN.git
-
The dataset is introduced in CVPR 2015 which contains 10,000 crowd videos from 8,257 different crowded scenes with annotated 94 attributes.
-
The LMDB data used in the model with training/validation/test splits.
-
CNN Initial Model
The initial model (VGG-16) is pre-trained on UCF-101 action dataset (single frame) and fine-tuned on WWW dataset (single frame).
BaiduDisk link: http://pan.baidu.com/s/1jH5VLNw (password: 76zl)
-
CNN Best Model
Three models: SCNN-xy, SCNN-xt, SCNN-yt.
BaiduDisk link: http://pan.baidu.com/s/1pK7h5sJ (password: j024)
-
Prototxt
The prototxts are corresponding to the above three models (SCNN-xy/-xt/-yt).
BaiduDisk link: http://pan.baidu.com/s/1o85xUI2 (password: mwvo)
-
Scripts
There are two scripts provided in our code: model_run.sh and extract_features.sh.
Deeply Learned Attributes for Crowd Scene Understanding
J. Shao, C. C. Loy, K. Kang, and X. Wang. Slicing Convolutional Neural Network for Crowd Video Understanding. Computer Vision and Pattern Recognition (CVPR), 2016.
@article{shao2016scnn,
title={Slicing Convolutional Neural Network for Crowd Video Understanding},
author={Shao, Jing and Loy, Chen Change and Kang, Kai and Wang, Xiaogang},
booktitle={Computer Vision and Pattern Recognition (CVPR)},
year={2016}
}