This repository contains the code to reproduce the main results of the following paper: "Multimodal and multiview distillation for real-time player detection on a football field". The paper can be found here: Paper. This work will be presented at the 6th International Workshop on Computer Vision in Sports (CVsports) at CVPR 2020.
@InProceedings{Cioppa2020Multimodal,
author = {Cioppa, Anthony and Deliège, Adrien and Huda, Noor Ul and Gade, Rikke and Van Droogenbroeck, Marc and Moeslund, Thomas B.},
title = {Multimodal and multiview distillation for real-time player detection on a football field},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}
The objective of this work is to monitor the occupancy of a football field using player detection on a thermal and a fisheye camera. In this work, we train a network in an online knowledge distillation approach as in our previous work ARTHuS, Github link. The main difference with our previous work is that the student and the teacher have different modalities and a different view of the same scene. In particular, we design a custom data augmentation combined with a motion detection algorithm to handle the training in the region of the fisheye camera not covered by the thermal one. We show that our solution is effective in detecting players on the whole field filmed by the fisheye camera.
This repository provides every module needed to reproduce the main results of the paper on our dataset: the network architecture, the losses, the training procedure and the evaluation.
For more information about the content of the paper, check out our presentation video. To see more of our work, subscribe to our YouTube channel Acad AI Research
The following instructions will help you install the required libraries and the dataset to run the code. The code runs in python 3
and was tested inside a nvidia-docker with the folowing base image: pytorch:18.02-py3
which can be found at the following address: NVIDIA TENSORFLOW IMAGE REPOSITORY and a conda environment.
Whether you are using the docker image or not, here are the versions of the libraries that are used:
h5py 2.9.0
matplotlib 3.0.3
natsort 6.0.0
numpy 1.16.3
opencv-python 4.1.0.25
pandas 0.22.0
Pillow 6.0.0
scikit-image 0.15.0
scikit-learn 0.20.3
scipy 1.0.1
torch 1.0.1.post2
torchvision 0.2.2.post3
tqdm 4.23.1
If you are using the nvidia-docker, you can follow these steps to instantiate the docker and install the libraries:
In our case we used the following commands to create the dockers. Note that you will need to replace /path/to/your/directory/ by the path to one of your directories and path/to/the/docker/image by the path to the docker image. Note that you can select the GPUs to use by changing the index(es) of the NV_GPU variable.
NV_GPU=0 nvidia-docker run --name Multimodal_Multiview_Distillation -it --rm --shm-size=1g --ulimit memlock=-1 -v /path/to/your/directory/:/workspace/generic path/to/the/docker/image nvcr.io/nvidia/pytorch:18.02-py3
To install the code and libraries, simply run:
1. git clone https://github.com/cioppaanthony/multimodal-multiview-distillation
3. cd multimodal-multiview-distillation
3. bash docker\_install.sh
At this step, all the required libraries are installed. Note that outside of a docker, sudo permission can be required to install the libraries in the docker_install.sh
file.
If you are using conda, simply follow these steps to create the environment and install the required libraries:
conda create -n multimodal_multiview_distillation
conda install python=3.7 pip cudnn cudatoolkit=10.1
pip install numpy==1.16.3 tqdm==4.23.1 h5py==2.9.0 matplotlib==3.0.3 opencv-python-headless==4.1.0.25 opencv-contrib-python-headless==4.1.0.25 torch==1.0.1.post2 torchvision== 0.2.2.post3 natsort==6.0.0 pandas==0.22.0 Pillow==6.0.0 scikit-image==0.15.0 scikit-learn==0.20.3 scipy==1.0.1
The dataset needs to be installed in the data folder. We provide a script for automatically downloading the dataset from a Google Drive repository. The dataset is shared only for scientific research purposes. Please, see the License file that comes along the dataset.
pip install -U pip setuptools
pip install gdown
sudo apt-get install zip
bash dataset_install.sh
The dataset contains:
- The fisheye video at 12 fps of the players on the field
- The thermal video re-sampled at 12 fps and temporally synchronized with the fishete video.
- A JSON file containing the bounding boxes of the teacher network, so on the thermal camera, which have been pre-computed by a Yolo network from a previous work.
- The pre-computed ViBe masks (the background subtraction algorithm) on the fisheye video in image format. These masks have been obtained with our Pytorch implementation of ViBe which will be shared this summer on our GitHub as well.
- The field and calibration masks.
The code for the online distillation is located inside the performance_benchmark
folder under the name main.py
.
To run the code, simply go to the performance_benchmark
folder and run the following command:
cd performance_benchmark
python3 main.py
This code will save in the output
folder. At each run of the code it will create an experiment_i
folder containing three sub-fodlers:
- A JSON file containing the output bounding boxes of the student network on the fisheye camera in the
student_outputs
folder. - The last weights of the network during the online distillation in the
networks
folder. - If requested by the
--outteacher 1
argument, it will save the output of the teacher network on the fisheye and thermal images in theteacher_output
folder.
Note that a lot of different arguments are also available for this command line in the utils/argument_parser.py
file to control the training.
The detections are stored in the json format and must be used in the following command line to apply the background subtraction post-processing and get the final video with the detected players overlayed.
To apply the background subtraction post-processing on the results and get the final output video containing the bounding boxes, you will need to run the folowing command line also from the performance_benchmark
:
python3 bgs_postprocessing.py --fisheyebbox path/to/the/json/detection/file -s /path/to/save/results/
This code also saves the final bounding boxes for the evaluation in the detections_with_BGS.json
file. By default, it is saved in the output
folder.
Finally, we provide a way to compute the same graphs of our paper (Figure 8) evaluating the quality of the detected players during the online distillation phase post-processed by the background subtraction algorithm. By default, the graph is saved in the output folder.
python3 evaluate.py --fisheyebbox path/to/the/json/detection/file/detections_with_bgs.json -s /path/to/save/the/graph/
- Anthony Cioppa, University of Liège (ULiège).
- Adrien Deliège, University of Liège (ULiège).
- Noor Ul Huda, Aalborg University (AAU).
See the AUTHORS file for details.
GNU GENERAL PUBLIC LICENSE Version 3
See the LICENSE file for details.
- Anthony Cioppa is funded by the FRIA, Belgium.
- This work is supported by the DeepSport project of the Walloon Region, at the University of Liège (ULiège), Belgium.
- Many thanks to Aalborg University and all the staff of the Visual Analysis of People Laboratory for their warm welcome during this research stay.