TransCenter: Transformers with Dense Queries for Multiple-Object Tracking

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking
Yihong Xu, Yutong Ban, Guillaume Delorme, Chuang Gan, Daniela Rus, Xavier Alameda-Pineda
[Paper] [Project]

Bibtex

If you find this code useful, please star the project and consider citing:

@misc{xu2021transcenter,
      title={TransCenter: Transformers with Dense Queries for Multiple-Object Tracking}, 
      author={Yihong Xu and Yutong Ban and Guillaume Delorme and Chuang Gan and Daniela Rus and Xavier Alameda-Pineda},
      year={2021},
      eprint={2103.15145},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Environment Preparation

Option 1 (recommended):

We provide two singularity images (similar to docker) containing all the packages we need for TransCenter:

Install singularity > 3.7.1: https://sylabs.io/guides/3.0/user-guide/installation.html#install-on-linux
Download one of the singularity images:

pytorch1-5cuda10-1.sif tested with Nvidia GTX TITAN. Or

pytorch1-5cuda10-1_RTX.sif tested with Nvidia RTX TITAN, Quadro RTX 8000, RTX 2080Ti, Quadro RTX 4000.

Launch a Singularity image

singularity shell --nv --bind yourLocalPath:yourPathInsideImage YourSingularityImage.sif

- -bind: to link a singularity path with a local path. By doing this, you can find data from local PC inside Singularity image;
- -nv: use the local Nvidia driver.

Option 2:

You can also build your own environment:

we use anaconda to simplify the package installations, you can download anaconda (4.9.2) here: https://www.anaconda.com/products/individual
you can create your conda env by doing

conda create --name <YourEnvName> --file requirements.txt

TransCenter uses Deformable transformer from Deformable DETR. Therefore, we need to install deformable attention modules:

cd ./to_install/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

TransCenter uses pytorch-liteflownet during tracking, which depends on correlation_package. You can install it by doing:

cd ./to_install/correlation_package
python setup.py install

for the up-scale and merge module in TransCenter, we use deformable convolution module, you can install it with:

cd ./to_install/DCNv2
./make.sh         # build
python testcpu.py    # run examples and gradient check on cpu
python testcuda.py   # run examples and gradient check on gpu

see also known issues from https://github.com/CharlesShang/DCNv2. If you have issues related to cuda of the third-party modules, please try to recompile them in the GPU that you use for training and testing. The dependencies are compatible with Pytorch 1.5, cuda 10.2.

Data Preparation

ms coco: we use only the person category for pretraining TransCenter. The code for filtering is provided in ./data/coco_person.py.

@inproceedings{lin2014microsoft,
  title={Microsoft coco: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={European conference on computer vision},
  pages={740--755},
  year={2014},
  organization={Springer}
}

CrowdHuman: CrowdHuman labels are converted to coco format, the conversion can be done through ./data/convert_crowdhuman_to_coco.py.

@article{shao2018crowdhuman,
    title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
    author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
    journal={arXiv preprint arXiv:1805.00123},
    year={2018}
  }

MOT17: MOT17 labels are converted to coco format, the conversion can be done through ./data/convert_mot_to_coco.py.

@article{milan2016mot16,
  title={MOT16: A benchmark for multi-object tracking},
  author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
  journal={arXiv preprint arXiv:1603.00831},
  year={2016}
}

MOT20: MOT20 labels are converted to coco format, the conversion can be done through ./data/convert_mot20_to_coco.py.

@article{dendorfer2020mot20,
  title={Mot20: A benchmark for multi object tracking in crowded scenes},
  author={Dendorfer, Patrick and Rezatofighi, Hamid and Milan, Anton and Shi, Javen and Cremers, Daniel and Reid, Ian and Roth, Stefan and Schindler, Konrad and Leal-Taix{\'e}, Laura},
  journal={arXiv preprint arXiv:2003.09003},
  year={2020}
}

We also provide the filtered/converted labels:

ms coco person labels: please put the annotations folder inside cocoperson to your ms coco dataset root folder.

CrowdHuman coco format labels: please put the annotations folder inside crowdhuman to your CrowdHuman dataset root folder.

MOT17 coco format labels: please put the annotations and annotations_onlySDP folders inside MOT17 to your MOT17 dataset root folder.

MOT20 coco format labels: please put the annotations folder inside MOT20 to your MOT20 dataset root folder.

Model Zoo

deformable transformer pretrained: pretrained model from deformable-DETR.

coco_pretrained: model trained with coco person dataset.

CH_pretrained: model pretrained on coco person and fine-tuned on CrowdHuman dataset.

MOT17_fromCoCo: model pretrained on coco person and fine-tuned on MOT17 trainset.

MOT17_fromCH: model pretrained on CrowdHuman and fine-tuned on MOT17 trainset.

MOT20_fromCoCo: model pretrained on coco person and fine-tuned on MOT20 trainset.

MOT20_fromCH: model pretrained on CrowdHuman and fine-tuned on MOT20 trainset.

Please put all the pretrained models to ./model_zoo .

Training

Pretrained on coco person dataset:

cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=4 --use_env ./training/transcenter/main_coco_tracking.py --output_dir=./output/whole_coco --batch_size=4 --num_workers=20 --resume=./model_zoo/r50_deformable_detr-checkpoint.pth --pre_hm --tracking --data_dir=PathToCoCoDataset

Pretrained on CrowdHuman dataset:

cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=4 --use_env ./training/transcenter/main_crowdHuman_tracking.py --output_dir=./output/whole_ch_from_COCO --batch_size=4 --num_workers=20 --resume=./model_zoo/coco_pretrained.pth --pre_hm --tracking --data_dir=PathToCrowdHumanDataset

Train MOT17 from CoCo pretrained model:

cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot17_tracking.py --output_dir=./output/whole_MOT17_from_COCO --batch_size=2 --num_workers=20 --resume=./model_zoo/coco_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --data_dir=PathToMOT17dataset

Train MOT17 from CrowdHuman pretrained model:

cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot17_tracking.py --output_dir=./output/whole_MOT17_from_CH --batch_size=2 --num_workers=20 --resume=./model_zoo/CH_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --data_dir=PathToMOT17dataset

Train MOT20 from CoCo pretrained model:

cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot20_tracking.py --output_dir=./output/whole_MOT20_from_COCO --batch_size=2 --num_workers=20 --resume=./model_zoo/coco_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --not_max_crop --data_dir=PathToMOT20dataset

Train MOT20 from CrowdHuman pretrained model:

cd TransCenter_official
python -m torch.distributed.launch --nproc_per_node=2 --use_env ./training/transcenter/main_mot20_tracking.py --output_dir=./output/whole_MOT20_from_CH --batch_size=2 --num_workers=20 --resume=./model_zoo/CH_pretrained.pth --pre_hm --tracking  --same_aug_pre --image_blur_aug --not_max_crop --data_dir=PathToMOT20dataset

Tips:

If you encounter RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR in some GPUs, please try to set torch.backends.cudnn.benchmark=False. In most of the cases, setting torch.backends.cudnn.benchmark=True is more memory-efficient.
Depending on your environment and GPUs, you might experience MOTA jitter in your final models.
You may see training noise during fine-tuning, especially for MOT17/MOT20 training with well-pretrained models. You can slow down the training rate by 1/10, apply early stopping, increase batch size with GPUs having more memory.
If you have GPU memory issues, try to lower the batch size for training and evaluation in main_****.py, freeze the resnet backbone and use our coco/CH pretrained models.

Tracking

Using Public detections:

MOT17:

cd TransCenter_official
python ./tracking/transcenter/mot17_pub.py --data_dir=YourMOT17Path

MOT20:

cd TransCenter_official
python ./tracking/transcenter/mot20_pub.py --data_dir=YourMOT20Path

Using Private detections:

MOT17:

cd TransCenter_official
python ./tracking/transcenter/mot17_private.py --data_dir=YourMOT17Path

MOT20:

cd TransCenter_official
python ./tracking/transcenter/mot20_private.py --data_dir=YourMOT20Path

Notes:

we recently corrected an image loading bug during reading certain images having an image ratio close to 1 (in MOT20) in the code, bringing better performance in MOT20.
you can test your model by changing the model_path inside mot17[20]_private[pub].py.

MOTChallenge Results

MOT17 public detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	68.8%	79.9%	61.4%	22,860	149,188	4,102
CH	71.9%	81.4%	62.3%	17,378	137,008	4,046

MOT20 public detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	61.0%	79.5%	49.8%	49,189	147,890	4,493
CH	62.3%	79.9%	50.3%	43,006	147,505	4,545

MOT17 private detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	70.0%	79.6%	62.1%	28,119	136,722	4,647
CH	73.2%	81.1%	62.2%	23,112	123,738	4,614

MOT20 private detections:

Pretrained	MOTA	MOTP	IDF1	FP	FN	IDS
CoCo	60.6%	79.5%	49.6%	52,332	146,809	4,604
CH	61.9%	79.9%	50.4%	45,895	146,347	4,653

Note:

The results can be slightly different depending on the running environment.
We might keep updating the results in the near future.

Acknowledgement

The code for TransCenter is modified and network pre-trained weights are obtained from the following repositories:

The Person Reid Network (./tracking/transcenter/model_zoo/ResNet_iter_25245.pth) is from Tracktor.
The lightflownet pretrained model (./tracking/transcenter/util/LiteFlownet/network-kitti.pytorch) is from pytorch-liteflownet and LiteFlowNet.
The deformable transformer pretrained model (./model_zoo/r50_deformable_detr-checkpoint.pth) is from Deformable-DETR.
The data format conversion code is modified from CenterTrack.

CenterTrack, Deformable-DETR, Tracktor.

@article{zhou2020tracking,
  title={Tracking Objects as Points},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  journal={ECCV},
  year={2020}
}

@InProceedings{tracktor_2019_ICCV,
author = {Bergmann, Philipp and Meinhardt, Tim and Leal{-}Taix{\'{e}}, Laura},
title = {Tracking Without Bells and Whistles},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}}

@article{zhu2020deformable,
  title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
  author={Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng},
  journal={arXiv preprint arXiv:2010.04159},
  year={2020}
}

Several modules are from:

MOT Metrics in Python: py-motmetrics

Soft-NMS: Soft-NMS

DETR: DETR

DCNv2: DCNv2

correlation_package: correlation_package

pytorch-liteflownet: pytorch-liteflownet

LiteFlowNet: LiteFlowNet

@InProceedings{hui18liteflownet,
    author = {Tak-Wai Hui and Xiaoou Tang and Chen Change Loy},
    title = {LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation},
    booktitle  = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2018},
    pages = {8981--8989},
    }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking

Bibtex

Environment Preparation

Option 1 (recommended):

Option 2:

Data Preparation

Model Zoo

Training

Tracking

MOTChallenge Results

Acknowledgement

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking

Bibtex

Environment Preparation

Option 1 (recommended):

Option 2:

Data Preparation

Model Zoo

Training

Tracking

MOTChallenge Results

Acknowledgement