Is 2D Heatmap Even Necessary for Human Pose Estimation?

NOTE: SimDR is the old name of this work, and we now use SimCC officially in our paper. For simplicity, we won't change the name in our codes, considering it has already been used by a lot of people.

The 2D heatmap representation has dominated human pose estimation for years due to its high performance. However, heatmap-based approaches suffer from several shortcomings:

1. The performance drops dramatically in the low-resolution images, which are frequently encountered in real-world scenarios.
1. To improve the localization precision, multiple upsample layers may be needed to recover the feature map resolution from low to high, which are computationally expensive.
1. Extra coordinate refinement is usually necessary to reduce the quantization error of downscaled heatmaps.

Intro: Given the shortcomings revealed above, we don't think 2D heatmap is the final solution for keypoint coordinate representation to this field. By contrast, SimDR is a simple yet effective scheme which gets rid of extra post-processing and reduces the quantisation error by the coordinate representation design. For the first time, SimDR brings heatmap-free methods to the competitive performance level of heatmap-based methods, outperforming the latter by a large margin in low input resolution cases. Additionally, SimDR allows one to directly remove the time-consuming upsampling module of some methods, which may inspire new researches on lightweight models for Human Pose Estimation

We hope proposed SimDR will motivate the community to rethink the design of coordinate representation for 2D human pose estimation.

For details see SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation by Yanjie Li, Sen Yang, Peidong Liu, Shoukui Zhang, Yunxiao Wang, Zhicheng Wang, Wankou Yang and Shu-Tao Xia.

News!

[2022.07.17] Our paper ''SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation'' has been accpeted by ECCV'2022 as Oral presentation (acceptance rate: 2.7%). If you find this repository useful please give it a star 🌟.
[2021.08.17] The pretrained models are released in Google Drive!
[2021.07.09] The codes for SimDR and SimDR* (space-aware SimDR) are released!

Experiments

Results on COCO test-dev set

Method	Representation	Input size	GFLOPs	AP	AR
SimBa-Res50	heatmap	384x288	20.0	71.5	76.9
SimBa-Res50	SimDR*	384x288	20.2	72.7	78.0
HRNet-W48	heatmap	256x192	14.6	74.2	79.5
HRNet-W48	SimDR*	256x192	14.6	75.4	80.5
HRNet-W48	heatmap	384x288	32.9	75.5	80.5
HRNet-W48	SimDR*	384x288	32.9	76.0	81.1

Note:

Flip test is used.
Person detector has person AP of 60.9 on COCO test-dev2017 dataset.
GFLOPs is for convolution and linear layers only.

Results on COCO validation set

Method	Representation	Input size	#Params	GFLOPs	Extra post.	AP	AR
SimBa-Res50	heatmap	64x64	34.0M	0.7	Y	34.4	43.7
	heatmap	64x64	34.0M	0.7	N	25.8	36.0
	SimDR (ours)	64x64	34.1M	0.7	N	40.8	49.6
	heatmap	128x128	34.0M	3.0	Y	60.3	67.6
	heatmap	128x128	34.0M	3.0	N	55.4	63.6
	SimDR (ours)	128x128	34.8M	3.0	N	62.6	69.5
	heatmap	256x192	34.0M	8.9	Y	70.4	76.3
	heatmap	256x192	34.0M	8.9	N	68.5	74.8
	SimDR (ours)	256x192	36.8M	9.0	N	71.4	77.4
TokenPose-S	heatmap	64x64	4.9M	1.4	Y	57.1	64.8
	heatmap	64x64	4.9M	1.4	N	35.9	47.0
	SimDR (ours)	64x64	4.9M	1.4	N	62.8	70.1
	heatmap	128x128	5.2M	1.6	Y	65.4	71.6
	heatmap	128x128	5.2M	1.6	N	57.6	64.9
	SimDR (ours)	128x128	5.1M	1.6	N	71.4	76.4
	heatmap	256x192	6.6M	2.2	Y	72.5	78.0
	heatmap	256x192	6.6M	2.2	N	69.9	75.8
	SimDR (ours)	256x192	5.5M	2.2	N	73.6	78.9
SimBa-Res101	heatmap	64x64	53.0M	1.0	Y	34.1	43.5
	heatmap	64x64	53.0M	1.0	N	25.7	36.1
	SimDR (ours)	64x64	53.1M	1.0	N	39.6	48.9
	heatmap	128x128	53.0M	4.1	Y	59.2	66.7
	heatmap	128x128	53.0M	4.1	N	54.4	62.5
	SimDR (ours)	128x128	53.5M	4.1	N	63.1	70.1
	heatmap	256x192	53.0M	12.4	Y	71.4	77.1
	heatmap	256x192	53.0M	12.4	N	69.5	75.6
	SimDR (ours)	256x192	53.7M	12.4	N	72.3	78.0
HRNet-W32	heatmap	64x64	28.5M	0.6	Y	45.8	55.3
	heatmap	64x64	28.5M	0.6	N	34.6	45.6
	SimDR (ours)	64x64	28.6M	0.6	N	56.4	64.9
	heatmap	128x128	28.5M	2.4	Y	67.2	74.1
	heatmap	128x128	28.5M	2.4	N	61.9	69.4
	SimDR (ours)	128x128	29.1M	2.4	N	70.7	76.7
	heatmap	256x192	28.5M	7.1	Y	74.4	79.8
	heatmap	256x192	28.5M	7.1	N	72.3	78.2
	SimDR	256x192	31.3M	7.1	N	75.3	80.8
HRNet-W48	heatmap	64x64	63.6M	1.2	Y	48.5	57.8
	heatmap	64x64	63.6M	1.2	N	36.9	47.8
	SimDR (ours)	64x64	63.7M	1.2	N	59.7	67.5
	heatmap	128x128	63.6M	4.9	Y	68.9	75.3
	heatmap	128x128	63.6M	4.9	N	63.3	70.5
	SimDR (ours)	128x128	64.1M	4.9	N	72.0	77.9
	heatmap	256x192	63.6M	14.6	Y	75.1	80.4
	heatmap	256x192	63.6M	14.6	N	73.1	78.7
	SimDR (ours)	256x192	66.3M	14.6	N	75.9	81.2

Note:

Flip test is used.
Person detector has person AP of 56.4 on COCO val2017 dataset.
GFLOPs is for convolution and linear layers only.
Extra post. = extra post-processing towards refining the predicted keypoint coordinate.

Results on higher input resolution

Results on the COCO validation set with the input size of 384×288.

Method	Representation	AP	AP_50	AP_75	AP_M	AP_L	AR
SimBa-Res50	heatmap	72.2	89.3	78.9	68.1	79.7	77.6
	SimDR (ours)	73.0	89.3	79.7	69.5	79.9	78.6
	*SimDR (ours)**	73.4	89.2	80.0	69.7	80.6	78.8
SimBa-Res101	heatmap	73.6	89.6	80.3	69.9	81.1	79.1
SimBa-Res101	SimDR (ours)	74.2	89.6	80.9	70.7	80.9	79.8
SimBa-Res152	heatmap	74.3	89.6	81.1	70.5	81.6	79.7
SimBa-Res152	SimDR (ours)	74.9	89.9	81.5	71.4	81.7	80.4
HRNet-W48	heatmap	76.3	90.8	82.9	72.3	83.4	81.2
HRNet-W48	*SimDR (ours)**	76.9	90.9	83.2	73.2	83.8	82.0

Note:

Flip test is used.
Person detector has person AP of 56.4 on COCO val2017 dataset.

Results on MPII val set

Method	Representation	Input size	Hea	Sho	Elb	Wri	Hip	Kne	Ank	Mean
[email protected]
HRNet-W32	heatmap	64x64	89.7	86.6	75.1	65.7	77.2	69.2	63.6	76.4
	SimDR (ours)	64x64	96.5	89.5	77.5	67.6	79.8	71.5	65.0	78.7
	heatmap	256x256	97.1	95.9	90.3	86.4	89.1	87.1	83.3	90.3
	SimDR (ours)	256x256	96.8	95.9	90.0	85.0	89.1	85.4	81.3	89.6
	*SimDR (ours)**	256x256	97.2	96.0	90.4	85.6	89.5	85.8	81.8	90.0
[email protected]
HRNet-W32	heatmap	64x64	12.9	11.7	9.7	7.1	7.2	7.2	6.6	9.2
	SimDR (ours)	64x64	30.9	23.3	18.1	15.0	10.5	13.1	12.8	18.5
	heatmap	256x256	44.5	37.3	37.5	36.9	15.1	25.9	27.2	33.1
	SimDR (ours)	256x256	50.1	41.0	45.3	42.4	16.6	29.7	30.3	37.8

Note:

Flip test is used.
It seems that there is a bug while computing [email protected] in the original code, we have it fixed in this repo.

Results on CrowdPose

Method	Representation	Input size	AP	AP_50	AP_75	AP_E	AP_M	AP_H
HRNet-W32	heatmap	64x64	42.4	69.6	45.5	51.2	43.1	31.8
	SimDR (ours)	64x64	46.5	70.9	50.0	56.0	47.5	34.7
	heatmap	256x192	66.4	81.1	71.5	74.0	67.4	55.6
	SimDR (ours)	256x192	66.7	82.1	72.0	74.1	67.8	56.2

Start to use

1. Dependencies installation & data preparation

Please refer to THIS to prepare the environment step by step.

2. Model Zoo

Pretrained models are provided in our model zoo.

3. Trainging

Training on COCO train2017 dataset

To train with SimDR as keypoint coordinate representation :

python tools/train.py \
    --cfg experiments/coco/hrnet/simdr/nmt_w48_256x192_adam_lr1e-3.yaml\

To train with SimDR* as keypoint coordinate representation :

python tools/train.py \
    --cfg experiments/coco/hrnet/sa_simdr/w48_256x192_adam_lr1e-3_split2_sigma4.yaml\

*Note: After using SimDR, the decovonlution layers of SimpleBaseline can be reserved or removed.

Training on MPII dataset

To train with SimDR as keypoint coordinate representation :

python tools/train.py \
    --cfg experiments/mpii/hrnet/simdr/norm_w32_256x256_adam_lr1e-3_ls2e1.yaml

To train with SimDR* as keypoint coordinate representation :

python tools/train.py \
    --cfg experiments/mpii/hrnet/sa_simdr/w32_256x256_adam_lr1e-3_split2_sigma6.yaml

4. Testing

Testing on COCO val2017 dataset using model zoo's models

python tools/test.py \
    --cfg experiments/coco/hrnet/simdr/nmt_w48_256x192_adam_lr1e-3.yaml \
    TEST.MODEL_FILE _PATH_TO_CHECKPOINT_ \
    TEST.USE_GT_BBOX False

python tools/test.py \
    --cfg experiments/coco/hrnet/sa_simdr/w48_256x192_adam_lr1e-3_split2_sigma4.yaml \
    TEST.MODEL_FILE _PATH_TO_CHECKPOINT_ \
    TEST.USE_GT_BBOX False

Testing on MPII dataset using model zoo's models

python tools/test.py \
    --cfg experiments/mpii/hrnet/simdr/norm_w32_256x256_adam_lr1e-3_ls2e1.yaml \
    TEST.MODEL_FILE _PATH_TO_CHECKPOINT_ TEST.PCKH_THRE 0.5

Citations

If you use our code or models in your research, please cite with:

@misc{li20212d,
      title={Is 2D Heatmap Representation Even Necessary for Human Pose Estimation?}, 
      author={Yanjie Li and Sen Yang and Shoukui Zhang and Zhicheng Wang and Wankou Yang and Shu-Tao Xia and Erjin Zhou},
      year={2021},
      eprint={2107.03332},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Thanks for the open-source HRNet.

Deep High-Resolution Representation Learning for Human Pose Estimation, Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
experiments		experiments
lib		lib
tools		tools
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Is 2D Heatmap Even Necessary for Human Pose Estimation?

News!

Experiments

Results on COCO test-dev set

Note:

Results on COCO validation set

Note:

Results on higher input resolution

Note:

Results on MPII val set

Note:

Results on CrowdPose

Start to use

1. Dependencies installation & data preparation

2. Model Zoo

3. Trainging

Training on COCO train2017 dataset

Training on MPII dataset

4. Testing

Testing on COCO val2017 dataset using model zoo's models

Testing on MPII dataset using model zoo's models

Citations

Acknowledgement

About

Releases

Packages

Languages

xin-li-67/SimCC

Folders and files

Latest commit

History

Repository files navigation

Is 2D Heatmap Even Necessary for Human Pose Estimation?

News!

Experiments

Results on COCO test-dev set

Note:

Results on COCO validation set

Note:

Results on higher input resolution

Note:

Results on MPII val set

Note:

Results on CrowdPose

Start to use

1. Dependencies installation & data preparation

2. Model Zoo

3. Trainging

Training on COCO train2017 dataset

Training on MPII dataset

4. Testing

Testing on COCO val2017 dataset using model zoo's models

Testing on MPII dataset using model zoo's models

Citations

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages