Skip to content

Commit

Permalink
[Feature] Voxelpose (open-mmlab#1050)
Browse files Browse the repository at this point in the history
* [Enhancement] inference speed and flops tools. (open-mmlab#986)

* add the function to test the dummy forward speed of models.

* add tools to test the flops and inference speed of multiple models.

* [Feature] Add ViPNAS models for wholebody keypoint detection (open-mmlab#1009)

* add configs

* add dark configs

* add checkpoint and readme

* update webcam demo

* fix model path in webcam demo

* fix unittest

* update model metafiles (open-mmlab#1001)

* [Feature] Add ViPNAS mbv3 (open-mmlab#1025)

* add vipnas mbv3

* test other variants

* submission for mmpose

* add unittest

* add readme

* update .yml

* fix lint

* rebase

* fix pytest

Co-authored-by: jin-s13 <[email protected]>

* add cfg file for flops and speed test,  change the bulid_posenet to init_pose_model and fix some typo in cfg (open-mmlab#1028)

* Skip CI when some specific files were changed (open-mmlab#1041)

* add voxelpose

* unit test

* unit test

* unit test

* add docs/ckpts

* del unnecessary comments

* correct typos in comments and docs

* Add or modify docs

* change variable names

* reduce memory cost in test

* get person_id

* rebase

* resolve comments

* rebase master

* rename cfg files

* fix typos in comments

Co-authored-by: zengwang430521 <[email protected]>
Co-authored-by: Yining Li <[email protected]>
Co-authored-by: Lumin <[email protected]>
Co-authored-by: jin-s13 <[email protected]>
Co-authored-by: Qikai Li <[email protected]>
Co-authored-by: QwQ2000 <[email protected]>
  • Loading branch information
7 people committed Jan 5, 2022
1 parent 4352386 commit fadd4c2
Show file tree
Hide file tree
Showing 37 changed files with 26,684 additions and 16 deletions.
1 change: 1 addition & 0 deletions .dev_scripts/github/update_model_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ def parse_config_path(path):
'2d_kpt_sview_rgb_img': '2D Keypoint',
'2d_kpt_sview_rgb_vid': '2D Keypoint',
'3d_kpt_sview_rgb_img': '3D Keypoint',
'3d_kpt_mview_rgb_img': '3D Keypoint',
'3d_kpt_sview_rgb_vid': '3D Keypoint',
'3d_mesh_sview_rgb_img': '3D Mesh',
None: None
Expand Down
160 changes: 160 additions & 0 deletions configs/_base_/datasets/panoptic_body3d.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
dataset_info = dict(
dataset_name='panoptic_pose_3d',
paper_info=dict(
author='Joo, Hanbyul and Simon, Tomas and Li, Xulong'
'and Liu, Hao and Tan, Lei and Gui, Lin and Banerjee, Sean'
'and Godisart, Timothy and Nabbe, Bart and Matthews, Iain'
'and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser',
title='Panoptic Studio: A Massively Multiview System '
'for Interaction Motion Capture',
container='IEEE Transactions on Pattern Analysis'
' and Machine Intelligence',
year='2017',
homepage='http://domedb.perception.cs.cmu.edu',
),
keypoint_info={
0:
dict(name='neck', id=0, color=[51, 153, 255], type='upper', swap=''),
1:
dict(name='nose', id=1, color=[51, 153, 255], type='upper', swap=''),
2:
dict(name='mid_hip', id=2, color=[0, 255, 0], type='lower', swap=''),
3:
dict(
name='left_shoulder',
id=3,
color=[0, 255, 0],
type='upper',
swap='right_shoulder'),
4:
dict(
name='left_elbow',
id=4,
color=[0, 255, 0],
type='upper',
swap='right_elbow'),
5:
dict(
name='left_wrist',
id=5,
color=[0, 255, 0],
type='upper',
swap='right_wrist'),
6:
dict(
name='left_hip',
id=6,
color=[0, 255, 0],
type='lower',
swap='right_hip'),
7:
dict(
name='left_knee',
id=7,
color=[0, 255, 0],
type='lower',
swap='right_knee'),
8:
dict(
name='left_ankle',
id=8,
color=[0, 255, 0],
type='lower',
swap='right_ankle'),
9:
dict(
name='right_shoulder',
id=9,
color=[255, 128, 0],
type='upper',
swap='left_shoulder'),
10:
dict(
name='right_elbow',
id=10,
color=[255, 128, 0],
type='upper',
swap='left_elbow'),
11:
dict(
name='right_wrist',
id=11,
color=[255, 128, 0],
type='upper',
swap='left_wrist'),
12:
dict(
name='right_hip',
id=12,
color=[255, 128, 0],
type='lower',
swap='left_hip'),
13:
dict(
name='right_knee',
id=13,
color=[255, 128, 0],
type='lower',
swap='left_knee'),
14:
dict(
name='right_ankle',
id=14,
color=[255, 128, 0],
type='lower',
swap='left_ankle'),
15:
dict(
name='left_eye',
id=15,
color=[51, 153, 255],
type='upper',
swap='right_eye'),
16:
dict(
name='left_ear',
id=16,
color=[51, 153, 255],
type='upper',
swap='right_ear'),
17:
dict(
name='right_eye',
id=17,
color=[51, 153, 255],
type='upper',
swap='left_eye'),
18:
dict(
name='right_ear',
id=18,
color=[51, 153, 255],
type='upper',
swap='left_ear')
},
skeleton_info={
0: dict(link=('nose', 'neck'), id=0, color=[51, 153, 255]),
1: dict(link=('neck', 'left_shoulder'), id=1, color=[0, 255, 0]),
2: dict(link=('neck', 'right_shoulder'), id=2, color=[255, 128, 0]),
3: dict(link=('left_shoulder', 'left_elbow'), id=3, color=[0, 255, 0]),
4: dict(
link=('right_shoulder', 'right_elbow'), id=4, color=[255, 128, 0]),
5: dict(link=('left_elbow', 'left_wrist'), id=5, color=[0, 255, 0]),
6:
dict(link=('right_elbow', 'right_wrist'), id=6, color=[255, 128, 0]),
7: dict(link=('left_ankle', 'left_knee'), id=7, color=[0, 255, 0]),
8: dict(link=('left_knee', 'left_hip'), id=8, color=[0, 255, 0]),
9: dict(link=('right_ankle', 'right_knee'), id=9, color=[255, 128, 0]),
10: dict(link=('right_knee', 'right_hip'), id=10, color=[255, 128, 0]),
11: dict(link=('mid_hip', 'left_hip'), id=11, color=[0, 255, 0]),
12: dict(link=('mid_hip', 'right_hip'), id=12, color=[255, 128, 0]),
13: dict(link=('mid_hip', 'neck'), id=13, color=[51, 153, 255]),
},
joint_weights=[
1.0, 1.0, 1.0, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2,
1.5, 1.0, 1.0, 1.0, 1.0
],
sigmas=[
0.026, 0.026, 0.107, 0.079, 0.072, 0.062, 0.107, 0.087, 0.089, 0.079,
0.072, 0.062, 0.107, 0.087, 0.089, 0.025, 0.035, 0.025, 0.035
])
8 changes: 8 additions & 0 deletions configs/body/3d_kpt_mview_rgb_img/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Multi-view 3D Human Body Pose Estimation

Multi-view 3D human body pose estimation targets at predicting the X, Y, Z coordinates of human body joints from multi-view RGB images.
For this task, we currently support [VoxelPose](configs/body/3d_kpt_mview_rgb_img/voxelpose).

## Data preparation

Please follow [DATA Preparation](/docs/tasks/3d_body_keypoint.md) to prepare data.
23 changes: 23 additions & 0 deletions configs/body/3d_kpt_mview_rgb_img/voxelpose/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

<!-- [ALGORITHM] -->

<details>
<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary>

```bibtex
@inproceedings{tumultipose,
title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment},
author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun},
booktitle={ECCV},
year={2020}
}
```

</details>

VoxelPose proposes to break down the task of 3d human pose estimation into 2 stages: (1) Human center detection by Cuboid Proposal Network
(2) Human pose regression by Pose Regression Network.

The networks in the two stages are all based on 3D convolution. And the input feature volumes are generated by projecting each voxel to
multi-view images and sampling at the projected location on the 2D heatmaps.
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
<!-- [ALGORITHM] -->

<details>
<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary>

```bibtex
@inproceedings{tumultipose,
title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment},
author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun},
booktitle={ECCV},
year={2020}
}
```

</details>

<!-- [DATASET] -->

<details>
<summary align="right"><a href="https://openaccess.thecvf.com/content_iccv_2015/html/Joo_Panoptic_Studio_A_ICCV_2015_paper.html">CMU Panoptic (ICCV'2015)</a></summary>

```bibtex
@Article = {joo_iccv_2015,
author = {Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh},
title = {Panoptic Studio: A Massively Multiview System for Social Motion Capture},
booktitle = {ICCV},
year = {2015}
}
```

</details>

Results on CMU Panoptic dataset.

| Arch | mAP | mAR | MPJPE | Recall@500mm| ckpt | log |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
| [prn64_cpn80_res50](/configs/body/3d_kpt_mview_rgb_img/voxelpose/panoptic/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5.py) | 97.31 | 97.99 | 17.57| 99.85| [ckpt](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5-545c150e_20211103.pth) | [log](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5_20211103.log.json) |
Loading

0 comments on commit fadd4c2

Please sign in to comment.