[Feature] Voxelpose (open-mmlab#1050)

* [Enhancement] inference speed and flops tools. (open-mmlab#986) * add the function to test the dummy forward speed of models. * add tools to test the flops and inference speed of multiple models. * [Feature] Add ViPNAS models for wholebody keypoint detection (open-mmlab#1009) * add configs * add dark configs * add checkpoint and readme * update webcam demo * fix model path in webcam demo * fix unittest * update model metafiles (open-mmlab#1001) * [Feature] Add ViPNAS mbv3 (open-mmlab#1025) * add vipnas mbv3 * test other variants * submission for mmpose * add unittest * add readme * update .yml * fix lint * rebase * fix pytest Co-authored-by: jin-s13 <[email protected]> * add cfg file for flops and speed test, change the bulid_posenet to init_pose_model and fix some typo in cfg (open-mmlab#1028) * Skip CI when some specific files were changed (open-mmlab#1041) * add voxelpose * unit test * unit test * unit test * add docs/ckpts * del unnecessary comments * correct typos in comments and docs * Add or modify docs * change variable names * reduce memory cost in test * get person_id * rebase * resolve comments * rebase master * rename cfg files * fix typos in comments Co-authored-by: zengwang430521 <[email protected]> Co-authored-by: Yining Li <[email protected]> Co-authored-by: Lumin <[email protected]> Co-authored-by: jin-s13 <[email protected]> Co-authored-by: Qikai Li <[email protected]> Co-authored-by: QwQ2000 <[email protected]>
shuheilocale · Jan 5, 2022 · fadd4c2 · fadd4c2
1 parent 4352386
commit fadd4c2
Show file tree

Hide file tree

Showing 37 changed files with 26,684 additions and 16 deletions.
diff --git a/.dev_scripts/github/update_model_index.py b/.dev_scripts/github/update_model_index.py
@@ -146,6 +146,7 @@ def parse_config_path(path):
         '2d_kpt_sview_rgb_img': '2D Keypoint',
         '2d_kpt_sview_rgb_vid': '2D Keypoint',
         '3d_kpt_sview_rgb_img': '3D Keypoint',
+        '3d_kpt_mview_rgb_img': '3D Keypoint',
         '3d_kpt_sview_rgb_vid': '3D Keypoint',
         '3d_mesh_sview_rgb_img': '3D Mesh',
         None: None

diff --git a/configs/_base_/datasets/panoptic_body3d.py b/configs/_base_/datasets/panoptic_body3d.py
@@ -0,0 +1,160 @@
+dataset_info = dict(
+    dataset_name='panoptic_pose_3d',
+    paper_info=dict(
+        author='Joo, Hanbyul and Simon, Tomas and  Li, Xulong'
+        'and Liu, Hao and Tan, Lei and Gui, Lin and Banerjee, Sean'
+        'and Godisart, Timothy and Nabbe, Bart and Matthews, Iain'
+        'and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser',
+        title='Panoptic Studio: A Massively Multiview System '
+        'for Interaction Motion Capture',
+        container='IEEE Transactions on Pattern Analysis'
+        ' and Machine Intelligence',
+        year='2017',
+        homepage='http://domedb.perception.cs.cmu.edu',
+    ),
+    keypoint_info={
+        0:
+        dict(name='neck', id=0, color=[51, 153, 255], type='upper', swap=''),
+        1:
+        dict(name='nose', id=1, color=[51, 153, 255], type='upper', swap=''),
+        2:
+        dict(name='mid_hip', id=2, color=[0, 255, 0], type='lower', swap=''),
+        3:
+        dict(
+            name='left_shoulder',
+            id=3,
+            color=[0, 255, 0],
+            type='upper',
+            swap='right_shoulder'),
+        4:
+        dict(
+            name='left_elbow',
+            id=4,
+            color=[0, 255, 0],
+            type='upper',
+            swap='right_elbow'),
+        5:
+        dict(
+            name='left_wrist',
+            id=5,
+            color=[0, 255, 0],
+            type='upper',
+            swap='right_wrist'),
+        6:
+        dict(
+            name='left_hip',
+            id=6,
+            color=[0, 255, 0],
+            type='lower',
+            swap='right_hip'),
+        7:
+        dict(
+            name='left_knee',
+            id=7,
+            color=[0, 255, 0],
+            type='lower',
+            swap='right_knee'),
+        8:
+        dict(
+            name='left_ankle',
+            id=8,
+            color=[0, 255, 0],
+            type='lower',
+            swap='right_ankle'),
+        9:
+        dict(
+            name='right_shoulder',
+            id=9,
+            color=[255, 128, 0],
+            type='upper',
+            swap='left_shoulder'),
+        10:
+        dict(
+            name='right_elbow',
+            id=10,
+            color=[255, 128, 0],
+            type='upper',
+            swap='left_elbow'),
+        11:
+        dict(
+            name='right_wrist',
+            id=11,
+            color=[255, 128, 0],
+            type='upper',
+            swap='left_wrist'),
+        12:
+        dict(
+            name='right_hip',
+            id=12,
+            color=[255, 128, 0],
+            type='lower',
+            swap='left_hip'),
+        13:
+        dict(
+            name='right_knee',
+            id=13,
+            color=[255, 128, 0],
+            type='lower',
+            swap='left_knee'),
+        14:
+        dict(
+            name='right_ankle',
+            id=14,
+            color=[255, 128, 0],
+            type='lower',
+            swap='left_ankle'),
+        15:
+        dict(
+            name='left_eye',
+            id=15,
+            color=[51, 153, 255],
+            type='upper',
+            swap='right_eye'),
+        16:
+        dict(
+            name='left_ear',
+            id=16,
+            color=[51, 153, 255],
+            type='upper',
+            swap='right_ear'),
+        17:
+        dict(
+            name='right_eye',
+            id=17,
+            color=[51, 153, 255],
+            type='upper',
+            swap='left_eye'),
+        18:
+        dict(
+            name='right_ear',
+            id=18,
+            color=[51, 153, 255],
+            type='upper',
+            swap='left_ear')
+    },
+    skeleton_info={
+        0: dict(link=('nose', 'neck'), id=0, color=[51, 153, 255]),
+        1: dict(link=('neck', 'left_shoulder'), id=1, color=[0, 255, 0]),
+        2: dict(link=('neck', 'right_shoulder'), id=2, color=[255, 128, 0]),
+        3: dict(link=('left_shoulder', 'left_elbow'), id=3, color=[0, 255, 0]),
+        4: dict(
+            link=('right_shoulder', 'right_elbow'), id=4, color=[255, 128, 0]),
+        5: dict(link=('left_elbow', 'left_wrist'), id=5, color=[0, 255, 0]),
+        6:
+        dict(link=('right_elbow', 'right_wrist'), id=6, color=[255, 128, 0]),
+        7: dict(link=('left_ankle', 'left_knee'), id=7, color=[0, 255, 0]),
+        8: dict(link=('left_knee', 'left_hip'), id=8, color=[0, 255, 0]),
+        9: dict(link=('right_ankle', 'right_knee'), id=9, color=[255, 128, 0]),
+        10: dict(link=('right_knee', 'right_hip'), id=10, color=[255, 128, 0]),
+        11: dict(link=('mid_hip', 'left_hip'), id=11, color=[0, 255, 0]),
+        12: dict(link=('mid_hip', 'right_hip'), id=12, color=[255, 128, 0]),
+        13: dict(link=('mid_hip', 'neck'), id=13, color=[51, 153, 255]),
+    },
+    joint_weights=[
+        1.0, 1.0, 1.0, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2,
+        1.5, 1.0, 1.0, 1.0, 1.0
+    ],
+    sigmas=[
+        0.026, 0.026, 0.107, 0.079, 0.072, 0.062, 0.107, 0.087, 0.089, 0.079,
+        0.072, 0.062, 0.107, 0.087, 0.089, 0.025, 0.035, 0.025, 0.035
+    ])
diff --git a/configs/body/3d_kpt_mview_rgb_img/README.md b/configs/body/3d_kpt_mview_rgb_img/README.md
@@ -0,0 +1,8 @@
+# Multi-view 3D Human Body Pose Estimation
+
+Multi-view 3D human body pose estimation targets at predicting the X, Y, Z coordinates of human body joints from multi-view RGB images.
+For this task, we currently support [VoxelPose](configs/body/3d_kpt_mview_rgb_img/voxelpose).
+
+## Data preparation
+
+Please follow [DATA Preparation](/docs/tasks/3d_body_keypoint.md) to prepare data.
diff --git a/configs/body/3d_kpt_mview_rgb_img/voxelpose/README.md b/configs/body/3d_kpt_mview_rgb_img/voxelpose/README.md
@@ -0,0 +1,23 @@
+# VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment
+
+<!-- [ALGORITHM] -->
+
+<details>
+<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary>
+
+```bibtex
+@inproceedings{tumultipose,
+  title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment},
+  author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun},
+  booktitle={ECCV},
+  year={2020}
+}
+```
+
+</details>
+
+VoxelPose proposes to break down the task of 3d human pose estimation into 2 stages: (1) Human center detection by Cuboid Proposal Network
+(2) Human pose regression by Pose Regression Network.
+
+The networks in the two stages are all based on 3D convolution. And the input feature volumes are generated by projecting each voxel to
+multi-view images and sampling at the projected location on the 2D heatmaps.
diff --git a/...w_rgb_img/voxelpose/panoptic/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5.md b/...w_rgb_img/voxelpose/panoptic/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5.md
@@ -0,0 +1,37 @@
+<!-- [ALGORITHM] -->
+
+<details>
+<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary>
+
+```bibtex
+@inproceedings{tumultipose,
+  title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment},
+  author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun},
+  booktitle={ECCV},
+  year={2020}
+}
+```
+
+</details>
+
+<!-- [DATASET] -->
+
+<details>
+<summary align="right"><a href="https://openaccess.thecvf.com/content_iccv_2015/html/Joo_Panoptic_Studio_A_ICCV_2015_paper.html">CMU Panoptic (ICCV'2015)</a></summary>
+
+```bibtex
+@Article = {joo_iccv_2015,
+author = {Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh},
+title = {Panoptic Studio: A Massively Multiview System for Social Motion Capture},
+booktitle = {ICCV},
+year = {2015}
+}
+```
+
+</details>
+
+Results on CMU Panoptic dataset.
+
+| Arch | mAP | mAR | MPJPE | Recall@500mm| ckpt | log |
+| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
+| [prn64_cpn80_res50](/configs/body/3d_kpt_mview_rgb_img/voxelpose/panoptic/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5.py) | 97.31 | 97.99 | 17.57| 99.85| [ckpt](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5-545c150e_20211103.pth) | [log](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5_20211103.log.json) |