Skip to content

Commit

Permalink
[Feature] Add ViPNAS_Mbv3 wholebody model (#1055)
Browse files Browse the repository at this point in the history
* add vipnas mbv3 coco_wholebody

* add vipnas mbv3 coco_wholebody md&yml

* fix lint

Co-authored-by: ly015 <[email protected]>
  • Loading branch information
luminxu and ly015 authored Dec 6, 2021
1 parent 8da1257 commit 0bf6ab5
Show file tree
Hide file tree
Showing 6 changed files with 342 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,5 @@ Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO

| Arch | Input Size | Body AP | Body AR | Foot AP | Foot AR | Face AP | Face AR | Hand AP | Hand AR | Whole AP | Whole AR | ckpt | log |
| :---- | :--------: | :-----: | :-----: | :-----: | :-----: | :-----: | :------: | :-----: | :-----: | :------: |:-------: |:------: | :------: |
| [S-ViPNAS-MobileNetV3](/configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_mbv3_coco_wholebody_256x192.py) | 256x192 | 0.619 | 0.700 | 0.477 | 0.608 | 0.585 | 0.689 | 0.386 | 0.505 | 0.473 | 0.578 | [ckpt](https://download.openmmlab.com/mmpose/top_down/vipnas/vipnas_mbv3_coco_wholebody_256x192-0fee581a_20211205.pth) | [log](https://download.openmmlab.com/mmpose/top_down/vipnas/vipnas_mbv3_coco_wholebody_256x192_20211205.log.json) |
| [S-ViPNAS-Res50](/configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_res50_coco_wholebody_256x192.py) | 256x192 | 0.643 | 0.726 | 0.553 | 0.694 | 0.587 | 0.698 | 0.410 | 0.529 | 0.495 | 0.607 | [ckpt](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmpose/top_down/vipnas/vipnas_res50_wholebody_256x192-49e1c3a4_20211112.pth) | [log](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmpose/top_down/vipnas/vipnas_res50_wholebody_256x192_20211112.log.json) |
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,34 @@ Collections:
Title: 'ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search'
URL: https://arxiv.org/abs/2105.10154
Models:
- Config: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_res50_coco_wholebody_256x192.py
- Config: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_mbv3_coco_wholebody_256x192.py
In Collection: ViPNAS
Metadata:
Architecture:
Architecture: &id001
- ViPNAS
Training Data: COCO-WholeBody
Name: topdown_heatmap_vipnas_mbv3_coco_wholebody_256x192
README: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_coco-wholebody.md
Results:
- Dataset: COCO-WholeBody
Metrics:
Body AP: 0.619
Body AR: 0.7
Face AP: 0.585
Face AR: 0.689
Foot AP: 0.477
Foot AR: 0.608
Hand AP: 0.386
Hand AR: 0.505
Whole AP: 0.473
Whole AR: 0.578
Task: Wholebody 2D Keypoint
Weights: https://download.openmmlab.com/mmpose/top_down/vipnas/vipnas_mbv3_coco_wholebody_256x192-0fee581a_20211205.pth
- Config: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_res50_coco_wholebody_256x192.py
In Collection: ViPNAS
Metadata:
Architecture: *id001
Training Data: COCO-WholeBody
Name: topdown_heatmap_vipnas_res50_coco_wholebody_256x192
README: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_coco-wholebody.md
Results:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,5 @@ Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO

| Arch | Input Size | Body AP | Body AR | Foot AP | Foot AR | Face AP | Face AR | Hand AP | Hand AR | Whole AP | Whole AR | ckpt | log |
| :---- | :--------: | :-----: | :-----: | :-----: | :-----: | :-----: | :------: | :-----: | :-----: | :------: |:-------: |:------: | :------: |
| [S-ViPNAS-MobileNetV3_dark](/configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_mbv3_coco_wholebody_256x192_dark.py) | 256x192 | 0.632 | 0.710 | 0.530 | 0.660 | 0.672 | 0.771 | 0.404 | 0.519 | 0.508 | 0.607 | [ckpt](/mnt/lustre/share_data/xulumin/MMPose/vipnas_mbv3_coco_wholebody_256x192_dark-e2158108_20211205.pth) | [log](/mnt/lustre/share_data/xulumin/MMPose/vipnas_mbv3_coco_wholebody_256x192_dark_20211205.log.json) |
| [S-ViPNAS-Res50_dark](/configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_res50_coco_wholebody_256x192_dark.py) | 256x192 | 0.650 | 0.732 | 0.550 | 0.686 | 0.684 | 0.784 | 0.437 | 0.554 | 0.528 | 0.632 | [ckpt](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmpose/top_down/vipnas/vipnas_res50_wholebody_256x192_dark-67c0ce35_20211112.pth) | [log](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmpose/top_down/vipnas/vipnas_res50_wholebody_256x192_dark_20211112.log.json) |
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,35 @@ Collections:
Title: 'ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search'
URL: https://arxiv.org/abs/2105.10154
Models:
- Config: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_res50_coco_wholebody_256x192_dark.py
- Config: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_mbv3_coco_wholebody_256x192_dark.py
In Collection: ViPNAS
Metadata:
Architecture:
Architecture: &id001
- ViPNAS
- DarkPose
Training Data: COCO-WholeBody
Name: topdown_heatmap_vipnas_mbv3_coco_wholebody_256x192_dark
README: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_dark_coco-wholebody.md
Results:
- Dataset: COCO-WholeBody
Metrics:
Body AP: 0.632
Body AR: 0.71
Face AP: 0.672
Face AR: 0.771
Foot AP: 0.53
Foot AR: 0.66
Hand AP: 0.404
Hand AR: 0.519
Whole AP: 0.508
Whole AR: 0.607
Task: Wholebody 2D Keypoint
Weights: /mnt/lustre/share_data/xulumin/MMPose/vipnas_mbv3_coco_wholebody_256x192_dark-e2158108_20211205.pth
- Config: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_res50_coco_wholebody_256x192_dark.py
In Collection: ViPNAS
Metadata:
Architecture: *id001
Training Data: COCO-WholeBody
Name: topdown_heatmap_vipnas_res50_coco_wholebody_256x192_dark
README: configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/vipnas_dark_coco-wholebody.md
Results:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
_base_ = ['../../../../_base_/datasets/coco_wholebody.py']
log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[170, 200])
total_epochs = 210
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=133,
dataset_joints=133,
dataset_channel=[
list(range(133)),
],
inference_channel=list(range(133)))

# model settings
model = dict(
type='TopDown',
pretrained=None,
backbone=dict(type='ViPNAS_MobileNetV3'),
keypoint_head=dict(
type='ViPNASHeatmapSimpleHead',
in_channels=160,
out_channels=channel_cfg['num_output_channels'],
num_deconv_filters=(160, 160, 160),
num_deconv_groups=(160, 160, 160),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[192, 256],
heatmap_size=[48, 64],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
bbox_file='data/coco/person_detection_results/'
'COCO_val2017_detections_AP_H_56_person.json',
)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=30,
scale_factor=0.25),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=2),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
samples_per_gpu=64,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{_base_.dataset_info}}),
val=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{_base_.dataset_info}}),
test=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{_base_.dataset_info}}),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
_base_ = ['../../../../_base_/datasets/coco_wholebody.py']
log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[170, 200])
total_epochs = 210
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=133,
dataset_joints=133,
dataset_channel=[
list(range(133)),
],
inference_channel=list(range(133)))

# model settings
model = dict(
type='TopDown',
pretrained=None,
backbone=dict(type='ViPNAS_MobileNetV3'),
keypoint_head=dict(
type='ViPNASHeatmapSimpleHead',
in_channels=160,
out_channels=channel_cfg['num_output_channels'],
num_deconv_filters=(160, 160, 160),
num_deconv_groups=(160, 160, 160),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='unbiased',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[192, 256],
heatmap_size=[48, 64],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
bbox_file='data/coco/person_detection_results/'
'COCO_val2017_detections_AP_H_56_person.json',
)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=30,
scale_factor=0.25),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=2, unbiased_encoding=True),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
samples_per_gpu=64,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{_base_.dataset_info}}),
val=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{_base_.dataset_info}}),
test=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{_base_.dataset_info}}),
)

0 comments on commit 0bf6ab5

Please sign in to comment.