Accelerating the inference of the trained model for COCO-WholeBody #520

volkov-maxim · 2021-03-11T06:55:07Z

Hi!
I want to accelerate the inference of the trained model for COCO-WholeBody.
I have ideas:

This model returned 133 keypoints. So I think that have a possibility to accelerate the trained model by decreasing returned amount of keypoints. For example 17 for body, 6 for feet, 3 for face, and 2 for hands = 28 of 133.

The question: Is it make sense? If YES, then how to customize the output of the trained model?
If I write my personal script for the inference powered by MMLab API, is it accelerate the process?

innerlee · 2021-03-11T07:15:17Z

Hi thanks for your interest!

The first step is to do profiling. Identify the bottleneck and improve it.

Steps:

create a mini-dataset containing a small number of samples, say 100 of them.
run the test script on the minidataset and use cProfile in the meantime
analyze the profiling result
improve bottleneck.

You can post the profiling image here

Related: #73 (comment)

jin-s13 · 2021-03-11T07:18:08Z

It is possible. Since only 28 keypoints are needed, not all heatmaps have to be transformed to keypoints. The post-processing (from heatmaps to keypoints) can be accelerated.

You may need to add a new keypoint head, and modify the post-processing part.

mmpose/mmpose/models/keypoint_heads/top_down_base_head.py

Line 73 in ef2799c

preds, maxvals = keypoints_from_heatmaps(

volkov-maxim · 2021-03-11T14:31:31Z

@jin-s13, I use a copy of top_down_base_head.py as keypoint_head. Should I transform heatmaps array from shape (1, 133, 64, 48) to (1, 28, 64, 48) in keypoints_from_heatmaps function or somewhere earlier?

volkov-maxim · 2021-03-12T07:34:19Z

@jin-s13, I have make rough transform heatmaps array from shape (1, 133, 64, 48) to (1, 28, 64, 48) in top_down_eval.py: heatmaps = heatmaps[:, :28, :, :], also edit lists in inference.py for dataset == 'TopDownCocoWholeBodyDataset'.
After it, I had run several inferences with top_down_img_demo_with_mmdet.py and get decreasing inference time ~0.1 sec/img

Question: Is it normal acceleration or I can accelerate inference more?

jin-s13 · 2021-03-12T08:01:14Z

Not very sure about that. But you may follow @innerlee to do profiling, and check which part costs the most of time.

You may also modify the network structure, especially the keypoint head to decrease the output channel from 133 to 28.
The pre-trained weights also need to be reshaped, to make it compatible with the changed output channels.

By the way, have you tried these tips to speed up inference?

volkov-maxim · 2021-03-12T10:08:46Z

Profiling is a good idea, but I still believe that I can accelerate inference by modifying the network structure.
Could you give me the cue about filenames where I can modify the network structure, keypoint head, and pre-trained weights?

Yes, I tried tips to speed up inference.

jin-s13 · 2021-03-12T12:34:06Z

Change the config file. For example, in hrnet, replace 133 with 28.
Directly modify the pre-trained weights. First load the pre-trained weights by

weights = torch.load("hrnet_w32_coco_wholebody_256x192-853765cd_20200918.pth")

Then modify
weights['state_dict']['keypoint_head.final_layer.weight']
and
weights['state_dict']['keypoint_head.final_layer.bias']

volkov-maxim · 2021-03-24T10:37:32Z

jin-s13, thank you very much for the help!
As you advise, I changed the config and modify tensors for the final layer in \mmcv\mmcv\runner\checkpoint.py
I can speed up inference to 9 fps on a GTX 1650 (4 Gb) when combine yolov3 608 (mmdetection) + res50 256x192 (mmpose)

fix typo

innerlee assigned jin-s13 Mar 11, 2021

jin-s13 closed this as completed Mar 24, 2021

rollingman1 pushed a commit to rollingman1/mmpose that referenced this issue Nov 5, 2021

Update README.md (open-mmlab#520)

b769492

fix typo

HAOCHENYE added a commit to HAOCHENYE/mmpose that referenced this issue Jun 27, 2023

[Fix] Fix error argument sequence in fsdp (open-mmlab#520)

9bbbd7d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accelerating the inference of the trained model for COCO-WholeBody #520

Accelerating the inference of the trained model for COCO-WholeBody #520

volkov-maxim commented Mar 11, 2021

innerlee commented Mar 11, 2021

jin-s13 commented Mar 11, 2021

volkov-maxim commented Mar 11, 2021 •

edited

Loading

volkov-maxim commented Mar 12, 2021

jin-s13 commented Mar 12, 2021

volkov-maxim commented Mar 12, 2021 •

edited

Loading

jin-s13 commented Mar 12, 2021

volkov-maxim commented Mar 24, 2021

Accelerating the inference of the trained model for COCO-WholeBody #520

Accelerating the inference of the trained model for COCO-WholeBody #520

Comments

volkov-maxim commented Mar 11, 2021

innerlee commented Mar 11, 2021

jin-s13 commented Mar 11, 2021

volkov-maxim commented Mar 11, 2021 • edited Loading

volkov-maxim commented Mar 12, 2021

jin-s13 commented Mar 12, 2021

volkov-maxim commented Mar 12, 2021 • edited Loading

jin-s13 commented Mar 12, 2021

volkov-maxim commented Mar 24, 2021

volkov-maxim commented Mar 11, 2021 •

edited

Loading

volkov-maxim commented Mar 12, 2021 •

edited

Loading