Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerating the inference of the trained model for COCO-WholeBody #520

Closed
volkov-maxim opened this issue Mar 11, 2021 · 8 comments
Closed
Assignees

Comments

@volkov-maxim
Copy link

Hi!
I want to accelerate the inference of the trained model for COCO-WholeBody.
I have ideas:

  1. This model returned 133 keypoints. So I think that have a possibility to accelerate the trained model by decreasing returned amount of keypoints. For example 17 for body, 6 for feet, 3 for face, and 2 for hands = 28 of 133.

    The question: Is it make sense? If YES, then how to customize the output of the trained model?

  2. If I write my personal script for the inference powered by MMLab API, is it accelerate the process?

@innerlee
Copy link
Contributor

Hi thanks for your interest!

The first step is to do profiling. Identify the bottleneck and improve it.

Steps:

  1. create a mini-dataset containing a small number of samples, say 100 of them.
  2. run the test script on the minidataset and use cProfile in the meantime
  3. analyze the profiling result
  4. improve bottleneck.

You can post the profiling image here

Related: #73 (comment)

@jin-s13
Copy link
Collaborator

jin-s13 commented Mar 11, 2021

It is possible. Since only 28 keypoints are needed, not all heatmaps have to be transformed to keypoints. The post-processing (from heatmaps to keypoints) can be accelerated.

You may need to add a new keypoint head, and modify the post-processing part.

preds, maxvals = keypoints_from_heatmaps(

@volkov-maxim
Copy link
Author

volkov-maxim commented Mar 11, 2021

@jin-s13, I use a copy of top_down_base_head.py as keypoint_head. Should I transform heatmaps array from shape (1, 133, 64, 48) to (1, 28, 64, 48) in keypoints_from_heatmaps function or somewhere earlier?

@volkov-maxim
Copy link
Author

@jin-s13, I have make rough transform heatmaps array from shape (1, 133, 64, 48) to (1, 28, 64, 48) in top_down_eval.py: heatmaps = heatmaps[:, :28, :, :], also edit lists in inference.py for dataset == 'TopDownCocoWholeBodyDataset'.
After it, I had run several inferences with top_down_img_demo_with_mmdet.py and get decreasing inference time ~0.1 sec/img

Question: Is it normal acceleration or I can accelerate inference more?

@jin-s13
Copy link
Collaborator

jin-s13 commented Mar 12, 2021

Not very sure about that. But you may follow @innerlee to do profiling, and check which part costs the most of time.

You may also modify the network structure, especially the keypoint head to decrease the output channel from 133 to 28.
The pre-trained weights also need to be reshaped, to make it compatible with the changed output channels.

By the way, have you tried these tips to speed up inference?

@volkov-maxim
Copy link
Author

volkov-maxim commented Mar 12, 2021

Profiling is a good idea, but I still believe that I can accelerate inference by modifying the network structure.
Could you give me the cue about filenames where I can modify the network structure, keypoint head, and pre-trained weights?

Yes, I tried tips to speed up inference.

@jin-s13
Copy link
Collaborator

jin-s13 commented Mar 12, 2021

  1. Change the config file. For example, in hrnet, replace 133 with 28.
  2. Directly modify the pre-trained weights. First load the pre-trained weights by
weights = torch.load("hrnet_w32_coco_wholebody_256x192-853765cd_20200918.pth")

Then modify
weights['state_dict']['keypoint_head.final_layer.weight']
and
weights['state_dict']['keypoint_head.final_layer.bias']

@jin-s13 jin-s13 closed this as completed Mar 24, 2021
@volkov-maxim
Copy link
Author

jin-s13, thank you very much for the help!
As you advise, I changed the config and modify tensors for the final layer in \mmcv\mmcv\runner\checkpoint.py
I can speed up inference to 9 fps on a GTX 1650 (4 Gb) when combine yolov3 608 (mmdetection) + res50 256x192 (mmpose)

rollingman1 pushed a commit to rollingman1/mmpose that referenced this issue Nov 5, 2021
HAOCHENYE added a commit to HAOCHENYE/mmpose that referenced this issue Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants