Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhance] add inferencer #2164

Merged
merged 12 commits into from
Feb 7, 2023
2 changes: 2 additions & 0 deletions configs/recognition/i3d/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Collections:

Models:
- Name: i3d_imagenet-pretrained-r50-nl-dot-product_8xb8-32x2x1-100e_kinetics400-rgb
Alias:
- i3d
Config: configs/recognition/i3d/i3d_imagenet-pretrained-r50-nl-dot-product_8xb8-32x2x1-100e_kinetics400-rgb.py
In Collection: I3D
Metadata:
Expand Down
2 changes: 2 additions & 0 deletions configs/recognition/slowfast/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ Models:
Weights: https://download.openmmlab.com/mmaction/v1.0/recognition/slowfast/slowfast_r50_8xb8-4x16x1-256e_kinetics400-rgb/slowfast_r50_8xb8-4x16x1-256e_kinetics400-rgb_20220901-701b0f6f.pth

- Name: slowfast_r50_8xb8-8x8x1-256e_kinetics400-rgb
Alias:
- slowfast
Config: configs/recognition/slowfast/slowfast_r50_8xb8-8x8x1-256e_kinetics400-rgb.py
In Collection: SlowFast
Metadata:
Expand Down
2 changes: 2 additions & 0 deletions configs/recognition/tsn/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Models:
Weights: https://download.openmmlab.com/mmaction/v1.0/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x5-100e_kinetics400-rgb/tsn_imagenet-pretrained-r50_8xb32-1x1x5-100e_kinetics400-rgb_20220906-65d68713.pth

- Name: tsn_imagenet-pretrained-r50_8xb32-1x1x8-100e_kinetics400-rgb
Alias:
- TSN
Config: configs/recognition/tsn/tsn_imagenet-pretrained-r50_8xb32-1x1x8-100e_kinetics400-rgb.py
In Collection: TSN
Metadata:
Expand Down
65 changes: 63 additions & 2 deletions demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
- [Video GradCAM Demo](#video-gradcam-demo): A demo script to visualize GradCAM results using a single video.
- [Webcam demo](#webcam-demo): A demo script to implement real-time action recognition from a web camera.
- [Skeleton-based Action Recognition Demo](#skeleton-based-action-recognition-demo): A demo script to predict the skeleton-based action recognition result using a single video.
- [Inferencer Demo](#inferencer): A demo script to implement fast predict for video analysis tasks based on unified inferencer interface.

## Modify configs through script arguments

Expand Down Expand Up @@ -52,7 +53,7 @@ Optional arguments:
Examples:

Assume that you are located at `$MMACTION2` and have already downloaded the checkpoints to the directory `checkpoints/`,
or use checkpoint url from to directly load corresponding checkpoint, which will be automatically saved in `$HOME/.cache/torch/checkpoints`.
or use checkpoint url from `configs/` to directly load corresponding checkpoint, which will be automatically saved in `$HOME/.cache/torch/checkpoints`.

1. Recognize a video file as input by using a TSN model on cuda by default.

Expand Down Expand Up @@ -183,7 +184,7 @@ Users can change:

## Skeleton-based Action Recognition Demo

MMAction2 provides an demo script to predict the skeleton-based action recognition result using a single video.
MMAction2 provides a demo script to predict the skeleton-based action recognition result using a single video.

```shell
python demo/demo_skeleton.py ${VIDEO_FILE} ${OUT_FILENAME} \
Expand Down Expand Up @@ -247,3 +248,63 @@ python demo/demo_skeleton.py demo/demo_skeleton.mp4 demo/demo_skeleton_out.mp4 \
--pose-checkpoint https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth \
--label-map tools/data/skeleton/label_map_ntu60.txt
```

## Inferencer

MMAction2 provides a demo script to implement fast prediction for video analysis tasks based on unified inferencer interface, currently only supports action recognition task.

```shell
python demo/demo.py ${INPUTS} \
[--vid-out-dir ${VID_OUT_DIR}] \
[--rec ${RECOG_TASK}] \
[--rec-weights ${RECOG_WEIGHTS}] \
[--label-file ${LABEL_FILE}] \
[--device ${DEVICE_TYPE}] \
[--batch-size ${BATCH_SIZE}] \
[--print-result ${PRINT_RESULT}] \
[--pred-out-file ${PRED_OUT_FILE} ]
```

Optional arguments:

- `--show`: If specified, the demo will display the video in a popup window.
- `--print-result`: If specified, the demo will print the inference results'
- `VID_OUT_DIR`: Output directory of saved videos. Defaults to None, means not to save videos.
- `RECOG_TASK`: Type of Action Recognition algorithm. It could be the path to the config file, the model name or alias defined in metafile.
- `RECOG_WEIGHTS`: Path to the custom checkpoint file of the selected recog model. If it is not specified and "rec" is a model name of metafile, the weights will be loaded from metafile.
- `LABEL_FILE`: Label file for dataset the algorithm pretrained on. Defaults to None, means don't show label in result.
- `DEVICE_TYPE`: Type of device to run the demo. Allowed values are cuda device like `cuda:0` or `cpu`. Defaults to `cuda:0`.
- `BATCH_SIZE`: The batch size used in inference. Defaults to 1.
- `PRED_OUT_FILE`: File path to save the inference results. Defaults to None, means not to save prediction results.

Examples:

Assume that you are located at `$MMACTION2`.

1. Recognize a video file as input by using a TSN model, loading checkpoint from metafile.

```shell
# The demo.mp4 and label_map_k400.txt are both from Kinetics-400
python demo/demo_inferencer.py demo/demo.mp4
--rec configs/recognition/tsn/tsn_r50_8xb32-1x1x8-100e_kinetics400-rgb.py \
--label-file tools/data/kinetics/label_map_k400.txt
```

2. Recognize a video file as input by using a TSN model, using model alias in metafile.

```shell
# The demo.mp4 and label_map_k400.txt are both from Kinetics-400
python demo/demo_inferencer.py demo/demo.mp4
--rec tsn \
--label-file tools/data/kinetics/label_map_k400.txt
```

3. Recognize a video file as input by using a TSN model, and then save visulization video.

```shell
# The demo.mp4 and label_map_k400.txt are both from Kinetics-400
python demo/demo_inferencer.py demo/demo.mp4
--vid-out-dir demo_out \
--rec tsn \
--label-file tools/data/kinetics/label_map_k400.txt
```
36 changes: 7 additions & 29 deletions demo/demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from operator import itemgetter
from typing import Optional, Tuple

import cv2
from mmengine import Config, DictAction

from mmaction.apis import inference_recognizer, init_recognizer
Expand Down Expand Up @@ -88,34 +87,9 @@ def get_output(
if video_path.startswith(('http://', 'https://')):
raise NotImplementedError

try:
import decord
except ImportError:
raise ImportError('Please install decord to enable output file.')

# Channel Order is `BGR`
video = decord.VideoReader(video_path)
frames = [x.asnumpy()[..., ::-1] for x in video]
if target_resolution:
w, h = target_resolution
frame_h, frame_w, _ = frames[0].shape
if w == -1:
w = int(h / frame_h * frame_w)
if h == -1:
h = int(w / frame_w * frame_h)
frames = [cv2.resize(f, (w, h)) for f in frames]

# init visualizer
out_type = 'gif' if osp.splitext(out_filename)[1] == '.gif' else 'video'
vis_backends_cfg = [
dict(
type='LocalVisBackend',
out_type=out_type,
save_dir='demo',
fps=fps)
]
visualizer = ActionVisualizer(
vis_backends=vis_backends_cfg, save_dir='place_holder')
visualizer = ActionVisualizer()
visualizer.dataset_meta = dict(classes=labels)

text_cfg = {'colors': font_color}
Expand All @@ -124,11 +98,15 @@ def get_output(

visualizer.add_datasample(
out_filename,
frames,
video_path,
data_sample,
draw_pred=True,
draw_gt=False,
text_cfg=text_cfg)
text_cfg=text_cfg,
fps=fps,
out_type=out_type,
out_path=osp.join('demo', out_filename),
target_resolution=target_resolution)


def main():
Expand Down
70 changes: 70 additions & 0 deletions demo/demo_inferencer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Copyright (c) OpenMMLab. All rights reserved.
from argparse import ArgumentParser

from mmaction.apis.inferencers import MMAction2Inferencer


def parse_args():
parser = ArgumentParser()
parser.add_argument(
'inputs', type=str, help='Input video file or rawframes folder path.')
parser.add_argument(
'--vid-out-dir',
type=str,
default='',
help='Output directory of videos.')
parser.add_argument(
'--rec',
type=str,
default=None,
help='Pretrained action recognition algorithm. It\'s the path to the '
'config file or the model name defined in metafile.')
parser.add_argument(
'--rec-weights',
type=str,
default=None,
help='Path to the custom checkpoint file of the selected recog model. '
'If it is not specified and "rec" is a model name of metafile, the '
'weights will be loaded from metafile.')
parser.add_argument(
'--label-file', type=str, default=None, help='label file for dataset.')
parser.add_argument(
'--device',
type=str,
default=None,
help='Device used for inference. '
'If not specified, the available device will be automatically used.')
parser.add_argument(
'--batch-size', type=int, default=1, help='Inference batch size.')
parser.add_argument(
'--show',
action='store_true',
help='Display the video in a popup window.')
parser.add_argument(
'--print-result',
action='store_true',
help='Whether to print the results.')
parser.add_argument(
'--pred-out-file',
type=str,
default='',
help='File to save the inference results.')

call_args = vars(parser.parse_args())

init_kws = ['rec', 'rec_weights', 'device', 'label_file']
init_args = {}
for init_kw in init_kws:
init_args[init_kw] = call_args.pop(init_kw)

return init_args, call_args


def main():
init_args, call_args = parse_args()
mmaction2 = MMAction2Inferencer(**init_args)
mmaction2(**call_args)


if __name__ == '__main__':
main()
1 change: 1 addition & 0 deletions mmaction/apis/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Copyright (c) OpenMMLab. All rights reserved.
from .inference import (detection_inference, inference_recognizer,
init_recognizer, pose_inference)
from .inferencers import * # NOQA

__all__ = [
'init_recognizer', 'inference_recognizer', 'detection_inference',
Expand Down
5 changes: 5 additions & 0 deletions mmaction/apis/inferencers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (c) OpenMMLab. All rights reserved.
from .actionrecog_inferencer import ActionRecogInferencer
from .mmaction2_inferencer import MMAction2Inferencer

__all__ = ['ActionRecogInferencer', 'MMAction2Inferencer']
Loading