The i3d-rgb-tf
is a model for video classification, based on paper "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset". This model use RGB input stream and trained on Kinetics-400 dataset. Additionally, this model has initialize values from Inception v1 model pre-trained on ImageNet dataset.
Originally redistributed as a checkpoint file, was converted to frozen graph.
- Clone or download original repository:
git clone https://github.com/deepmind/kinetics-i3d.git
- (Optional) Checkout the commit that the conversion was tested on:
git checkout 0667e88
- Install prerequisites, tested with:
tensorflow==1.11 tensorflow-probability==0.4.0 dm-sonnet==1.26
- Copy
<omz_dir>/models/public/i3d-rgb-tf/freeze.py
script to root directory of original repository and run it:python freeze.py
Metric | Value |
---|---|
Type | Action recognition |
GFLOPs | 278.981 |
MParams | 12.69 |
Source framework | TensorFlow* |
Accuracy validations performed on validation part of Kinetics-400 dataset. Subset consists of 400 randomly chosen videos from this dataset.
Metric | Converted Model | Converted Model (subset 400) |
---|---|---|
Top 1 | 65.96% | 64.83% |
Top 5 | 86.01% | 84.58% |
Video clip, name - Placeholder
, shape - 1, 79, 224, 224, 3
, format is B, D, H, W, C
, where:
B
- batch sizeD
- duration of input clipH
- heightW
- widthC
- channel
Channel order is RGB
. Mean value - 127.5, scale value - 127.5.
Video clip, name - Placeholder
, shape - 1, 79, 224, 224, 3
, format is B, D, H, W, C
, where:
B
- batch sizeD
- duration of input clipH
- heightW
- widthC
- channel
Channel order is RGB
.
Action classifier according to Kinetics-400 action classes, name - Softmax
, shape - 1, 400
, format is B, C
, where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] range
Action classifier according to Kinetics-400 action classes, name - Softmax
, shape - 1, 400
, format is B, C
, where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] range
You can download models and if necessary convert them into Inference Engine format using the Model Downloader and other automation tools as shown in the examples below.
An example of using the Model Downloader:
omz_downloader --name <model_name>
An example of using the Model Converter:
omz_converter --name <model_name>
The original model is distributed under the
Apache License, Version 2.0. A copy of the license is provided in <omz_dir>/models/public/licenses/APACHE-2.0.txt
.