How to infer from new videos ? #12

lecidhugo · 2019-09-16T15:12:38Z

Hi @LuoweiZhou,
My question is two-fold:
1)
I downloaded the pre-trained models and I tried first to run the example you provided for inference. I got this error:
IOError: [Errno 2] No such file or directory: u'data/anet/rgb_motion_1d/K6Tm5xHkJ5c_resnet.npy'
Below is the point where the error arises:
Loading the model save/anet-unsup-0-0-0-run1/model-best.pth...
Finetune param: ctx2pool_grd.0.weight
Finetune param: ctx2pool_grd.0.bias
Finetune param: vis_embed.0.weight
I verified that the file is missing but I do not know how to get it ( I saw a similar issue but I could not proceed with the provided answer as it was unclear for me)
2) I am wondering how can I use your code in order to infer from my own videos. Can you please guide me ?
Thanks in advance

LuoweiZhou · 2019-09-17T00:08:56Z

Hi @lecidhugo, I have updated this issue: #5
Basically, you will need to first pre-process your dataset/annotations (e.g., anet). Then,
extract feature-wise features (for temporal attention) and region features (for region attention), as I described in Issue 5. The dataloader needs to be updated accordingly.

lecidhugo · 2019-10-03T14:17:54Z

Hi @LuoweiZhou,
Thank you for sharing your code and for your help.
I finally reproduced your steps correctly.
However, I did not figure out how to pre-process the videos I have for inference. If my understanding is fine, what I have to do as pre-processing is:
1- sampling the video
2- calculate the features of the sampled frames:
2.1- Region features: can be obtained using extract_features.py and Detectron
2.1- Frame-wise features : I have no idea how to calculate them
3- use you code for inference
My goal is to do testing, so I do not need to annotate my videos. Right?
Could you please confirm if my understanding is good ? and how can I do sampling and how can I get the frame-wise features ?
Thanks in advance,

LuoweiZhou · 2019-10-04T04:59:03Z

Hi @lecidhugo, yes you're right. For the frame-wise features, please refer to this answer. Note that when extracting the region features, we uniformly sample 10 frames from each video segment while for frame-wise features, we sample the entire video at 2fps. Yes, if your end goal is inference/testing, you do not need to have any caption annotations.

lecidhugo · 2019-10-04T11:44:40Z

Thank you @LuoweiZhou,
Last question please. How can I produce segments from a video ?
For example, in the following output (which I got from the log), how you divided video "v_K6Tm5xHkJ5c" into two segments ?
segment v_K6Tm5xHkJ5c_segment_00: A woman is seen sitting in a chair holding a
segment v_K6Tm5xHkJ5c_segment_01: The woman then begins playing the accordion while looking back

LuoweiZhou · 2019-10-04T14:44:03Z

@lecidhugo The definition of video segments can be found here. You will see the start/end timestamp of each segment in the annotation file. For short videos, you can also directly feed them into the model. GVD captions each video (segment) independently.

LuoweiZhou closed this as completed Oct 26, 2019

LuoweiZhou mentioned this issue Dec 4, 2019

Steps to generate a caption from a video file #17

Closed

TheShadow29 mentioned this issue Jul 9, 2020

Inference on a single video TheShadow29/vognet-pytorch#5

Closed

yunsujeon mentioned this issue Jun 19, 2021

Does process exist to use pre-trained model on my own video? #37

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to infer from new videos ? #12

How to infer from new videos ? #12

lecidhugo commented Sep 16, 2019

LuoweiZhou commented Sep 17, 2019

lecidhugo commented Oct 3, 2019 •

edited

Loading

LuoweiZhou commented Oct 4, 2019

lecidhugo commented Oct 4, 2019

LuoweiZhou commented Oct 4, 2019 •

edited

Loading

How to infer from new videos ? #12

How to infer from new videos ? #12

Comments

lecidhugo commented Sep 16, 2019

LuoweiZhou commented Sep 17, 2019

lecidhugo commented Oct 3, 2019 • edited Loading

LuoweiZhou commented Oct 4, 2019

lecidhugo commented Oct 4, 2019

LuoweiZhou commented Oct 4, 2019 • edited Loading

lecidhugo commented Oct 3, 2019 •

edited

Loading

LuoweiZhou commented Oct 4, 2019 •

edited

Loading