question about use pre-train model on my own video #32

pandababyer · 2020-08-04T02:48:58Z

Hello, thanks for share the great work and it is very helpful ! When I use your pre-train model to generate the description for my own video. I use the code you offer to extract the feature(_segment.npy, anet_detection_vg_fc6_feat_100rois.h5, and bn.npy and resnet.npy). However, when i use it to generate the caption, it says 'TypeError: only size-1 arrays can be converted to Python scalars' I find out it is the difference between the anet_detection_vg_fc6_feat_100rois.h5 you offer and the
anet_detection_vg_fc6_feat_100rois.h5 file I generate with the code in detectron-vlp. the dimension of dets_num, dets_labels and others in the detectron-vlp is different from the .h5 file you offer. https://github.com/LuoweiZhou/detectron-vlp/blob/b9140d298538703205fd2c0421b06c4b40e00018/tools/extract_features_gvd_anet.py#L221
looking forward to your reply. thx!

LuoweiZhou · 2020-08-10T05:50:12Z

@pandababyer Thanks for your interest in our work. Could you check if you have dic_anet.json set up appropriately?
https://github.com/LuoweiZhou/detectron-vlp/blob/master/tools/extract_features_gvd_anet.py#L136
https://github.com/LuoweiZhou/detectron-vlp/blob/master/tools/extract_features_gvd_anet.py#L207
Also, the error is related to NumPy format so you may want to check if the code raises any exception when feeding in one video (in your case).

pandababyer · 2020-08-13T09:15:00Z

@LuoweiZhou Thanks for your reply. Maybe my question is not clear. when I use code in detectron-vlp to extract feature for one video, i get dets_num = np.zeros((1, 10)). but the code in GVD dataloader_anet.py line 184 num_proposal = int(self.num_proposals[ix]) which will raise the error "TypeError: only size-1 arrays can be converted to Python scalars".
This confused me a lot. Tanks for your time !

LuoweiZhou · 2020-08-13T18:35:28Z

@pandababyer It turned out there is a minor bug in the feature extraction file:
f.create_dataset("dets_num", data=dets_num) -> f.create_dataset("dets_num", data=dets_num.sum(axis=-1))
f.create_dataset("nms_num", data=nms_num) -> f.create_dataset("nms_num", data=nms_num).sum(axis=-1)
I have made the fix. Thank you for your feedback!

pandababyer · 2020-08-14T02:34:54Z

@LuoweiZhou Thanks for your quick reply and the problem solved ! My last question is about the environment configuration of the project anet2016-cuhk-feature. I tried ubuntu 16 and ubuntu 14 but always get the problem: Cannot use GPU in CPU-only Caffe: check mode. and the output of resnet feature is (n,2048) which is right but the output feature of bn is (0,1024). I think it is the problem when build dense_flow. So what's the config detail and is it possible to provide a dockerfile. Thanks a lot !

ycxia · 2020-08-14T08:08:37Z

@LuoweiZhou Thanks for share the great work! I ran into the same problem.
the problem solved by modifing
f.create_dataset("dets_num", data=dets_num) -> f.create_dataset("dets_num", data=dets_num.sum(axis=-1))
f.create_dataset("nms_num", data=nms_num) -> f.create_dataset("nms_num", data=nms_num).sum(axis=-1).
But another problem arises：
File "/data/yongcheng/grounded-video-description/misc/dataloader_anet.py", line 331, in getitem
pad_proposals[:num_pps] = proposals[:num_pps]
ValueError: could not broadcast input array from shape (10,100,6) into shape (10,7)
This confused me a lot. Tanks for your time !

pandababyer · 2020-08-14T08:16:56Z

@ycxia hello ,could you please tell me how do you do the feature extraction with repo anet2016-cuhk-feature. I have been troubled for a long time, thank you.

ycxia · 2020-08-14T08:30:18Z

@pandababyer Just following build_all.sh. Then, python2 examples/extract_feature_activitynet.py data/ --use_flow
good luck!
Do you know how to debug "ValueError: could not broadcast input array from shape (10,100,6) into shape (10,7)"?
thank you!

pandababyer · 2020-08-14T08:41:20Z

@ycxia Sorry, I haven't solve the problem. I use the dockerfile cuda8.0-cudnn5-devel-ubuntu14.04 and build the project with build_all.sh, but it always shows :Cannot use GPU in CPU-only Caffe: check mode.

LuoweiZhou · 2020-08-14T23:40:27Z

@ycxia It turned out the frame index is missing and has been updated in the feature extraction code.

Besides, it seems in your case your self.max_proposal is 10 rather than 1000 somehow.

ycxia · 2020-08-17T01:48:31Z

@LuoweiZhou Thanks for your replay. I print self.max_proposal and its number is 1000.
After updating the feature extraction code, there is another error :
dets_labels[i, j, :num_proposal] = proposals
ValueError: could not broadcast input array from shape (100,7) into shape (100,6)

so, i changed the shape dets_labels = np.zeros((N, fpv, 100, 6)) into dets_labels = np.zeros((N, fpv, 100, 7))

But there is another error in excuting main.py of grouned video descripting:
File "/data/yongcheng/grounded-video-description/misc/dataloader_anet.py", line 334, in getitem
pad_proposals[:num_pps] = proposals[:num_pps]
ValueError: could not broadcast input array from shape (10,100,7) into shape (10,7)

so , i changed the extract_features_gvd_anet.py https://github.com/LuoweiZhou/detectron-vlp/blob/master/tools/extract_features_gvd_anet.py#L271
f.create_dataset("dets_labels", data=dets_labels) -> f.create_dataset("dets_labels", data=dets_labels.reshape(1,fpv*100,7))

There are no other errors！

LuoweiZhou · 2020-08-21T02:32:43Z

@ycxia Thank you for your feedback. We will need to reshape dets_labels to N*(fpv*100)*7. Please see this commit for the fix: LuoweiZhou/detectron-vlp@9ecc981

rohrbach closed this as completed Sep 18, 2020

yunsujeon mentioned this issue Jun 19, 2021

Does process exist to use pre-trained model on my own video? #37

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about use pre-train model on my own video #32

question about use pre-train model on my own video #32

pandababyer commented Aug 4, 2020

LuoweiZhou commented Aug 10, 2020

pandababyer commented Aug 13, 2020

LuoweiZhou commented Aug 13, 2020

pandababyer commented Aug 14, 2020

ycxia commented Aug 14, 2020

pandababyer commented Aug 14, 2020

ycxia commented Aug 14, 2020

pandababyer commented Aug 14, 2020

LuoweiZhou commented Aug 14, 2020

ycxia commented Aug 17, 2020

LuoweiZhou commented Aug 21, 2020

question about use pre-train model on my own video #32

question about use pre-train model on my own video #32

Comments

pandababyer commented Aug 4, 2020

LuoweiZhou commented Aug 10, 2020

pandababyer commented Aug 13, 2020

LuoweiZhou commented Aug 13, 2020

pandababyer commented Aug 14, 2020

ycxia commented Aug 14, 2020

pandababyer commented Aug 14, 2020

ycxia commented Aug 14, 2020

pandababyer commented Aug 14, 2020

LuoweiZhou commented Aug 14, 2020

ycxia commented Aug 17, 2020

LuoweiZhou commented Aug 21, 2020