-
Notifications
You must be signed in to change notification settings - Fork 73
question about use pre-train model on my own video #32
Comments
@pandababyer Thanks for your interest in our work. Could you check if you have dic_anet.json set up appropriately? |
@LuoweiZhou Thanks for your reply. Maybe my question is not clear. when I use code in detectron-vlp to extract feature for one video, i get dets_num = np.zeros((1, 10)). but the code in GVD dataloader_anet.py line 184 num_proposal = int(self.num_proposals[ix]) which will raise the error "TypeError: only size-1 arrays can be converted to Python scalars". |
@pandababyer It turned out there is a minor bug in the feature extraction file: |
@LuoweiZhou Thanks for your quick reply and the problem solved ! My last question is about the environment configuration of the project anet2016-cuhk-feature. I tried ubuntu 16 and ubuntu 14 but always get the problem: Cannot use GPU in CPU-only Caffe: check mode. and the output of resnet feature is (n,2048) which is right but the output feature of bn is (0,1024). I think it is the problem when build dense_flow. So what's the config detail and is it possible to provide a dockerfile. Thanks a lot ! |
@LuoweiZhou Thanks for share the great work! I ran into the same problem. |
@ycxia hello ,could you please tell me how do you do the feature extraction with repo anet2016-cuhk-feature. I have been troubled for a long time, thank you. |
@pandababyer Just following build_all.sh. Then, python2 examples/extract_feature_activitynet.py data/ --use_flow |
@ycxia Sorry, I haven't solve the problem. I use the dockerfile cuda8.0-cudnn5-devel-ubuntu14.04 and build the project with build_all.sh, but it always shows :Cannot use GPU in CPU-only Caffe: check mode. |
@ycxia It turned out the frame index is missing and has been updated in the feature extraction code. Besides, it seems in your case your self.max_proposal is 10 rather than 1000 somehow. |
@LuoweiZhou Thanks for your replay. I print self.max_proposal and its number is 1000. so, i changed the shape dets_labels = np.zeros((N, fpv, 100, 6)) into dets_labels = np.zeros((N, fpv, 100, 7)) But there is another error in excuting main.py of grouned video descripting: so , i changed the extract_features_gvd_anet.py https://github.com/LuoweiZhou/detectron-vlp/blob/master/tools/extract_features_gvd_anet.py#L271 There are no other errors! |
@ycxia Thank you for your feedback. We will need to reshape dets_labels to N*(fpv*100)*7. Please see this commit for the fix: LuoweiZhou/detectron-vlp@9ecc981 |
Hello, thanks for share the great work and it is very helpful ! When I use your pre-train model to generate the description for my own video. I use the code you offer to extract the feature(_segment.npy, anet_detection_vg_fc6_feat_100rois.h5, and bn.npy and resnet.npy). However, when i use it to generate the caption, it says 'TypeError: only size-1 arrays can be converted to Python scalars' I find out it is the difference between the anet_detection_vg_fc6_feat_100rois.h5 you offer and the
anet_detection_vg_fc6_feat_100rois.h5 file I generate with the code in detectron-vlp. the dimension of dets_num, dets_labels and others in the detectron-vlp is different from the .h5 file you offer. https://github.com/LuoweiZhou/detectron-vlp/blob/b9140d298538703205fd2c0421b06c4b40e00018/tools/extract_features_gvd_anet.py#L221
looking forward to your reply. thx!
The text was updated successfully, but these errors were encountered: