-
Notifications
You must be signed in to change notification settings - Fork 27
FAQs
It's there!!!
We haven't released the training code yet, but we will do that once we run a functional test.
Our program generate_proposals.py explicitly says that an HDF5-file with C3D features for all the frames in the video is needed to run successfully. It sounds like an overkilling requirement but take a look of our visual encoder interface before get sad/mad 😄.
The visual encoder interface was designed to deal with different striding and sampling schemes. Therefore, we have two solutions for you:
- Are you an easy-going person? Here you go:
-
Get C3D features every eight frames and uses NaNs for the other frames.
-
Save the HDF5-file enabling compression flags and chunked storage so you won't waste space.
-
Run our demo program as it is.
Take a look of the c3d_features
dataset of the video_test_0000541
group in the sample HDF5-file. You will find a lot of NaN
values, but it does not break the code 😉.
- Do you like principled solutions?
-
Change the
f_stride
of the visual encoder accordingly to your desired sampling scheme. -
Make sure that the
duration
used to request features is setup accordingly to your sampling scheme.
I tried our best model with the stable version of Lasagne (0.1) using Theano 0.8.2, and I got different results.
Ensure that you are using the same environment of our software stack and you did something similar to the installation steps described here. Especially, the steps involving pip
.
You can find an explanation for this behavior here.
Could you share with us the evaluation script used to obtain the Average Recall curve, and Recall vs. tIoU curve?
You can find scripts to compute both curves in this IPython Notebook.
-
DAPs proposal results in THUMOS14 test set with annotations here.
-
DAPs proposal results in ActivityNet v1.2 validation set here.
Please download the numbers from the following URLs: Figure 4a , Figure 4b. Additionally, we shared an IPython Notebook where you can find hints on how to read and plot the data.
Sure, download it from here. If you want to keep the same colors and style, check out this notebook.
Figure 5 compares the Average Recall in two different datasets ActivityNet version 1.2 and THUMOS-14.
-
THUMOS-14 (green line) corresponds to the results of our model in the videos from the test set of THUMOS-14.
-
ActivityNet (purple line) corresponds to the results of our model in the videos from the validation set of ActivityNet.
-
ActivityNet \cap THUMOS-14 (yellow line) corresponds to the results of our model in the videos from the validation set of ActivityNet that contain instances whose action label match any of the action labels in THUMOS-14.
-
ActivityNet <= 1024 frames (pink line) corresponds to the results of our model in the videos from the validation set of ActivityNet whose annotations span up to 1024 frames. Note that the videos can be longer than 1024 frames.
Notes
-
The last two subsets are disjoint i.e. action labels associated with instances appearing in videos from ActivityNet <= 1024 frames do not match any action label in THUMOS-14.
-
Figure 5 shows the performance of a model trained using the videos in the validation set of THUMOS14.
The following python script saves the subsets of ActivityNet on disk as CSV files. Make sure to have pandas and requests to execute it.
import io
import pandas as pd
import requests
# Retrieve ActivityNet v1.2 annotations := ground-truth
ground_truth_url = ('https://gist.githubusercontent.com/escorciav/'
'f21e798a9bab759b583864c8994ec63f/raw/'
'45682887f395dcf6c80fd6608404fc78ce12b75b/'
'activitynet_v1-2_val_groundtruth.csv')
s = requests.get(ground_truth_url).content
ground_truth = pd.read_csv(io.StringIO(s.decode('utf-8')), sep=' ')
# Define a couple of constants
ANET_OVERLAP_THUMOS14 = [159, 82, 233, 224, 195, 116, 80, 106, 169]
MAX_DURATION = 1024
# Get subsets
idx_similar_length = (ground_truth['f-end'] - ground_truth['f-init']) <= MAX_DURATION
idx_overlapped = ground_truth['label-idx'].isin(ANET_OVERLAP_THUMOS14)
df_similar_length = ground_truth.loc[idx_similar_length & (~idx_overlapped), :]
df_overlapp_thumos14 = ground_truth.loc[idx_similar_length & idx_overlapped, :]
# Dump subset into disk as CSV
for i, j in [(df_similar_length, 'subset_similar_length.csv'),
(df_overlapp_thumos14, 'subset_overlap_thumos14.csv')]:
i.to_csv(j, index=None, sep=' ')
P.D. sorry but Github-flavored Markdown does not render latex.
Do you want all the proposals of a bunch of videos?
Create a CSV-file with the video-name and do something like:
for i in {2..215}; do
video_name=$(sed -n $i"p" [your CSV file])
generate_proposals.py -iv $video_name -ic3d [HDF5 file with C3D] -imd [NPZ file with DAPs model];
done
Do you want to use the ipython notebook above and you have a bunch of CSV-files?
touch all_results
for i in *.csv; do
sed -n "2,$(wc -l $i | awk '{print $1}')p" $i >> all_results
done
mv all_results all_results.csv