The code is for generating the dataset from the parent datasets. If you just want to use use ASRL as a training bed, you can skip this. See data
Very briefly, the process is as follows:
- Add semantic roles to captions in AC.
- Prepocess AE. In particular, resize all the proposals, ground-truth bounding boxes (this is required for SPAT/TEMP).
- Preprocess the features and choose only 5 groundtruths for GT5 setting.
- Obtain the bounding boxes and category names from AE for the relevant phrases.
- Filter out some verbs like "is", "are", "complete", "begin"
- Filter some SRL Arguments based on Frequency.
- Get Training/Validation/Test videos.
- Do Contrastive Sampling and store the dictionary files for easier sampling during training.
-
First download relevant files. Optional: specify the data folder where it would be downloaded.
bash download_asrl_parent_ann.sh [save_point]
The folder should look like:
anet_cap_ent_files |-- anet_captions_all_splits.json (AC captions) |-- anet_entities_test_1.json |-- anet_entities_test_2.json |-- anet_entities_val_1.json |-- anet_entities_val_2.json |-- cap_anet_trainval.json (AE Train annotations) |-- dic_anet.json (Train/Valid/Test video splits for AE)
-
Use SRL Labeling system from AllenAI (Should take ~15 mins) to add the semantic roles to the captions from AC.
cd $ROOT python dcode/sem_role_labeller.py
This will create
$ROOT/cache_dir
and store the output SRL files which should look like:cache_dir/ |-- SRL_Anet |-- SRL_Anet_bert_cap_annots.csv # AC annotations in csv format to input into BERT |-- srl_bert_preds.pkl # BERT outputs
-
Resize the boxes in AE.
cd $ROOT python dcode/preproc_anet_files.py --task='resize_boxes_ae'
This takes the file
cap_anet_trainval.json
as input (this is the main AE annotation file) and outputsanet_ent_cls_bbox_trainval.json
. The latter file contains resized ground-truth boxes. It also resizes the proposal boxes, taking inanet_detection_vg_fc6_feat_100rois.h5
as input and producesanet_detection_vg_fc6_feat_100rois_resized.h5
as output. The latter contains resized proposals. -
GT5 setting
cd $ROOT python dcode/preproc_anet_files.py --task='choose_gt_5'
Intially, there are
100
proposals per frame. For faster iteration, we only choose the 5 proposals from each frame. If there is a ground-truth box, we take include that box, and the remaining are included in order of their proposal score (not a fair way, but the best that could be done). If there are no ground-turth box, we choose the top5 scoring proposals.To compute the recall scores (for sanity check):
python dcode/preproc_anet_files.py --task='compute_recall'
By default, it computes recall scores for GT5, you can change the proposal file, for other settings.
-
Aligning SRL outputs and NounPhrases from AE to create ASRL and adding the bounding boxes to the ASRL files (<1min)
cd $ROOT python dcode/asrl_creator.py
Now
$ROOT/data/anet_srl_files/
should look like:anet_srl_files/ |-- verb_ent_file.csv # main file with SRLs, BBoxes |-- verb_lemma_dict.json # dictionary of verbs corresponding to their lemma
-
Use the Train/Val videos from AE to create Train/Val/Test videos for ASRL (~5-7 mins). Additionally, create the vocab file for the SRL arguments
cd $ROOT python dcode/prepoc_ds_files.py
This will create
anet_cap_ent_files/csv_dir
. It should look like:csv_dir |-- train.csv |-- train_postproc.csv |-- val.csv |-- val_postproc.csv
Further, now
$ROOT/data/anet_srl_files/
should look like:anet_srl_files/ |-- trn_verb_ent_file.csv # train file |-- val_verb_ent_file.csv # val & test file |-- verb_ent_file.csv |-- verb_lemma_dict.json
-
Do Constrastive sampling for train and validation set (~30mins)
cd $ROOT python code/contrastive_sampling.py
Now your
anet_srl_files
directory should look like:anet_srl_files/ |-- trn_asrl_annots.csv # used for training |-- trn_srl_obj_to_index_dict.json # used for CS |-- trn_verb_ent_file.csv # not used anymore |-- val_asrl_annots.csv # used for val/test |-- val_srl_obj_to_index_dict.json # used for CS |-- val_verb_ent_file.csv # not used anymore |-- verb_ent_file.csv # not used anymore |-- verb_lemma_dict.json # not used anymore
-
I have provided drive links to the processed files (generated after completing all the previous steps):
anet_cap_ent_files
andanet_srl_files
: https://drive.google.com/open?id=1mH8TyVPU4w7864Hxiukzg8dnqPIyBuE3SRL_Anet
: https://drive.google.com/open?id=1vGgqc8_-ZBk3ExNroRP-On7ArWN-d8du- resized proposal h5 files: https://drive.google.com/open?id=1a6UOK90Epz7n-dncKAeFDQP4TBgqdTS9
- fc6_feats_5rois: https://drive.google.com/open?id=13tvBIEAgv4VS5dqkZBK1gvTI_Z22gRLM
Alternatively, you can download these files from
download_asrl_parent_ann.sh
by passingasrl_proc_files
:bash download_asrl_parent_ann.sh asrl_proc_files