This is implementation for the paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
- Ubuntu 20.04
- CUDA 11.7
- Python 3.7
Install other required packages by
pip install -r requirements.txt
This paper has reconstructed the validation and test sets of two widely used datasets in the TSG domain: Charades-STA and ActivityNet Captions, to construct a testing environment for TSG-RF task., i.e., Charades-STA-RF, ActivityNet Captions-RF. The reconstructed dataset is located in the ./data/dataset
directory.
The details about how to prepare the Charades-STA
, ActivityNet Captions
features are followed previous work: VSLNet Datasets Preparation. Alternatively, you can download the prepared visual features from Mega, and place them to the ./data/features/
directory.
Download the word embeddings from here and place it to
./data/features/
directory.
Train
# train RaTSG on Charades-STA-RF dataset
bash charades_RF_train.sh
# train RaTSG on ActivityNet Captions-RF dataset
bash activitynet_RF_train.sh
Run the following script to test on the trained models: Test
# test RaTSG on Charades-STA-RF dataset
bash charades_RF_test.sh
# test RaTSG on ActivityNet Captions-RF dataset
bash activitynet_RF_test.sh
We release several pretrained checkpoints, please download and put them into ./ckpt/
- RaTSG on Charades-STA-RF: RaTSG_charades_RF_i3d_128
- RaTSG on Activitynet Captions-RF: RaTSG_activitynet_RF_i3d_128