Temporal Relation Networks

Introduction: This repository is to apply TRN on ASLLRP dataset. The goal is to recognize sign labels, limited in lexical signs. In addition to TRN framework, keypoint detection, hand detection and left and right hand recognition is also included.

Note: always use git clone --recursive https://github.com/metalbubble/TRN-pytorch to clone this project Otherwise you will not be able to use the inception series CNN architecture.

Requirements

Data preparation

Raw RGB image

Extract frames.

python extract_frame.py

Cut videos to utterance clips.

python cut_video_to_clip.py

Delete blank frames.

python delete_unvalid_frame.py

Generate vocabulary and handshape list.

python generate_vocabulary.py

Parse annotation xml file and generate json file for each utterance video. Note: you can also include handshape annotation in this step, while I do so in process_dataset_dai_hand.py

python parse_sign.py

Cut the utterance videos into sign videos, and reorganize them according to the label. Note: since the label name migh include '/', it's necessary to replace '/' with 'or'.

python GenerateClassificationFolders.py

Generate category list and train/test list. Each row in train/test list is [video_id, num_frames, class_idx].

python process_dataset_dai.py

Optical flow image

To compute optical flow image of each frame, run python extract_of.py

Hand detection

Pretrain on Egohands dataset.

python Hand/train_detectron2_ego.py

Finetune on my annotated images from ASLLRP dataset

python Hand/findtune_asl_detectron2.py

Body keypoint detection

python Hand/train_detectron2_keypoint.py

Left right hand recognition

Simply associate left right wrist keypoint with detected bounding boxes. Save in root/dai_lexical_handbox, each file is the bounding boxes in [left, right] order.

python Hand/classify_leftright_asl_detectron2.py

Then we are able to generate train/test list in which each row is [video_id, num_frames, class_idx, right_start_hs, left_start_hs, right_end_hs, left_end_hs].

python process_dataset_dai_hand.py

Code

Core code to implement the Temporal Relation Network module is TRNmodule. It is plug-and-play on top of the TSN.

Training and Testing

The command to train single scale TRN using full frame. To train on hand image, use main-hand.py. To jointly train, use main-handjoint.py.

CUDA_VISIBLE_DEVICES=0 python main.py \
                     --dataset dai --modality RGB \
                     --arch BNInception --num_segments 3 \
                     --consensus_type TRN --batch_size 64

The command to train multi-scale TRN

CUDA_VISIBLE_DEVICES=0 python main.py \
                     --dataset dai --modality RGB \
                     --arch BNInception --num_segments 3 \
                     --consensus_type TRNmultiscale --batch_size 64

The command to test the single scale TRN

python test_video_dai.py \
       --frame_folder $FRAME_FOLDER \
       --test_segments 3 \
       --consensus_type TRN \
       --weight $WEIGHT_PATH \
       --arch BNInception \
       --dataset dai

The command to test the single scale 2-stream TRN

python test-dai-2stream.py \
        --dataset rachel --modality RGB \
        --arch BNInception --num_segments 3 \
        --resume $RGB_WEIGHTPATH \
        --resume_of $OF_WEIGHTPATH \
        --consensus_type TRN --batch_size 64 \
        --data_length 3 \
        --evaluate

Reference:

B. Zhou, A. Andonian, and A. Torralba. Temporal Relational Reasoning in Videos. European Conference on Computer Vision (ECCV), 2018. PDF

@article{zhou2017temporalrelation,
    title = {Temporal Relational Reasoning in Videos},
    author = {Zhou, Bolei and Andonian, Alex and Oliva, Aude and Torralba, Antonio},
    journal={European Conference on Computer Vision},
    year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
Hand		Hand
dai		dai
model_zoo		model_zoo
ops		ops
pretrain		pretrain
sample_data		sample_data
.gitignore		.gitignore
.gitmodules		.gitmodules
969_71106.xml		969_71106.xml
GenerateClassificationFolders.py		GenerateClassificationFolders.py
LICENSE		LICENSE
README.md		README.md
TRNmodule.py		TRNmodule.py
average_scores.py		average_scores.py
class_balanced_loss.py		class_balanced_loss.py
cut_video_to_clip.py		cut_video_to_clip.py
dataset.py		dataset.py
dataset_joint.py		dataset_joint.py
datasets_video.py		datasets_video.py
delete_unvalid_frame.py		delete_unvalid_frame.py
download.sh		download.sh
extract_frame.py		extract_frame.py
extract_of.py		extract_of.py
fps_dem_trn.py		fps_dem_trn.py
generate_vocabulary.py		generate_vocabulary.py
main-hand.py		main-hand.py
main-handjoint.py		main-handjoint.py
main.py		main.py
models.py		models.py
opts.py		opts.py
parse_sign.py		parse_sign.py
process_dataset.py		process_dataset.py
process_dataset_charades.py		process_dataset_charades.py
process_dataset_dai.py		process_dataset_dai.py
process_dataset_dai_hand.py		process_dataset_dai_hand.py
test-dai-2stream.py		test-dai-2stream.py
test_models.py		test_models.py
test_video.py		test_video.py
test_video.sh		test_video.sh
test_video_dai.py		test_video_dai.py
train_rgb_dai.sh		train_rgb_dai.sh
train_rgb_dai_hand.sh		train_rgb_dai_hand.sh
train_rgb_dai_joint.sh		train_rgb_dai_joint.sh
transforms.py		transforms.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Temporal Relation Networks

Requirements

Data preparation

Raw RGB image

Optical flow image

Hand detection

Body keypoint detection

Left right hand recognition

Code

Training and Testing

Reference:

About

Releases

Packages

Languages

License

litingfeng/TRN-pytorch-ASL

Folders and files

Latest commit

History

Repository files navigation

Temporal Relation Networks

Requirements

Data preparation

Raw RGB image

Optical flow image

Hand detection

Body keypoint detection

Left right hand recognition

Code

Training and Testing

Reference:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages