SWL-LSE: A Dataset of Spanish Sign Language Health Signs with an ISLR Baseline Method

This repository contains the code necessary to replicate the experiments outlined in the paper "SWL-LSE: A Dataset of Spanish Sign Language Health Signs with an ISLR Baseline Method."

Repository Structure

The repository is divided into two main sections, each located in different folders:

Mediapipe_keypoints: Contains all the scripts and models used to extract feature vectors from videos and build the datasets for training and evaluation.
Msg3d: Contains the original MSG3D model present in Github with some modifications.

Datasets Used

We employ three datasets:

SWL-SLE: A newly created dataset available at Zenodo.
- Annotation files: train, val, test
WLASL: Available at GitHub.
- Annotation files: train, val, test
- WLASL300C: An extension of WLASL replaced the classes whose labeling in the original version presented numerous anomalies with other better-labeled classes.
  - Annotation files: train, val, test
ASL-Citizen: Available at Microsoft Research.
- Annotation files: train, val, test

Mediapipe Keypoints: Dataset Construction, Feature Extraction, and Preprocessing

This process involves multiple steps in which the Mediapipe model is applied, data is normalized, and the feature vectors are generated to construct the final datasets. Below is the pipeline:

Key Scripts

`generate_mediapipe.py`

parser.add_argument('--folder_input_videos', required=True, type=str)
parser.add_argument('--pose_hands', action='store_true')
parser.add_argument('--holistic', action='store_true')
parser.add_argument('--holistic_legacy', action='store_true')
parser.add_argument('--folder_output_mediapipe', required=True, type=str)

`generate_arr_keypoints.py`

parser.add_argument('--pose_hands', action='store_true')
parser.add_argument('--holistic', action='store_true')
parser.add_argument('--holistic_legacy', action='store_true')
parser.add_argument('--folder_input_mediapipe', default='', type=str)
parser.add_argument('--folder_output_kps', required=True, type=str)
parser.add_argument('--world', action='store_true')

The default keypoints array consists of 61 keypoints:

Pose Keypoints: 19
Hands (Left & Right): 21 each

`generate_features.py`

parser.add_argument('--folder_in_kps', required=True, type=str)
parser.add_argument('--folder_out_features', required=True, type=str)
parser.add_argument('--type_kps', required=False, default='C4_xyzc', type=str)  
parser.add_argument('--offset', action='store_true')  
parser.add_argument('--normalize', action='store_true')  
parser.add_argument('--noFramesLimit', action='store_true')  
parser.add_argument('--jump_reset', action='store_false')

`generate_dataset.py`

parser.add_argument('--folder_npy', required=True, type=str)
parser.add_argument('--folder_labels', required=True, type=str)
parser.add_argument('--folder_out', required=True, type=str)

Simple Usage Example: Video + Annotations to Dataset (Step-by-step)

A step-by-step guide to generate a dataset using the Holistic model, producing both normalized and non-normalized datasets.

Requirements

Folder: ANNOTATIONS with files: train_labels.csv, val_labels.csv, test_labels.csv
Folder: VIDEOS with .mp4 files

Define paths

PATH_VIDEOS=BBDD_PATH/VIDEOS
PATH_MEDIAPIPE=BBDD_PATH/MEDIAPIPE
PATH_KEYPOINTS_HL=BBDD_PATH/KEYPOINTS/HL
PATH_FEATURES_HL_NORM=BBDD_PATH/FEATURES/NORM/HL
PATH_FEATURES_HL_NO_NORM=BBDD_PATH/FEATURES/NO_NORM/HL
PATH_DATASET_HL_NORM=BBDD_PATH/DATASET/NORM/HL
PATH_DATASET_HL_NO_NORM=BBDD_PATH/DATASET/NO_NORM/HL
PATH_ANNOTATIONS=BBDD_PATH/ANNOTATIONS

Step 1: Obtain Mediapipe Output (Hands & Pose, Holistic Legacy: "raw output")

python generate_mediapipe.py --pose_hands --holistic_legacy --folder_input_videos $PATH_VIDEOS --folder_output_mediapipe $PATH_MEDIAPIPE

Step 2.1: Generate Keypoints Array (from Holistic) used in Signamed

python generate_arr_keypoints.py --holistic_legacy --folder_input_mediapipe $PATH_MEDIAPIPE --folder_output_kps $PATH_KEYPOINTS_HL

Step 3.1: Extract Features (with normalization and angle calculation using z-dimension or discarding z-dimension)

python generate_features.py --type_kps C4_xyzc --offset --normalize --folder_in_kps $PATH_KEYPOINTS_HL --folder_out_features $PATH_FEATURES_HL_NORM

python generate_features.py --type_kps C3_xyc --offset --normalize --folder_in_kps $PATH_KEYPOINTS_HL --folder_out_features $PATH_FEATURES_HL_NORM

Step 4.1: Build Dataset

python generate_dataset.py --folder_npy $PATH_FEATURES_HL_NORM --folder_labels $PATH_ANNOTATIONS --folder_out $PATH_DATASET_HL_NORM

Dataset Structure

BBDD_PATH
├── ANNOTATIONS
│   ├── train_labels.csv
│   ├── val_labels.csv
│   └── test_labels.csv
├── VIDEOS
│   └── *.mp4
├── MEDIAPIPE
│   └── *.pkl
├── KEYPOINTS
│   └── *.npy
├── FEATURES
│   └── *.npy
└── DATASET
    ├── *.npy
    └── *.pkl

MSG3D: Model

Key Script

`main.py`

    parser.add_argument('--work-dir', type=str, required=True, help='the work folder for storing results')  
    parser.add_argument('--dataset', type=str, required=True, help='Dataset used')  
    parser.add_argument('--stream', type=str, required=True, help='Stream used')  
    parser.add_argument('--num-classes', type=int, required=True, help='Stream used')  
    parser.add_argument('--config', default='/home/bdd/LSE_Lex40_uvigo/dataconfig/nturgbd-cross-view/test_bone.yaml', help='path to the configuration file')  
    parser.add_argument('--phase', default='train', help='must be train or test')  
    parser.add_argument('--seed', type=int, default=random.randrange(200), help='random seed')    
    parser.add_argument('--weights', default=None, help='the weights for network initialization')  
    parser.add_argument('--ignore-weights', type=str, default=[], nargs='+', help='the name of weights which will be ignored in the initialization')   
    parser.add_argument('--base-lr', type=float, default=0.01, help='initial learning rate')  
    parser.add_argument('--step', type=int, default=[20, 40, 60], nargs='+', help='the epoch where optimizer reduce the learning rate')  
    parser.add_argument('--device', type=int, default=0, nargs='+', help='the indexes of GPUs for training or testing')  
    parser.add_argument('--optimizer', default='SGD', help='type of optimizer')  
    parser.add_argument('--nesterov', type=str2bool, default=False, help='use nesterov or not')  
    parser.add_argument('--batch-size', type=int, default=32, help='training batch size')  
    parser.add_argument('--test-batch-size', type=int, default=256, help='test batch size')  
    parser.add_argument('--forward-batch-size', type=int, default=16, help='Batch size during forward pass, must be factor of --batch-size')  
    parser.add_argument('--num-epoch', type=int, default=80, help='stop training in which epoch')  
    parser.add_argument('--weight-decay', type=float, default=0.0005, help='weight decay for optimizer')  
    parser.add_argument('--use-tta', action='store_true', help='Activate tta - deactivated use only first element in the config file')  
    parser.add_argument('--tta', default=[[False, 1]], help='Config tta')  
    parser.add_argument('--lr-scheduler', default='MultiStepLR', help='type of LR scheduler')  
    parser.add_argument('--gamma', type=float, default=0.1, help='Gamma parameter MultiStepLR')  
    parser.add_argument('--factor', type=float, default=0.1, help='Factor parameter ReduceLROnPlateau')  
    parser.add_argument('--patience', type=int, default=10, help='Patience parameter ReduceLROnPlateau')  
    parser.add_argument('--cooldown', type=int, default=0, help='Cooldown parameter ReduceLROnPlateau')  
    parser.add_argument('--tmax', type=int, default=0, help='tmax parameter CosineAnnealingLR')  
    parser.add_argument('--eta-min', type=float, default=0.0001, help='eta_min parameter CosineAnnealingLR')  
    parser.add_argument('--epoch-warn', type=int, default=0, help='Epoch without scheduler steps')  
    parser.add_argument('--early-stopping', type=int, default=0, help='stop training if not improve in X epochs')  
    parser.add_argument('--use-train-normalization', type=str, default=None, help='Use normalized data and provide the folder where this data is located')

Training Example

STREAM=joints_C4_xyzc 
DATASET=/path/to/ASL_Citizen/DATASET/NORM/HP 
DEVICE=5
NUM_CLASSES=300
SEED=42 
ESTUDIO=E0
CONFIG=config/TRAIN_CUSTOM/train.yaml
EXPERIMENT=TRAIN_ASL_CITIZEN_HP/IMAGE_05/$ESTUDIO/$SEED/$STREAM-T1

nohup python main.py --work-dir work_dir/$EXPERIMENT --config $CONFIG --dataset $DATASET --stream $STREAM --num-classes $NUM_CLASSES --device $DEVICE --batch-size 32 --forward-batch-size 32 --test-batch-size 32 --nesterov true --weight-decay 0.0005 --base-lr 0.1 --seed $SEED --use-deterministic --num-worker 50 --early-stopping 30 --step 250 --num-epoch 250 --optimizer 'SGD' --lr-scheduler ReduceLROnPlateau --factor 0.5 --patience 10 --cooldown 0 &

Evaluation and Testing Example

STREAM=joints_C4_xyzc 
DATASET=/path/to/ASL_Citizen/DATASET/NORM/HP 
DEVICE=5
NUM_CLASSES=300
SEED=42 
ESTUDIO=E0
EXPERIMENT=TRAIN_ASL_CITIZEN_HP/IMAGE_05/$ESTUDIO/$SEED/$STREAM-T1
WEIGHT=work_dir/TRAIN_ASL_CITIZEN_HP/IMAGE_05/E0/42/joints_C3_xyc-T11/weights/weights-110.pt
CONFIG=config/TRAIN_CUSTOM/val.yaml

python main_GTM.py --work-dir eval/$EXPERIMENT --config $CONFIG --weights $WEIGHT --device $DEVICE --test-batch-size 50 --seed $SEED --stream $STREAM --dataset $DATASET --num-classes $NUM_CLASSES

STREAM=joints_C4_xyzc 
DATASET=/path/to/ASL_Citizen/DATASET/NORM/HP 
DEVICE=5
NUM_CLASSES=300
SEED=42 
ESTUDIO=E0
EXPERIMENT=TRAIN_ASL_CITIZEN_HP/IMAGE_05/$ESTUDIO/$SEED/$STREAM-T1
WEIGHT=work_dir/TRAIN_ASL_CITIZEN_HP/IMAGE_05/E0/42/joints_C3_xyc-T11/weights/weights-110.pt
CONFIG=config/TRAIN_CUSTOM/test.yaml

python main_GTM.py --work-dir eval/$EXPERIMENT --config $CONFIG --weights $WEIGHT --device $DEVICE --test-batch-size 50 --seed $SEED --stream $STREAM --dataset $DATASET --num-classes $NUM_CLASSES

Acknowledgements

This repo is based on

[MS-G3D] (https://github.com/kenziyuliu/MS-G3D)

@inproceedings{liu2020disentangling,
  title={Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition},
  author={Liu, Ziyu and Zhang, Hongwen and Chen, Zhenghao and Wang, Zhiyong and Ouyang, Wanli},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={143--152},
  year={2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
datasets		datasets
imgs		imgs
mediapipe_keypoints		mediapipe_keypoints
msg3d		msg3d
.gitignore		.gitignore
README.md		README.md
README_save.md		README_save.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SWL-LSE: A Dataset of Spanish Sign Language Health Signs with an ISLR Baseline Method

Repository Structure

Datasets Used

Mediapipe Keypoints: Dataset Construction, Feature Extraction, and Preprocessing

Key Scripts

`generate_mediapipe.py`

`generate_arr_keypoints.py`

`generate_features.py`

`generate_dataset.py`

Simple Usage Example: Video + Annotations to Dataset (Step-by-step)

Requirements

Define paths

Step 1: Obtain Mediapipe Output (Hands & Pose, Holistic Legacy: "raw output")

Step 2.1: Generate Keypoints Array (from Holistic) used in Signamed

Step 3.1: Extract Features (with normalization and angle calculation using z-dimension or discarding z-dimension)

Step 4.1: Build Dataset

Dataset Structure

MSG3D: Model

Key Script

`main.py`

Training Example

Evaluation and Testing Example

Acknowledgements

About

Releases

Packages

Languages

mvazquezgts/SWL-LSE

Folders and files

Latest commit

History

Repository files navigation

SWL-LSE: A Dataset of Spanish Sign Language Health Signs with an ISLR Baseline Method

Repository Structure

Datasets Used

Mediapipe Keypoints: Dataset Construction, Feature Extraction, and Preprocessing

Key Scripts

generate_mediapipe.py

generate_arr_keypoints.py

generate_features.py

generate_dataset.py

Simple Usage Example: Video + Annotations to Dataset (Step-by-step)

Requirements

Define paths

Step 1: Obtain Mediapipe Output (Hands & Pose, Holistic Legacy: "raw output")

Step 2.1: Generate Keypoints Array (from Holistic) used in Signamed

Step 3.1: Extract Features (with normalization and angle calculation using z-dimension or discarding z-dimension)

Step 4.1: Build Dataset

Dataset Structure

MSG3D: Model

Key Script

main.py

Training Example

Evaluation and Testing Example

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`generate_mediapipe.py`

`generate_arr_keypoints.py`

`generate_features.py`

`generate_dataset.py`

`main.py`

Packages