Skip to content

Latest commit

 

History

History
81 lines (60 loc) · 5.04 KB

DATASET.md

File metadata and controls

81 lines (60 loc) · 5.04 KB

Dataset Preparation

Download our copy

We have prepared one copy of the Kinetics dataset. To obtain the dataset, please email [email protected]. Our copy includes 234643 training and 19761 validation videos of the Kinetics dataset. All the videos are rescaled to height=256 pixels via ffmpeg, thus the total size is much smaller: 132GB.

Because of the expirations of some YouTube links, we can only download 234,643 out of 246,535 videos for training (on 12/20/2017), so about 10K (5%) training videos are missing. If you are training with this data, it will lead to a slight performance drop (<0.5%) compared to the numbers reported in our paper.

Note: In this repo, we release the models that are trained with the same data as our paper. However, this data with fewer missing videos is currently unavailable.

Download via the official code

One can also download the videos via the official code. This is how it is done:

  1. Download the videos via the official scripts.

  2. After all the videos are downloaded on the path: YOUR_DATASET_FOLDER. Go into the folder:

    cd process_data/kinetics

    We provide a dictionary for the corresponding name of each class

    classids.json

    Use the following script to generate the txt lists for both training and validation set. This script will also modify the names of some folders in the dataset (e.g., "petting cat" -> "petting_cat").

    gen_py_list.py

    Use the following script to rescale the videos to height=256 pixels. This is not necessary, but it will make the IO much faster during training.

    downscale_video_joblib.py

Creating lmdb for training and testing

In this code, we are going to perform video decoding and extract the video frames on the fly during training. Following the instruction below, we can prepare the lmdb for training.

cd process_data/kinetics
  1. This step is only necessary if you are using our copy of the Kinetics data. Otherwise, "trainlist.txt" and "vallist.txt" should have already been generated by following Download via the official code. If you downloaded our copy, we need to change the folder name in the train/val lists, assuming the dataset is downloaded in YOUR_DATASET_FOLDER, run:

    python change_listname.py trainlist_download.txt trainlist.txt /nfs.yoda/xiaolonw/kinetics/data/train $YOUR_DATASET_FOLDER/train_256
    python change_listname.py vallist_download.txt vallist.txt /nfs.yoda/xiaolonw/kinetics/data/val $YOUR_DATASET_FOLDER/val_256
  2. Since lmdb does not support shuffle during training, We shuffle the training list and repeat the shuffling for 100 times (as 100 epochs). The results are saved in one single txt file (2GB). It is crucial to shuffle after each epoch when training with Batch Normalization.

    python shuffle_list_rep.py trainlist_shuffle_rep.txt
  3. Create the lmdb according to the list. Note that the lmdb is only storing the file names instead of the videos themselves. The test set is the same as the validation set. During training, the validation error is measured on random cropped examples. For each example in lmdb, there are 2 elements: video name and the class label for the video.

    mkdir ../../data/lmdb
    mkdir ../../data/lmdb/kinetics_lmdb_multicrop
    python create_video_lmdb.py --dataset_dir ../../data/lmdb/kinetics_lmdb_multicrop/train  --list_file trainlist_shuffle_rep.txt
    python create_video_lmdb.py --dataset_dir ../../data/lmdb/kinetics_lmdb_multicrop/val  --list_file vallist.txt
  4. The testing lmdb is in a different format from the validation lmdb. This is the lmdb for FCN (spatial) testing without flipping. For each sample, there are 4 elements: video name, index for the video, index for the start frame and index for the spatial location for cropping.

    python create_video_lmdb_test_multicrop.py --dataset_dir ../../data/lmdb/kinetics_lmdb_multicrop/test  --list_file vallist.txt

Note that the variable labels indicates the index of the video instead of the actual class label during test time. The actual labels for evaluation is read via here from cfg.FILENAME_GT in the code.

  1. (Optional:) This is the lmdb for FCN (spatial) testing with flipping.

    mkdir ../../data/lmdb/kinetics_lmdb_flipcrop
    python create_video_lmdb_test_flipcrop.py --dataset_dir ../../data/lmdb/kinetics_lmdb_flipcrop/test  --list_file vallist.txt
  2. (Optional:) This is the lmdb for testing by only using center cropped examples.

    mkdir ../../data/lmdb/kinetics_lmdb_singlecrop
    python create_video_lmdb_test.py --dataset_dir ../../data/lmdb/kinetics_lmdb_singlecrop/test  --list_file vallist.txt