-
Notifications
You must be signed in to change notification settings - Fork 152
Instructions to create ImageNet 2012 data
Note: you need in total 250GB of available memory on your hard disk
#Step1:
mkdir /hdd/ImageNet
cd /hdd/ImageNet
#Step2: Download ImageNet data
Download training images (about 50GB) wget -c http://www.image-net.org/challenges/LSVRC/2012/nonpub/ILSVRC2012_img_train.tar &
Download validation images: wget -c http://www.image-net.org/challenges/LSVRC/2012/nonpub/ILSVRC2012_img_val.tar &
#Step3: decompress ImageNet data
To extract training data mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
// Make sure to check the completeness of the decompression, you should have 1,281,167 images in train folder
To extract validation data
cd ../ && mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xvf ILSVRC2012_img_val.tar
#Step4: preprocess ImageNet data This step requires that you have built the caffe project (either the OpenCL caffe or original caffe in CPU_ONLY mode), because we are going to use some of the scripting tools provided by caffe.
cd data/ilsvrc2012
./get_ilsvrc.sh
cd ../../
vi /example/imagenet/create_imagenet.sh
modify the following variables to point to your ImageNet data dir
TRAIN_DATA_ROOT=/hdd/ImageNet/train
VAL_DATA_ROOT=/hdd/ImageNet/val
then set data resize bool to true:
RESIZE=true
then you are ready to create the lmdb format of ImageNet data, as needed by the trianing! ./examples/imagenet/create_imagenet.sh