Evaluation was based on five datasets; these are briefly introduced in the following. The pre-processing, formatting and conversion tools used are detailed afterwards.
Dataset | Link |
---|---|
BSDS500 | Web |
NYUV2 | Web |
SBD | Web |
SUNRGBD | Web |
Fash | Web |
Downloads of the pre-processed datasets will be made available here: davidstutz/superpixel-benchmark-data
Sample images of all datasets are shown below:
The Berkeley Segmentation Dataset 500 [2] was the first dataset used for evaluating superpixel algorithms. It consists of 500 images, each with 5 different ground truth segmentations of high quality, divided into a training set of 200 images, a validation set of 100 images and a test set of 200 images. For parameter optimization, the validation set was used.
[2] P. Arbelaez, M. Maire, C. Fowlkes, J. Malik.
Contour detection and hierarchical image segmentation.
IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (5) (2011) 898-916.
The ground truth was used as provided, however, converted from .mat
format
to .csv
format. The converted dataset is not yet available.
In order to manually convert the BSDS500 dataset, use lib_tools/bsds500_convert_script.m
:
- Download the BSDS500 dataset from here.
- Extract the
BSR/BSDS500/data
folder intodata/BSDS500
(overwriting the provided examples indata/BSDS500/images
anddata/BSDS500/csv_groundTruth
). Also note that afterwards there are three folders:groundTruth
,csv_groundTruth
andimages
. - In
lib_tools/bsds500_convert_script.m
, adapt the path to the directory, i.e. setBSDS500_DIR
correctly. - Run the script. Note that this may take some time.
The instructions are also found in lib_tools/bsds500_convert_script.m
.
The NYU Depth Dataset V2 [3] includes 1449 images with pre-processed depth. Semantic ground truth segmentations with instance labels are provided. Following Ren and Bo [4], the ground truth has been pre-processed to remove small unlabeled segments. 199 images were randomly chosen to represent the validation set and 399 images where randomly chosen for testing.
[3] N. Silberman, D. Hoiem, P. Kohli, R. Fergus.
Indoor segmentation and support inference from RGBD images.
European Conference on Computer Vision, 2012, pp. 746–760.
[4] X. Ren, L. Bo.
Discriminatively trained sparse code gradients for contour detection.
Neural Information Processing Systems, 2012, pp. 593–601.
The randomly chosen validation and test subsets can be found in data/NYUV2/nyuv2_train_subset.txt
and data/NYUV2/nyuv2_test_subset.txt
(note that the validation set corresponds to
the train set in this case).
The ground truth was converted to .csv
files after thinning unlabeled regions.
The converted dataset is available in the data repository:
davidstutz/superpixel-benchmark-data.
In order to manually convert the NYUV2 dataset and extract the used validation and
testing subsets, use lib_tools/nyuv2_convert.script.m
:
- Download the dataset from here.
Make sure that the downloaded file is
nyu_depth_v2_labeled.mat
. - Put the file in
data/NYUV2/
. - Make sure that
data/NYUV2
containsnyuv2_test_subset.txt
,nyuv2_train_subset.txt
,nyuv2_test.txt
andnyuv2_train.txt
. - In
lib_tools/nyuv2_convert.script.m
, setNYUV2_DIR
to point to thedata/NYUV2
directory. - Run the script. Note that this may take some time and memory.
The instructions are also found in lib_tools/nyuv2_convert.script.m
.
The Stanford Background Dataset [5] combines 715 images from several datasets. The images are of varying size, quality and scenes. The semantic ground truth segmentations provided needed to be pre-processed in order to guarantee connected components. Validation and testing sets of size 238 and 477, respectively, were chosen at random.
[5] S. Gould, R. Fulton, D. Koller.
Decomposing a scene into geometric and semantically consistent regions.
International Conference on Computer Vision, 2009, pp. 1–8.
The ground truth was converted to .csv
files. The converted dataset
is available in the data repository:
davidstutz/superpixel-benchmark-data.
To manually convert the SBD and select validation and testing images, follow
lib_tools/sbd_convert_script.m
:
- Download the Stanford Background Dataset from here.
- Extract the dataset such that
data/SBD
contains two folders:images
andlabels
. - Make sure that
data/SBD
containssbd_test.txt
andsbd_train.txt
. - In
lib_tools/sbd_convert_script.m
, adapt the variableSBD_DIR
below to match the path todata/SBD
. - Run the script. Note that this may take some time and memory.
The instructions are also found in lib_tools/sbd_convert_script.m
.
The SUNRGBD dataset [6] contains 10335 images including pre-processed depth. Semantic ground truth segmentations are provided and need to be pre-processed similar to the NYUV2 dataset. Validation set and testing set of size 200 and 400, respectively, were chosen at random. Images that are also included in the NYUV2 dataset were ignored.
[6] S. Song, S. P. Lichtenberg, J. Xiao.
SUN RGB-D: A RGB-D scene understanding benchmark suite.
IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 567–576.
The ground truth was converted to .csv
files. The converted dataset
is available in the data repository:
davidstutz/superpixel-benchmark-data.
To manually convert the dataset, follow lib_tools/sunrgbd_convert_script.m
:
- Download the SUNRGBD dataset from here.
- Make sure to download both the SUNRGBD V1 dataset and the SUNRGBDtoolbox containing the annotations.
- From the SUNRGBDtoolbox extract
SUNRGBD2dseg.mat
andSUNRGBDMeta.mat
todata/SUNRGBD
. - From the SUNRGBD V1 dataset extract all files into
data/SUNRGBD
; note that this may take quite some time! It might be wise to extract the contained directories (xtion, realsense, kv1, kv2) separately. - In
lib_tools/sunrgbd_convert_script.m
, adaptROOT_DIR
to point to the data directory (i.e. the parent directory of the SUNRGBD directory). - Run the script. Note that this may take some time and memory.
The instructions are also found in lib_tools/sunrgbd_convert_script.m
.
The Fashionista dataset [7] contains 685 images with semantic ground truth segmentations. The ground truth segmentations were pre-processed to ensure connected segments. Validation set and training set of size 222 and 463, respectively, were chosen at random.
[7] K. Yamaguchi, K. M. H, L. E. Ortiz, T. L. Berg.
Parsing clothing in fashion photographs.
IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 3570–3577.
The ground truth needs to be converted to .csv
files using the steps in lib_tools/fash_convert_script.m
:
- Download the Fashionista dataset from here.
- Extract fashionista_v0.2.1.mat into data/Fash.
- Adapt the below variables to match the path where
data/Fash
can be found. - Run the script. Note that this may take some time and memory.