This repository contains the selected list of datasets found in our survey "A Survey on RGB-D Datasets". We gathered 231 datasets that contain accessible depth data, therefore, this is the criteria to be considered an awesome dataset!
Datasets are divided into three categories and 6 sub-categories, which represent distinct applications of RGB-D data. The taxonomy tree of the application types is available in Figure 1, and extra information and examples of each category are available in our paper.
Fig 1: Taxonomy for RGB-D datasets.We present each dataset with 8 columns that summarize important information of the datasets: Year, Scene Type, Sensor Type, Sensor Name, Data Modalities, Extra Data, Images/Scenes, and Application.
In Figure 2, we illustrate the variability of datasets presented in this paper, and how the type of sensor produces distinct depth results, changing reliability and sparsity.
Fig 2: Examples of depth data with image (first row) and depth (second row) of the following sensors: (a) Structured Light from NYUv2, (b) TOF from AVD Dataset, (c) LIDAR from KITTI, and (d) Stereo Camera Sensing from ReDWeb, where the authors compute correspondence maps by using optical flow.Data is organized here by "Application Type", "Scene Type", and "Year" in this order. Since it is a long list, we also have it available on our website in a filterable way.
We also discuss the different applications for each sensor type and explain how these sensors work in our paper. We also identify influents and trending datasets in each field, which are also detailed in our paper.
15/07/2022 : Added 28 new datasets after revising other 606 papers. The majority of datasets included in this revision contain saliency maps and are from the "Body" category.
Please cite our paper if you used our survey in your work:
@article{LOPES2022103489,
title = {A survey on RGB-D datasets},
journal = {Computer Vision and Image Understanding},
volume = {222},
pages = {103489},
year = {2022},
issn = {1077-3142},
doi = {https://doi.org/10.1016/j.cviu.2022.103489},
}
We made available a website with filtering and ordering options to help you find the best datasets for your work. We encourage you to check it out in our page.
If you have any comments or want to add your work to the list, just contact us at:
- alexandre.lopes at ic.unicamp.br or
- alexrlops at gmail.com
We expect to continue updating this list of datasets through the years, and your contribution is fundamental to continue this work.
No. | Dataset Name | Sensor Type | Sensor Name | Application Type | Scene Type | Data Modalities | Extra Data | Images/Scenes | Year |
---|---|---|---|---|---|---|---|---|---|
0 | No Name Defined | - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
1 | DDAD | LiDAR | Luminar-H2 LIDAR | SOR, and SOE | Driving | Color, Deph | Instance Segmentation | 150 scenes (12650 frames) | 2020/2021 |
2 | Woodscape | LiDAR | Velodyne HDL-64E | SOR, and SOE | Driving | Color, IMU, GPS, Depth | instance segmentation, 2d object detection | 50 sequences (100k frames) | 2019/2020/2021 |
3 | NuScenes | LiDAR | - | SOR, and SOE | Driving | Color, Depth, Radar, IMU | 3D object detection, semantic segmentation | 1000 scenes (20 seconds each). 1.4M Images and 390k Lidar Sweeps | 2019/2020 |
4 | EventScape | - | Synthetic | SOR, and SOE | Driving | Color, Depth | Semantic Segmentation, Navigation Data (Position, orientation, angular velocity, etc) | 758 sequences | 2021 |
5 | KITTI-360 | SCS, LiDAR | Velodyne (LIDAR) Points Cloud, MVS | SOR, and SOE | Driving | Color, Depth, GPS, IMU | 2d-object detection, 3d-object detection, tracking, instance segmentation, optical flow. These are not in necessary in the same dataset | 11 sequences to over 320k images and 100k laser scans | 2021 |
6 | Lyft level 5 | SCS, LiDAR | 3 LiDAR (40 and 64-beam lidars), 5 radars, MVS | SOR, and SOE | Driving | Color, Depth, Radar | 3d object detection | 170000 scenes (25 seconds each) | 2020 |
7 | Virtual Kitti | - | Synthetic | SOR, and SOE | Driving | Color, Depth | semantic segmentation, instance segmentation, optical flow | 50 videos (21260 frames) | 2016 |
8 | KITTI | SCS, LiDAR | Velodyne (LIDAR) Points Cloud, MVS | SOR, and SOE | Driving | Color, Grayscale, Depth, GPS, IMU | instance segmentation | 61 scenes (42746 frames) | 2012 |
9 | Hypersim | - | Synthetic | SOR, and SOE | Indoor | Color, Depth | Normal Maps, Instance Segmentation, Diffuse Reflectance | 461 scenes (77400 images) | 2021 |
10 | RoboTHOR | - | Synthetic | SOR, and SOE | Indoor | Color, Depth | Instance Segmentation | 75 scenes | 2020 |
11 | Structured3D Dataset | - | Synthetic | SOR, and SOE | Indoor | Color, Depth | Object Detection, Semantic Segmentation | 3500 scenes with 21835 rooms (196515 frames) | 2020 |
12 | Replica | Structured Light | SOR, and SOE | Indoor | Color, Depth, IMU, Grayscale Camera | Normal Maps, Instance Segmentation | 18 scenes | 2019 | |
13 | Gibson | LiDAR, Structured Light | NavVis, Matterport Camera, DotProduct | SOR, and SOE | Indoor | Color, Depth | Normal Maps, Semantic Segmentation | 572 scenes. 1400 floor spaces from 572 buildings | 2018 |
14 | InteriorNet | - | Synthetic | SOR, and SOE | Indoor | Color, Depth, IMU | Normal Maps, Semantic Segmentation | 20 Million Images | 2018 |
15 | Taskonomy | Structured Light | SOR, and SOE | Indoor | Color, Depth | 25 Tags (Normals Maps, Semantic Segmentation, Scene Classification, etc.) | 4.5 Million Scenes | 2018 | |
16 | AVD | TOF | Kinect v2 | SOR, and SOE | Indoor | Color, Depth | Object Detection | 15 scenes (over 30k images) | 2017 |
17 | MatterPort3D | SCS, Structured light | Matterport Camera, MVS | SOR, and SOE | Indoor | Color, Depth | semantic segmentation, 3D semantic-voxel segmentation | 90 scenes, 10800 panoramic views (194400 images) | 2017 |
18 | ScanNet | Structured Light | Occipital Structure Sensor - similar to Microsoft Kinect v1 | SOR, and SOE | Indoor | Color, Depth | 3D semantic-voxel segmentation | 1513 sequences (over 2.5 million frames) | 2017 |
19 | SceneNet RGB-D | - | Synthetic | SOR, and SOE | Indoor | Color, Depth | instance segmentation, optical flow | 15K trajectories (scenes) (5M images) | 2017 |
20 | SunCG | - | Synthetic | SOR, and SOE | Indoor | Color, Depth | semantic segmentation | 45622 scenes | 2017 |
21 | GMU Kitchen Dataset | TOF | Kinect v2 | SOR, and SOE | Indoor | Color, Depth | Object Detection | 9 scenes (6735 frames) | 2016 |
22 | Stanford2D3D | Structured light | Matterport Camera | SOR, and SOE | Indoor | Color, Depth | semantic segmentation, Normal Maps | 6 large-scale indoor areas (70496 images) | 2016 |
23 | SUN3D | Structured Light | Asus Xtion Pro Live | SOR, and SOE | Indoor | Color, Depth | semantic segmentation | 415 sequences | 2013 |
24 | Starter Dataset | Structured Light | Synthetic, Matterport Pro2, NA | SOR and SOE (depending on subdataset) | Indoor, In-the-wild | Color, Depth, IMU, Grayscale Camera (depending on subdataset) | Normals Maps, Semantic Segmentation, Scene Classification, etc. (depending on subdataset) | Over 14.6M Images (multiple scenes) | 2021 |
25 | RGBD Object dataset | Structured Light | Kinect v1 | SOR, and SOE | Indoor, Isolated Objects / Focussed on Objects | Color, Depth | 3d segmentation | 8 sequences and 300 isolated objects (250000 frames) | 2011 |
26 | TartanAir | - | Synthetic LiDAR | SOR, and SOE | Indoor, Outdoor | Color, Depth | semantic segmentation, optical flow | 1037 scenes (Over 1M frames). Each scene contains 500-4000 frames. | 2020 |
27 | RGB-D Semantic Segmentation Dataset | Structured Light | Kinect v1 | SOR, and SOE | Isolated Objects / Focussed on Objects | Color, Depth | 3D semantic segmentation | 16 test scenes | 2011 |
28 | GTA-SfM Dataset | - | Synthetic | SOR, and SOE | Outdoor | Color, Depth | Optical Flow | 76000 images | 2020 |
29 | GL3D | SCS | MVS | SOR | Aerial | Color, Depth | 543 scenes (125623 images) | 2018 | |
30 | ApolloScape | LiDAR | Velodyne HDL-64E S3 | SOR | Driving | Color, Depth, GPS, Radar | 155 min with 93k frames | 2020 | |
31 | KAIST | SCS, LiDAR | Velodyne VLP-16, SICK LMS-511, MVS | SOR | Driving | Color, Depth, GPS, IMU, Altimeter | 19 sequences (191 km) | 2019 | |
32 | RobotCar | LiDAR | 2 x SICK LMS-151 2D LIDAR, 1 x SICK LD-MRS 3D LIDAR | SOR | Driving | Color, Deph, GPS, INS (Inertial navigation system) | 133 scenes (almost 20M images (from multiple sensors) | 2016 | |
33 | Malaga Urban Dataset | SCS, LiDAR | 2 SICK LMS, 3 HOKUYO, MVS | SOR | Driving | Color, Depth, IMU, GPS | 15 sequences | 2014 | |
34 | Omniderectional Dataset | SCS, LiDAR | Velodyne HDL-64E, MVS | SOR | Driving | Color, Depth | - | 152 scenes (12607 frames) | 2014 |
35 | Ford Campus Vision and Lidar | SCS, LiDAR | Velodyne HDL-64E, MVS | SOR | Driving | Color, Depth, IMU, GPS | 2 sequences | 2011 | |
36 | Karlsruhe | SCS | MVS | SOR | Driving | Color, GPS/IMU | - | 20 sequences (16657 frames) | 2011 |
37 | Multi-FoV (Urban Canyon dataset) | - | Synthetic | SOR | Driving, Indoor | Color, Depth | - | 2 sequences | 2016 |
38 | No Name Defined | RGB-D scans (NA) | SOR | Driving, Outdoor | Color, Depth | 13 scenes (5 Castle, 5 Church, 3 Street scenes) | 2013 | ||
39 | BlendedMVS | - | Synthetic | SOR | In-the-wild | Color, Depth | 113 scenes (17000 images) | 2020 | |
40 | Youtube3D | - | Two Points Automatically Annotated | SOR | In-the-wild | Color, Relative Depth | 795066 images | 2019 | |
41 | No Name Defined | Structured Light, TOF | Kinect v1, v2 and Synthetic | SOR | In-the-wild | Color, Depth | 10 scenes (2703 frames) | 2019 | |
42 | 4D Light Field Benchmark | SCS | light-field (Synthetic MVS) | SOR | In-the-wild | Color, Depth | 24 scenes | 2016 | |
43 | Habitat Matterport (HM3D) | Structured light | Matterport Pro2 | SOR | Indoor | Color, Depth | - | 1000 scenes | 2021 |
45 | ODS Dataset | SCS | MiniPolar 360 Camera (MVS) | SOR | Indoor | Color, Depth | Normal Maps | 6 indoor areas (50000 images) | 2019 |
46 | 360D | Structured light | Synthetic and Matterport Camera | SOR | Indoor | Color, Depth | 12072 scanned scenes and 10024 CG Scenes | 2018 | |
47 | PanoSUNCG | - | Synthetic | SOR | Indoor | Color, Depth | 103 scenes (25000 images) | 2018 | |
48 | CoRBS | TOF | Kinect v2 | SOR | Indoor | Color, Depth | 4 scenes (9 hours of recording) | 2016 | |
49 | EuRoC MAV Dataset | TOF, MVS | Vicon Motion Capture, Leica MS50 | SOR | Indoor | Color, Depth, IMU | - | 11 scenes | 2016 |
50 | Augmented ICL-NUIM Dataset | - | Synthetic | SOR | Indoor | Color, Depth | 4 scenes (2 living room, 2 offices) | 2015 | |
51 | Ikea Dataset | Structured Light | Kinect v1 and PrimeSense | SOR | Indoor | Color, Depth | - | 7 scenes | 2015 |
52 | ViDRILO | Structured Light | Kinect v1 | SOR | Indoor | Color, Depth | semantic category of the scene | 5 sequences (22454 images) | 2015 |
53 | ICL-NUIM dataset | - | Synthetic | SOR | Indoor | Color, Depth | 8 scenes (4 living room, 4 office) | 2014 | |
54 | MobileRGBD | TOF | Kinect v2 | SOR | Indoor | Color, Depth | 3 scenes (9.5 hours of recording) | 2014 | |
55 | RGBD Object dataset v2 | Structured Light | Kinect v1 | SOR | Indoor | Color, Depth | - | 14 sequences | 2014 |
56 | No Name Defined | LiDAR | Faro Focus 3D laser | SOR | Indoor | Depth | 40 scenes (rooms from three offices) | 2014 | |
57 | RGB-D Dataset 7-Scenes | Structured Light | Kinect v1 | SOR | Indoor | Color, Depth | - | 7 scenes (500-1000 frames/scene) | 2013 |
58 | Reading Room Dataset | Structured Light | Asus Xtion Pro Live | SOR | Indoor | Color, Depth | - | 1 scene | 2013 |
59 | TUM-RGBD | Structured Light | Kinect v1 | SOR | Indoor | Color, Depth, accelerometer | 39 sequences | 2012 | |
60 | IROS 2011 Paper Kinect | Structured Light | Kinect v1 | SOR | Indoor | Depth | - | 27 sequences | 2011 |
61 | No Name Defined | Structured Light | Asus Xtion Pro Live | SOR | Indoor, Isolated Objects / Focussed on Objects | Color, Depth | 6 scenes | 2013 | |
62 | No Name Defined | Structured Light, TOF | KinectFusion (Kinect v1) for two scenes. Riegl VZ-400 for office | SOR | Indoor, Isolated Objects / Focussed on Objects | Color, Depth | 2 scenes: statue and targetbox | 2012 | |
63 | M&M | SCS | MVS | SOR | Indoor, Outdoor | Color, Depth | 4690 sequences (170k frames) and 130000 images | 2020 | |
64 | Mannequin Challenge datasets | SCS | MVS | SOR | Indoor, Outdoor | Color, Depth | 4690 sequences (170k frames) | 2019 | |
65 | MVSEC Dataset | SCS, LiDAR | Velodyne (LiDAR), MVS | SOR | Indoor, Outdoor | Color, Depth, IMU | 5 sequences | 2018 | |
66 | ETH3D | SCS, LiDAR | FaroFocus X 330 (Laser Sensor), MVS | SOR | Indoor, Outdoor | Color, Depth | - | 25 high-res, 10 low-res | 2017 |
67 | DiLigGent-MV Dataset | SCS | MVS | SOR | Isolated Objects / Focussed on Objects | Color | 5 objects (scenes) | 2020 | |
68 | A Large Dataset of Object Scans | Structured Light | PrimeSense Carmine | SOR | Isolated Objects / Focussed on Objects | Color, Depth | over 10000 3D scans of objects. | 2016 | |
69 | No Name Defined | SCS, Structured Light | PrimeSense, MVS | SOR | Isolated Objects / Focussed on Objects | Color, Depth | 9 scenes: 4 scenes using PrimeSense, 5 scenes using MVS | 2015 | |
70 | BigBIRD | Structured Light | Carmine 1.09 sensor | SOR | Isolated Objects / Focussed on Objects | Color, Depth | 600 images (from 125 objects) | 2014 | |
71 | Fountain Dataset | Structured Light | Asus Xtion Pro Live | SOR | Isolated Objects / Focussed on Objects | Color, Depth | 1 scene | 2014 | |
72 | MVS | SCS | MVS | SOR | Isolated Objects / Focussed on Objects | Color | 124 scenes | 2014 | |
73 | Live Color+3D Database | TOF | Range Scanner (RIEGL VZ-400) | SOR | Outdoor | Color, Depth | 12 scenes | 2011/2013/2017 | |
74 | The Newer College Dataset | Structured Light, LiDAR | Intel D435i, Ouster OS-1 (Gen 1) 64 | SOR | Outdoor | Color, Depth, IMU | - | 6 scenes | 2020 |
75 | Megadepth | SCS | MVS | SOR | Outdoor | Color, Depth | 130000 images | 2018 | |
76 | CVC-13: Multimodal Stereo Dataset | SCS, | MVS | SOR | Outdoor | Color, Infrared | - | 4 scenes | 2013 |
77 | Make3D | TOF | custom-built 3-D scanner | SOR | Outdoor | Color, Depth | 534 images | 2009 | |
78 | Fountain-P11 and Herz-Jesu-P8 | LiDAR | SOR | Outdoor | Color, Depth | 2 scenes (19 images) | 2008 | ||
79 | No Name Defined | SCS | MVS (seven cameras) | SOR | Partial Body w/o Scene | Color | - | NA (2 actors) | 2011 |
80 | No Name Defined | SCS | MVS | SOR | Underwater | Color, Depth, IMU, Sonar | - | 3 sequences | 2016 |
81 | DeMon | SCS, Synthetic, Structured Light | Synthetic, MVS, Asus Xtion Pro Live, Kinect v1 | SOR | Color, Depth | 20537 sequences and scenes | 2017 | ||
82 | Scenes11 | - | Synthetic | SOR | Color, Depth | 19959 sequences | 2017 | ||
83 | VALID | - | Synthetic | SOE | Aerial | Color, Depth | Object Detection, Panoptic Segmentation, Instance Segmentation, Semantic Segmentation | 6 scenes (6690 images) | 2020 |
84 | US3D | LiDAR | Airborne LiDAR | SOE | Aerial | Color, Depth | Semantic Segmentation | 4160 images from 3 different cities (a fourth is not available) | 2019 |
85 | Potsdam | - | - | SOR | Aerial | Color, Depth | - | 38 Patches | 2011 |
86 | Vaihingen | LiDAR | Leica ALS50 and ALTM-ORION M | Only Depth | Aerial | Color, Depth | - | 33 Patches | 2011 |
87 | Leddar Pixset Dataset | LiDAR | Leddar Pixell LiDAR | SOE and Tracking (Other) | Driving | Color, Depth, IMU, Radar | 3D Bounding Boxes, 2D Bounding Boxes, Semantic Segmentation | 97 sequences (29k frames) | 2021 |
88 | Virtual Kitti 2 | - | Synthetic | SOE and Tracking (Other) | Driving | Color, Depth | Semantic Segmentation, Instance Segmentation, Optical Flow | 5 scenes (multiple conditions for each scene) | 2020 |
89 | Waymo Perception | LiDAR | SOE | Driving | Color, Depth | 3D object detection | 1150 scenes (20 seconds/scene) | 2020 | |
90 | Argoverse Dataset | SCS, LiDAR | Argo LiDAR, MVS | SOE and Tracking (Other) | Driving | Color, Depth | 3D Object Detection | 113 scenes | 2019 |
91 | CityScapes | SCS | MVS | SOE | Driving | Color, Odometry | semantic segmentation, 3d-object detection and pose | 50 cities (25000 images) | 2016 |
92 | SYNTHIA | Virtual 8 depth sensors | Synthetic | SOE | Driving | Color, Depth | instance segmentation | 5 sequences (with sub-sequences) at 5 fps. 200k images from videos | 2016 |
93 | Daimler Urban Segmentation Dataset | SCS | MVS | SOE | Driving | Color | Semantic Labeling | 5000 images | 2014 |
94 | Ground Truth Stixel Dataset | SCS | MVS | SOE | Driving | Color | Stixels | 12 sequences | 2013 |
95 | Daimler Stereo Pedestrian Dataset | SCS | MVS | SOE | Driving | Color | Object Detection | 28919 images | 2011 |
96 | UnrealDataset | - | Synthetic | SOE | Driving, Outdoor | Color, Depth | Semantic Segmentation | 21 sequences (100k images) | 2018 |
97 | OASIS v2 | - | From Human Annotation | SOE | In-the-wild | Color, Depth | Normal Maps, Instance Segmenation | 102000 images | 2021 |
98 | OASIS | - | From Human Annotation | SOE | In-the-wild | Color, Depth | Normal Maps, Instance Segmenation | 140000 images | 2020 |
99 | Scene Flow Datasets | - | Synthetic | SOE | In-the-wild | Color | Optical Flow, object segmentation | 2256 scenes (39049 frames) | 2016 |
100 | RGBD Salient Object Detection | Structured Light | Kinect v1 | SOE | In-the-wild | Color, Depth | Saliency Maps | 1000 images | 2014 |
101 | Saliency Detection on Light Field | SCS | Lytro light field (MVS) | SOE | In-the-wild | Color, Depth | Saliency Maps | 100 images | 2014 |
102 | MPI Sintel | - | Synthetic | SOE | In-the-wild | Color, Depth | Optical Flow | 35 scenes (50 frames/scene) | 2012 |
103 | NYUv2-OC++ | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth, accelerometer | occlusion boundaries maps | 1449 images from NYUv2 | 2020 |
104 | Near-Collision Set | SCS, LiDAR | LiDAR (NA), MVS | SOE | Indoor | Color, Depth | 2D Object Detection | 13658 sequences | 2019 |
105 | SUN_RGB-D | Structured Light and TOF | Intel RealSense 3D Camera, Asus Xtion LIVE PRO, Kinect v1 and v2 | SOE | Indoor | Color, Depth | semantic segmentation, object detection and pose | 10335 images | 2015 |
106 | TUW | Structured Light | ASUS Xtion ProLive RGB-D | SOE | Indoor | Color, Depth | object instance recognition | 15 sequences (163 frames) | 2014 |
107 | Willow and Challenge Dataset | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth | object instance recognition | 24 sequences (353 frames) for Willow, 39 sequences (176 frames) | 2014 |
108 | NYU Depth V2 | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth, accelerometer | semantic segmentation | 464 scenes (407024 frames) with 1449 labeled aligned RGB-D images | 2012 |
109 | No Name Defined | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth | semantic segmentation | 3 options. Large: 2 sequences (397 frames) | 2012 |
110 | Berkeley B3DO | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth | object detection | 75 scenes (849 images) | 2011 |
111 | NYU Depth V1 | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth | semantic segmentation | 64 scenes (108617 frames) with 2347 labeled RGB-D frames | 2011 |
112 | ClearGrasp | Structured Light | Synthetic, Intel RealSense D415 | SOE | Isolated Objects / Focussed on Objects | Color, Depth | Normal Maps, Semantic Segmentation - Synthetic | over 50000 synthetic images of 9 objects. 286 real images of 10 objects | 2019 |
113 | T-LESS | Structured Light, TOF | Kinect v2, PrimeSense Carmine 1.0 | SOE | Isolated Objects / Focussed on Objects | Color, Depth | 3D instance segmentation | NA scenes (38k images) for training. 20 scenes (10k images) for testing | 2017 |
114 | DROT | Structured Light, TOF | Kinect v1, v2 and RealSense R200 | SOE | Isolated Objects / Focussed on Objects | Color, Depth | Object Motion | 5 scenes (112 frames) | 2016 |
115 | MPII Multi-Kinect | SCS, Structured Light | Kinect v1, MVS | SOE | Isolated Objects / Focussed on Objects | Color, Depth | Object Detection | 33 scenes (560 images) | 2012 |
116 | Mid-Air Dataset | - | Synthetic | SOE | Outdoor | Color, Depth, Accelerometer, Gyroscope, GPS | Normal Maps, Semantic Segmentation | 54 sequences (420,000 frames) | 2019 |
117 | SCARED Dataset | SCS, Structured Light | Structured Light System (using P300 Neo Pico), MVS | Medical | Endoscopy | Color, Depth | - | 9 sequences | 2021 |
118 | Colonoscopy CG dataset | - | Synthetic | Medical | Endoscopy | Color, Depth | 16016 images | 2019 | |
119 | Endoscopic Video Datasets | SCS | MVS | Medical | Endoscopy | Color | 25 scenes | 2010 | |
120 | Name Not Defined | - | Synthetic | Medical | Medical | Color, Depth | - | 100 irises (72000 images) | 2020 |
121 | 50 Salads | Structured Light | Kinect v1 | Gestures | Indoor | Color, Depth, Accelerometer | Activity Classification | 50 sequences (25 people) | 2013 |
122 | RGB2Hands | Structured Light | Intel RealSense SR300/Synthetic | Gestures | Partial Body w/o Scene | Color, Depth | Segmentation, 2D Keypoints, Dense matching map, inter-hand distance, intra-hand distance | Real: 4 sequences (1724 frames). Synthetic: NA | 2020 |
123 | ObMan Dataset | - | Synthetic | Gestures | Partial Body w/o Scene | Color, Depth | 3D Hand Keypoints, Object Segmentation, Hand Segmentation | 150000 images | 2019 |
124 | Name Not Defined | Structured Light | Intel RealSense SR300 | Gestures | Partial Body w/o Scene | Color, Depth, magnetic and kinematic sensors | - | 1175 sequences (over 100000 frames) | 2018 |
125 | BigHand2.2M | Structured Light | Intel RealSense SR300 | Gestures | Partial Body w/o Scene | Color, Depth, 6D magnetic sensor | NA sequences (2.2 million images), 10 subjects | 2017 | |
126 | Pandora Dataset | TOF | Kinect v2 | Gestures | Partial Body w/o Scene | Color, Depth | Upper Body Part Person Pose (Skeleton) | 100 sequences (more than 250k frames) from 20 subjects | 2017 |
127 | RHD | - | Synthetic | Gestures | Partial Body w/o Scene | Color, Depth | Segmentation, Keypoints | 43986 images | 2017 |
128 | THU-READ | Structured Light | PrimeSense Carmine | Gestures | Partial Body w/o Scene | Color, Depth | - | 1920 sequences | 2017 |
129 | STB | SCS, Structured Light | Intel Real Sense F200, MVS | Gestures | Partial Body w/o Scene | Color, Depth | 12 sequences (18000 images) | 2016 | |
130 | EYEDIAP | Structured Light | Kinect v1 | Gestures | Partial Body w/o Scene | Color, Depth | Eye Points, Head Pose | 94 sequences | 2014 |
131 | Eurecom Kinect Face Dataset | Structured Light | Kinect v1 | Gestures | Partial Body w/o Scene | Color, Depth | Face Points | 936 sequences | 2014 |
132 | MANIAC Dataset | Structured Light | Kinect v1 | Gestures | Partial Body w/o Scene | Color, Depth | - | 103 sequences | 2014 |
133 | NYU Hand Pose Dataset | Structured Light | Synthetic/Kinect v1 | Gestures | Partial Body w/o Scene | Color, Depth | Hand Pose | 81009 frames | 2014 |
134 | 3DMAD | Structured Light | Kinect v1 | Gestures | Partial Body w/o Scene | Color, Depth | Eye Points | 255 sequences (76500 frames) | 2013 |
135 | Dexter 1 | SCS, Structured Light, TOF | Kinect v1, Creative Gesture Camera, MVS | Gestures | Partial Body w/o Scene | Color, Depth | 7 sequences | 2013 | |
136 | No Name Defined | TOF | SoftKinetic DS 325 | Gestures | Partial Body w/o Scene | Color, Depth, Measurand ShapeHand | 870 images (30 subjects) | 2013 | |
138 | MSR Gesture3D | Structured Light | Kinect v1 | Gestures | Partial Body w/o Scene | Color, Depth | - | 336 sequences | 2012 |
139 | Florence 3D Faces | - | Synthetic | Gestures | Partial Body w/o Scene | Color | 53 people (NA Frames/Seqs) | 2011 | |
140 | Espada Dataset | - | Synthetic | Only Depth | Aerial | Color, Depth | 49 environments (80k images) | 2021 | |
141 | DSEC Dataset | SCS, LiDAR | Velodyne VLP-16, MVS | Only Depth | Driving | Color, Depth, GPS | 53 sequences | 2021 | |
142 | Mapillary | SCS | MVS | Only Depth | Driving | Color, Depth | 50000 Scenes (750000 images) | 2020 | |
143 | rabbitAI Benchmark | SCS, MVS | 17-camera light-field (MVS) | Only Depth | Driving | Color | 200 scenes (100 for training, 100 for testing) | 2020 | |
144 | DrivingStereo - Driving Stereo | SCS, LiDAR | Velodyne HDL-64E, MVS | Only Depth | Driving | Color, Depth, IMU, GPS | 42 sequences (182188 frames) | 2019 | |
145 | Urban Virtual Dataset (UVD) | - | Synthetic | Only Depth | Driving | Color, Depth | 58500 images | 2017 | |
146 | DiverseDepth Dataset | SCS | MVS | Only Depth | In-the-wild | Color | 320000 images | 2020 | |
147 | HRWSI | SCS | MVS | Only Depth | In-the-wild | Color, Depth | 20778 images | 2020 | |
148 | Holopix50k | SCS | MVS | Only Depth | In-the-wild | Color | - | 49368 images | 2020 |
149 | DualPixels Dataset | SCS | MVS | Only Depth | In-the-wild | Color, Depth | - | 3190 images | 2019 |
150 | TAU Agent Dataset | - | Synthetic | Only Depth | In-the-wild | Color, Depth | 5 scenes | 2019 | |
151 | WSVD | SCS | MVS | Only Depth | In-the-wild | Color, Depth | 553 videos (1500000 frames) | 2019 | |
152 | ReDWeb | SCS | MVS | Only Depth | In-the-wild | Color, Depth | 3600 images | 2018 | |
153 | IRS Dataset | - | Synthetic | Only Depth | Indoor | Color, Depth | Normal Maps | 100025 images | 2019/2021 |
154 | IBims-1 | TOF | Laser (Leica HDS7000 laser scanner) | Only Depth | Indoor | Color, Depth | Semantic Segmentation (only for planar areas: walls, tables, floor) | 100 images | 2018/2020 |
155 | AirSim Building_99 | - | Synthetic | Only Depth | Indoor | Color, Depth | - | 20000 images | 2021 |
156 | Pano3D Dataset | LiDAR, Structured Light | 3 RGB cameras. The 3 depth cameras. Matterport Camera, NavVis, DotProduct (depending on subdataset) | Only Depth | Indoor | Color, Depth | Normal Maps | 42923 samples | 2021 |
157 | Multiscopic Vision | SCS | Synthetic MVS | Only Depth | Indoor | Color | around 1200 scenes of synthetic data, 92 scenes of real data. | 2020 | |
158 | Middlebury 2014 Dataset | SCS | MVS | Only Depth | Indoor | Color, Depth | 33 images | 2014 | |
159 | Middlebury 2006 Dataset | Structured Light | custom-build Structured Light | Only Depth | Indoor | Color, Depth | 21 images | 2006 | |
160 | Middlebury 2005 Dataset | Structured Light | custom-build Structured Light | Only Depth | Indoor | Color, Depth | 9 images | 2005 | |
161 | Middlebury 2003 Dataset | Structured Light | custom-build Structured Light | Only Depth | Indoor | Color, Depth | 2 images | 2003 | |
162 | Middlebury 2001 Dataset | SCS | MVS | Only Depth | Indoor | Color, Depth | 6 images | 2001 | |
163 | DIML/CVL | SCS, TOF | Kinectv2 for Indoor, ZED Stereo Camera (MVS) for Outdoor | Only Depth | Indoor, Outdoor | Color, Depth | more than 200 scenes | 2016, 2017, 2018, 2021 | |
164 | DIODE | LiDAR | FARO Focus S350 | Only Depth | Indoor, Outdoor | Color, Depth | Normal Maps | 30 scenes (8574 indoor images, 16884 outdoor images) | 2019 |
165 | Forest Virtual Dataset (FVD) | - | Synthetic | Only Depth | Outdoor | Color, Depth | 49500 images | 2017 | |
166 | Zurich Forest Dataset | SCS | MVS | Only Depth | Outdoor | Color, Depth | - | 3 sequences (9846 images) | 2017 |
167 | No Name Defined | SCS | MVS | Only Depth | Underwater | Color, Depth | - | 600 pairs (51 with depth Ground Truth) | 2021 |
168 | SQUID | SCS | MVS | Only Depth | Underwater | Color, Depth | - | 57 images | 2020 |
169 | No Name Defined | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth | 800 frames for each person (26 people) | 2019 | |
170 | UOW Online Action3D | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 20 sequences (20 participants performing multiple actions in a sequence) | 2018 |
171 | TVPR | Structured Light | Asus Xtion Pro Live | Human Activities | Full Body | Color, Depth | - | 23 sequences (100 people, 2004 secs) | 2017 |
172 | TST Fall detection Dataset v2 | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth, accelerometer | Person Pose (Skeleton) | 264 scenes | 2016 |
173 | UOW LargeScale Combined Action3D | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 4953 sequences | 2016 |
174 | TST Intake Monitoring dataset v1 | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 48 sequences | 2015 |
175 | TST Intake Monitoring dataset v2 | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 60 sequences | 2015 |
176 | TST TUG dataBase | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth, accelerometer | Person Pose (Skeleton) | 60 sequences | 2015 |
177 | UTD-MHAD | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth, accelerometer | Person Pose (Skeleton) | 861 sequences | 2015 |
178 | Human3.6M | TOF | MESA Imaging SR4000 from SwissRanger | Human Activities | Full Body | Color, Depth, motion capture (mx) camera | Person Pose (Skeleton) | 447260 RGB-D frames (almost 3.6M RGB frames) | 2014 |
179 | Northwestern-UCLA Multiview Action 3D Dataset | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 1473 sequences | 2014 |
180 | TST Fall detection Dataset v1 | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth, accelerometer | Person Pose (Skeleton) | 20 sequences | 2014 |
181 | Chalearn Multimodal Gesture Recognition | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth, Audio | User mask, Person Pose (Skeleton) | 707 sequences (1720800 frames) | 2013 |
182 | MHAD | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth, Accelerometer, Motion Capture System | - | 660 sequences | 2013 |
183 | ChaLearn gesture challenge | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 50000 sequences | 2012 |
184 | DGait | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 583 sequences (53 subjects) | 2012 |
185 | MSR DailyActivity3D | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 320 sequences | 2012 |
186 | RGBD-ID | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 316 sequences (79 people) | 2012 |
187 | SBU Kinect Interaction | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 21 sequences from seven participants | 2012 |
188 | MSR Action3D | Structured Light | Similar to Kinect v1 (NA) | Human Activities | Full Body | Color, Depth | - | 557 sequences (23797 frames) | 2010 |
189 | Hollywood 3D | SCS | MVS | Human Activities | In-the-wild | Color, Depth | - | around 650 video clips | 2013 |
190 | Depth 2 Height | TOF | Kinect v2 | Human Activities | Indoor | Color, Depth | 2136 images | 2020 | |
191 | HHOI | TOF | Kinect v2 | Human Activities | Indoor | Color, Depth | Person Pose (Skeleton) | 8 actors recorded interections. Each interaction lasts 2-7 seconds presented at 10-15 fps | 2016 |
192 | CMU Panoptic Dataset | TOF | Kinect v2 | Human Activities | Indoor | Color, Depth | 3D Skeleton | 65 sequences (5.5 capture hours) | 2015 |
193 | UR Fall Detection Dataset | Structured Light | Kinect v1 | Human Activities | Indoor | Color, Depth, Accelerometer | - | 70 sequences | 2014 |
194 | RGB-D People | Structured Light | Kinect v1 | Human Activities | Indoor | Color, Depth | object detection and tracking | 1 sequence (1132 frames of 3 sensors) | 2011 |
195 | DIW | - | Two points (manually anotated) | Points (Other) | In-the-wild | Color, Depth Points (2 points) | 495000 images | 2016 | |
196 | LightField Dataset | SCS | Lytro Illum (Light field) (MVS) | Synthesizes a 4D RGBD LF (Other) | Other (Flowers) | Color | 3343 images | 2017 | |
197 | Princeton Tracking Benchmark | Structured Light | Kinect v1 | Tracking (Other) | Indoor | Color, Depth | 100 sequences | 2013 | |
198 | FRIDA dataset | - | Synthetic | Fog (Other) | Driving | Color, Depth | 18 scenes (90 images) | 2010 | |
199 | FRIDA2 dataset | - | Synthetic | Fog (Other) | Driving | Color, Depth | 66 scenes (330 images) | 2012 | |
200 | Dynamic Scene | SCS | MVS | Novel View Synthesis (Other) | Indoor, Outdoor | Color | Semantic Segmentation | 9 scenes | 2020 |
201 | 3D Ken Burns | - | Synthetic | 3D Ken Burns (Other) | In-the-wild | Color, Depth | Normal Maps | 46 sequences | 2019 |
202 | Mirror3D Dataset | SCS, Structured Light | Matterport Camera, MVS, Kinect v1, Occipital Structure Sensor - similar to Microsoft Kinect v1 | Mirror (Other) | Indoor | Color, Depth | Mirror Mask | 7011 scenes with mirror | 2021 |
203 | SBM-RGBD dataset | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth | Saliency Mask | 33 sequences (~15000 frames) | 2017 |
204 | LFSD | - | Lytro light field (MVS) | SOE | In-The-Wild | Color, Depth | Saliency Mask | 100 images | 2015 |
205 | An In Depth View of Saliency | Structured Light | Kinect v1 | SOE | Indoor | Color, Depth | Saliency Mask | 80 images | 2013 |
206 | DUTLF-Depth | - | Lytro light field (MVS) | SOE | In-The-Wild | Color, Depth | Saliency Mask | 1200 images | 2019 |
207 | ReDWeb-S | - | MVS | SOE | In-The-Wild | Color, Depth | Saliency Mask | 3179 images | 2020 |
208 | COTS | - | Intel Realsense D435 (MVS) | SOE | Isolated Objects / Focussed on Objects | Color, Depth | Saliency Mask | 120 images | 2021 |
209 | NTU RGB+D | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth, IR | Person Pose (Skeleton) | 56880 sequences | 2016 |
210 | NTU RGB+D 120 | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth, IR | Person Pose (Skeleton) | 111480 sequences | 2019 |
211 | Mivia Action | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 28 sequences | 2013 |
212 | Chalearn LAP IsoGD | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 47933 sequences | 2016 |
213 | SYSU 3D HOI | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 480 sequences | 2015 |
213 | G3D | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton), Semantic Segmentation | 7 sequences (multiple actions per sequence) | 2016 |
214 | IAS-Lab RGBD-ID | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton), Semantic Segmentation | 33 sequences | 2013 |
215 | Online RGBD Action Dataset (ORGBD) | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 48 sequences | 2014 |
216 | MAD | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 40 sequences | 2014 |
217 | Hand Gesture | Structured Light | Kinect v1 | Gestures (Head and Hand) | Partial Body w/o Scene | Color, Depth | - | 1400 sequences | 2014 |
218 | Creative Senz3D | Structured Light | Creative Senz3D | Gestures (Head and Hand) | Partial Body w/o Scene | Color, Depth | - | 1320 sequences | 2015 |
219 | PKU-MMD | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 3076 sequences | 2017 |
220 | Florence 3D Actions | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 215 sequences | 2013 |
221 | UTKinect-Action3D | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 200 sequences | 2012 |
222 | KARD | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 540 sequences | 2014 |
223 | SOR3D-AFF | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth | - | 1201 sequences | 2020 |
224 | CMDFALL | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth, Accelerometer | - | 20 sequences | 2018 |
225 | EgoGesture | Structured Light | Intel RealSense SR300 | Gestures (Head and Hand) | Partial Body w/o Scene | Color, Depth | - | 2081 sequences | 2018 |
226 | LIRIS | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | - | 180 sequences | 2014 |
227 | Bimanual Actions | Structured Light | PrimeSense Carmine 1.09 | Gestures (Head and Hand) | Partial Body w/o Scene | Color, Depth | - | 540 sequences | 2020 |
228 | ISR-UoL 3D Social Activity | Structured Light | Kinect v1 | Human Activities | Full Body | Color, Depth | Person Pose (Skeleton) | 10 Sequences | 2016 |
229 | UESTC | TOF | Kinect v2 | Human Activities | Full Body | Color, Depth | - | 25600 sequences | 2018 |
230 | DDAD | LiDAR | Luminar-H2 LIDAR | SOR, and SOE | Driving | Color, Deph | Instance Segmentation | 150 scenes (12650 frames) | 2020/2021 |
This repositary has the following license: