This is a unified codebase for NN-based monocular depth estimation, the framework is based on detectron2 (with a lot of modifications) and supports both supervised and self-supervised monocular depth estimation methods. The main goal for developing this repository is to help understand popular depth estimation papers, I tried my best to keep the code simple.
- Add unsupervised motion learning
- Add waymo dataset support
- clone this repo
SDE_ROOT=/path/to/SimpleDepthEstimation git clone https://github.com/zzzxxxttt/SimpleDepthEstimation $SDE_ROOT cd $SDE_ROOT
- create a new conda environment and activate it
conda create -n sde python=3.7 conda activate sde
- install torch==1.8.0 and torchvision==0.9.0 follow the official instructions. (I haven't tried other pytorch versions)
- install other requirements
pip install -r requirements.txt
- to use waymo dataset, compile waymo-open-dataset according to the official instructions.
- Download and extract KITTI raw dataset, refined KITTI depth groundtruth, and eigen split files
- Modify the data path in the config file
- Download Waymo tfrecords
- Extract image and depth from tfrecords
python tools/extract_waymo_data.py --src_dir path/to/tfrecords --dst_dir path/to/extracted/data --split training python tools/extract_waymo_data.py --src_dir path/to/tfrecords --dst_dir path/to/extracted/data --split validation
- Modify the data path in the config file
python path/to/project/train.py --num-gpus 2 --cfg path/to/config RUN_NAME run_name
python path/to/project/train.py --num-gpus 2 --cfg path/to/config --eval MODEL.WEIGHTS /path/to/checkpoint_file
model | type | config | abs rel err | sq rel err | rms | log rms | d1 | d2 | d3 |
---|---|---|---|---|---|---|---|---|---|
ResNet-18 | supervised | link | 0.076 | 0.306 | 3.066 | 0.116 | 0.936 | 0.990 | 0.998 |
ResNet-50 | supervised | link | 0.069 | 0.282 | 2.977 | 0.107 | 0.943 | 0.991 | 0.998 |
BTSNet (ResNet-50) | supervised | link | 0.062 | 0.259 | 2.859 | 0.100 | 0.950 | 0.992 | 0.998 |
MonoDepth2 (ResNet-18) | self-supervised | link | 0.118 | 0.735 | 4.517 | 0.163 | 0.860 | 0.974 | 0.994 |
MonoDepth2 (ResNet-50) | self-supervised | link | 0.108 | 0.674 | 4.414 | 0.153 | 0.882 | 0.976 | 0.994 |
PackNet (1A) | self-supervised | link | 0.107 | 0.762 | 4.577 | 0.159 | 0.884 | 0.972 | 0.992 |
python tools/demo.py --cfg path/to/config --input path/to/image --output path/to/output_dir MODEL.WEIGHTS /path/to/checkpoint_file
visualization:
- add PackNet
- add Dynamic Motion Learning (I have implemented it but still buggy, help welcome!)
- add Depth From Videos in the Wild
- add Full Surround Monodepth
- support more datasets
- detectron2
- Digging into Self-Supervised Monocular Depth Prediction
- From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation
- 3D Packing for Self-Supervised Monocular Depth Estimation
- Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras
- Unsupervised Monocular Depth and Motion Learning