This is a Python3 / PyTorch implementation of FSRE-Depth, as described in the following paper:
Fine-grained Semantics-aware Representation Enhancement for Self-supervisedMonocular Depth Estimation Hyunyoung Jung, Eunhyeok Park and Sungjoo Yoo
ICCV 2021 (oral)
The code was implemented based on Monodepth2.
This code was implemented under torch==1.3.0 and torchvision==0.4.1, using two NVIDIA TITAN Xp gpus with distrutibted training. Different version may produce different results.
pip install -r requirements.txt
KITTI Raw Data and pre-computed segmentation images are required for training.
KITTI/
├── 2011_09_26/
├── 2011_09_28/
├── 2011_09_29/
├── 2011_09_30/
├── 2011_10_03/
└── segmentation/ # download and unzip "segmentation.zip"
For training the full model, run the command as below:
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node 2 --master_port YOUR_PORT_NUMBER train_ddp.py --data_path YOUR_KITTI_DATA_PATH
The ground truth depth maps should be prepared prior to evaluation.
python export_gt_depth.py --data_path YOUR_KITTI_DATA_PATH --split eigen
MODEL_DIR should be configured as below:
MODEL_DIR
├── encoder.pth # required
├── decoder.pth # required
├── ...
Run the evaluation command.
python evaluate_depth.py --load_weights_folder MODEL_DIR --data_path YOUR_KITTI_DATA_PATH
Backbone | Input | Download | AbsRel | SqRel | Rms | RmsLog | delta < 1.25 | delta < 1.25^2 | delta < 1.25^3 |
---|---|---|---|---|---|---|---|---|---|
ResNet-18 | 192 x 640 | Drive (.zip) | 0.105 | 0.708 | 4.546 | 0.182 | 0.886 | 0.964 | 0.983 |
Please use the following citation when referencing our work:
@InProceedings{Jung_2021_ICCV,
author = {Jung, Hyunyoung and Park, Eunhyeok and Yoo, Sungjoo},
title = {Fine-Grained Semantics-Aware Representation Enhancement for Self-Supervised Monocular Depth Estimation},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {12642-12652}
}