Created by Wang Zhao, Shaohui Liu, Yezhi Shu, Yong-Jin Liu.
This implementation is based on our CVPR'2020 paper "Towards Better Generalization: Joint Depth-Pose Learning without PoseNet". You can find the arXiv version of the paper here. In this repository we release code and pre-trained models for TrianFlow (our method) and a strong baseline PoseNet-Flow method.
The code is based on Python3.6. You could use either virtualenv or conda to setup a specified environment. And then run:
pip install -r requirements.txt
To run a depth prediction demo, you may need to first download the pretrained model from here.
python test.py --config_file ./config/default_1scale.yaml --gpu 0 --mode depth --task demo --image_path ./data/demo/kitti.png --pretrained_model ./models/pretrained/depth_pretrained.pth --result_dir ./data/demo
This will give you a predicted depth map for demo image.
- For KITTI depth and flow tasks, download KITTI raw dataset using the script provided on the official website. You also need to download KITTI 2015 dataset to evaluate the predicted optical flow. Run the following commands to generate groundtruth files for eigen test images.
cd ./data/eigen
python export_gt_depth.py --data_path /path/to/your/kitti/root
- For KITTI Odometry task, download KITTI Odometry dataset.
- For NYUv2 experiments, download NYUv2 raw sequences and labeled data mat, also the traing test split mat from here. Put the labeled data and splits file under the same directory. The data structure should be:
nyuv2
| basements
| cafe
| ...
nyuv2_test
| nyu_depth_v2_labeled.mat
| splits.mat
- Modify the configuration file in the ./config directory to set up your path. The config file contains the important paths and default hyper-parameters used in the training process.
- For KITTI depth, we have the three-stage training schedule.
1. python train.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode flow --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models]
2. python train.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode depth --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models] --flow_pretrained_model [path/to/your/stage1_flow_model]
3. python train.py --config_file ./config/kitti_3stage.yaml --gpu [gpu_id] --mode depth_pose --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models] --depth_pretrained_model [path/to/your/stage2_depth_model]
If you are running experiments on the dataset for the first time, it would first process data and save in the [prepared_base_dir] path defined in your config file. For other datasets like KITTI Odometry and NYUv2 dataset, you could run with the same commands using the appropriate config file.
We also implement and release codes for the strong baseline PoseNet-Flow method, you could run it by two-stage training:
1. python train.py --config_file [path/to/your/config/file] --gpu [gpu_id] --mode flow --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models]
2. python train.py --config_file [path/to/your/config/file] --gpu [gpu_id] --mode flowposenet --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models] --flow_pretrained_model [path/to/your/stage1_flow_model]
We provide pretrained models here for different tasks. The performance could be slightly different with the paper due to randomness.
- To evaluate the monocular depth estimation on kitti eigen test split, run:
python test.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode depth --task kitti_depth --pretrained_model [path/to/your/model] --result_dir [path/to/save/results]
- To evaluate the monocular depth estimation on nyuv2 test split, run:
python test.py --config_file ./config/nyu.yaml --gpu [gpu_id] --mode depth --task nyuv2 --pretrained_model [path/to/your/model] --result_dir [path/to/save/results]
- To evaluate the optical flow estimation on KITTI 2015, run:
python test.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode flow_3stage --task kitti_flow --pretrained_model [path/to/your/model] --result_dir [path/to/save/results]
- To evaluate the visual odometry task on KITTI Odometry dataset, first get predictions on a single sequence and then evaluate:
python infer_vo.py --config_file ./config/odo.yaml --gpu [gpu_id] --traj_save_dir_txt [where/to/save/the/prediction/file] --sequences_root_dir [the/root/dir/of/your/image/sequences] --sequence [the sequence id] ----pretrained_model [path/to/your/model]
python ./core/evaluation/eval_odom.py --gt_txt [path/to/your/groundtruth/poses/txt] --result_txt [path/to/your/prediction/txt] --seq [sequence id to evaluate]
You could evaluate on the sampled KITTI odometry dataset by simply sampling the raw image sequences and gt-pose txt. Then run infer_vo.py on the sampled image sequence and eval_odom.py with predicted txt and sampled gt txt to get results.
If you find our work useful in your research, please consider citing:
@inproceedings{zhao2020towards,
title={Towards Better Generalization: Joint Depth-Pose Learning without PoseNet},
author={Zhao, Wang and Liu, Shaohui and Shu, Yezhi and Liu, Yong-Jin},
booktitle={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020}
}