ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
Ziyu Tang, Weicai Ye, Yifan Wang, Di Huang, Hujun Bao, Tong He, Guofeng Zhang
https://zju3dv.github.io/nd-sdf/static/videos/e0_notop_rgb.mp4
See more results on the project page.
Clone this repository:
git clone https://github.com/zju3dv/ND-SDF.git
cd ND-SDF
Create a new conda environment:
conda create -n ndsdf python=3.8
conda activate ndsdf
Install pytorch and other dependencies:
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
TODO List:
- Light training config (512 or 1024 rays per batch): We do not use mlp for scene representation, since it inherently causes detail loss in complex scenes. With hashgrid as the scene primitive, we adopt sampling 4×1024 rays per batch as it's more stable especially when training on ScanNet, which suffers from motion blur heavily.
Our code is compatible with the data format of MonoSDF. We thank monosdf for providing the following indoor datasets: ScanNet, replica, and Tanksandtemples. Click the blue hyperlinks to download the datasets with processed priors. After downloading, extract the files to the data
folder in the root directory.
The data structure should look like this:
ND-SDF
├── data
│ ├── scannet
│ │ ├── scan1
│ │ ├── ...
│ ├── replica
│ │ ├── scan1
│ │ ├── ...
│ ├── ...
- We provide three preprocessed scenes from ScanNet++ with incorporated monocular priors. For the full dataset, please refer to Wang's NeuRodin for tips on downloading and processing.
ScanNet:
torchrun --nproc_per_node=1 exp_runner.py --conf confs/scannet.yaml --scan_id 1
Replica:
torchrun --nproc_per_node=1 exp_runner.py --conf confs/replica.yaml --scan_id 1
Tanksandtemples:
torchrun --nproc_per_node=1 exp_runner.py --conf confs/tnt.yaml --scan_id 1
ScanNet++:
torchrun --nproc_per_node=1 exp_runner.py --conf confs/scannetpp.yaml --scan_id 1
You can continue training from the latest checkpoint:
torchrun --nproc_per_node=1 exp_runner.py \
--conf <path_to_config_file> \ # e.g., runs/exp_name/timestamp/conf.yaml
--is_continue
Or continue training from a specific checkpoint:
torchrun --nproc_per_node=1 exp_runner.py \
--conf <path_to_config_file> \
--is_continue \
--checkpoint <path_to_checkpoint> # e.g., runs/exp_name/timestamp/checkpoints/xxx.pth
Our code supports multi-GPU training (DDP). You can specify the number of GPUs by --nproc_per_node
, e.g., torchrun --nproc_per_node=4 ...
.
After training, you can extract the mesh by running:
python scripts/extract_mesh.py \
--conf <path_to_config_file> \
--checkpoint <path_to_checkpoint> \
--res 512 \ # marching cube resolution, we recommend 1024 or higher when evaluating large complex scenes.
--textured # optional, if you want to extract textured mesh
Our evaluation methods for ScanNet and Replica datasets are consistent with DebSDF. We thank the authors for providing the code. Below are the evaluation scripts for intermediate extracted meshes during or after training.
ScanNet:
cd evals/scannet_eval
python evaluate.py --exp_name <exp_name> # <exp_name> refer to the one defined in the config file
Replica:
cd evals/replica_eval
python evaluate.py --exp_name <exp_name>
ScanNet++:
cd evals/scannetpp_eval
python evaluate.py --exp_name <exp_name>
You can also evaluate a specific extracted mesh. Take the ScanNet dataset as an example:
cd evals/scannet_eval
python evaluate_single_mesh.py \
--mesh_dir <path_to_mesh_dir> \
--scan_id 1
For Tanksandtemples dataset, please refer to the official evaluation page for details. We provide a submission example in Submission.
Tips:
- For ScanNet++ dataset, we recommend evaluating a mesh extracted at 1024/2048 resolution, as it contains more details and finer structures. First, run
scripts/extract_mesh.py
to extract the high-resolution mesh, then evaluate it by usingevals/scannetpp_eval/evaluate_single_mesh.py
.
We support converting your own COLMAP dataset to the ND-SDF format. The input data structure should look like this:
<custom_dataset_dir>
├── input
│ ├── 1.jpg
│ ├── ...
Run COLMAP on the custom dataset:
cd preprocess/datasets
python convert.py -s <custom_dataset_dir>
Convert the COLMAP format to the ND-SDF format:
python process_colmap_to_json.py -i <custom_dataset_dir>
We provide a simple, interactive procedure to help locate the bounding box of the scene (highly recommended). Try it by adding the --if_interactive
flag.
We provide scripts to extract monocular cues using Omnidata.
Install:
cd preprocess/omnidata
sh download.sh
Then extract monocular cues:
python extract_mono_cues_square.py --input_dir <custom_dataset_dir>/rgb --output_dir <custom_dataset_dir> --task normal
python extract_mono_cues_square.py --input_dir <custom_dataset_dir>/rgb --output_dir <custom_dataset_dir> --task depth
Train on the custom dataset:
torchrun --nproc_per_node=1 exp_runner.py --conf confs/custom.yaml --scan_id -1 --data_dir <custom_dataset_dir>
Notes: This codebase is built from scratch and additionally features Neus as a backend, NerfAcc for acceleration fast like instant-angelo, etc. For more details, please refer to the codebase.
If you find this code useful for your research, please use the following BibTeX entry.
@article{tang2024ndsdf,
title={ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction},
author={Ziyu Tang and Weicai Ye and Yifan Wang and Di Huang and Hujun Bao and Tong He and Guofeng Zhang},
booktitle=={arxiv preprint},
year={2024}
}