Haithem Turki, Jason Y. Zhang, Francesco Ferroni, Deva Ramanan
This repository contains the code needed to train SUDS models.
@misc{turki2023suds,
title={SUDS: Scalable Urban Dynamic Scenes},
author={Haithem Turki and Jason Y. Zhang and Francesco Ferroni and Deva Ramanan},
year={2023},
eprint={2303.14536},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
conda env create -f environment.yml
conda activate suds
python setup.py install
The codebase has been mainly tested against CUDA >= 11.3 and A100/A6000 GPUs. GPUs with compute capability greater or equal to 7.5 should generally work, although you may need to adjust batch sizes to fit within GPU memory constraints.
-
Download the following from the KITTI MOT dataset:
-
Extract everything to
./data/kitti
and keep the data structure -
Generate depth maps from the Velodyne point clouds:
python scripts/create_kitti_depth_maps.py --kitti_sequence $SEQUENCE
-
(Optional) Generate sky and static masks from semantic labels:
python scripts/create_kitti_masks.py --kitti_sequence $SEQUENCE
-
Create metadata file:
python scripts/create_kitti_metadata.py --config_file scripts/configs/$CONFIG_FILE
-
Extract DINO features:
python scripts/extract_dino_features.py --metadata_path $METADATA_PATH
orpython -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_features.py --metadata_path $METADATA_PATH
for multi-GPU extractionpython scripts/run_pca.py --metadata_path $METADATA_PATH
-
Extract DINO correspondences:
python scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH
orpython -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH
for multi-GPU extraction -
(Optional) Generate feature clusters for visualization:
python scripts/create_kitti_feature_clusters.py --metadata_path $METADATA_PATH --output_path $OUTPUT_PATH
-
Download the following from the VKITTI2 dataset:
- RGB images
- Depth images
- Camera intrinsics/extrinsics
- (Optional) Ground truth forward flow
- (Optional) Ground truth backward flow
- (Optional) Semantic labels
-
Extract everything to
./data/vkitti2
and keep the data structure -
(Optional) Generate sky and static masks from semantic labels:
python scripts/create_vkitti2_masks.py --vkitti2_path $SCENE_PATH
-
Create metadata file:
python scripts/create_vkitti2_metadata.py --config_file scripts/configs/$CONFIG_FILE
-
Extract DINO features:
python scripts/extract_dino_features.py --metadata_path $METADATA_PATH
orpython -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_features.py --metadata_path $METADATA_PATH
for multi-GPU extractionpython scripts/run_pca.py --metadata_path $METADATA_PATH
-
If not using the ground truth flow provided by VKITTI2, extract DINO correspondences:
python scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH
orpython -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS scripts/extract_dino_correspondences.py --metadata_path $METADATA_PATH
for multi-GPU extraction -
(Optional) Generate feature clusters for visualization:
python scripts/create_vkitti2_feature_clusters.py --metadata_path $METADATA_PATH --vkitti2_path $SCENE_PATH --output_path $OUTPUT_PATH
python suds/train.py suds --experiment-name $EXPERIMENT_NAME --pipeline.datamanager.dataparser.metadata_path $METADATA_PATH [--pipeline.feature_clusters $FEATURE_CLUSTERS]
python suds/eval.py --load_config $SAVED_MODEL_PATH
or python -m torch.distributed.run --standalone --nnodes=1 --nproc_per_node $NUM_GPUS suds/eval.py --load_config $SAVED_MODEL_PATH
for multi-GPU evaluation
This project is built on Nerfstudio and tiny-cuda-nn. The DINO feature extraction scripts are based on ShirAmir's implementation and parts of the KITTI processing code from Neural Scene Graphs.