MINSU3D:MinkowskiEngine-powered Scene Understanding in 3D contains reimplementation of state-of-the-art 3D scene understanding methods on point clouds powered by MinkowskiEngine.
We support the following instance segmentation methods:
We also provide bounding boxes predictions based on instance segmentation for 3D object detection.
- Highly-modularized design enables researchers to easily add different models and datasets.
- Multi-GPU and distributed training support through PytorchLightning.
- Better logging with W&B, periodic evaluation during training.
- Easy experiment configuration and management with Hydra.
- Unified and optimized C++ and CUDA extensions.
- MINSU3D v2.0 release, ~1.8 times faster, ~4GB less CPU memory usage and ~400MB less GPU memory usage
We recommend the use of miniconda to manage system dependencies.
# create and activate the conda environment
conda create -n minsu3d python=3.10
conda activate minsu3d
# install PyTorch 2.0
conda install pytorch pytorch-cuda=11.7 -c pytorch -c nvidia
# install Python libraries
pip install .
# install OpenBLAS
conda install openblas-devel --no-deps -c anaconda
# install MinkowskiEngine
pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps \
--install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas"
# install C++ extensions
export CPATH=$CONDA_PREFIX/include:$CPATH
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
cd minsu3d/common_ops
python setup.py develop
Note: Setting up with pip (no conda) requires OpenBLAS to be pre-installed in your system.
# create and activate the virtual environment
virtualenv --no-download env
source env/bin/activate
# install PyTorch 2.0
pip3 install torch
# install Python libraries
pip install .
# install OpenBLAS and SparseHash via APT
sudo apt install libopenblas-dev
# install MinkowskiEngine
pip install MinkowskiEngine
# install C++ extensions
cd minsu3d/common_ops
python setup.py develop
- Download the ScanNet v2 dataset and put it under
minsu3d/data/scannetv2
. To acquire the access to the dataset, please refer to their instructions. You will get adownload-scannet.py
script after your request is approved:
# about 10.7GB in total
python download-scannet.py -o data/scannet --type _vh_clean_2.ply
python download-scannet.py -o data/scannet --type _vh_clean.aggregation.json
python download-scannet.py -o data/scannet --type _vh_clean_2.0.010000.segs.json
The raw dataset files should be organized as follows:
minsu3d
├── data
│ ├── scannetv2
│ │ ├── scans
│ │ │ ├── [scene_id]
│ │ │ │ ├── [scene_id]_vh_clean_2.ply
│ │ │ │ ├── [scene_id]_vh_clean_2.0.010000.segs.json
│ │ │ │ ├── [scene_id].aggregation.json
│ │ │ │ ├── [scene_id].txt
- Preprocess the data, it converts original meshes and annotations to
.pth
data:
python data/scannetv2/preprocess_all_data.py data=scannetv2
Note: Configuration files are managed by Hydra, you can easily add or override any configuration attributes by passing them as arguments.
# log in to WandB
wandb login
# train a model from scratch
# available model_name: pointgroup, hais, softgroup
# available dataset_name: scannetv2
python train.py model={model_name} data={dataset_name} experiment_name={experiment_name}
# train a model from scratch with 2 GPUs
python train.py model={model_name} data={dataset_name} model.trainer.devices=2
# train a model from a checkpoint
python train.py model={model_name} data={dataset_name} model.ckpt_path={checkpoint_path}
# test a pretrained model
python test.py model={model_name} data={dataset_name} model.ckpt_path={pretrained_model_path}
# evaluate inference results
python eval.py model={model_name} data={dataset_name} experiment_name={experiment_name}
# examples:
# python train.py model=pointgroup data=scannetv2 model.trainer.max_epochs=480
# python test.py model=pointgroup data=scannetv2 model.ckpt_path=PointGroup_best.ckpt
# python eval.py model=hais data=scannetv2 experiment_name=run_1
We provide pretrained models for ScanNet v2. The pretrained model, corresponding config file, and performance on ScanNet v2 val set are given below. Note that all MINSU3D models are trained from scratch. After downloading a pretrained model, run test.py
to do inference as described in the above section.
Model | Code | mean AP | AP 50% | AP 25% | Bbox AP 50% | Bbox AP 25% | Download |
---|---|---|---|---|---|---|---|
MINSU3D PointGroup | config | model | 36.4 | 57.9 | 71.1 | 49.9 | 60.0 | link |
Official PointGroup | - | 35.2 | 57.1 | 71.4 | - | - | - |
MINSU3D HAIS | config | model | 42.6 | 61.9 | 72.6 | 51.4 | 62.9 | link |
Official HAIS (retrained) | - | 42.2 | 61.0 | 72.9 | - | - | - |
Official HAIS | - | 44.1 | 64.4 | 75.7 | - | - | - |
MINSU3D SoftGroup | config | model | 42.3 | 65.1 | 77.8 | 55.8 | 69.3 | link |
Official SoftGroup | - | 46.0 | 67.6 | 78.9 | 59.4 | 71.6 | - |
1 The official pretrained SoftGroup model was trained with HAIS checkpoint as pretrained backbone.
2 The MINSU3D HAIS model's scores are 2-3 lower than the official pretrained HAIS's. To investigate, we retrained the official HAIS model using their code, the best scores we can get are 42.2 / 61.0 / 72.9 for mean AP / AP 50% / AP 25%, which match our MINSU3D HAIS model's scores.
We provide scripts to visualize the predicted segmentations and bounding boxes. To use the visualization scripts, place the mesh (ply) file from the Scannet dataset as follows.
minsu3d
├── data
│ ├── scannetv2
│ │ ├── scans
│ │ │ ├── [scene_id]
| | | | ├── [scene_id]_vh_clean_2.ply
To visualize the predictions, use visualize/scannet/generate_ply.py
to generate ply files with vertices colored according to the semantic or instance.
cd visualize/scannet
python generate_prediction_ply.py --predict_dir {path to the predictions} --split {test/val/train} --bbox --mode {semantic/instance} --output_dir {output directory of ply files}
# example:
# python generate_prediction_ply.py --predict_dir ../../output/ScanNet/PointGroup/test/predictions/instance --split val --bbox --mode semantic --output_dir output_ply
The --mode
option allows you to specify the color mode.
In the 'semantic' mode, objects with the same semantic prediction will have the same color.
In the 'instance' mode, each independent object instance will have an unique color, allowing the user to check how well the model performs on instance segmentation.
The --bbox
option allows you to generate ply file that uses bounding box to specify the position of objects.
Semantic Segmentation(color) | Instance Segmentation(color) |
---|---|
Semantic Segmentation(bbox) | Instance Segmentation(bbox) |
---|---|
If you find that many bounding boxes are overlapping, you can choose to do non maximum suppression during the inference phase. This can be achieved by adjusting TEST_NMS_THRESH
in the config file
Test environment
- CPU: Intel Core i9-9900K @ 3.60GHz × 16
- RAM: 64GB
- GPU: NVIDIA GeForce RTX 2080 Ti 11GB
- System: Ubuntu 22.04.2 LTS
Training time in total (train set only, without validation)
Model | Epochs | Batch Size | MINSU3D | Official Version |
---|---|---|---|---|
PointGroup | 450 | 4 | 28hr | 51hr |
HAIS | 450 | 4 | 38hr | 60hr |
SoftGroup | 256 | 4 | (to be updated) | 30hr |
Inference time per scene (avg)
Model | MINSU3D | Official Version |
---|---|---|
PointGroup | (to be updated) | 176ms |
HAIS | (to be updated) | 165ms |
SoftGroup | (to be updated) | 204ms |
MINSU3D allows for easy additions of custom datasets and models. All code under minsu3d/data/dataset
and minsu3d/model
are automatically registered and managed by Hydra using configuration files under config/data
and config/model
, respectively.
- Add a new dataset config file (.yaml) at
config/data/{your_dataset}.yaml
. - Add a new dataset processing code at
minsu3d/data/dataset/{your_dataset}.py
, it should inherit theGeneralDataset()
class fromminsu3d/data/dataset/general_dataset.py
.
- Add a new model config file (.yaml) at
config/model/{your_model}.yaml
. - Add a new model code at
minsu3d/model/{your_model}.py
, it should inherit theGeneralModel()
class fromminsu3d/model/general_model.py
.
This repo is built upon the MinkowskiEngine, PointGroup, HAIS, and SoftGroup. We train our models on ScanNet. If you use this repo and the pretrained models, please cite the original papers.