3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds (ICRA24)

Created by Junsheng Zhou, Xin Wen, Baorui Ma, Yu-Shen Liu, Yue Gao, Yi Fang, Zhizhong Han

This repository contains official PyTorch implementation for 3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds.

The manual annotation for large-scale point clouds is still tedious and unavailable for many harsh real-world tasks. Self-supervised learning, which is used on raw and unlabeled data to pre-train deep neural networks, is a promising approach to address this issue. Existing works usually take the common aid from auto-encoders to establish the self-supervision by the self-reconstruction schema. However, the previous auto-encoders merely focus on the global shapes and do not distinguish the local and global geometric features apart. To address this problem, we present a novel and efficient self-supervised point cloud representation learning framework, named 3D Occlusion Auto-Encoder (3D-OAE), to facilitate the detailed supervision inherited in local regions and global shapes. We propose to randomly occlude some local patches of point clouds and establish the supervision via inpainting the occluded patches using the remaining ones. Specifically, we design an asymmetrical encoder-decoder architecture based on standard Transformer, where the encoder operates only on the visible subset of patches to learn local patterns, and a lightweight decoder is designed to leverage these visible patterns to infer the missing geometries via self-attention. We find that occluding a very high proportion of the input point cloud (e.g. 75%) will still yield a nontrivial self-supervisory performance, which enables us to achieve 3-4 times faster during training but also improve accuracy. Experimental results show that our approach outperforms the state-of-the-art on a diverse range of downstream discriminative and generative tasks.

We first extract seed points from the input point cloud using FPS, and then separate the input into point patches by grouping local points around each seed point using KNN. After that, we randomly occlude high ratio of patches and subtract each visible patch to its corresponding seed point for detaching the patch from its spatial location. The encoder operates only on the embeddings of visible patches and the learnable occlusion tokens are combined to the latent feature before the decoder . Finally, we operate addition to the output patches and their corresponding seed points to regain their spatial locations and further merge the local patches into a complete shape, where we compute a loss function with the ground truth.

Pretrained Models

Model	Dataset	Task	Performance	Config	Url
3D-OAE (SSL)	ShapeNet	Linear-SVM	92.3 (Acc.)	config	Google Drive
Transformer/PoinTr	PCN	Point Cloud Completion	6.97 (CD.)	config	Google Drive
Transformer	ModelNet	Classification	93.4 (Acc.)	config	Google Drive
Transformer	ScanObjectNN	Classification	89.16 (Acc.)	config	Google Drive
Transformer	ScanObjectNN	Classification	88.64 (Acc.)	config	Google Drive
Transformer	ScanObjectNN	Classification	83.17 (Acc.)	config	Google Drive
Transformer	ShapeNetPart	Part Segmentation	85.7 (Acc.)	config	Google Drive

Usage

Requirements

PyTorch >= 1.7.0
python >= 3.7
CUDA >= 9.0
GCC >= 4.9
torchvision
timm
open3d
tensorboardX

pip install -r requirements.txt

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

NOTE: PyTorch >= 1.7 and GCC >= 4.9 are required.

# Chamfer Distance
bash install.sh
# PointNet++
pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

Dataset

We use ShapeNet for the self-supervised learning of 3D-OAE models. And finetuning the 3D-OAE models on ModelNet, ScanObjectNN, PCN and ShapeNetPart

The details of used datasets can be found in DATASET.md.

Self-supervised learning

For self-supervised learning of 3D-OAE models on ShapeNet, simply run:

bash ./scripts/run_OAE.sh <NUM_GPU> \
    --config cfgs/SSL_models/Point-OAE_2k.yaml \
    --exp_name <name> \
    --val_freq 1

val_freq controls the frequence to evaluate the Transformer on ModelNet40 with LinearSVM.

Fine-tuning on downstream tasks

We finetune our 3D-OAE on 6 downstream tasks: LinearSVM on ModelNet40, Classfication on ModelNet40, Few-shot learning on ModelNet40, Point completion on PCN dataset, Transfer learning on ScanObjectNN and Part segmentation on ShapeNetPart.

ModelNet40

To finetune a pre-trained 3D-OAE model on ModelNet40, simply run:

bash ./scripts/run_OAE.sh <GPU_IDS> \
    --config cfgs/ModelNet_models/Transformer_1k.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Few-shot Learning on ModelNet40

First, preparing the few-shot learning split and dataset. (see DATASET.md). Then run:

bash ./scripts/run_OAE.sh <GPU_IDS> \
    --config cfgs/Fewshot_models/Transformer_1k.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name> \
    --way <int> \
    --shot <int> \
    --fold <int>

ScanObjectNN

To finetune a pre-trained 3D-OAE model on ScanObjectNN, simply run:

bash ./scripts/run_OAE.sh <GPU_IDS>  \
    --config cfgs/ScanObjectNN_models/Transformer_hardest.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Point Cloud Completion

To finetune a pre-trained 3D-OAE model on PCN, simply run:

bash ./scripts/run_OAE_pcn.sh <GPU_IDS>  \
    --config cfgs/PCN_models/Transformer_pcn.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Part Segmentation

To finetune a pre-trained 3D-OAE model on ShapeNetPart, simply run:

bash ./scripts/run_OAE_seg.sh <GPU_IDS>  \
    --config cfgs/ShapeNetPart_models/Transformer_seg.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Visualization

Point cloud self-reconstruction results using our 3D-OAE model trained on ShapeNet:

Point cloud completion results using our 3D-OAE model trained on PCN dataset:

License

MIT License

Acknowledgements

Some of the code of this repo is borrowed from Point-BERT. We thank the authors for their great job!

Citation

If you find our work useful in your research, please consider citing:

@article{zhou20223d,
  title={3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds},
  author={Zhou, Junsheng and Wen, Xin and Ma, Baorui and Liu, Yu-Shen and Gao, Yue and Fang, Yi and Han, Zhizhong},
  journal={IEEE International Conference on Robotics and Automation (ICRA)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
cfgs		cfgs
data		data
datasets		datasets
extensions		extensions
figs		figs
models		models
scripts		scripts
tools		tools
utils		utils
.gitignore		.gitignore
DATASET.md		DATASET.md
README.md		README.md
install.sh		install.sh
main_OAE.py		main_OAE.py
main_OAE_pcn.py		main_OAE_pcn.py
main_OAE_seg.py		main_OAE_seg.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds (ICRA24)

Pretrained Models

Usage

Requirements

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

Dataset

Self-supervised learning

Fine-tuning on downstream tasks

ModelNet40

Few-shot Learning on ModelNet40

ScanObjectNN

Point Cloud Completion

Part Segmentation

Visualization

License

Acknowledgements

Citation

About

Releases

Packages

Languages

junshengzhou/3D-OAE

Folders and files

Latest commit

History

Repository files navigation

3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds (ICRA24)

Pretrained Models

Usage

Requirements

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

Dataset

Self-supervised learning

Fine-tuning on downstream tasks

ModelNet40

Few-shot Learning on ModelNet40

ScanObjectNN

Point Cloud Completion

Part Segmentation

Visualization

License

Acknowledgements

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages