Streaming Object Detection and Segmentation with Polar Pillars
PolarStream,
Qi Chen, Sourabh Vora, Oscar Beijbom,
NeurIPS 2021 Poster (arXiv 2006.11275)
@article{chen2021polarstream,
title={PolarStream: Streaming Object Detection and Segmentation with Polar Pillars},
author={Chen, Qi and Vora, Sourabh and Beijbom, Oscar},
journal={Advances in Neural Information Processing Systems},
volume={34},
year={2021}
}
Any questions or suggestions are welcome!
Qi Chen [email protected]
Recent works recognized lidars as an inherently streaming data source and showed that the end-to-end latency of lidar perception models can be reduced significantly by operating on wedge-shaped point cloud sectors rather then the full point cloud. However, due to use of cartesian coordinate systems these methods represent the sectors as rectangular regions, wasting memory and compute. In this work we propose using a polar coordinate system and make two key improvements on this design. First, we increase the spatial context by using multi-scale padding from neighboring sectors: preceding sector from the current scan and/or the following sector from the past scan. Second, we improve the core polar convolutional architecture by introducing feature undistortion and range stratified convolutions. Experimental results on the nuScenes dataset show significant improvements over other streaming based methods. We also achieve comparable results to existing non-streaming methods but with lower latencies.
MAP ↑ | NDS ↑ | PKL ↓ | FPS ↑ | |
---|---|---|---|---|
PolarStream-Full Sweep | 52.9 | 61.2 | 26.3 | |
PolarStream-4 CPx1 | 53.5 | 61.8 | 89.3 | 47.2 |
PolarStream-4 CPx2 | 52.9 | 61.2 | 47.2 |
mIoU | freq_weighted mIoU | FPS | |
---|---|---|---|
PolarStream-Full Sweep | 73.4 | 87.4 | 33.9 |
PolarStream-4 CPx1 | 73 | 87.5 | 59.2 |
PolarStream-4 CPx2 | 73.1 | 87.5 | 59.2 |
PQ | SQ | RQ | FPS | |
---|---|---|---|---|
PolarStream-Full Sweep | 71 | 86 | 82 | 22.3 |
Panoptic Segmentation on NuScenes Validation Set (following Panoptic-PolarNet 's label generation & evaluation)
PQ | SQ | RQ | FPS | |
---|---|---|---|---|
PolarStream-Full Sweep | 68.7 | 85.3 | 79.9 | 22.3 |
PolarStream-4 CPx1 | 69 | 85.2 | 80.4 | 44.3 |
PolarStream-4 CPx2 | 69.6 | 85.5 | 80.8 | 44.3 |
All results are tested on a V100 with batch size 1.
- Polar Representation
- Artificially simulating streaming lidar
- Multi-scale context padding
- Simultaneous object detection and semantic segmentation
- Single detection head with comparable accuracy to multi-group heads
- Panoptic labels and predictions generation (compatible with nuScenes official panoptic eval)
- Reimplementation of STROBE
- Reimplementation of Han et. al.
- Dynamic voxelization
Please refer to INSTALL to set up libraries needed for distributed training.
- The experiments are run with PyTorch 1.9, CUDA 11.2, and CUDNN 7.5.
- The training is conducted on 8 V100 GPUs
- Testing times are measured on a V100 GPU with batch size 1.
We provide training / validation configurations, pretrained models in the paper
Please refer to GETTING_START to prepare the data. Then follow the instruction there to reproduce our detection, semantic segmentation and panoptic segmentation results. Configurations are included in configs.
Model | det FPS | seg FPS | panoptic FPS | Test MAP | Test NDS | Test mIoU | Test freq_weigted mIoU | Validation PQ | Validation SQ | Validation RQ | Link |
---|---|---|---|---|---|---|---|---|---|---|---|
polarstream_det_n_seg_1_sector | 26.3 | 33.9 | 22.3 | 52.9 | 61.2 | 73.4 | 87.4 | 68.7 | 85.3 | 79.9 | URL |
polarstream_det_n_seg_4_sector_bidirectional | 47.2 | 59.2 | 44.3 | 52.9 | 61.2 | 73.1 | 87.5 | 69.6 | 85.5 | 80.8 | URL |
Model | val det mAP | Link |
---|---|---|
voxelnet_det_cylinder_singlehead | 57.7 | URL |
Model | val seg mIoU | Link |
---|---|---|
voxelnet_seg_cylinder | 77.7 | URL |
Reimplementation of STROBE and Han et. al.
Model | Link |
---|---|
han_1_sector | URL |
han_4_sector | URL |
strobe_1_sector | URL |
strobe_4_sector | URL |
PolarStream is release under MIT license (see LICENSE). It is developed based on a forked version of CenterPoint. We also incorperate code from PolarNet. See the NOTICE for details. Note that both nuScenes and Waymo datasets are under non-commercial licenses.
This project is not possible without multiple great opensourced codebases. We list some notable examples below.