Learning Ego 3D Representation as Ray Tracing

Website | Paper

Learning Ego 3D Representation as Ray Tracing,
Jiachen Lu, Zheyuan Zhou, Xiatian Zhu, Hang Xu, Li Zhang
ECCV 2022

Demo

Video

News

[2022/07/19]: Configs and instructions for training are released!
[2022/07/05]: First version of Ego3RT is released! Code for detection head and training configs will comming soon.
[2022/07/04]: Ego3RT is accepted by ECCV 2022!

Abstract

A self-driving perception model aims to extract 3D semantic representations from multiple cameras collectively into the bird's-eye-view (BEV) coordinate frame of the ego car in order to ground downstream planner. Existing perception methods often rely on error-prone depth estimation of the whole scene or learning sparse virtual 3D representations without the target geometry structure, both of which remain limited in performance and/or capability. In this paper, we present a novel end-to-end architecture for ego 3D representation learning from an arbitrary number of unconstrained camera views. Inspired by the ray tracing principle, we design a polarized grid of ``imaginary eyes" as the learnable ego 3D representation and formulate the learning process with the adaptive attention mechanism in conjunction with the 3D-to-2D projection. Critically, this formulation allows extracting rich 3D representation from 2D images without any depth supervision, and with the built-in geometry structure consistent w.r.t. BEV. Despite its simplicity and versatility, extensive experiments on standard BEV visual tasks (e.g., camera-based 3D object detection and BEV segmentation) show that our model outperforms all state-of-the-art alternatives significantly, with an extra advantage in computational efficiency from multi-task learning.

Methods

Train & Test

Please refer to the get_started.md

Result

3D object detection on nuScenes validation set

Model	Polar size	mAP	NDS	checkpoint
Ego3RT, ResNet101_DCN	80x256	37.5	45.0
Ego3RT, ResNet101_DCN	72x192	37.5	44.9	ego3rt_polar72x192_cart128x128.pth
Ego3RT, VoVNet	80x256	47.8	53.4

3D object detection on nuScenes test set

Model	Polar size	mAP	NDS
Ego3RT, ResNet101_DCN	80x256	38.9	44.3
Ego3RT, VoVNet	80x256	42.5	47.3

BEV segmentation on nuScenes validation set

Model	Polar size	Multitask	mIoU
Ego3RT, EfficientNet	80x256	no	55.5
Ego3RT, ResNet101_DCN	80x256	yes	46.2

License

MIT

Reference

@inproceedings{lu2022ego3rt,
  title={Learning Ego 3D Representation as Ray Tracing},
  author={Lu, Jiachen and Zhou, Zheyuan and Zhu, Xiatian and Xu, Hang and Zhang, Li},
  booktitle={European Conference on Computer Vision},
  year={2022}
}

Acknowledgement

Thanks to previous open-sourced repo:

MMDetection3D
DETR3D
Deformable DETR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Learning Ego 3D Representation as Ray Tracing

Website | Paper

Demo

Video

News

Abstract

Methods

Train & Test

Result

3D object detection on nuScenes validation set

3D object detection on nuScenes test set

BEV segmentation on nuScenes validation set

License

Reference

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

Learning Ego 3D Representation as Ray Tracing

Website | Paper

Demo

Video

News

Abstract

Methods

Train & Test

Result

3D object detection on nuScenes validation set

3D object detection on nuScenes test set

BEV segmentation on nuScenes validation set

License

Reference

Acknowledgement