Train & inference

Train

You can train the model following:

tools/dist_train.sh projects/configs/StreamPETR/stream_petr_r50_flash_704_bs2_seq_24e.py 8 --work-dir work_dirs/stream_petr_r50_flash_704_bs2_seq_24e/

Notes:

We provide training config both in sliding window and streaming video. The results reported in our paper is trained with sliding window, while the sliding window training consumes huge GPU memory and training time. So we provide an additional training manner in streaming video (following SOLOFusion).

Evaluation

You can evaluate the detection model following:

tools/dist_test.sh projects/configs/StreamPETR/stream_petr_vov_flash_800_bs2_seq_24e.py work_dirs/stream_petr_vov_flash_800_bs2_seq_24e/latest.pth 8 --eval bbox

You can evaluate the tracking model following:

python nusc_tracking/pub_test --version v1.0-trainval --checkpoint {PATH_RESULTS.JSON} --data_root {PATH_NUSCENES}

Estimate the inference speed of StreamPETR

The latency includes data-processing, network (FP32) and post-processing. Noting that "workers_per_gpu" may affect the speed because we include data processing time.

python tools/benchmark.py projects/configs/test_speed/stream_petr_r50_704_bs2_seq_428q_nui_speed_test.py

Visualize

You can generate the reault json following:

./tools/dist_test.sh projects/configs/StreamPETR/stream_petr_vov_flash_800_bs2_seq_24e.py work_dirs/stream_petr_vov_flash_800_bs2_seq_24e/latest.pth 8 --format-only

You can visualize the 3D object detection following:

python3 tools/visualize.py
# please change the results_nusc.json path in the python file

Training Recipes

Here we provide some training tricks, which may further boost the performance of our model. In the further, we will try them and provide the improved baseline of our model.

The training of streaming video converges relatively slowly (Sliding window with 60 epoches == streaming video with 90 epoches), but it still saves 4x training hours. The results in our paper were early trained using sliding window (8x frame window size).
To achieve SOTA results, we have modified the loss weights and Hungarian matching weights for the bounding box regression. Specifically, we change the x,y weight from 1.0 to 2.0. We find it works well on sparse query based designs.
The learning rate of backbone has significant impact for small models. For most 2D pretrained (e.g. R50-Nuimage or V2-99-FCOS) and large backbone (e.g. VIT-Base), we suggest setting it to 0.1. For small IN1k-pretrained models (e.g. R50-IN1k), 0.25 or 0.5 is better.
For small IN1k models, the results are not stable, Sync-BN can obtain stable results, while resulting in slightly longer training time. You can enable it by additionally setting SyncBN=True and changing the norm config to norm_cfg=dict(type='BN2d', requires_grad=True), norm_eval=False.
When training longer (e.g. 60ep), 300+128 queries has similar results to 644+256 queries, which is friendly to deployment.
The dropout ratio for Transformer may be sub-optimal.
EMA may boost the performance.
The feedforward_channels for Transformer can be set smaller (e.g. 512), it can improve the inference speed and has little impact on the accuracy.
Single frame detector pre-training.
If your device does not support Flash attention, change the config to dict(type='PETRMultiheadAttention', embed_dims=256, num_heads=8, dropout=0.1, fp16=True,).
Adjuting the learning rate:

Num_gpus * Batch_size	Learning Rate
8	2e-4
16	4e-4
32	6e-4
64	TBA

Detection Results

Model	Setting	Pretrain	Lr Schd	NDS	mAP	Config	Download
PETR	R50 - 900q	ImageNet	24ep	34.9	30.9	config	log
FocalPETR	R50 - 900q	ImageNet	24ep	36.6	33.1	config	log
StreamPETR	R50 - 900q	ImageNet	24ep	47.6	37.5	config	log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training_inference.md

training_inference.md

Train & inference

Train

Evaluation

Estimate the inference speed of StreamPETR

Visualize

Training Recipes

Detection Results

Files

training_inference.md

Latest commit

History

training_inference.md

File metadata and controls

Train & inference

Train

Evaluation

Estimate the inference speed of StreamPETR

Visualize

Training Recipes

Detection Results