Bird’s-eye View Prediction: Contrastive Predictive Coding pre-train Pio2Vox with YOLOv3

Code for DS-GA-1008 Self-Supervised Learning(SSL) Final Project for Bofei Zhang, Can Cui, Yuanxi Sun

Report: Bird-eye View Prediction : Contrastive Predictive Coding pre-train Pio2Vox with YOLOv3

Rank (out of 50 teams):

Detection: 9
Segmentation: 2
Overall: 4

Instruction

Step 1. CPC pretrain Pix2Vox

In this step we utilized unlabelled data and CPC to pretrian a ResNet-50 encoder. Command: folder ./self_supervised/config.py contains configurations of the model, when adjusted and simply run

python ./self_supervised/run.py

Important details about config.py:

resnet_model, the encoder model of feature extractor, by default resnet50 and lists are {resnet18, resnet50. resnet101, resnet152}
encoder_model, the encoder model of seq2seq, by default LSTM, and lists are {LSTM, GRU, RNN}
embed_size, the embedding size of feature extractor
rnn_hidden_size, the hidden size of seq2seq
output_size, the output size of the seq2seq
rnn_n_layers, the number of layers of seq2seq (both encoder and decoder)
rnn_seq_len, the sequence length of seq2seq

Step 2. Fine Tune or train from scratch

Folder ./car_detection contains all code train a Pix2Vox model for detection/segmentation task. To run this on your own environment, please first configure ./car_detection/data.py for the dataloader. To run training, you will need to run main.py with following command line arguments:

-mc, --model-conifg, setup the model configuration. The best model uses configuration pix2vox
-bs, --batch-size, batch size of training
-dm, yes or no, if yes, it only adds one scene into train/validation set, which is good for debugging
-det, -seg, yes or no. Train detection or train segmentation model. Note, you can train them in a multi-tasking manner, but in practice, it does not converges.
-ssl, yes or no, if you want to used pre-train weights. You have to configure the path of pre-train weights in ./car_detection/load_ssl.py before set it to yes
-pt, yes or no, if you want to use ImageNet pre-train weights for ResNet encoder
-a, -g, float, weights parameter for Focal Loss and weighted cross entropy

To run the training, simply do

# segmentation model from scratch
python main.py -mc pix2vox -bs 2 -pt no -det no -seg yes -ssl no

# detection model from scratch
python main.py -mc pix2vox -bs 2 -pt no -det yes -seg no -ssl no

After training, you can use detect_pix.py for the testing and figure generation.

References

We refer https://github.com/eriklindernoren/PyTorch-YOLOv3 for YOLOv3 implementation

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
car_detection		car_detection
self_supervised		self_supervised
self_supervised_evaluation		self_supervised_evaluation
.gitignore		.gitignore
README.md		README.md
data_helper.py		data_helper.py
helper.py		helper.py
model_loader.py		model_loader.py
plot_rectangle.py		plot_rectangle.py
run_test.py		run_test.py
sub_eval.s		sub_eval.s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bird’s-eye View Prediction: Contrastive Predictive Coding pre-train Pio2Vox with YOLOv3

Instruction

Step 1. CPC pretrain Pix2Vox

Step 2. Fine Tune or train from scratch

References

About

Releases

Packages

Contributors 3

Languages

bofei5675/DS-GA-1008-SSL-Project

Folders and files

Latest commit

History

Repository files navigation

Bird’s-eye View Prediction: Contrastive Predictive Coding pre-train Pio2Vox with YOLOv3

Instruction

Step 1. CPC pretrain Pix2Vox

Step 2. Fine Tune or train from scratch

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages