Prediction on Steering Angles for Self-driving Car is one of challenge in Udacity, and the main purpose of this project is to create two models, that are 3DCNN+LSTM and TransferLearning, to analyse and extract information from video of driving recording in real world, predicting steering angles based on real road sitaution, so that autonomous cars are able to drive by themselves.
Data source is avaliable on Udacity github.
Ⅰ. Insides includes steering angles, speed and torque from left, center and right cameras.
Ⅱ. Image size: 640*320
Data format: Rosbag Tool: from Mr.Ross Wightman
For 3DCNN+LSTM and TransferLearning models, we load images in different sizes. In , two files can be found.
ConsecutiveBatchSampler.py & UdacityDataset.py
Convolution3D.py & TransferLearning.py
* Loading Size: [Batch_size, sequence_length, channels, height, width]
* Feeding Size: images in Batch_size * sequence_length * channels * 320 * 120
Ⅰ. Insides 3D convolution layers, residual connection layers are added in order to tackle graident vanishing situation; When going through LSTM, memory property is to withdraw information from former images and output integrated infromation to next linear connection layers
Ⅱ. With time sequence, 3DCNN+LSTM take video-type input data and LSTM can memorize driving history based on former frames extracted from 3DCNN layers;
Reference: , Shuyang Du, Haoli Guo, Andrew Simpson, arXiv:1912.05440v1[sc.CV] 11.Dec.2019, page 4, "Figure 3. Architecture used for transfer learning model"
* Loading Size: Batch_size * channels * height * width
* Feeding Size: images in Batch_size * channels * 224 * 224
Ⅰ. Instead of considering time sequence, TransferLearning take frames as input dataset and using pretrained ResNet50 to extract more accurate infromation from former CNN layers
below is loss values in two models in different stages
For visualze outputs from our models, you can go to
Visualization.ipynb
- below is attention map;
- below is kernel images:
By using this 3DCNN+LSTM model, you should get this result