Skip to content

Latest commit

 

History

History
164 lines (141 loc) · 10.9 KB

hyper-tuning.rst

File metadata and controls

164 lines (141 loc) · 10.9 KB

Hyperparameters Tuning

Training

I've set by mistakenly dropout as 0.2. Somewhere I've read dropout should be less for input channels. But I performed less dropout on weight vectors.

Learning rate: 0.001

Epochs Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 3 512 2 10 93.31 1.6068
5 64 3 1024 2 10 93.87 1.5971
5 64 4 2048 2 10 94.31 1.5901
5 32 4 2048 2 10 94.75 1.5656
  • After removing batch normalization from second maxout and adding normalization to input
Epochs Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 3 1024 2 10 94.38 1.5355
5 64 3 2048 2 10 94.65 1.5307
5 64 4 1024 2 10 94.33 1.5242
5 64 4 2048 2 10 94.33 1.5911

Validation

Training
Epochs
Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 3 512 2 10 94.38 1.5789
5 64 3 1024 2 10 95.08 1.5719
5 64 4 2048 2 10 95.36 1.5666
5 32 4 2048 2 10 95.53 1.5554
  • After removing batch normalization from second maxout and adding normalization to input
Training
Epochs
Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 3 1024 2 10 95.11 1.5243
5 64 3 2048 2 10 95.28 1.5216
5 64 4 1024 2 10 95.28 1.5234
5 64 4 2048 2 10 95.10 1.5683

As the accuracies and loss are coming nearly same in both cases. To simplify the network I've removed normalization from first layer and added batch normalizations to two of the maxout layers as before. The dropout I've kept as 0.5

Training

Epochs Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 4 2048 2 10 91.74 1.6334
5 64 4 1024 2 10 90.77 1.6480

Validation

Training
Epochs
Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 4 2048 2 10 94.08 1.5852
5 64 4 1024 2 10 93.52 1.5917

I believe with little tweak we can achieve above accuracies without much variant between training and validation. Now after increasing learning rate to 0.005

Training

Epochs Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 4 2048 2 10 97.79 1.5060
5 64 4 1024 2 10 97.44 1.5107

Validation

Training
Epochs
Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 4 2048 2 10 96.94 1.5097
5 64 4 1024 2 10 96.83 1.5108

Finally the hyperparameters are as in the commit. It has been trained further with whole training dataset with the following accuracies and loss.

Training with pretrained weights

Epochs Batch size Layer1 Layer2
Accuracy
(%)
Loss
Number of
layers
Number of
Neurons
Number of
layers
Number of
Neurons
5 64 4 2048 2 10 99.02 1.4827