Hyperparameters Tuning

Training

I've set by mistakenly dropout as 0.2. Somewhere I've read dropout should be less for input channels. But I performed less dropout on weight vectors.

Learning rate: 0.001

Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	3	512	2	10	93.31	1.6068
5	64	3	1024	2	10	93.87	1.5971
5	64	4	2048	2	10	94.31	1.5901
5	32	4	2048	2	10	94.75	1.5656

After removing batch normalization from second maxout and adding normalization to input

Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	3	1024	2	10	94.38	1.5355
5	64	3	2048	2	10	94.65	1.5307
5	64	4	1024	2	10	94.33	1.5242
5	64	4	2048	2	10	94.33	1.5911

Validation

Training Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Training Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	3	512	2	10	94.38	1.5789
5	64	3	1024	2	10	95.08	1.5719
5	64	4	2048	2	10	95.36	1.5666
5	32	4	2048	2	10	95.53	1.5554

After removing batch normalization from second maxout and adding normalization to input

Training Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Training Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	3	1024	2	10	95.11	1.5243
5	64	3	2048	2	10	95.28	1.5216
5	64	4	1024	2	10	95.28	1.5234
5	64	4	2048	2	10	95.10	1.5683

As the accuracies and loss are coming nearly same in both cases. To simplify the network I've removed normalization from first layer and added batch normalizations to two of the maxout layers as before. The dropout I've kept as 0.5

Training

Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	4	2048	2	10	91.74	1.6334
5	64	4	1024	2	10	90.77	1.6480

Validation

Training Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Training Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	4	2048	2	10	94.08	1.5852
5	64	4	1024	2	10	93.52	1.5917

I believe with little tweak we can achieve above accuracies without much variant between training and validation. Now after increasing learning rate to 0.005

Training

Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	4	2048	2	10	97.79	1.5060
5	64	4	1024	2	10	97.44	1.5107

Validation

Training Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Training Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	4	2048	2	10	96.94	1.5097
5	64	4	1024	2	10	96.83	1.5108

Finally the hyperparameters are as in the commit. It has been trained further with whole training dataset with the following accuracies and loss.

Training with pretrained weights

Epochs	Batch size	Layer1		Layer2		Accuracy (%)	Loss
Epochs	Batch size	Number of layers	Number of Neurons	Number of layers	Number of Neurons	Accuracy (%)	Loss
5	64	4	2048	2	10	99.02	1.4827

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hyper-tuning.rst

hyper-tuning.rst

Hyperparameters Tuning

Training

Validation

Training

Validation

Training

Validation

Training with pretrained weights

Files

hyper-tuning.rst

Latest commit

History

hyper-tuning.rst

File metadata and controls

Hyperparameters Tuning

Training

Validation

Training

Validation

Training

Validation

Training with pretrained weights