GitHub

Doubly Convolutional Neural Networks

Shuangfei Zhai, Yu Cheng, Weining Lu, Zhongfei Zhang

Full paper here: https://arxiv.org/abs/1610.09716

Enviroment:

- Dependencies: theano lasagne (both bleeding edge), optionally cudnn.
- To test the dependencies, type python main.py in bash, this should start training a tiny model on cifar-10.

More explanations about the code:

- The general usage of the code is python main.py [-param value]

Possible parameters are:

-dataset: the dataset to use, current supports cifar10 and cifar100, default is cifar10

-train_epochs: number of epochs to train, default is 100.

-patience: the patience for training, when the validation error stops decreasing for this number of epochs, devide learning rate by half. default is 10.

-lr: the learning rate for adadelta, default is 1.

-filter_shape: specifies the shape of the cnn. More specifically, this is the shape of the pysically allocated parameters, the actual shape is determined by the architecture.

-conv_type: {'standard', 'double', 'maxout'}.

If standard, the standard cnn is chosen, the layer shape is the same as the filter_shape.

If double, the double cnn is chosen, the layer shape is determined by the -kernel_size as well as the -kernel_pool_size together.

If maxout, maxout is used (see Maxout Networks, Goodfellow et. al.). This corresponds to performing feature pooling. The pooling size is kernel_pool_size^2 (to be consistent with double cnn).

-kernel_size: the actual filter size for doulbe convolution, only odd size is supported.

-kernel_pool_size: how to pool the activations along the large filters.

-batch_size: batch size of data, default is 200.

-load_model: 0 or 1. deafult 0.

-save_model: the name with which to save the model.

-train_on_valid: whether to fold the validation set into the training set. default 1.

-dropout_rate: dropping out when training

-learning_decay: decaying learning rate when the validation error stops decreasing

Experiments I on cifar10 (the hyper-parameters listed are optimal ones for each setting)

python main.py -dataset cifar10 -conv_type standard -filter_shape 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 -save_model standard.npz -train_epochs=150

This should yield an error rate around ~9.8%
python main.py -dataset cifar10 -conv_type double -filter_shape 128,4,4 128,4,4 2,2 128,4,4 128,4,4 2,2 128,4,4 128,4,4 2,2 128,4,4 128,4,4 2,2 -kernel_size 3 -kernel_pool_size -1 -save_model doubleconv.npz -learning_decay 0.8 -dropout_rate 0.6 -train_epochs=150

This should produce a model with the same shape (w.r.t. #layers and #neurons per layer), and a corresponding error rate of ~8.6%.
python main.py -dataset cifar10 -conv_type maxout -filter_shape 512,3,3 512,3,3 2,2 512,3,3 512,3,3 2,2 512,3,3 512,3,3 2,2 -kernel_pool_size 2 -save_model maxout.npz -learning_decay 0.7 -dropout_rate 0.6 -train_epochs=150

This should run a maxout network with effecitve layer size of 128 (512 / 2^2), and a corresponding error rate of ~9.6%.

Experiments II:

This set of experiments is aimed to try the effect of doubleconv in different layers (first, second, third, fourth). Note: when -conv_type is double and filter size is less or equal than kernel_size, the corresponding layer falls back to standard convlayer.

python main.py -dataset cifar10 -conv_type double -filter_shape 128,4,4 128,4,4 2,2 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 -kernel_size 3 -kernel_pool_size -1 -save_model doubleconv_first.npz
python main.py -dataset cifar10 -conv_type double -filter_shape 128,3,3 128,3,3 2,2 128,4,4 128,4,4 2,2 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 -kernel_size 3 -kernel_pool_size -1 -save_model doubleconv_second.npz
python main.py -dataset cifar10 -conv_type double -filter_shape 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 128,4,4 128,4,4 2,2 128,3,3 128,3,3 2,2 -kernel_size 3 -kernel_pool_size -1 -save_model doubleconv_third.npz
python main.py -dataset cifar10 -conv_type double -filter_shape 128,3,3 128,3,3 2,2 128,3,3 128,3,3 2,2 128,3,3 128,3,3 128,4,4 128,4,4 2,2 2,2 -kernel_size 3 -kernel_pool_size -1 -save_model doubleconv_fourth.npz

Experiments III:

python main.py -dataset cifar10 -conv_type double -filter_shape 16,6,6 16,6,6 2,2 16,6,6 16,6,6 2,2 16,6,6 16,6,6 2,2 -kernel_size 3 -kernel_pool_size 1 -save_model doubleconv3_1.npz

same #parameters, larger model

python main.py -dataset cifar10 -conv_type double -filter_shape 4,10,10 4,10,10 2,2 4,10,10 4,10,10 2,2 4,10,10 4,10,10 2,2 -kernel_size 3 -kernel_pool_size 1 -save_model doubleconv3_2.npz

less #parameters, larger model

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
logs		logs
plots		plots
README.md		README.md
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

ych133/Double_CNN

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages