Skip to content

Successfully training approximations to full-rank matrices for efficiency in deep learning.

License

Notifications You must be signed in to change notification settings

BayesWatch/deficient-efficient

Repository files navigation

Deficient Linear Transforms for Efficient Deep Learning

Substitute compressed linear transforms for deep learning. Substitute convolutions into an existing WideResNet or DARTS network and train as normal. Details of the research are provided in the research log.

tl;dr

In a deep neural network, you can replace the matrix multiply using a weight matrix (a linear transform) with an alternative that uses fewer parameters or mult-adds or both. Such as:

But, this will only train if you scale the original weight decay used to train the network by the compression ratio.

WRN-28-10 on CIFAR-10

DARTS on CIFAR-10

WRN-50-2 on ImageNet

Citations

If you would like to cite this work, please cite our paper using the following bibtex entry:

@article{gray2019separable,
  author    = {Gavin Gray and
               Elliot J. Crowley and
               Amos Storkey},
  title     = {Separable Layers Enable Structured Efficient Linear
Substitutions},
  journal   = {CoRR},
  volume    = {abs/1906.00859}, pending
  year      = {2019},
  url       = {https://arxiv.org/abs/1906.00859},
  archivePrefix = {arXiv}
  eprint    = {1906.00859}
}

Acknowledgements

Based on: https://github.com/BayesWatch/pytorch-moonshine

About

Successfully training approximations to full-rank matrices for efficiency in deep learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages