Skip to content

Latest commit

 

History

History
86 lines (41 loc) · 4.88 KB

README.md

File metadata and controls

86 lines (41 loc) · 4.88 KB

NTK-and-MF-examples

Description

This repo contains some examples of mean-field regime (MF) [6] and neural tangent kernel regime (NTK) [4,5], which are mathematical models of neural networks. Under different conditions, the dynamics of neural network is prone to behave as one of them. This repo and [1] introduces the differences between these two regimes and tries to illustate the conditions of them. For simplicity, we only consider the two-layer NN case, which is written as

where input , feature learning parameter , importance parameter , and scaling factor .

NTK

Our NTK part focus on the priciple of linearization, which is the core idea of tangent kernel. When the distance between initialization and optimized NN is not large, it can be analysed with NTK regime. This circumstance indeed happens when and are large.

Also, we have pointed out some other factors may also changes , such as learning rate, momentum and even the initalization of . And we have illustrate the failure of linear approximation of some practical NN. Thus, some practical tricks may break the NTK regimes and need more investigation for further study.

More detailed information are in [1,4,5] and "./NTK/"

MF

Unlike NTK, the MF does not restrict , but assume the i.i.d. property between each . This relax the search space of each particle by compromising the correlation between them. More importantly, this property is also an ideal case for practical NNs, consistent with tricks such as Dropout, Batch Normalization. However, the theoretical analysis of MF are still preliminary compared with NTK.

In this repo, we are focusing on the distributional change of particles and investigating the effective neurons of real NNs.

More detailed information are in [1,2,3,6] and "./MF/MF.ipynb"

Repopulation

Feature repopulation is a consequence of MF regime [2,3,7]. In particular, rather than learning the tangent space (parameters of trained NN are very closed to initialization), MF regime moves the whole distribution of NN parameters.

We compared the feature produced by repopulated distribution and initialized distribution and find the effetiveness of the first one.

Codes are provided in "./Repopulation".

References

[1] C. Fang, H. Dong, T. Zhang. Mathematical Models of Overparameterized Neural Networks. Proceedings of the IEEE, 2021.

[2] C. Fang, H. Dong, T. Zhang. "Over parameterized two-level neural networks can learn near optimal feature representations". arXiv preprint arXiv:1910.11508, 2019.

[3] C. Fang, J. D. Lee, P. Yang, and T. Zhang, "Modeling from features: a mean-field framework for over-parameterized deep neural networks". arXiv preprint arXiv:2007.01452, 2020.

[4] A. Jacot, F. Gabriel, and C. Hongler, "Neural tangent kernel: Convergence and generalization in neural networks". Advances in neural information processing systems, 2018.

[5] S. S. Du, J. D. Lee, H. Li, L. Wang, and X. Zhai, "Gradient descent finds global minima of deep neural networks". International Conference on Machine Learning, 2019.

[6] S. Mei, A. Montanari, and P.-M. Nguyen, "A mean field view of the landscape of two-layer neural networks". Proceedings of the National Academy of Sciences, vol. 115, no. 33, pp. E7665–E7671, 2018.

[7] W. Zhang, Y. Gu, C. Fang, J. Lee, and T. Zhang, "How to Characterize The Landscape of Overparameterized Convolutional Neural Networks". Advances in neural information processing systems, 2020.

Citations

The visualization and illustration in this repo came primarily out of research in Statistics and Machine Learning Research Group at HKUST.

For detailed description you can refer to

Mathematical Models of Overparameterized Neural Networks

If you find it helpful, you can cite

@ARTICLE{fang2021mathematical,
  author = {Cong Fang, Hanze Dong, Tong Zhang},
  journal={Proceedings of the IEEE}, 
  title = {Mathematical Models of Overparameterized Neural Networks},
  year={2021},
  
}

Contact

If you meet any problem in this repo, please describe them and contact:

Hanze Dong: A [AT] B, where A=hdongaj, B=ust.hk.