MNN is a bare minimum but sufficiently efficient neural network implementation running only on GPU. It serves as an educational reference to whoever wants to inspect how neural network works as well as the math behind the scenes.
- Minimum: no complicated or bloated frameworks, manual operator differentiation, only a few source files written from scratch in a crystal clear way.
- Efficient: although taking a simplistic approach, it expects no worse efficiency thanks to: no low-granularity automatic differentiation (i.e., auto diff) thus no large graph of the derivatives, and running (only) on GPU where CuPy underpins all the efficient matrix manipulation.
- Educational: focusing on neural network core knowledge, written in simple Python but with detailed in-source-code LaTeX comments that describe the math! (e.g., including derivations of the Jacobian matrix)
This project is still under development. Although it has a complete pipeline for solving a MNIST classification task, many exciting hands-on and well-documented code is yet to come. Wishfully, expect a Transformer module which can be trained in one or two days on a consumer GPU in the future!
python examples/datasets.py prepare_MNIST_dataset
python examples/mnist.py train \
--save_file=./data/mnist_model_ckpt.pkl \
--batch_size=1024 \
--epochs 60
python examples/mnist.py test ./data/mnist_model_ckpt.pkl
Here is a recommended ordered reading list of MNN source code that you can pick up knowledge smoothly by going through the comments with linked code location:
- examples/mnist (only illustrating the pipeline code)
- SequentialLayers
- LinearLayer
- ReluLayer
- MSELossLayer
- SoftmaxLayer
- LogSoftmaxLayer
- NllLossLayer
- CrossEntropyLossLayer
This project is inspired by pytorch and tinynn. Further inspiration may also taken from other projects (will be listed).
MIT