Skip to content

How to use

Denys Dragunov edited this page Aug 13, 2022 · 38 revisions

Overview

At the moment the code base of the project is represented by a single Visual Studio (2019) solution, DeepLearning, consisting 3 VS projects:

  1. DeepLearning, containing code of the actual machine learning-related algorithms;
  2. DeepLearningTest, containing testing functionality for the algorithms (via VSTest framework);
  3. NetrunnerConsole --- a standalone executable to run training of neural networks from console;

All the functions, classes and methods have rather comprehensive summary sections. Most of the methods/components have dedicated test-suites in the testing project, which, besides the obvious validation/verification purposes, also provide examples of how the corresponding functionality is supposed to be invoked.

The solution requires CUDA SDK 11.7.

How to build a neural net

Currently there are 3 classes representing 3 types of layers that one can use to build a neural net:

  1. NLayer (see NLayer.h) - a fully connected neural layer;
  2. CLayer (see CLayer.h) - a convolution neural layer;
  3. PLayer (see PLayer.h) - a "pooling" layer.

The net itself is represented with a class Net (see Net.h).

NLayer

A fully connected layer. To instantiate it we need to provide 3 pieces of information: a number of input neurons; a number of output neurons and the type of activation function to use. For example, this line of code

NLayer(100, 10, ActivationFunctionId::SIGMOID)

constructs a fully connected neural layer with 100 input neurons and 10 output neurons, using "sigmoid" as its activation function. Bu default, weight and biases of the layer are initialized by uniformly distributed pseudo-random floating point values from [-1, 1]. The constructor also provides 3 default parameters allowing more customized initialization. Besides "sigmoid" activation, one can use "hyperbolic tangent" (ActivationFunctionId::TANH), "linear rectifier" (ActivationFunctionId::RELU) and "soft-max" (ActivationFunctionId::SOFTMAX) activation functions (see ActivationFunction.h for details).

CLayer

A convolution neural layer. To instantiate this type of a layer one should provide the input data (image) size (3d --- number of channels, height, width), filter window size (2d --- height, width), number of filters to use (defines number of channels in the output of the layer) and type of activation function to use. For example, the following line of code

CLayer({1,28,28}, {5,5}, 20, ActivationFunctionId::RELU)

instantiates a convolution neural layer that takes a single channel image of size 28x28 as its input, uses 20 filters (convolution kernels) of size 1x5x5 which implies that its output will have size 20x24x24. Optionally one can specify "paddings" and "strides" used during the convolution as a couple of default parameters of the constructor.

PLayer

In the current implementation a "pool" layer allows one to do "min", "max" or "average" pooling operation. To instantiate it, one should specify 3 parameters: 3d size of the input data (image), 2d size of the "pool" window (will be applied to each channel of the input data) and the "pool type" identifier (PoolTypeId::MAX, PoolTypeId::MIN, PoolTypeId::AVERAGE).

Net

The following lines of code below:

image

instantiate a neural network consisting of 6 layers, which (in terms of architecture) is equivalent this neural net

image

from Neural Networks and Deep Learning (Chapter 6).

Once instantiated, the net can be trained, by calling its method net.learn(...) and then used for predictions by calling net.act(...). In addition to that a trained net can be saved to disk in binary format by calling (see MsgPackUtils.h for details)

MsgPack::save_to_file(net, <path to file>);

and then loaded from disk by calling

auto net = MsgPack::load_from_file<Net>(<path to file>);

There is an option to serialize the neural network's architecture in a script format as well as to instantiate a neural net with an architecture described the given script. For example, by calling

net.save_script(<path to text file>)

the architecture of the network introduced above will be saved to the given text file with the following text inside:

CONV;{1, 28, 28}{5, 5};20;RELU;{0, 0, 0}{1, 1, 1}

PULL;{20, 24, 24}{2, 2};MAX

CONV;{20, 12, 12}{5, 5};40;RELU;{0, 0, 0}{1, 1, 1}

PULL;{40, 8, 8}{2, 2};MAX

FULL;{40, 4, 4}{1, 1, 100};RELU

FULL;{}{1, 1, 10};SOFTMAX

As one can easily conclude, each line in the script above correspond to a layer in the net. Take for example the first line. "CONV" there refers to the layer type: "convolution"; "{1, 28, 28}" --- defines size of the input image; "{5, 5}" --- determines size of the convolution window, "20" --- represents the number of filters in the layer (i.e. the number of channels in the output image); "RELU" tells for itself and "{0, 0, 0}", "{1, 1, 1}" stand for the "paddings" and "strides" parameters respectively. everything is pretty much self-explanatory except probably "FULL", which is a nickname of NLayer (a fully connected neural layer).

By calling

net.try_load_from_script_file(<path to script>)

one can instantiate a net with the architecture encoded in the given text script on disk.

Clone this wiki locally