Understanding the underlying mechanism of complex neural networks can be difficult. The invention of TensorBoard makes common operations of running neural networks visible. It greatly facilitates people understanding, debugging, and optimizing deep learning programs. In order for MXNet users to use TensorBoard, we developed a tool called MXBoard for logging MXNet data in the format recognizable to TensorBoard. In this demo, we are going to demonstrate the usage of MXBoard for MXNet users to train and understand neural networks in a more intuitive way.
Let's borrow the script of training the MNIST model from the
Gluon example
to monitor the training progress in TensorBoard. You can find the code in train_mnist.py
.
Note that here we define the network using HybridSequential
,
instead of Sequential
as in the original Gluon example. This is because that we want to plot
the graph in TensorBoard and MXBoard only accepts HybridBlock
s for models built
using Gluon interfaces.
In the following, we highlight the snippets in the train()
function where MXBoard comes into play.
- Define a
SummaryWriter
object for writing MXNet data to event files under the./logs
directory and flushing writes every five seconds in order to view updated results in TensorBoard promptly.
sw = SummaryWriter(logdir='./logs', flush_secs=5)
- For the first mini-batch of images in every epoch, display them in TensorBoard sequentially.
We want to verify that these images are different since we set
shuffle=True
in theDataLoader
loading the training dataset.
if i == 0: # i here represents the minibatch id
sw.add_image('minist_first_minibatch', data.reshape((opt.batch_size, 1, 28, 28)), epoch)
- Plot the graph of the MNIST model. The model graph is available for logging once
net.hybridize()
andnet.forward()
execute.
if epoch == 0:
sw.add_graph(net)
- Plot the histograms of the gradients of parameters for every epoch.
grads = [i.grad() for i in net.collect_params().values()]
assert len(grads) == len(param_names)
# logging the gradients of parameters for checking convergence
for i, name in enumerate(param_names):
sw.add_histogram(tag=name, values=grads[i], global_step=epoch, bins=1000)
- Plot cross entropy, training and validation accuracy curves.
sw.add_scalar(tag='cross_entropy', value=L.mean().asscalar(), global_step=global_step)
name, acc = metric.get()
print('[Epoch %d] Training: %s=%f' % (epoch, name, acc))
# logging training accuracy
sw.add_scalar(tag='train_acc', value=acc, global_step=epoch)
name, val_acc = test(ctx)
# logging the validation accuracy
print('[Epoch %d] Validation: %s=%f' % (epoch, name, val_acc))
sw.add_scalar(tag='valid_acc', value=val_acc, global_step=epoch)
- Remember to close the
SummaryWriter
once training finishes.
sw.close()
This section visualizes the filters and outputs of the first convolution layers in three ConvNets: Inception-BN, Resnet-152, and VGG16.
All three models are trained on ImageNet dataset. The outputs are generated by applying filters to an image selected from the validation dataset. The three pictures in each example are original picture from the validation dataset, the filters of the first convolution layer, and the outputs of the convolution layer after applying the filters to the original image. Each filter in the middle picture corresponds to the output image of the same coordinate in the picture on its right side. The smooth and nicely-formed patterns in the filters indicate that the models have been well trained. The gray-scale filters tend to extract the outlines of the object, while the colorful filters focus on local features of the objects. Run command
$ python plot_filter_and_output.py
to write filters and outputs to event files for visualization in TensorBoard.
Embeddings are the mathematical vector representation of real-world objects in high-dimensional space. One can quantify the similarity between two objects by calculating the normalized inner-product of their embedding vectors. Furthermore, dimension-reduction algorithms, such as PCA and t-SNE, can collapse high-dimensional embedding vectors into 2D or 3D space so that their spatial position can be visualized.
While human eyes recognize images by telling the difference of shapes, colors, environments etc. from one to another, ConvNets classify images via special codes generated by the last fully-connected layer just before the classifier such as softmax. These codes can be taken as the embeddings of the objects and we can plot them in TensorBoard.
Here we randomly selected 2,304 images from the validation dataset of ImageNet as follows.
Then we collected their embeddings generated by the pre-trained model Resnet-152 without the softmax layer, and wrote the embeddings to event files for visualization in TensorBoard.
Let's collapse the embeddings data into 3D space using t-SNE since it is known for the excellent capability of preserving local structures of object relationship. One can apply other pre-trained models to achieve the similar results as below.
On the top-right corner of the TensorBoard GUI, you can enter the object type to search for the matched objects in the canvas, and then zoom in to check whether similar objects are clustered. A well trained ConvNet model should produce the codes that enable t-SNE to place similar objects in the same neighborhood, and preserve the relative distances of objects in the high-dimensional space. Here we want to search for images of dogs, and after zooming in, we observed clustering of dog images as follows. To reproduce the result, type
$ python get_convnet_codes.py
to generate image embeddings. Note that if you don't have an Nvidia GPU on your machine, please use the following command instead:
$ python get_convnet_codes.py --ctx cpu
Then type
$ python plot_convnet_embedding.py
to write embeddings into event files for visualization in TensorBoard.