Skip to content

Commit

Permalink
First translation commit
Browse files Browse the repository at this point in the history
  • Loading branch information
JackEggie authored Mar 1, 2019
1 parent 3e63198 commit 2df4478
Showing 1 changed file with 63 additions and 63 deletions.
126 changes: 63 additions & 63 deletions TODO1/how-to-build-your-own-neural-network-from-scratch-in-python.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,33 @@
> * 译者:
> * 校对者:
# How to build your own Neural Network from scratch in Python
# 如何用 Python 从零开始构建你自己的神经网络

> A beginner’s guide to understanding the inner workings of Deep Learning
> 一个帮助初学者理解深度学习内部工作原理的指南
**Motivation:** As part of my personal journey to gain a better understanding of Deep Learning, I’ve decided to build a Neural Network from scratch without a deep learning library like TensorFlow. I believe that understanding the inner workings of a Neural Network is important to any aspiring Data Scientist.
**写作动机:** 为了使我自己可以更好地理解深度学习,我决定在没有像 TensorFlow 这样的深度学习库的情况下,从零开始构建一个神经网络。我相信,理解神经网络的内部工作原理对任何有追求的数据科学家来说都很重要。

This article contains what I’ve learned, and hopefully it’ll be useful for you as well!
这篇文章包含了我所学到的东西,希望对你们也有用。

## What’s a Neural Network?
## 什么是神经网络?

Most introductory texts to Neural Networks brings up brain analogies when describing them. Without delving into brain analogies, I find it easier to simply describe Neural Networks as a mathematical function that maps a given input to a desired output.
大多数介绍神经网络的文章在描述它们时都会与大脑做类比。在不深入研究与大脑类似之处的情况下,我发现将神经网络简单地描述为给定输入映射到期望输出的数学函数更容易理解一些。

Neural Networks consist of the following components
神经网络由以下几个部分组成:

* An **input layer**, **_x_**
* An arbitrary amount of **hidden layers**
* An **output layer**, **_ŷ_**
* A set of **weights** and **biases** between each layer, **_W and b_**
* A choice of **activation function** for each hidden layer, **_σ_**. In this tutorial, we’ll use a Sigmoid activation function.
* 一个**输入层****_x_**
* 任意数量的**隐含层**
* 一个**输出层****_ŷ_**
* 层与层之间的一组**权重****偏差****_W__b_**
* 每个隐含层中所包含的一个可选的**激活函数**, **_σ_**。在本教程中,我们将使用 Sigmoid 激活函数。

The diagram below shows the architecture of a 2-layer Neural Network (_note that the input layer is typically excluded when counting the number of layers in a Neural Network_)
下图展示了 2 层神经网络的架构(**注:在计算神经网络中的层数时,输入层通常被排除在外**)

![](https://cdn-images-1.medium.com/max/1600/1*sX6T0Y4aa3ARh7IBS_sdqw.png)

Architecture of a 2-layer Neural Network
2 层神经网络的架构

Creating a Neural Network class in Python is easy.
Python 中创建一个神经网络的类很简单。

```python
class NeuralNetwork:
Expand All @@ -43,32 +43,32 @@ class NeuralNetwork:
self.output = np.zeros(y.shape)
```

**Training the Neural Network**
**训练神经网络**

The output **_ŷ_** of a simple 2-layer Neural Network is:
一个简单的 2 层神经网络的输出 **_ŷ_** 如下:

![](https://cdn-images-1.medium.com/max/1600/1*E1_l8PGamc2xTNS87XGNcA.png)

You might notice that in the equation above, the weights **_W_** and the biases **_b_** are the only variables that affects the output **_ŷ._**
你可能注意到了,在上面的等式中,只有权重 **_W_** 和偏差 **_b_** 这两个变量会对输出 **_ŷ_** 产生影响。

Naturally, the right values for the weights and biases determines the strength of the predictions. The process of fine-tuning the weights and biases from the input data is known as **training the Neural Network.**
当然,合理的权重和偏差会决定预测的准确程度。将针对输入数据的权重和偏差进行微调的过程就是**训练神经网络**的过程。

Each iteration of the training process consists of the following steps:
训练过程的每次迭代包括以下步骤:

* Calculating the predicted output **_ŷ_**, known as **feedforward**
* Updating the weights and biases, known as **backpropagation**
* 计算预测输出的值 **_ŷ_**,即**前馈**
* 更新权重和偏差,即**反向传播**

The sequential graph below illustrates the process.
下面的序列图展示了这个过程。

![](https://cdn-images-1.medium.com/max/1600/1*CEtt0h8Rss_qPu7CyqMTdQ.png)

### Feedforward
### 前馈

As we’ve seen in the sequential graph above, feedforward is just simple calculus and for a basic 2-layer neural network, the output of the Neural Network is:
正如我们在上面的序列图中看到的,前馈只是一个简单的计算,对于一个基本的 2 层神经网络,它的输出是:

![](https://cdn-images-1.medium.com/max/1600/1*E1_l8PGamc2xTNS87XGNcA.png)

Let’s add a feedforward function in our python code to do exactly that. Note that for simplicity, we have assumed the biases to be 0.
让我们在 Python 代码中添加一个前馈函数来实现这一点。注意,为了简单起见,我们假设偏差为 0。

```python
class NeuralNetwork:
Expand All @@ -84,41 +84,41 @@ class NeuralNetwork:
self.output = sigmoid(np.dot(self.layer1, self.weights2))
```

However, we still need a way to evaluate the “goodness” of our predictions (i.e. how far off are our predictions)? The **Loss Function** allows us to do exactly that.
但是,我们仍然需要一种方法来评估预测的“精准程度”(即我们的预测有多好)? 而**损失函数**能让我们做到这一点。

### Loss Function
### 损失函数

There are many available loss functions, and the nature of our problem should dictate our choice of loss function. In this tutorial, we’ll use a simple **sum-of-sqaures error** as our loss function.
可用的损失函数有很多,而我们对损失函数的选择应该由问题本身的性质决定。在本教程中,我们将使用简单的**平方和误差**作为我们的损失函数。

![](https://cdn-images-1.medium.com/max/1600/1*iNa1VLdaeqwUAxpNXs3jwQ.png)

That is, the sum-of-squares error is simply the sum of the difference between each predicted value and the actual value. The difference is squared so that we measure the absolute value of the difference.
这就是说,平方和误差只是每个预测值与实际值之差的总和。我们将差值平方后再计算,以便我们评估误差的绝对值。

**Our goal in training is to find the best set of weights and biases that minimizes the loss function.**
**训练的目标是找到能使损失函数最小化的一组最优的权值和偏差。**

### Backpropagation
### 反向传播

Now that we’ve measured the error of our prediction (loss), we need to find a way to **propagate** the error back, and to update our weights and biases.
现在我们已经得出了预测的误差(损失),我们还需要找到一种方法将误差**传播**回来,并更新我们的权重和偏差。

In order to know the appropriate amount to adjust the weights and biases by, we need to know the **derivative of the loss function with respect to the weights and biases**.
为了得出调整权重和偏差的合适的量,我们需要计算**损失函数对于权重和偏差的导数**

Recall from calculus that the derivative of a function is simply the slope of the function.
回忆一下微积分的知识,计算函数的导数就是计算函数的斜率。

![](https://cdn-images-1.medium.com/max/1600/1*3FgDOt4kJxK2QZlb9T0cpg.png)

Gradient descent algorithm
梯度下降算法

If we have the derivative, we can simply update the weights and biases by increasing/reducing with it(refer to the diagram above). This is known as **gradient descent**.
如果我们已经算出了导数,我们就可以简单地通过增加/减少导数来更新权重和偏差(参见上图)。这就是所谓的**梯度下降**

However, we can’t directly calculate the derivative of the loss function with respect to the weights and biases because the equation of the loss function does not contain the weights and biases. Therefore, we need the **chain rule** to help us calculate it.
然而,我们无法直接计算损失函数对于权重和偏差的导数,因为损失函数的等式中不包含权重和偏差。 因此,我们需要**链式法则**来帮助我们进行计算。

![](https://cdn-images-1.medium.com/max/1600/1*7zxb2lfWWKaVxnmq2o69Mw.png)

Chain rule for calculating derivative of the loss function with respect to the weights. Note that for simplicity, we have only displayed the partial derivative assuming a 1-layer Neural Network.
用于计算损失函数对于权重的导数的链式法则。注意,为了简单起见,我们只展示了假设为 1 层的神经网络的偏导数。

Phew! That was ugly but it allows us to get what we needed — the derivative (slope) of the loss function with respect to the weights, so that we can adjust the weights accordingly.
哦!这真难看,但它让我们得到了我们需要的东西 —— 损失函数对于权重的导数(斜率),这样我们就可以相应地调整权重。

Now that we have that, let’s add the backpropagation function into our python code.
现在我们知道要怎么做了,让我们向 Pyhton 代码中添加反向传播函数。

```python
class NeuralNetwork:
Expand All @@ -134,58 +134,58 @@ class NeuralNetwork:
self.output = sigmoid(np.dot(self.layer1, self.weights2))

def backprop(self):
# application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
# 应用链式法则求出损失函数对于 weights2 weights1 的导数
d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output)))
d_weights1 = np.dot(self.input.T, (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))

# update the weights with the derivative (slope) of the loss function
# 用损失函数的导数(斜率)更新权重
self.weights1 += d_weights1
self.weights2 += d_weights2
```

For a deeper understanding of the application of calculus and the chain rule in backpropagation, I strongly recommend this tutorial by 3Blue1Brown.
如果你需要更深入地理解微积分和链式法则在反向传播中的应用,我强烈推荐 3Blue1Brown 的教程。

Watch this [video](https://youtu.be/tIeHLnjs5U8)
观看[视频教程](https://youtu.be/tIeHLnjs5U8)

## Putting it all together
## 融会贯通

Now that we have our complete python code for doing feedforward and backpropagation, let’s apply our Neural Network on an example and see how well it does.
现在我们已经有了前馈和反向传播的完整 Python 代码,让我们将神经网络应用到一个示例中,看看效果如何。

![](https://cdn-images-1.medium.com/max/1600/1*HaC4iILh2t0oOKi6S6FwtA.png)

Our Neural Network should learn the ideal set of weights to represent this function. Note that it isn’t exactly trivial for us to work out the weights just by inspection alone.
我们的神经网络应该通过学习得出一组理想的权重来表示这个函数。请注意,仅仅通过查看数据来得出权重对我们来说也并不简单。

Let’s train the Neural Network for 1500 iterations and see what happens. Looking at the loss per iteration graph below, we can clearly see the loss **monotonically decreasing towards a minimum.** This is consistent with the gradient descent algorithm that we’ve discussed earlier.
让我们对神经网络进行 1500 次训练迭代,看看会发生什么。观察下图中每次迭代的损失变化,我们可以清楚地看到损失**单调递减至最小值**。这与我们前面讨论的梯度下降算法是一致的。

![](https://cdn-images-1.medium.com/max/1600/1*fWNNA2YbsLSoA104K3Z3RA.png)

Let’s look at the final prediction (output) from the Neural Network after 1500 iterations.
让我们看一下经过 1500 次迭代后神经网络的最终预测(输出)。

![](https://cdn-images-1.medium.com/max/1600/1*9oOlYhhOSdCUqUJ0dQ_KxA.png)

Predictions after 1500 training iterations
1500 次训练迭代后的预测结果

We did it! Our feedforward and backpropagation algorithm trained the Neural Network successfully and the predictions converged on the true values.
我们成功了!我们的前馈和反向传播算法成功地训练了神经网络,预测结果收敛于真实值。

Note that there’s a slight difference between the predictions and the actual values. This is desirable, as it prevents **overfitting** and allows the Neural Network to **generalize** better to unseen data.
请注意,预测值和实际值之间会存在细微的偏差。我们需要这种偏差,因为它可以防止**过拟合**,并允许神经网络更好地**推广**至不可见数据中。

## What’s Next?
## 后续的学习任务

Fortunately for us, our journey isn’t over. There’s still **much** to learn about Neural Networks and Deep Learning. For example:
幸运的是,我们的学习旅程还未结束。关于神经网络和深度学习,我们还有**很多**内容需要学习。例如:

* What other **activation function** can we use besides the Sigmoid function?
* Using a **learning rate** when training the Neural Network
* Using **convolutions** for image classification tasks
* 除了 Sigmoid 函数,我们还可以使用哪些**激活函数**
* 在训练神经网络时使用**学习率**
* 使用**卷积**进行图像分类任务

I’ll be writing more on these topics soon, so do follow me on Medium and keep and eye out for them!
我将会就这些主题编写更多内容,请在 Medium 上关注我并留意更新!

## Final Thoughts
## 结语

I’ve certainly learnt a lot writing my own Neural Network from scratch.
当然,我也在从零开始编写我自己的神经网络的过程中学到了很多。

Although Deep Learning libraries such as TensorFlow and Keras makes it easy to build deep nets without fully understanding the inner workings of a Neural Network, I find that it’s beneficial for aspiring data scientist to gain a deeper understanding of Neural Networks.
虽然像 TensorFlow Keras 这样的深度学习库使得构建深度神经网络变得很简单,即使你不完全理解神经网络内部工作原理也没关系,但是我发现对于有追求的数据科学家来说,深入理解神经网络是很有好处的。

This exercise has been a great investment of my time, and I hope that it’ll be useful for you as well!
这个练习花费了我大量的时间,我希望它对你们也有帮助!

> 如果发现译文存在错误或其他需要改进的地方,欢迎到 [掘金翻译计划](https://github.com/xitu/gold-miner) 对译文进行修改并 PR,也可获得相应奖励积分。文章开头的 **本文永久链接** 即为本文在 GitHub 上的 MarkDown 链接。
Expand Down

0 comments on commit 2df4478

Please sign in to comment.