Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the demo code and the doc of varbase.backward. #2448

Merged
merged 5 commits into from
Aug 27, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion doc/fluid/api/dygraph.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ fluid.dygraph
.. toctree::
:maxdepth: 1

dygraph/BackwardStrategy.rst
dygraph/BatchNorm.rst
dygraph/BilinearTensorProduct.rst
dygraph/Conv2D.rst
Expand Down
12 changes: 0 additions & 12 deletions doc/fluid/api/dygraph/BackwardStrategy.rst

This file was deleted.

1 change: 0 additions & 1 deletion doc/fluid/api_cn/dygraph_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ fluid.dygraph
.. toctree::
:maxdepth: 1

dygraph_cn/BackwardStrategy_cn.rst
dygraph_cn/BatchNorm_cn.rst
dygraph_cn/BilinearTensorProduct_cn.rst
dygraph_cn/Conv2D_cn.rst
Expand Down
49 changes: 0 additions & 49 deletions doc/fluid/api_cn/dygraph_cn/BackwardStrategy_cn.rst

This file was deleted.

15 changes: 6 additions & 9 deletions doc/fluid/api_cn/dygraph_cn/grad_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,21 @@ grad

**注意:该API仅支持【动态图】模式**

.. py:method:: paddle.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False, only_inputs=True, allow_unused=False, no_grad_vars=None, backward_strategy=None)
.. py:method:: paddle.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False, only_inputs=True, allow_unused=False, no_grad_vars=None)

对于每个 `inputs` ,计算所有 `outputs` 相对于其的梯度和。

参数:
- **outputs** (Variable|list(Variable)|tuple(Variable)) – 用于计算梯度的图的输出变量,或多个输出变量构成的list/tuple。
- **inputs** (Variable|list(Variable)|tuple(Variable)) - 用于计算梯度的图的输入变量,或多个输入变量构成的list/tuple。该API的每个返回值对应每个 `inputs` 的梯度。
- **grad_outputs** (Variable|list(Variable|None)|tuple(Variable|None), 可选) - `outputs` 变量梯度的初始值。若 `grad_outputs` 为None,则 `outputs` 梯度的初始值均为全1的Tensor。若 `grad_outputs` 不为None,它必须与 `outputs` 的长度相等,此时,若 `grad_outputs` 的第i个元素为None,则第i个 `outputs` 的梯度初始值为全1的Tensor;若 `grad_outputs` 的第i个元素为Variable,则第i个 `outputs` 的梯度初始值为 `grad_outputs` 的第i个元素。默认值为None。
- **outputs** (Tensor|list(Tensor)|tuple(Tensor)) – 用于计算梯度的图的输出变量,或多个输出变量构成的list/tuple。
- **inputs** (Tensor|list(Tensor)|tuple(Tensor)) - 用于计算梯度的图的输入变量,或多个输入变量构成的list/tuple。该API的每个返回值对应每个 `inputs` 的梯度。
- **grad_outputs** (Tensor|list(Tensor|None)|tuple(Tensor|None), 可选) - `outputs` 变量梯度的初始值。若 `grad_outputs` 为None,则 `outputs` 梯度的初始值均为全1的Tensor。若 `grad_outputs` 不为None,它必须与 `outputs` 的长度相等,此时,若 `grad_outputs` 的第i个元素为None,则第i个 `outputs` 的梯度初始值为全1的Tensor;若 `grad_outputs` 的第i个元素为Tensor,则第i个 `outputs` 的梯度初始值为 `grad_outputs` 的第i个元素。默认值为None。
- **retain_graph** (bool, 可选) - 是否保留计算梯度的前向图。若值为True,则前向图会保留,用户可对同一张图求两次反向。若值为False,则前向图会释放。默认值为None,表示值与 `create_graph` 相等。
- **create_graph** (bool, 可选) - 是否创建计算过程中的反向图。若值为True,则可支持计算高阶导数。若值为False,则计算过程中的反向图会释放。默认值为False。
- **only_inputs** (bool, 可选) - 是否只计算 `inputs` 的梯度。若值为False,则图中所有叶节点变量的梯度均会计算,并进行累加。若值为True,则只会计算 `inputs` 的梯度。默认值为True。only_inputs=False功能正在开发中,目前尚不支持。
- **allow_unused** (bool, 可选) - 决定当某些 `inputs` 变量不在计算图中时抛出错误还是返回None。若某些 `inputs` 变量不在计算图中(即它们的梯度为None),则当allowed_unused=False时会抛出错误,当allow_unused=True时会返回None作为这些变量的梯度。默认值为False。
- **no_grad_vars** (Variable|list(Variable)|tuple(Variable)|set(Variable), 可选) - 指明不需要计算梯度的变量。默认值为None。
- **backward_strategy** (BackwardStrategy, 可选) - 计算梯度的策略。详见 :ref:`cn_api_fluid_dygraph_BackwardStrategy` 。默认值为None。
- **no_grad_vars** (Tensor|list(Tensor)|tuple(Tensor)|set(Tensor), 可选) - 指明不需要计算梯度的变量。默认值为None。

返回: 变量构成的tuple,其长度等于 `inputs` 中的变量个数,且第i个返回的变量是所有 `outputs` 相对于第i个 `inputs` 的梯度之和。

返回类型: tuple
返回: tuple(Tensor),其长度等于 `inputs` 中的变量个数,且第i个返回的变量是所有 `outputs` 相对于第i个 `inputs` 的梯度之和。

**示例代码 1**
.. code-block:: python
Expand Down
44 changes: 19 additions & 25 deletions doc/fluid/api_cn/fluid_cn/Variable_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -145,31 +145,28 @@ Variable

**参数:**

- **backward_strategy**: ( :ref:`cn_api_fluid_dygraph_BackwardStrategy` ) 使用何种 :ref:`cn_api_fluid_dygraph_BackwardStrategy` 聚合反向的梯度
- **retain_graph** (bool,可选) – 该参数用于确定反向梯度更新完成后反向梯度计算图是否需要保留(retain_graph为True则保留反向梯度计算图)。若用户打算在执行完该方法( :code:`backward` )后,继续向之前已构建的计算图中添加更多的Op,则需要设置 :code:`retain_graph` 值为True(这样才会保留之前计算得到的梯度)。可以看出,将 :code:`retain_graph` 设置为False可降低内存的占用。默认值为False。

返回:无


**示例代码**
.. code-block:: python

import paddle.fluid as fluid
import numpy as np

import paddle
paddle.disable_static()
x = np.ones([2, 2], np.float32)
with fluid.dygraph.guard():
inputs2 = []
for _ in range(10):
tmp = fluid.dygraph.base.to_variable(x)
# 如果这里我们不为输入tmp设置stop_gradient=False,那么后面loss2也将因为这个链路都不需要梯度
# 而不产生梯度
tmp.stop_gradient=False
inputs2.append(tmp)
ret2 = fluid.layers.sums(inputs2)
loss2 = fluid.layers.reduce_sum(ret2)
backward_strategy = fluid.dygraph.BackwardStrategy()
backward_strategy.sort_sum_gradient = True
loss2.backward(backward_strategy)
inputs = []
for _ in range(10):
tmp = paddle.to_tensor(x)
# 如果这里我们不为输入tmp设置stop_gradient=False,那么后面loss也将因为这个链路都不需要梯度
# 而不产生梯度
tmp.stop_gradient=False
inputs.append(tmp)
ret = paddle.sums(inputs)
loss = paddle.reduce_sum(ret)
loss.backward()

.. py:method:: gradient()

Expand Down Expand Up @@ -202,9 +199,7 @@ Variable
inputs2.append(tmp)
ret2 = fluid.layers.sums(inputs2)
loss2 = fluid.layers.reduce_sum(ret2)
backward_strategy = fluid.dygraph.BackwardStrategy()
backward_strategy.sort_sum_gradient = True
loss2.backward(backward_strategy)
loss2.backward()
print(loss2.gradient())

# example2: 返回tuple of ndarray
Expand Down Expand Up @@ -248,9 +243,7 @@ Variable
inputs2.append(tmp)
ret2 = fluid.layers.sums(inputs2)
loss2 = fluid.layers.reduce_sum(ret2)
backward_strategy = fluid.dygraph.BackwardStrategy()
backward_strategy.sort_sum_gradient = True
loss2.backward(backward_strategy)
loss2.backward()
print(loss2.gradient())
loss2.clear_gradient()
print("After clear {}".format(loss2.gradient()))
Expand Down Expand Up @@ -351,6 +344,7 @@ Variable
.. code-block:: python

import paddle.fluid as fluid
import numpy as np

with fluid.dygraph.guard():
value0 = np.arange(26).reshape(2, 13).astype("float32")
Expand All @@ -366,9 +360,9 @@ Variable
out1.stop_gradient = True
out = fluid.layers.concat(input=[out1, out2, c], axis=1)
out.backward()
# 可以发现这里linear的参数变成了
assert (linear.weight.gradient() == 0).all()
assert (out1.gradient() == 0).all()
# 可以发现这里linear的参数梯度变成了None
assert linear.weight.gradient() is None
assert out1.gradient() is None

.. py:attribute:: persistable

Expand Down