From 5e186a5ac236d22b73fe0f2bd8f5db2603001505 Mon Sep 17 00:00:00 2001 From: nb312 Date: Sun, 22 Nov 2015 01:52:40 +0800 Subject: [PATCH] =?UTF-8?q?=E5=85=B1=E4=BA=AB=E5=8F=98=E9=87=8F=E7=BF=BB?= =?UTF-8?q?=E8=AF=91=E5=AE=8C=E6=88=90?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 共享变量翻译完成 请校对 --- SOURCE/how_tos/variable_scope/index.md | 211 +++++++------------------ 1 file changed, 57 insertions(+), 154 deletions(-) diff --git a/SOURCE/how_tos/variable_scope/index.md b/SOURCE/how_tos/variable_scope/index.md index f96734c..42d3423 100755 --- a/SOURCE/how_tos/variable_scope/index.md +++ b/SOURCE/how_tos/variable_scope/index.md @@ -1,19 +1,9 @@ -# Sharing Variables +# 共享变量 +你可以在[怎么使用变量](../../how_tos/variables/index.md)中所描述的方式来创建,初始化,保存及加载单一的变量.但是当创建复杂的模块时,通常你需要共享大量变量集并且如果你还想在同一个地方初始化这所有的变量,我们又该怎么做呢.本教程就是演示如何使用`tf.variable_scope()` 和`tf.get_variable()`两个方法来实现这一点. -You can create, initialize, save and load single variables -in the way described in the [Variables HowTo](../../how_tos/variables/index.md). -But when building complex models you often need to share large sets of -variables and you might want to initialize all of them in one place. -This tutorial shows how this can be done using `tf.variable_scope()` and -the `tf.get_variable()`. -## The Problem - -Imagine you create a simple model for image filters, similar to our -[Convolutional Neural Networks Tutorial](../../tutorials/deep_cnn/index.md) -model but with only 2 convolutions (for simplicity of this example). If you use -just `tf.Variable`, as explained in [Variables HowTo](../../how_tos/variables/index.md), -your model might look like this. +## 问题 +假设你为图片过滤器创建了一个简单的模块,和我们的[卷积神经网络教程](../../tutorials/deep_cnn/index.md)模块相似,但是这里包括两个卷积(为了简化实例这里只有两个).如果你仅使用`tf.Variable`变量,那么你的模块就如[怎么使用变量](../../how_tos/variables/index.md)里面所解释的是一样的模块. ```python def my_image_filter(input_images): @@ -32,15 +22,8 @@ def my_image_filter(input_images): return tf.nn.relu(conv2 + conv2_biases) ``` -As you can easily imagine, models quickly get much more complicated than -this one, and even here we already have 4 different variables: `conv1_weights`, -`conv1_biases`, `conv2_weights`, and `conv2_biases`. - -The problem arises when you want to reuse this model. Assume you want to -apply your image filter to 2 different images, `image1` and `image2`. -You want both images processed by the same filer with the same parameters. -You can call `my_image_filter()` twice, but this will create two sets -of variables: +你很容易想到,模块集很快就比一个模块变得更为复杂,仅在这里我们就有了四个不同的变量:`conv1_weights`,`conv1_biases`, `conv2_weights`, 和`conv2_biases`. +当我们想重用这个模块时问题还在增多.假设你想把你的图片过滤器运用到两张不同的图片, `image1`和`image2`.你想通过拥有同一个参数的同一个过滤器来过滤两张图片,你可以调用`my_image_filter()`两次,但是这会产生两组变量. ```python # First call creates one set of variables. @@ -48,9 +31,7 @@ result1 = my_image_filter(image1) # Another set is created in the second call. result2 = my_image_filter(image2) ``` - -A common way to share variables is to create them in a separate piece of code -and pass them to functions that use them. For example by using a dictionary: +通常共享变量的方法就是在单独的代码块中来创建他们并且通过使用他们的函数.如使用字典的例子: ```python variables_dict = { @@ -73,44 +54,30 @@ def my_image_filter(input_images, variables_dict): result1 = my_image_filter(image1, variables_dict) result2 = my_image_filter(image2, variables_dict) ``` +虽然使用上面的方式创建变量是很方便的,但是在这个模块代码之外却破坏了其封装性: +* 在构建试图的代码中标明变量的名字,类型,形状来创建. +* 当代码改变了,调用的地方也许就会产生或多或少或不同类型的变量. -While convenient, creating variables like above, -outside of the code, breaks encapsulation: - -* The code that builds the graph must document the names, types, - and shapes of variables to create. -* When the code changes, the callers may have to create more, or less, - or different variables. +解决此类问题的方法之一就是使用类来创建模块,在需要的地方使用类来小心地管理他们需要的变量. +一个更高明的做法,不用调用类,而是利用TensorFlow 提供了*变量作用域* 机制,当构建一个视图时,很容易就可以共享命名过的变量. -One way to address the problem is to use classes to create a model, -where the classes take care of managing the variables they need. -For a lighter solution, not involving classes, TensorFlow provides -a *Variable Scope* mechanism that allows to easily share named variables -while constructing a graph. +## 变量作用域实例 -## Variable Scope Example - -Variable Scope mechanism in TensorFlow consists of 2 main functions: +变量作用域机制在TensorFlow中主要由两部分组成: * `tf.get_variable(, , )`: - Creates or returns a variable with a given name. +通过所给的名字创建或是返回一个变量. * `tf.variable_scope()`: - Manages namespaces for names passed to `tf.get_variable()`. +通过 `tf.get_variable()`为变量名指定命名空间. + +方法 `tf.get_variable()` 用来获取或创建一个变量,而不是直接调用`tf.Variable`.它采用的不是像`tf.Variable这样直接获取值来初始化的方法.一个初始化就是一个方法,创建其形状并且为这个形状提供一个张量.这里有一些在TensorFlow中使用的初始化变量: -The function `tf.get_variable()` is used to get or create a variable instead -of a direct call to `tf.Variable`. It uses an *initializer* instead of passing -the value directly, as in `tf.Variable`. An initializer is a function that -takes the shape and provides a tensor with that shape. Here are some -initializers available in TensorFlow: +* `tf.constant_initializer(value)` 初始化一切所提供的值, +* `tf.random_uniform_initializer(a, b)`从a到b均匀初始化, +* `tf.random_normal_initializer(mean, stddev)` 用所给平均值和标准差初始化均匀分布. -* `tf.constant_initializer(value)` initializes everything to the provided value, -* `tf.random_uniform_initializer(a, b)` initializes uniformly from [a, b], -* `tf.random_normal_initializer(mean, stddev)` initializes from the normal - distribution with the given mean and standard deviation. +为了了解`tf.get_variable()`怎么解决前面所讨论的问题,让我们在单独的方法里面创建一个卷积来重构一下代码,命名为`conv_relu`: -To see how `tf.get_variable()` solves the problem discussed -before, let's refactor the code that created one convolution into -a separate function, named `conv_relu`: ```python def conv_relu(input, kernel_shape, bias_shape): @@ -124,12 +91,7 @@ def conv_relu(input, kernel_shape, bias_shape): strides=[1, 1, 1, 1], padding='SAME') return tf.nn.relu(conv + biases) ``` - -This function uses short names `"weights"` and `"biases"`. -We'd like to use it for both `conv1` and `conv2`, but -the variables need to have different names. -This is where `tf.variable_scope()` comes into play: -it pushes a namespace for variables. +这个方法中用了`"weights"` 和`"biases"`两个简称.而我们更偏向于用`conv1` 和 `conv2`这两个变量的写法,但是不同的变量需要不同的名字.这就是`tf.variable_scope()` 变量起作用的地方.他为变量指定了相应的命名空间. ```python def my_image_filter(input_images): @@ -140,18 +102,15 @@ def my_image_filter(input_images): # Variables created here will be named "conv2/weights", "conv2/biases". return conv_relu(relu1, [5, 5, 32, 32], [32]) ``` +现在,让我们看看当我们调用 `my_image_filter()` 两次时究竟会发生了什么. -Now, let's see what happens when we call `my_image_filter()` twice. ``` result1 = my_image_filter(image1) result2 = my_image_filter(image2) # Raises ValueError(... conv1/weights already exists ...) ``` - -As you can see, `tf.get_variable()` checks that already existing variables -are not shared by accident. If you want to share them, you need to specify -it by setting `reuse_variables()` as follows. +就像你看见的一样,`tf.get_variable()`会检测已经存在的变量是否已经共享.如果你想共享他们,你需要像下面使用的一样,通过`reuse_variables()`这个方法来指定. ``` with tf.variable_scope("image_filters") as scope: @@ -159,48 +118,32 @@ with tf.variable_scope("image_filters") as scope: scope.reuse_variables() result2 = my_image_filter(image2) ``` +用这种方式来共享变量是非常好的,轻量级而且安全. -This is a good way to share variables, lightweight and safe. -## How Does Variable Scope Work? +## 变量作用域是怎么工作的? -### Understanding `tf.get_variable()` - -To understand variable scope it is necessary to first -fully understand how `tf.get_variable()` works. -Here is how `tf.get_variable` is usually called. +### 理解 `tf.get_variable()` +为了理解变量作用域,首先完全理解`tf.get_variable()`是怎么工作的是很有必要的. +通常我们就是这样调用`tf.get_variable` 的. ```python v = tf.get_variable(name, shape, dtype, initializer) ``` +此调用做了有关作用域的两件事中的其中之一,方法调入.总的有两种情况. -This call does one of two things depending on the scope it is called in. -Here are the two options. - -* Case 1: the scope is set for creating new variables, as evidenced by -`tf.get_variable_scope().reuse == False`. +* 情况1:当`tf.get_variable_scope().reuse == False`时,作用域就是为创建新变量所设置的. -In this case, `v` will be a newly created `tf.Variable` with the provided -shape and data type. The full name of the created variable will be set to -the current variable scope name + the provided `name` and a check will be -performed to ensure that no variable with this full name exists yet. -If a variable with this full name already exists, the funtion will -raise a `ValueError`. If a new variable is created, it will be -initialized to the value `initializer(shape)`. For example: +这种情况下,`v`将通过`tf.Variable`所提供的形状和数据类型来重新创建.创建变量的全称将会由当前变量作用域名+所提供的`名字`所组成,并且还会检查来确保没有任何变量使用这个全称.如果这个全称已经有一个变量使用了,那么方法将会抛出`ValueError`错误.如果一个变量被创建,他将会用`initializer(shape)`进行初始化.比如: ```python with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) assert v.name == "foo/v:0" ``` +* 情况1:当`tf.get_variable_scope().reuse == True`时,作用域是为重用变量所设置 -* Case 2: the scope is set for reusing variables, as evidenced by -`tf.get_variable_scope().reuse == True`. - -In this case, the call will search for an already existing variable with -name equal to the current variable scope name + the provided `name`. -If no such variable exists, a `ValueError` will be raised. If the variable -is found, it will be returned. For example: +这种情况下,调用就会搜索一个已经存在的变量,他的全称和当前变量的作用域名+所提供的`名字`是否相等.如果不存在相应的变量,就会抛出`ValueError` 错误.如果变量找到了,就返回这个变量.如下: ```python with tf.variable_scope("foo"): @@ -210,13 +153,8 @@ with tf.variable_scope("foo", reuse=True): assert v1 == v ``` -### Basics of `tf.variable_scope()` - -Knowing how `tf.get_variable()` works makes it easy to understand variable -scope. The primary function of variable scope is to carry a name that will -be used as prefix for variable names and a reuse-flag to distinguish the two -cases described above. Nesting variable scopes appends their names in a way -analogous to how directories work: +### `tf.variable_scope()` 基础 +知道`tf.get_variable()`是怎么工作的,使得理解变量作用域变得很容易.变量作用域的主方法带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签来区分以上的两种情况.嵌套的作用域附加名字所用的规则和文件目录的规则很类似: ```python with tf.variable_scope("foo"): @@ -224,10 +162,8 @@ with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) assert v.name == "foo/bar/v:0" ``` +当前变量作用域可以用`tf.get_variable_scope()`进行检索并且`reuse` 标签可以通过调用`tf.get_variable_scope().reuse_variables()`设置为`True` . -The current variable scope can be retrieved using `tf.get_variable_scope()` -and the `reuse` flag of the current variable scope can be set to `True` by -calling `tf.get_variable_scope().reuse_variables()`: ```python with tf.variable_scope("foo"): @@ -236,20 +172,10 @@ with tf.variable_scope("foo"): v1 = tf.get_variable("v", [1]) assert v1 == v ``` +注意你*不能*设置`reuse`标签为`False`.其中的原因就是允许改写创建模块的方法.想一下你前面写得方法`my_image_filter(inputs)`.有人在变量作用域内调用`reuse=True` 是希望所有内部变量都被重用.如果允许在方法体内强制执行`reuse=False`,将会打破内部结构并且用这种方法使得很难再共享参数. -Note that you *cannot* set the `reuse` flag to `False`. The reason behind -this is to allow to compose functions that create models. Imagine you write -a function `my_image_filter(inputs)` as before. Someone calling the function -in a variable scope with `reuse=True` would expect all inner variables to be -reused as well. Allowing to force `reuse=False` inside the function would break -this contract and make it hard to share parameters in this way. +即使你不能直接设置 `reuse` 为 `False` ,但是你可以输入一个重用变量作用域,然后就释放掉,就成为非重用的变量.当打开一个变量作用域时,使用`reuse=True` 作为参数是可以的.但也要注意,同一个原因,`reuse` 参数是不可继承.所以当你打开一个重用变量作用域,那么所有的子作用域也将会被重用. -Even though you cannot set `reuse` to `False` explicitly, you can enter -a reusing variable scope and then exit it, going back to a non-reusing one. -This can be done using a `reuse=True` parameter when opening a variable scope. -Note also that, for the same reason as above, the `reuse` parameter is -inherited. So when you open a reusing variable scope, all sub-scopes will -be reusing too. ```python with tf.variable_scope("root"): @@ -268,14 +194,8 @@ with tf.variable_scope("root"): assert tf.get_variable_scope().reuse == False ``` -### Capturing variable scope - -In all examples presented above, we shared parameters only because their -names agreed, that is, because we opened a reusing variable scope with -exactly the same string. In more complex cases, it might be useful to pass -a VariableScope object rather than rely on getting the names right. -To this end, variable scopes can be captured and used instead of names -when opening a new variable scope. +### 获取变量作用域 +在上面的所有例子中,我们共享参数只因为他们的名字是一致的,那是因为我们开启一个变量作用域重用时刚好用了同一个字符串.在更复杂的情况,他可以通过变量作用域对象来使用,而不是通过依赖于右边的名字来使用.为此,变量作用域可以被获取并使用,而不是仅作为当开启一个新的变量作用域的名字. ```python with tf.variable_scope("foo") as foo_scope: @@ -288,10 +208,8 @@ with tf.variable_scope(foo_scope, reuse=True) assert v1 == v assert w1 == w ``` +当开启一个变量作用域,使用一个预先已经存在的作用域时,我们会跳过当前变量作用域的前缀而直接成为一个完全不同的作用域.这就是我们做得完全独立的地方. -When opening a variable scope using a previously existing scope -we jump out of the current variable scope prefix to an entirely -different one. This is fully independent of where we do it. ```python with tf.variable_scope("foo") as foo_scope: @@ -303,17 +221,9 @@ with tf.variable_scope("bar") assert foo_scope2.name == "foo" # Not changed. ``` -### Initializers in variable scope +### 变量作用域中的初始化器 +使用`tf.get_variable()`允许你重写方法来创建或者重用变量,并且可以被外部透明调用.但是如果我们想改变创建变量的初始化器那要怎么做呢?是否我们需要为所有的创建变量方法传递一个额外的参数呢?那在大多数情况下,当我们想在一个地方并且为所有的方法的所有的变量设置一个默认初始化器,那又改怎么做呢?为了解决这些问题,变量作用域可以携带一个默认的初始化器.他可以被子作用域继承并传递给`tf.get_variable()` 调用.但是如果其他初始化器被明确地指定,那么他将会被重写. -Using `tf.get_variable()` allows to write functions that create or reuse -variables and can be transparently called from outside. But what if we wanted -to change the initializer of the created variables? Do we need to pass an extra -argument to every function that creates variables? What about the most common -case, when we want to set the default initializer for all variables in one -place, on top of all functions? To help with these cases, variable scope -can carry a default initializer. It is inherited by sub-scopes and passed -to each `tf.get_variable()` call. But it will be overridden if another -initializer is specified explicitly. ```python with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)): @@ -329,22 +239,16 @@ with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)): assert v.eval() == 0.2 # Changed default initializer. ``` -### Names of ops in `tf.variable_scope()` - -We discussed how `tf.variable_scope` governs the names of variables. -But how does it influence the names of other ops in the scope? -It is natural that ops created inside a variable scope should also -share that name. For this reason, when we do `with tf.variable_scope("name")`, -this implicitly opens a `tf.name_scope("name")`. For example: +### 在`tf.variable_scope()`中ops的名称 +我们讨论 `tf.variable_scope` 怎么处理变量的名字.但是又是如何在作用域中影响到 +其他ops的名字的呢?ops在一个变量作用域的内部创建,那么他应该是共享他的名字,这是很自然的想法.出于这样的原因,当我们用`with tf.variable_scope("name")`时,这就间接地开启了一个`tf.name_scope("name")`.比如: ```python with tf.variable_scope("foo"): x = 1.0 + tf.get_variable("v", [1]) assert x.op.name == "foo/add" ``` - -Name scopes can be openend in addition to a variable scope, and then -they will only affect the names of the ops, but not of variables. +名称作用域可以被开启并添加到一个变量作用域中,然后他们只会影响到ops的名称,而不会影响到变量. ```python with tf.variable_scope("foo"): @@ -354,19 +258,18 @@ with tf.variable_scope("foo"): assert v.name == "foo/v:0" assert x.op.name == "foo/bar/add" ``` +当用一个引用对象而不是一个字符串去开启一个变量作用域时,我们就不会为ops改变当前的名称作用域. -When opening a variable scope using a captured object instead of a string, -we do not alter the current name scope for ops. -## Examples of Use +## 使用实例 +这里有一些指向怎么使用变量作用域的文件.特别是,他被大量用于 +[时间递归神经网络](https://zh.wikipedia.org/wiki/%E9%80%92%E5%BD%92%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C)和`sequence-to-sequence`模型, -Here are pointers to a few files that make use of variable scope. -In particular, it is heavily used for recurrent neural networks -and sequence-to-sequence models. File | What's in it? --- | --- -`models/image/cifar10.py` | Model for detecting objects in images. -`models/rnn/rnn_cell.py` | Cell functions for recurrent neural networks. -`models/rnn/seq2seq.py` | Functions for building sequence-to-sequence models. +`models/image/cifar10.py` |图像中检测对象的模型. +`models/rnn/rnn_cell.py` |时间递归神经网络的元方法集. +`models/rnn/seq2seq.py` |为创建`sequence-to-sequence`模型的方法集. +原文:[Sharing Variables](http://www.tensorflow.org/how_tos/variable_scope/index.md) 翻译:[nb312](https://github.com/nb312)