-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design Of Refactor Topology #1665
Changes from 1 commit
0e92fbf
f5a14b4
cf2d77c
a09299a
52d43cd
b79af86
cab093d
d30c033
f001bc9
857f752
1cfd1da
b922b00
b3a3b0e
4a94baa
d346d49
3e5d22a
e3d0fa6
7d440eb
4ac8719
386133a
ff63670
12a430a
7ce9fd5
03184c1
4acd579
a109c54
e99e19c
6b8893e
726ba05
d4ccdea
bb562b6
bb68fda
ccf5d7d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,21 +10,92 @@ | |
|
||
### 总体概览 | ||
|
||
* 在注册Layer的时候,不只注册Layer的C++类型,同时注册Layer的信息,这个信息使用Protobuf来表示。 | ||
* 使用一个静态函数生成,Layer信息的Protobuf。 | ||
* 在注册Layer的时候,不只注册Layer的C++类型,同时注册Layer的元信息,元信息使用Protobuf来表示。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 为什么metadata要用protobuf来描述? |
||
* 使用全局静态函数生成Layer的元信息。代码生成器通过Layer访问元信息来生成配置解析器(ConfigParser) | ||
* 将神经网络参数推导(每一个参数的size多大,输出size是多大)功能,移至Paddle C++ Core中 | ||
|
||
### Layer元信息 | ||
|
||
Paddle将**每种**Layer在C++端注册元信息,将元信息声明成Protobuf。 | ||
|
||
主要的元信息有两个 | ||
|
||
#### LayerDef | ||
* LayerDef 是描述了每**种**Layer的元信息,他包含每种Layer的类型名,注释,可以接受的输入类型,参数类型,Layer的其他属性。不包括这个Layer输出什么类型 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Github compatible Markdown 里的bullet用 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
* 注意这是**元信息**。一个`LayerDef`描述了一**种**`Layer`的类型,而不是一**个**`Layer`的具体参数。 | ||
* 同理,LayerDef中使用的 `ArgumentDef`描述的是某**一种输入参数的类型**,而不是某一个具体的输入参数是什么。`AttributeDef`是表示某一个属性(Attribute)的**类型**,而不是这个属性的具体参数。 | ||
* 一个全连接层(FullyConnected, 下简写为FC)的LayerDef可能为 | ||
|
||
```json | ||
{ | ||
"type": "fc", | ||
"description": "Fully Connected Layer is the simplest layer in nerual network. ...", | ||
"inputs" : [ | ||
{ | ||
"name": "input", | ||
"description": "The input of fully connected layer, could be several.", | ||
"data_type": ["Dense", "Sparse", "SparseInt", "Int"], | ||
"seq_nested_level": [0, 1, 2], | ||
"repeatable": true | ||
} | ||
], | ||
"parameter_attr": [ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 可以继承base layer 元信息吗?这样可以简化parameter_attr等说明。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这段protobuf是在Paddle C++部分生成的。所以,虽然这里有重复的信息,但是在C++部分可以使用同一个函数来生成Protobuf数据。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 知道了。如果只是在已定义的Layer里修改/增加一些参数,是否只需修改/增加相应元信息即可? |
||
{ | ||
"attributes": [{ | ||
"name": "weight_decay", | ||
"type": "float", | ||
"description": "The weight decay rate of parameter, used to implement L2 Norm", | ||
"default_value": 0.0, | ||
"max_value": 1.0, | ||
"min_value": 0.0 | ||
}, { | ||
"name": "gradient_clipping", | ||
"type": "float", | ||
"description": "The gradient clipping threshold", | ||
"default_value": 0.0, | ||
"min_value": 0.0 | ||
}] | ||
} | ||
], | ||
"bias_attr": { | ||
"attributes": [{ | ||
"name": "weight_decay", | ||
"type": "float", | ||
"description": "The weight decay rate of parameter, used to implement L2 Norm", | ||
"default_value": 0.0, | ||
"max_value": 1.0, | ||
"min_value": 0.0 | ||
}] | ||
}, | ||
"layer_attr": [ | ||
{ | ||
"name": "dropout_rate", | ||
"type": "float", | ||
"description": "The dropout rate of this layer", | ||
"default_value": 0.0, | ||
"max_value": 1.0, | ||
"min_value": 0.0 | ||
} | ||
] | ||
} | ||
``` | ||
|
||
#### LayerOutputType | ||
|
||
* LayerOutputType 表示的是,某一个Layer输入输出具体是什么类型的(不是输入输出具体是什么值)。这是在运行时中计算出来的。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. need to consider the possibility of multiple outputs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 目前的LstmStepLayer实际上是2个output。第二个output是通过另一个GetOutputLayer来获得的。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 同意需要支持多个output,在网络配置时能够支持取其中的某些层连接到不同的层~ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 是的,所以 |
||
* 某一个FC Layer的LayerOutputType可能是 | ||
|
||
### LayerDef/LayerOutputDef Protobuf. | ||
```json | ||
{ | ||
"type": "Dense", | ||
"size": 200, | ||
"seq_nested_level": 2 | ||
} | ||
``` | ||
|
||
Paddle将Layer在C++端注册信息,声明成Protobuf。一个Layer的信息主体分为两个部分: | ||
#### Layer元信息的Protobuf定义 | ||
|
||
* Layer本身的信息 | ||
* 包括这个Layer支持什么样类型的输入 | ||
* 这个Layer的参数,bias有哪些可以设置的属性 | ||
* 这个Layer本身有哪些可以设置的属性 | ||
* Layer输出什么类型 | ||
* 这个Layer在某一种输入下,的输出类型是什么样子的。 | ||
* 由于Paddle的一个Layer可以接受和产生不同类型的输入和输出,Layer的输出类型(例如size)是和输入有关系的。所以这个信息是解析配置文件过程中运行时调用生成的。 | ||
下面是Layer元信息的Protobuf定义。 | ||
|
||
```protobuf | ||
enum DataType { | ||
|
@@ -104,7 +175,7 @@ message LayerDef { | |
} | ||
|
||
// Define the layer's output types by given input types. | ||
message LayerOutputDef { | ||
message LayerOutputType { | ||
// Output name, Each Paddle Layer could have multiple outputs. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Paddle目前一个layer只有一个output。 我理解layer的input, output是相同的类型,ArgumentDef。 这里似乎只看到:
没有看到layer的output是啥类型?还是 ArgumentDef嘛? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Layer的输出并不在Layer的元信息中指定。 因为每一个Layer的输出形状是解析配置时计算的,而不是元信息可以规定的。所以在LayerDef里面没有outputs |
||
optional string name = 1; | ||
|
||
|
@@ -116,13 +187,13 @@ message LayerOutputDef { | |
} | ||
``` | ||
|
||
### C++ 端暴露LayerDef/LayerOutputDef Protobuf. | ||
### C++ 端暴露LayerDef/LayerOutputType Protobuf. | ||
|
||
基本想法: | ||
|
||
* 对于每一种类型的Layer,Paddle根据Layer的名字约定两个全局函数的名字。例如,对于FC Layer,全局函数的名字是 `__get_fc_layer_definition__` 和 `__get_fc_layer_output_definition__`。 这两个全局函数通过`REGISTER_LAYER`自动生成。 | ||
* 对于每一种类型的Layer,Paddle根据Layer的名字约定两个全局函数的名字。例如,对于FC Layer,全局函数的名字是 `__get_fc_layer_definition__` 和 `__get_fc_layer_output_type__`。 这两个全局函数通过`REGISTER_LAYER`自动生成。 | ||
* 对于每个Layer实现的时候,实现两个静态(`static`)函数,分别实现这两个函数。 | ||
* 对于获得LayerOutputDef的函数,其还有一个作用就是在运行时设置ParameterSize,动态添加辅助输入等等。 | ||
* 对于获得LayerOutputType的函数,同时完成**神经网络推导**过程。即在运行时设置ParameterSize,动态添加Layer的辅助输入等等。 | ||
|
||
举例来说,例如对于FCLayer,可能的实现为: | ||
|
||
|
@@ -144,9 +215,9 @@ public: | |
.addDoc("FC Layer is fully connected. Blah blah blah..."); | ||
} | ||
|
||
static std::vector<LayerOutputDef> getLayerOutputDefinition(const std::vector<LayerOutputDef>& inputs, | ||
static std::vector<LayerOutputType> getLayerOutputType(const std::vector<LayerOutputDef>& inputs, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. need to check whether inputs are consistent (e.g., whether dimensions and types match) |
||
LayerConfig& self) { | ||
// self could be modified, for calculating parameter size, etc. | ||
// self could be modified, for calculating parameter size, etc. | ||
LayerOutputDef out; | ||
out.set_size(self.size()); | ||
out.set_type(InputType::Dense); | ||
|
@@ -165,13 +236,14 @@ REGISTER_LAYER(fc, FCLayer); | |
|
||
![配置解析运行流程](http://api.paddlepaddle.org/graphviz?dot=https://gist.githubusercontent.com/reyoung/0a3d7bfb44e45d61d7bd80b26ca18fbc/raw/4177e2ca56f0410a65338a089cf4e37b9bb87c93/gistfile1.txt) | ||
|
||
1. 读取Paddle Core中所有的Layer Def。 | ||
1. 根据所有LayerDef生成解析器ConfigParser | ||
1. 读取Paddle Core中所有的Layer的元信息, LayerDef。 | ||
1. 根据所有Layer的元信息,LayerDefs生成解析器ConfigParser | ||
* 如何生成解析器是每个语言自定义的过程 | ||
* 这个过程可以是离线的过程。即先将所有Layer的LayerDef写入到一个文件里,然后其他语言读取这个文件,来生成代码。 | ||
* 这个过程同时也可以是在线的过程。比如对于Python这种动态类型语言,运行时生成函数比较简单,就没必要先生成代码,再生成函数了。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这里生成config_parser是先于读取用户定义的网络结构的,那就是需要对所有的layer元信息都遍历一遍。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 确实也是可以的。这个工作也可以是Lazy的。即用户使用某一个函数的时候,Python找不到这个函数,然后代码生成器再生成这个函数。 不过为了生成Sphinx文档方便,在解析配置之前生成config_parser可能更有道理。这样,Sphinx才能查找到所有支持的Layer,进而生成文档。 |
||
1. 使用ConfigParser,解析用户的配置文件`trainer_config.conf`。 | ||
* 这时,解析器只返回一个调用图,即Layer与Layer之间的调用关系,而不返回真正的`ModelConfig`。 | ||
* 这时,解析器只返回一个调用图,即Layer与Layer之间的调用关系(`Graph Protobuf`),而不返回真正的`ModelConfig`。 | ||
* 这个Graph Protobuf非常简单,只包括调用了哪个Layer,设置了那个Attribute即可 | ||
1. 讲这个调用图传递给Paddle Core,生成真正的`ModelConfig`。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 讲=>将 |
||
* 对于每一个Layer,顺序执行 `getLayerOutputDefinition`获得这个Layer的输出,传递给下一个Layer。 | ||
* 在C++端真正的生成每一个Layer的LayerConfig,在`getLayerOutputDefinition`中,用户可以对生成的LayerConfig进行修改。例如添加辅助输入,设置参数大小等等。 | ||
* 对于`GraphProtobuf`中每一个项目,生成每一个LayerConfig。 | ||
* 进而顺序执行 `getLayerOutputType`获得这个Layer的输出,并完成神经网络参数推导过程。再将这个LayerConfig传递给下一个Layer。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 推导的过程是在初始化GradientMachine时做的吗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layer ==> layer