Merge pull request #1494 from HexToString/fix_doc_0.7.0

ysl_change doc and examples
PaddlePaddle · Nov 14, 2021 · 1f35ae3 · 1f35ae3
2 parents b28501d + 25f54a6
commit 1f35ae3
Show file tree

Hide file tree

Showing 241 changed files with 992 additions and 1,208 deletions.
diff --git a/...cpp_server/ABTEST_IN_PADDLE_SERVING_CN.md → doc/C++Serving/ABTest_CN.md b/...cpp_server/ABTEST_IN_PADDLE_SERVING_CN.md → doc/C++Serving/ABTest_CN.md
@@ -1,10 +1,10 @@
 # 如何使用Paddle Serving做ABTEST
 
-(简体中文|[English](./ABTEST_IN_PADDLE_SERVING.md))
+(简体中文|[English](./ABTest_EN.md))
 
 该文档将会用一个基于IMDB数据集的文本分类任务的例子，介绍如何使用Paddle Serving搭建A/B Test框架，例中的Client端、Server端结构如下图所示。
 
-<img src="images/abtest.png" style="zoom:33%;" />
+<img src="../images/abtest.png" style="zoom:33%;" />
 
 需要注意的是：A/B Test只适用于RPC模式，不适用于WEB模式。
 
@@ -24,13 +24,13 @@ pip install Shapely
 ````
 您可以直接运行下面的命令来处理数据。
 
-[python abtest_get_data.py](../python/examples/imdb/abtest_get_data.py)
+[python abtest_get_data.py](../../examples/C++/imdb/abtest_get_data.py)
 
 文件中的Python代码将处理`test_data/part-0`的数据，并将处理后的数据生成并写入`processed.data`文件中。
 
 ### 启动Server端
 
-这里采用[Docker方式](RUN_IN_DOCKER_CN.md)启动Server端服务。
+这里采用[Docker方式](../RUN_IN_DOCKER_CN.md)启动Server端服务。
 
 首先启动BOW Server，该服务启用`8000`端口：
 
@@ -62,7 +62,7 @@ exit
 
 您可以直接使用下面的命令，进行ABTEST预测。
 
-[python abtest_client.py](../python/examples/imdb/abtest_client.py)
+[python abtest_client.py](../../examples/C++/imdb/abtest_client.py)
 
 ```python
 from paddle_serving_client import Client

diff --git a/doc/cpp_server/ABTEST_IN_PADDLE_SERVING.md → doc/C++Serving/ABTest_EN.md b/doc/cpp_server/ABTEST_IN_PADDLE_SERVING.md → doc/C++Serving/ABTest_EN.md
@@ -1,10 +1,10 @@
 # ABTEST in Paddle Serving
 
-([简体中文](./ABTEST_IN_PADDLE_SERVING_CN.md)|English)
+([简体中文](./ABTest_CN.md)|English)
 
 This document will use an example of text classification task based on IMDB dataset to show how to build a A/B Test framework using Paddle Serving. The structure relationship between the client and servers in the example is shown in the figure below.
 
-<img src="images/abtest.png" style="zoom:25%;" />
+<img src="../images/abtest.png" style="zoom:25%;" />
 
 Note that:  A/B Test is only applicable to RPC mode, not web mode.
 
@@ -25,13 +25,13 @@ pip install Shapely
 
 You can directly run the following command to process the data.
 
-[python abtest_get_data.py](../python/examples/imdb/abtest_get_data.py)
+[python abtest_get_data.py](../../examples/C++/imdb/abtest_get_data.py)
 
 The Python code in the file will process the data `test_data/part-0` and write to the `processed.data` file.
 
 ### Start Server
 
-Here, we [use docker](RUN_IN_DOCKER.md) to start the server-side service. 
+Here, we [use docker](../RUN_IN_DOCKER.md) to start the server-side service. 
 
 First, start the BOW server, which enables the `8000` port:
 
@@ -63,7 +63,7 @@ Before running, use `pip install paddle-serving-client` to install the paddle-se
 
 You can directly use the following command to make abtest prediction.
 
-[python abtest_client.py](../python/examples/imdb/abtest_client.py)
+[python abtest_client.py](../../examples/C++/imdb/abtest_client.py)
 
 [//file]:#abtest_client.py
 ``` python

diff --git a/doc/C++Serving/Benchmark_CN.md b/doc/C++Serving/Benchmark_CN.md
@@ -0,0 +1,53 @@
+# C++ Serving vs TensorFlow Serving 性能对比
+# 1. 测试环境和说明
+1) GPU型号：Tesla P4(7611 Mib)
+2) Cuda版本：11.0
+3) 模型：ResNet_v2_50
+4) 为了测试异步合并batch的效果，测试数据中batch=1
+5) [使用的测试代码和使用的数据集](../../examples/C++/PaddleClas/resnet_v2_50)
+6) 下图中蓝色是C++ Serving，灰色为TF-Serving。
+7) 折线图为QPS，数值越大表示每秒钟处理的请求数量越大，性能就越好。
+8) 柱状图为平均处理时延，数值越大表示单个请求处理时间越长，性能就越差。
+
+# 2. 同步模式
+均使用同步模式，默认参数配置。
+
+
+可以看出同步模型默认参数配置情况下，C++Serving QPS和平均时延指标均优于TF-Serving。
+
+<p align="center">
+    <br>
+<img src='../images/syn_benchmark.png'">
+    <br>
+<p>
+
+|client_num |	model_name |	qps(samples/s) |	mean(ms) |	model_name |	qps(samples/s) |	mean(ms) |
+| --- | --- | --- | --- | --- | --- | --- |
+| 10 |	pd-serving |	111.336 |	89.787|	tf-serving|	84.632|	118.13|
+|30	|pd-serving	|165.928	|180.761	|tf-serving	|106.572	|281.473|
+|50|	pd-serving|	207.244|	241.211|	tf-serving|	80.002	|624.959|
+|70	|pd-serving	|214.769	|325.894	|tf-serving	|105.17	|665.561|
+|100|	pd-serving|	235.405|	424.759|	tf-serving|	93.664	|1067.619|
+|150	|pd-serving	|239.114	|627.279	|tf-serving	|86.312	|1737.848|
+
+# 3. 异步模式
+均使用异步模式，最大batch=32，异步线程数=2。
+
+
+可以看出异步模式情况下，两者性能接近，但当Client端并发数达到70的时候，TF-Serving服务直接超时，而C++Serving能够正常返回结果。
+
+同时，对比同步和异步模式可以看出，异步模式在请求batch数较小时，通过合并batch能够有效提高QPS和平均处理时延。
+<p align="center">
+    <br>
+<img src='../images/asyn_benchmark.png'">
+    <br>
+<p>
+
+|client_num |	model_name |	qps(samples/s) |	mean(ms) |	model_name |	qps(samples/s) |	mean(ms) |
+| --- | --- | --- | --- | --- | --- | --- |
+|10|	pd-serving|	130.631|	76.502|	tf-serving	|172.64	|57.916|
+|30|	pd-serving|	201.062|	149.168|	tf-serving|	241.669|	124.128|
+|50|	pd-serving|	286.01|	174.764|	tf-serving	|278.744	|179.367|
+|70|	pd-serving|	313.58|	223.187|	tf-serving|	298.241|	234.7|
+|100|	pd-serving|	323.369|	309.208|	tf-serving|	0|	∞|
+|150|	pd-serving|	328.248|	456.933|	tf-serving|	0|	∞|
diff --git a/doc/cpp_server/CLIENT_CONFIGURE.md → doc/C++Serving/Client_Configure_CN.md b/doc/cpp_server/CLIENT_CONFIGURE.md → doc/C++Serving/Client_Configure_CN.md
diff --git a/doc/cpp_server/CREATING.md → doc/C++Serving/Creat_C++Serving_CN.md b/doc/cpp_server/CREATING.md → doc/C++Serving/Creat_C++Serving_CN.md
@@ -75,9 +75,9 @@ service ImageClassifyService {
 
 #### 2.2.2 示例配置
 
-关于Serving端的配置的详细信息，可以参考[Serving端配置](SERVING_CONFIGURE.md)
+关于Serving端的配置的详细信息，可以参考[Serving端配置](../SERVING_CONFIGURE_CN.md)
 
-以下配置文件将ReaderOP, ClassifyOP和WriteJsonOP串联成一个workflow (关于OP/workflow等概念，可参考[设计文档](C++DESIGN_CN.md))
+以下配置文件将ReaderOP, ClassifyOP和WriteJsonOP串联成一个workflow (关于OP/workflow等概念，可参考[OP介绍](OP_CN.md)和[DAG介绍](DAG_CN.md))
 
 - 配置文件示例：
 
@@ -310,7 +310,7 @@ api.thrd_finalize();
 api.destroy();
 ```
 
-具体实现可参考paddle Serving提供的例子sdk-cpp/demo/ximage.cpp
+具体实现可参考C++Serving提供的例子。sdk-cpp/demo/ximage.cpp
 
 ### 3.3 链接
 
@@ -392,4 +392,4 @@ predictors {
   }
 }
 ```
-关于客户端的详细配置选项，可参考[CLIENT CONFIGURATION](CLIENT_CONFIGURE.md)
+关于客户端的详细配置选项，可参考[CLIENT CONFIGURATION](Client_Configure_CN.md)
diff --git a/doc/cpp_server/SERVER_DAG_CN.md → doc/C++Serving/DAG_CN.md b/doc/cpp_server/SERVER_DAG_CN.md → doc/C++Serving/DAG_CN.md
@@ -1,6 +1,6 @@
 # Server端的计算图
 
-(简体中文|[English](./SERVER_DAG.md))
+(简体中文|[English](DAG_EN.md))
 
 本文档显示了Server端上计算图的概念。 如何使用PaddleServing内置运算符定义计算图。 还显示了一些顺序执行逻辑的示例。
 
@@ -9,7 +9,7 @@
 深度神经网络通常在输入数据上有一些预处理步骤，而在模型推断分数上有一些后处理步骤。 由于深度学习框架现在非常灵活，因此可以在训练计算图之外进行预处理和后处理。 如果要在服务器端进行输入数据预处理和推理结果后处理，则必须在服务器上添加相应的计算逻辑。 此外，如果用户想在多个模型上使用相同的输入进行推理，则最好的方法是在仅提供一个客户端请求的情况下在服务器端同时进行推理，这样我们可以节省一些网络计算开销。 由于以上两个原因，自然而然地将有向无环图（DAG）视为服务器推理的主要计算方法。 DAG的一个示例如下：
 
 <center>
-<img src='images/server_dag.png' width = "450" height = "500" align="middle"/>
+<img src='../images/server_dag.png' width = "450" height = "500" align="middle"/>
 </center>
 
 ## 如何定义节点
@@ -18,7 +18,7 @@
 
 PaddleServing在框架中具有一些预定义的计算节点。 一种非常常用的计算图是简单的reader-infer-response模式，可以涵盖大多数单一模型推理方案。 示例图和相应的DAG定义代码如下。
 <center>
-<img src='images/simple_dag.png' width = "260" height = "370" align="middle"/>
+<img src='../images/simple_dag.png' width = "260" height = "370" align="middle"/>
 </center>
 
 ``` python
@@ -47,10 +47,10 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
 
 ### 包含多个输入的节点
 
-在[Paddle Serving中的集成预测](./deprecated/MODEL_ENSEMBLE_IN_PADDLE_SERVING_CN.md)文档中给出了一个包含多个输入节点的样例，示意图和代码如下。
+在[Paddle Serving中的集成预测](Model_Ensemble_CN.md)文档中给出了一个包含多个输入节点的样例，示意图和代码如下。
 
 <center>
-<img src='images/complex_dag.png' width = "480" height = "400" align="middle"/>
+<img src='../images/complex_dag.png' width = "480" height = "400" align="middle"/>
 </center>
 
 ```python

diff --git a/doc/cpp_server/SERVER_DAG.md → doc/C++Serving/DAG_EN.md b/doc/cpp_server/SERVER_DAG.md → doc/C++Serving/DAG_EN.md
@@ -1,6 +1,6 @@
 # Computation Graph On Server
 
-([简体中文](./SERVER_DAG_CN.md)|English)
+([简体中文](./DAG_CN.md)|English)
 
 This document shows the concept of computation graph on server. How to define computation graph with PaddleServing built-in operators. Examples for some sequential execution logics are shown as well.
 
@@ -9,7 +9,7 @@ This document shows the concept of computation graph on server. How to define co
 Deep neural nets often have some preprocessing steps on input data, and postprocessing steps on model inference scores. Since deep learning frameworks are now very flexible, it is possible to do preprocessing and postprocessing outside the training computation graph. If we want to do input data preprocessing and inference result postprocess on server side, we have to add the corresponding computation logics on server. Moreover, if a user wants to do inference with the same inputs on more than one model, the best way is to do the inference concurrently on server side given only one client request so that we can save some network computation overhead. For the above two reasons, it is naturally to think of a Directed Acyclic Graph(DAG) as the main computation method for server inference. One example of DAG is as follows:
 
 <center>
-<img src='images/server_dag.png' width = "450" height = "500" align="middle"/>
+<img src='../images/server_dag.png' width = "450" height = "500" align="middle"/>
 </center>
 
 ## How to define Node
@@ -19,7 +19,7 @@ Deep neural nets often have some preprocessing steps on input data, and postproc
 PaddleServing has some predefined Computation Node in the framework. A very commonly used Computation Graph is the simple reader-inference-response mode that can cover most of the single model inference scenarios. A example graph and the corresponding DAG definition code is as follows.
 
 <center>
-<img src='images/simple_dag.png' width = "260" height = "370" align="middle"/>
+<img src='../images/simple_dag.png' width = "260" height = "370" align="middle"/>
 </center>
 
 ``` python
@@ -48,10 +48,10 @@ python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --po
 
 ### Nodes with multiple inputs
 
-An example containing multiple input nodes is given in the [MODEL_ENSEMBLE_IN_PADDLE_SERVING](./deprecated/MODEL_ENSEMBLE_IN_PADDLE_SERVING.md). A example graph and the corresponding DAG definition code is as follows.
+An example containing multiple input nodes is given in the [Model_Ensemble](Model_Ensemble_EN.md). A example graph and the corresponding DAG definition code is as follows.
 
 <center>
-<img src='images/complex_dag.png' width = "480" height = "400" align="middle"/>
+<img src='../images/complex_dag.png' width = "480" height = "400" align="middle"/>
 </center>
 
 ```python

diff --git a/doc/cpp_server/ENCRYPTION_CN.md → doc/C++Serving/Encryption_CN.md b/doc/cpp_server/ENCRYPTION_CN.md → doc/C++Serving/Encryption_CN.md
@@ -1,6 +1,6 @@
 # 加密模型预测
 
-(简体中文|[English](ENCRYPTION.md))
+(简体中文|[English](Encryption_EN.md))
 
 Padle Serving提供了模型加密预测功能，本文档显示了详细信息。
 
@@ -12,7 +12,7 @@ Padle Serving提供了模型加密预测功能，本文档显示了详细信息
 
 普通的模型和参数可以理解为一个字符串，通过对其使用加密算法（参数是您的密钥），普通模型和参数就变成了一个加密的模型和参数。
 
-我们提供了一个简单的演示来加密模型。请参阅[`python/examples/encryption/encrypt.py`](../python/examples/encryption/encrypt.py)。
+我们提供了一个简单的演示来加密模型。请参阅[examples/C++/encryption/encrypt.py](../../examples/C++/encryption/encrypt.py)。
 
 
 ### 启动加密服务
@@ -40,5 +40,4 @@ python -m paddle_serving_server.serve --model encrypt_server/ --port 9300 --use_
 
 
 ### 模型加密推理示例
-模型加密推理示例, 请参见[`/python/examples/encryption/`](../python/examples/encryption/)。
-
+模型加密推理示例, 请参见[examples/C++/encryption/](../../examples/C++/encryption/)。
diff --git a/doc/cpp_server/ENCRYPTION.md → doc/C++Serving/Encryption_EN.md b/doc/cpp_server/ENCRYPTION.md → doc/C++Serving/Encryption_EN.md
@@ -1,6 +1,6 @@
 # MOEDL ENCRYPTION INFERENCE
 
-([简体中文](ENCRYPTION_CN.md)|English)
+([简体中文](Encryption_CN.md)|English)
 
 Paddle Serving provides model encryption inference, This document shows the details.
 
@@ -12,7 +12,7 @@ We use symmetric encryption algorithm to encrypt the model. Symmetric encryption
 
 Normal model and parameters can be understood as a string, by using the encryption algorithm (parameter is your key) on them, the normal model and parameters become an encrypted one.
 
-We provide a simple demo to encrypt the model. See the [python/examples/encryption/encrypt.py](../python/examples/encryption/encrypt.py)。
+We provide a simple demo to encrypt the model. See the [examples/C++/encryption/encrypt.py](../../examples/C++/encryption/encrypt.py)。
 
 
 ### Start Encryption Service
@@ -40,5 +40,4 @@ Once the server gets the key, it uses the key to parse the model and starts the
 
 
 ### Example of Model Encryption Inference
-Example of model encryption inference, See the [`/python/examples/encryption/`](../python/examples/encryption/)。
-
+Example of model encryption inference, See the [examples/C++/encryption/](../../examples/C++/encryption/)。