Skip to content

Commit

Permalink
Merge pull request #46 from PaddlePaddle/develop
Browse files Browse the repository at this point in the history
Sync repo
  • Loading branch information
TeslaZhao authored May 10, 2021
2 parents 6be73cb + 50bbbbe commit fa52ca2
Show file tree
Hide file tree
Showing 21 changed files with 158 additions and 90 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ message SimpleResponse { required int32 err_code = 1; }

message GetClientConfigRequest {}

message GetClientConfigResponse { repeated string client_config_str_list = 1; }
message GetClientConfigResponse { required string client_config_str = 1; }

service MultiLangGeneralModelService {
rpc Inference(InferenceRequest) returns (InferenceResponse) {}
Expand Down
2 changes: 1 addition & 1 deletion doc/COMPILE.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_OPENCV=ON
-DWITH_OPENCV=ON \
-DSERVER=ON ..
make -j10
```
Expand Down
2 changes: 1 addition & 1 deletion doc/COMPILE_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR/ \
-DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
-DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_OPENCV=ON
-DWITH_OPENCV=ON \
-DSERVER=ON ..
make -j10
```
Expand Down
28 changes: 15 additions & 13 deletions doc/PADDLE_SERVING_ON_KUBERNETES.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ kubectl apply -f https://bit.ly/kong-ingress-dbless
`tools/generate_runtime_docker.sh`文件下,它的使用方式如下

```bash
bash tool/generate_runtime_docker.sh --env cuda10.1 --python 2.7 --serving 0.5.0 --paddle 2.0.0 --name serving_runtime:cuda10.1-py27
bash tool/generate_runtime_docker.sh --env cuda10.1 --python 3.6 --serving 0.6.0 --paddle 2.0.1 --name serving_runtime:cuda10.1-py36
```

会生成 cuda10.1,python 2.7,serving版本0.5.0 还有 paddle版本2.0.0的运行镜像。如果有其他疑问,可以执行下列语句得到帮助信息。
会生成 cuda10.1,python 3.6,serving版本0.6.0 还有 paddle版本2.0.1的运行镜像。如果有其他疑问,可以执行下列语句得到帮助信息。

```
bash tools/generate_runtime_docker.sh --help
Expand All @@ -39,7 +39,7 @@ bash tools/generate_runtime_docker.sh --help
- paddle-serving-server, paddle-serving-client,paddle-serving-app,paddlepaddle,具体版本可以在tools/runtime.dockerfile当中查看,同时,如果有定制化的需求,也可以在该文件中进行定制化。
- paddle-serving-server 二进制可执行程序

也就是说,运行镜像在生成之后,我们只需要将我们运行的代码(如果有)和模型搬运到镜像中就可以。生成后的镜像名为`paddle_serving:cuda10.2-py37`
也就是说,运行镜像在生成之后,我们只需要将我们运行的代码(如果有)和模型搬运到镜像中就可以。生成后的镜像名为`paddle_serving:cuda10.2-py36`

### 添加您的代码和模型

Expand All @@ -50,8 +50,8 @@ bash tools/generate_runtime_docker.sh --help
对于pipeline模式,我们需要确保模型和程序文件、配置文件等各种依赖都能够在镜像中运行。因此可以在`/home/project`下存放我们的执行文件时,我们以`Serving/python/example/pipeline/ocr`为例,这是OCR文字识别任务。

```bash
#假设您已经拥有Serving运行镜像,假设镜像名为paddle_serving:cuda10.2-py37
docker run --rm -dit --name pipeline_serving_demo paddle_serving:cuda10.2-py37 bash
#假设您已经拥有Serving运行镜像,假设镜像名为paddle_serving:cuda10.2-py36
docker run --rm -dit --name pipeline_serving_demo paddle_serving:cuda10.2-py36 bash
cd Serving/python/example/pipeline/ocr
# get models
python -m paddle_serving_app.package --get_model ocr_rec
Expand All @@ -71,7 +71,7 @@ docker commit pipeline_serving_demo ocr_serving:latest
```
docker exec -it pipeline_serving_demo bash
cd /home/ocr
python3.7 web_service.py
python3.6 web_service.py
```

进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。
Expand All @@ -83,8 +83,8 @@ python3.7 web_service.py
web service模式本质上和pipeline模式类似,因此我们以`Serving/python/examples/bert`为例

```bash
#假设您已经拥有Serving运行镜像,假设镜像名为registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-py37
docker run --rm -dit --name webservice_serving_demo registry.baidubce.com/paddlepaddle/serving:0.6.0-cpu-py27 bash
#假设您已经拥有Serving运行镜像,假设镜像名为registry.baidubce.com/paddlepaddle/serving:0.6.0-cuda10.2-py36
docker run --rm -dit --name webservice_serving_demo registry.baidubce.com/paddlepaddle/serving:0.6.0-cpu-py36 bash
cd Serving/python/examples/bert
### download model
wget https://paddle-serving.bj.bcebos.com/paddle_hub_models/text/SemanticModel/bert_chinese_L-12_H-768_A-12.tar.gz
Expand All @@ -102,7 +102,7 @@ docker commit webservice_serving_demo bert_serving:latest
```bash
docker exec -it webservice_serving_demo bash
cd /home/bert
python3.7 bert_web_service.py 9292
python3.6 bert_web_service.py bert_seq128_model 9292
```

进入容器到工程目录之后,剩下的操作和调试代码的工作是类似的。
Expand All @@ -118,14 +118,15 @@ kubenetes集群操作需要`kubectl`去操纵yaml文件。我们这里给出了
- pipeline ocr示例

```bash
sh tools/generate_k8s_yamls.sh --app_name ocr --image_name registry.baidubce.com/paddlepaddle/serving:k8s-pipeline-demo --workdir /home/ocr --command "python2.7 web_service.py" --port 9999
sh tools/generate_k8s_yamls.sh --app_name ocr --image_name registry.baidubce.com/paddlepaddle/serving:k8s-pipeline-demo --workdir /home/ocr --command "python3.6 web_service.py" --port 9999
```

- web service bert示例

```bash
sh tools/generate_k8s_yamls.sh --app_name bert --image_name registry.baidubce.com/paddlepaddle/serving:k8s-web-demo --workdir /home/bert --command "python2.7 bert_web_service.py 9292" --port 9292
sh tools/generate_k8s_yamls.sh --app_name bert --image_name registry.baidubce.com/paddlepaddle/serving:k8s-web-demo --workdir /home/bert --command "python3.6 bert_web_service.py bert_seq128_model 9292" --port 9292
```
**需要注意的是,app_name需要同URL的函数名相同。例如示例中bert的访问URL是`https://127.0.0.1:9292/bert/prediction`,那么app_name应为bert。**

接下来我们会看到有两个yaml文件,分别是`k8s_serving.yaml`和 k8s_ingress.yaml`.

Expand Down Expand Up @@ -174,7 +175,7 @@ spec:
workingDir: /home/ocr
name: ocr
command: ['/bin/bash', '-c']
args: ["python3.7 web_service.py"]
args: ["python3.6 bert_web_service.py bert_seq128_model 9292"]
env:
- name: NODE_NAME
valueFrom:
Expand Down Expand Up @@ -216,7 +217,8 @@ spec:
最终我们执行就可以启动相关容器和API网关。
```
kubectl apply -f k8s_serving.yaml k8s_ingress.yaml
kubectl apply -f k8s_serving.yaml
kubectl apply -f k8s_ingress.yaml
```

输入
Expand Down
3 changes: 3 additions & 0 deletions python/examples/bert/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,4 +94,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "hello"}]
bash benchmark.sh bert_seq128_model bert_seq128_client
```
性能测试的日志文件为profile_log_bert_seq128_model

如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。

注意:bert_seq128_model和bert_seq128_client路径后不要加'/'符号,示例需要在GPU机器上运行。
15 changes: 9 additions & 6 deletions python/examples/bert/benchmark.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,27 +17,30 @@ sleep 5

#warm up
$PYTHONROOT/bin/python3 benchmark.py --thread 4 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
echo -e "import psutil\ncpu_utilization=psutil.cpu_percent(1,False)\nprint('CPU_UTILIZATION:', cpu_utilization)\n" > cpu_utilization.py
echo -e "import psutil\nimport time\nwhile True:\n\tcpu_res = psutil.cpu_percent()\n\twith open('cpu.txt', 'a+') as f:\n\t\tf.write(f'{cpu_res}\\\n')\n\ttime.sleep(0.1)" > cpu.py
for thread_num in 1 4 8 16
do
for batch_size in 1 4 16 64
do
job_bt=`date '+%Y%m%d%H%M%S'`
nvidia-smi --id=0 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_use.log 2>&1 &
nvidia-smi --id=0 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_memory_use.log 2>&1 &
nvidia-smi --id=0 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 &
rm -rf cpu.txt
$PYTHONROOT/bin/python3 cpu.py &
gpu_memory_pid=$!
$PYTHONROOT/bin/python3 benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
kill ${gpu_memory_pid}
kill `ps -ef|grep used_memory|awk '{print $2}'`
kill `ps -ef|grep used_memory|awk '{print $2}'` > /dev/null
kill `ps -ef|grep utilization.gpu|awk '{print $2}'` > /dev/null
kill `ps -ef|grep cpu.py|awk '{print $2}'` > /dev/null
echo "model_name:" $1
echo "thread_num:" $thread_num
echo "batch_size:" $batch_size
echo "=================Done===================="
echo "model_name:$1" >> profile_log_$1
echo "batch_size:$batch_size" >> profile_log_$1
$PYTHONROOT/bin/python3 cpu_utilization.py >> profile_log_$1
job_et=`date '+%Y%m%d%H%M%S'`
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}' gpu_use.log >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "CPU_UTILIZATION:", max}' cpu.txt >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}' gpu_memory_use.log >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$1
rm -rf gpu_use.log gpu_utilization.log
$PYTHONROOT/bin/python3 ../util/show_profile.py profile $thread_num >> profile_log_$1
Expand Down
3 changes: 3 additions & 0 deletions python/examples/fit_a_line/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,7 @@ curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1
bash benchmark.sh uci_housing_model uci_housing_client
```
性能测试的日志文件为profile_log_uci_housing_model

如需修改性能测试用例的参数,请修改benchmark.sh中的配置信息。

注意:uci_housing_model和uci_housing_client路径后不要加'/'符号,示例需要在GPU机器上运行。
11 changes: 9 additions & 2 deletions python/examples/fit_a_line/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,28 +30,35 @@ def single_func(idx, resource):
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=1)
total_number = sum(1 for _ in train_reader())
latency_list = []

if args.request == "rpc":
client = Client()
client.load_client_config(args.model)
client.connect([args.endpoint])
start = time.time()
for data in train_reader():
l_start = time.time()
fetch_map = client.predict(feed={"x": data[0][0]}, fetch=["price"])
l_end = time.time()
latency_list.append(l_end * 1000 - l_start * 1000)
end = time.time()
return [[end - start], [total_number]]
return [[end - start], latency_list, [total_number]]
elif args.request == "http":
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=1)
start = time.time()
for data in train_reader():
l_start = time.time()
r = requests.post(
'http://{}/uci/prediction'.format(args.endpoint),
data={"x": data[0]})
l_end = time.time()
latency_list.append(l_end * 1000 - l_start * 1000)
end = time.time()
return [[end - start], [total_number]]
return [[end - start], latency_list, [total_number]]


start = time.time()
Expand Down
55 changes: 55 additions & 0 deletions python/examples/fit_a_line/benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
rm profile_log*
export CUDA_VISIBLE_DEVICES=0,1
export FLAGS_profile_server=1
export FLAGS_profile_client=1
export FLAGS_serving_latency=1

gpu_id=0
#save cpu and gpu utilization log
if [ -d utilization ];then
rm -rf utilization
else
mkdir utilization
fi
#start server
$PYTHONROOT/bin/python3 -m paddle_serving_server.serve --model $1 --port 9292 --thread 4 --gpu_ids 0,1 --mem_optim --ir_optim > elog 2>&1 &
sleep 5

#warm up
$PYTHONROOT/bin/python3 benchmark.py --thread 4 --batch_size 1 --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
echo -e "import psutil\nimport time\nwhile True:\n\tcpu_res = psutil.cpu_percent()\n\twith open('cpu.txt', 'a+') as f:\n\t\tf.write(f'{cpu_res}\\\n')\n\ttime.sleep(0.1)" > cpu.py
for thread_num in 1 4 8 16
do
for batch_size in 1 4 16 64
do
job_bt=`date '+%Y%m%d%H%M%S'`
nvidia-smi --id=0 --query-compute-apps=used_memory --format=csv -lms 100 > gpu_memory_use.log 2>&1 &
nvidia-smi --id=0 --query-gpu=utilization.gpu --format=csv -lms 100 > gpu_utilization.log 2>&1 &
rm -rf cpu.txt
$PYTHONROOT/bin/python3 cpu.py &
gpu_memory_pid=$!
$PYTHONROOT/bin/python3 benchmark.py --thread $thread_num --batch_size $batch_size --model $2/serving_client_conf.prototxt --request rpc > profile 2>&1
kill `ps -ef|grep used_memory|awk '{print $2}'` > /dev/null
kill `ps -ef|grep utilization.gpu|awk '{print $2}'` > /dev/null
kill `ps -ef|grep cpu.py|awk '{print $2}'` > /dev/null
echo "model_name:" $1
echo "thread_num:" $thread_num
echo "batch_size:" $batch_size
echo "=================Done===================="
echo "model_name:$1" >> profile_log_$1
echo "batch_size:$batch_size" >> profile_log_$1
job_et=`date '+%Y%m%d%H%M%S'`
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "CPU_UTILIZATION:", max}' cpu.txt >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "MAX_GPU_MEMORY:", max}' gpu_memory_use.log >> profile_log_$1
awk 'BEGIN {max = 0} {if(NR>1){if ($1 > max) max=$1}} END {print "GPU_UTILIZATION:", max}' gpu_utilization.log >> profile_log_$1
rm -rf gpu_use.log gpu_utilization.log
$PYTHONROOT/bin/python3 ../util/show_profile.py profile $thread_num >> profile_log_$1
tail -n 8 profile >> profile_log_$1
echo "" >> profile_log_$1
done
done

#Divided log
awk 'BEGIN{RS="\n\n"}{i++}{print > "bert_log_"i}' profile_log_$1
mkdir bert_log && mv bert_log_* bert_log
ps -ef|grep 'serving'|grep -v grep|cut -c 9-15 | xargs kill -9
35 changes: 5 additions & 30 deletions python/paddle_serving_client/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -554,15 +554,8 @@ def connect(self, endpoints):
get_client_config_req = multi_lang_general_model_service_pb2.GetClientConfigRequest(
)
resp = self.stub_.GetClientConfig(get_client_config_req)
model_config_path_list = resp.client_config_str_list
file_path_list = []
for single_model_config in model_config_path_list:
if os.path.isdir(single_model_config):
file_path_list.append("{}/serving_server_conf.prototxt".format(
single_model_config))
elif os.path.isfile(single_model_config):
file_path_list.append(single_model_config)
self._parse_model_config(file_path_list)
model_config_str = resp.client_config_str
self._parse_model_config(model_config_str)

def _flatten_list(self, nested_list):
for item in nested_list:
Expand All @@ -572,23 +565,10 @@ def _flatten_list(self, nested_list):
else:
yield item

def _parse_model_config(self, model_config_path_list):
if isinstance(model_config_path_list, str):
model_config_path_list = [model_config_path_list]
elif isinstance(model_config_path_list, list):
pass

file_path_list = []
for single_model_config in model_config_path_list:
if os.path.isdir(single_model_config):
file_path_list.append("{}/serving_client_conf.prototxt".format(
single_model_config))
elif os.path.isfile(single_model_config):
file_path_list.append(single_model_config)
def _parse_model_config(self, model_config_str):
model_conf = m_config.GeneralModelConfig()
f = open(file_path_list[0], 'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
model_conf = google.protobuf.text_format.Merge(model_config_str,
model_conf)
self.feed_names_ = [var.alias_name for var in model_conf.feed_var]
self.feed_types_ = {}
self.feed_shapes_ = {}
Expand All @@ -598,11 +578,6 @@ def _parse_model_config(self, model_config_path_list):
self.feed_shapes_[var.alias_name] = var.shape
if var.is_lod_tensor:
self.lod_tensor_set_.add(var.alias_name)
if len(file_path_list) > 1:
model_conf = m_config.GeneralModelConfig()
f = open(file_path_list[-1], 'r')
model_conf = google.protobuf.text_format.Merge(
str(f.read()), model_conf)
self.fetch_names_ = [var.alias_name for var in model_conf.fetch_var]
self.fetch_types_ = {}
for i, var in enumerate(model_conf.fetch_var):
Expand Down
11 changes: 10 additions & 1 deletion python/paddle_serving_server/rpc_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,5 +198,14 @@ def GetClientConfig(self, request, context):
#model_config_path_list is list right now.
#dict should be added when graphMaker is used.
resp = multi_lang_general_model_service_pb2.GetClientConfigResponse()
resp.client_config_str_list[:] = self.model_config_path_list
model_config_str = []
for single_model_config in self.model_config_path_list:
if os.path.isdir(single_model_config):
with open("{}/serving_server_conf.prototxt".format(
single_model_config)) as f:
model_config_str.append(str(f.read()))
elif os.path.isfile(single_model_config):
with open(single_model_config) as f:
model_config_str.append(str(f.read()))
resp.client_config_str = model_config_str[0]
return resp
8 changes: 4 additions & 4 deletions tools/Dockerfile.cuda10.1-cudnn7.devel
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}

# Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh
RUN bash /build_scripts/install_trt.sh cuda10.1
RUN rm -rf /build_scripts

# git credential to skip password typing
Expand Down Expand Up @@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache

RUN python3.8 -m pip install --upgrade pip requests && \
python3.7 -m pip install --upgrade pip requests && \
python3.6 -m pip install --upgrade pip requests
RUN python3.8 -m pip install --upgrade pip==21.1.1 requests && \
python3.7 -m pip install --upgrade pip==21.1.1 requests && \
python3.6 -m pip install --upgrade pip==21.1.1 requests

RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
Expand Down
8 changes: 4 additions & 4 deletions tools/Dockerfile.cuda10.2-cudnn8.devel
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ ENV PATH=usr/local/go/bin:/root/go/bin:${PATH}

# Downgrade TensorRT
COPY tools/dockerfiles/build_scripts /build_scripts
RUN bash /build_scripts/install_trt.sh
RUN bash /build_scripts/install_trt.sh cuda10.2
RUN rm -rf /build_scripts

# git credential to skip password typing
Expand Down Expand Up @@ -132,9 +132,9 @@ RUN wget https://paddle-ci.gz.bcebos.com/ccache-3.7.9.tar.gz && \
make -j8 && make install && \
ln -s /usr/local/ccache-3.7.9/bin/ccache /usr/local/bin/ccache

RUN python3.8 -m pip install --upgrade pip requests && \
python3.7 -m pip install --upgrade pip requests && \
python3.6 -m pip install --upgrade pip requests
RUN python3.8 -m pip install --upgrade pip==21.1.1 requests && \
python3.7 -m pip install --upgrade pip==21.1.1 requests && \
python3.6 -m pip install --upgrade pip==21.1.1 requests

RUN wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
Expand Down
Loading

0 comments on commit fa52ca2

Please sign in to comment.