pipeline部署模型，出现lod报错，_get_bbox_result无法返回bbox_results #1902

ClassmateXiaoyu · 2023-01-04T08:59:19Z

环境
CUDA 11.7
cudnn 8.4.1
显卡：GTX 1070 
python 3.8.13
PaddlePaddle 2.4.1.post117
paddle-serving-server-gpu 0.9.0
paddle_serving_app 0.9.0

用paddleX训练的PPYOLOv2模型，通过python -m paddle_serving_client.convert --dirname  --model_filename  --params_filename  --serving_server serving_server --serving_client serving_client命令将inference模型转为了server模型。
发现一个问题，同一个模型用不同的方式部署后，会出现lod报错。具体如下：
1、当我用pipeline方式部署，fetch_dict中没有fetch_name.lod这个键，fetch_dict:  {'save_infer_model/scale_0.tmp_1': array([[  0.        ,   0.85202295, 216.68979   ,  64.207535  ,        436.6143    , 332.37054   ]], dtype=float32)}。
也就是没有lod信息，client与server通讯时，出现报错
Traceback (most recent call last):
  File "/root/anaconda3/envs/paddle38/lib/python3.8/site-packages/paddle_serving_server/pipeline/error_catch.py", line 97, in wrapper
    res = func(*args, **kw)
  File "/root/anaconda3/envs/paddle38/lib/python3.8/site-packages/paddle_serving_server/pipeline/operator.py", line 1179, in postprocess_help
    postped_data, prod_errcode, prod_errinfo = self.postprocess(
  File "pipeline_web_service_linux.py", line 72, in postprocess
    self.img_postprocess(
  File "/root/anaconda3/envs/paddle38/lib/python3.8/site-packages/paddle_serving_app/reader/image_reader.py", line 426, in __call__
    bbox_result = self._get_bbox_result(image_with_bbox, fetch_name,
  File "/root/anaconda3/envs/paddle38/lib/python3.8/site-packages/paddle_serving_app/reader/image_reader.py", line 344, in _get_bbox_result
    lod = [fetch_map[fetch_name + '.lod']]
KeyError: 'save_infer_model/scale_0.tmp_1.lod'
Classname: Op._run_postprocess.<locals>.postprocess_help
FunctionName: postprocess_help

2、当我用非pipeline方式部署时，fetch_map则有fetch_name.lod这个键，fetch_map:{'save_infer_model/scale_0.tmp_1': array([[0.0000000e+00, 6.3646980e-02, 5.2615891e+00, 1.2278875e+02,
        1.6876831e+02, 3.5357916e+02],
       [0.0000000e+00, 4.2369448e-02, 6.6680511e+01, 6.9318405e+01,
        6.0023975e+02, 5.3855756e+02],
       [0.0000000e+00, 1.8086428e-02, 1.2872772e+02, 1.4232706e+02,
        2.9876392e+02, 3.3751181e+02],
       [0.0000000e+00, 1.5854711e-02, 1.8734198e+02, 3.1824486e+01,
        3.5457477e+02, 1.9962274e+02],
       [0.0000000e+00, 1.5454855e-02, 2.1284140e+02, 1.9946268e+02,
        3.9645621e+02, 4.0698849e+02],
       [0.0000000e+00, 1.4058443e-02, 1.5301871e+02, 2.4853967e+02,
        3.2183228e+02, 4.3125073e+02],
       [0.0000000e+00, 1.2545503e-02, 1.1664839e+02, 2.4064153e+02,
        3.0317432e+02, 4.3767188e+02],
       [0.0000000e+00, 1.1161749e-02, 3.8942078e+01, 1.3401808e+02,
        1.8269760e+02, 3.4691406e+02],
       [0.0000000e+00, 1.0988280e-02, 1.4913477e+02, 1.8804048e+02,
        3.2895029e+02, 3.5642706e+02],
       [0.0000000e+00, 1.0884989e-02, 1.5156635e+02, 2.1480481e+02,
        3.2716016e+02, 3.9296497e+02]], dtype=float32), 'save_infer_model/scale_0.tmp_1.lod': array([ 0, 10])}。
client与server通讯时则没有报错，可以正常返回预测结果。
{'result': [{'bbox': [5.261589050292969, 122.78874969482422, 164.50672149658203, 231.79041290283203], 'category_id': 0, 'score': 0.06364697962999344}]}

请问技术同学，为何同一个模型，不同部署方式，会出现lod缺失的问题，这个问题该如何处理呀，谢谢！

The text was updated successfully, but these errors were encountered:

ClassmateXiaoyu · 2023-01-04T09:01:48Z

系统环境centos 7.9

fanruifeng · 2023-01-10T01:30:57Z

你好这个问题解决了嘛目前我也是这个情况

ClassmateXiaoyu · 2023-01-10T01:50:29Z

你好这个问题解决了嘛目前我也是这个情况

暂未解决，我还未想到如何解决，官方技术同学也还未答复这个问题

fanruifeng · 2023-01-10T01:52:31Z

好的方便加个QQ嘛 1125729232 交流下；我目前用非pipeline方式部署时, 我服务端服务也能正常启动, 但是在客户端处理的时候，接口返回异常 {'err_no': 10000, 'err_msg': 'Log_id: 10000 Raise_msg: transpose_0.tmp_0 ClassName: Op._run_postprocess..postprocess_help FunctionName: postprocess_help', 'key': [], 'value': [], 'tensors': []}

wjplove8 · 2023-02-22T07:07:16Z

#1635 (comment)

好的方便加个QQ嘛 1125729232 交流下；我目前用非pipeline方式部署时, 我服务端服务也能正常启动, 但是在客户端处理的时候，接口返回异常 {'err_no': 10000, 'err_msg': 'Log_id: 10000 Raise_msg: transpose_0.tmp_0 ClassName: Op._run_postprocess..postprocess_help FunctionName: postprocess_help', 'key': [], 'value': [], 'tensors': []}

你好这个问题解决了嘛目前我也是这个情况

HuiHuiSun · 2023-10-20T08:49:14Z

你好这个问题解决了嘛目前我也是这个情况

暂未解决，我还未想到如何解决，官方技术同学也还未答复这个问题

你好，请问这个问题现在解决了吗？

paddle-bot bot closed this as completed Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipeline部署模型，出现lod报错，_get_bbox_result无法返回bbox_results #1902

pipeline部署模型，出现lod报错，_get_bbox_result无法返回bbox_results #1902

ClassmateXiaoyu commented Jan 4, 2023

ClassmateXiaoyu commented Jan 4, 2023

fanruifeng commented Jan 10, 2023

ClassmateXiaoyu commented Jan 10, 2023

fanruifeng commented Jan 10, 2023

wjplove8 commented Feb 22, 2023

HuiHuiSun commented Oct 20, 2023

pipeline部署模型，出现lod报错，_get_bbox_result无法返回bbox_results #1902

pipeline部署模型，出现lod报错，_get_bbox_result无法返回bbox_results #1902

Comments

ClassmateXiaoyu commented Jan 4, 2023

ClassmateXiaoyu commented Jan 4, 2023

fanruifeng commented Jan 10, 2023

ClassmateXiaoyu commented Jan 10, 2023

fanruifeng commented Jan 10, 2023

wjplove8 commented Feb 22, 2023

HuiHuiSun commented Oct 20, 2023