Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading TorchScript model fails for Triton in DeepStream #2317

Closed
rbrigden opened this issue Dec 7, 2020 · 5 comments
Closed

Loading TorchScript model fails for Triton in DeepStream #2317

rbrigden opened this issue Dec 7, 2020 · 5 comments

Comments

@rbrigden
Copy link

rbrigden commented Dec 7, 2020

Description

I am trying to load a successfully exported TorchScript model in the Triton inference server that is packed with DeepStream 5.0. Unfortunately I receive this error:

Internal: load failed for libtorch model -> 'mymodel': version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at ../caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 4, but the maximum supported version for reading is 2. Your PyTorch installation may be too old.

Issues filed in pytorch seem to mismatched pytorch versions (between the export and runtime): example

The Pytorch/Torchvision environments in my training / export environment are:

torch==1.7.0
torchvision==0.8.1

I am using the NGC container

nvcr.io/nvidia/deepstream:5.0.1-20.09-triton

Based on the framework support matrix, it appears that 20.09 support PyTorch 1.7.0.

Triton Information
What version of Triton are you using?

nvcr.io/nvidia/deepstream:5.0.1-20.09-triton

Are you using the Triton container or did you build it yourself?

I am using the NGC container

To Reproduce
Steps to reproduce the behavior.

Export a TorchScript model using PyTorch 1.7.0 and adapt the triton sample in the container nvcr.io/nvidia/deepstream:5.0.1-20.09-triton /opt/nvidia/deepstream/deepstream-5.0/sources/python/apps/deepstream-ssd-parser

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).


  platform: "pytorch_libtorch"
  name: "efficientdet"
  max_batch_size: 4
  input [
    {
      name: "INPUT_0"
      data_type: TYPE_FP16
      dims: [ -1, 3, 512, 512 ]
    }
  ]
  output [
    {
      name: "OUTPUT_0"
      data_type: TYPE_FP16
      dims: [ 4, 49104, 4 ]
    },
    {
      name: "OUTPUT_1"
      data_type: TYPE_FP16
      dims: [ 4, 49104, 1 ]
    },
    {
      name: "OUTPUT_2"
      data_type: TYPE_INT64
      dims: [ 4, 49104, 1 ]
    }
  ]

@CoderHam
Copy link
Contributor

CoderHam commented Dec 8, 2020

Please share your model repository structure. It looks like instead of using numeric versions for the model you directly placed 'mymodel' in the 'model_directory'. Please refer to the instructions here and reopen if needed.

@CoderHam CoderHam closed this as completed Dec 8, 2020
@rbrigden
Copy link
Author

rbrigden commented Dec 8, 2020

@CoderHam I mount my model repo to /models in the container

My model repo looks like this:

+--efficientdet
|   +-- 1
     |   +-- model.pt
|   +-- config.pbtxt

If it helps, my DeepStream config is

infer_config {
  unique_id: 5
  gpu_ids: [0]
  max_batch_size: 4
  backend {
    trt_is {
      model_name: "efficientdet"
      version: 1
      model_repo {
        root: "/models"
        log_level: 2
        strict_model_config: true
      }
    }
  }

  preprocess {
    network_format: IMAGE_FORMAT_RGB
    tensor_order: TENSOR_ORDER_NONE
    maintain_aspect_ratio: 0
    normalize {
      scale_factor: 1.0
      channel_offsets: [0, 0, 0]
    }
  }

  postprocess {
    labelfile_path: "labels.txt"
    other {}
  }

  extra {
    copy_input_to_host_buffers: false
  }

  custom_lib {
    path: "/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_infercustomparser.so"
  }
}
input_control {
  process_mode: PROCESS_MODE_FULL_FRAME
  interval: 0
}
output_control {
  output_tensor_meta: true
}

(Also, I am unable to re-open this issue as I am not a collaborator on this repo, so I do hope you see this)

@CoderHam CoderHam reopened this Dec 8, 2020
@CoderHam
Copy link
Contributor

CoderHam commented Dec 8, 2020

Could you try running the model directly inside nvcr.io/nvidia/tritonserver:20.09-py3. Triton does not manage the deepstream container and it would help narrow down the issue. Looks to me that the model does not load correctly because of an older pytorch version.

@msalehiNV
Copy link

@rbrigden Since this appears to be a DeepStream-related issue, I've spoken to their team and they recommended that you post your issue on the Deepstream SDK Board:
https://forums.developer.nvidia.com/c/accelerated-computing/intelligent-video-analytics/deepstream-sdk/15

@rbrigden
Copy link
Author

Thank you @CoderHam and @msalehiNV, I haven't yet had a chance to test on the standalone triton server, but will do that soon. I'll post an update here and also well as make a post on the Deepstream SDK Board.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants