Problem accessing triton nodes endpoints when they are part of a graph #4198

saeid93 · 2022-06-30T12:56:49Z

Describe the bug

As per my conversation with @adriangonz on slack there seem to be a bug in accessing the endpoints of the inference graph nodes where some nodes of the inference graph are prepackaged Triton servers. For example in single node setting metadata of a triton node is accessible but the same endpoint is not accessible when the Triton node is part of an inference graph.

To reproduce

To compare the two situation, you can deploy following single node Triton server:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: gpt2
spec:
  name: default
  predictors:
  - graph:
      implementation: TRITON_SERVER
      logger:
        mode: all
      modelUri: gs://seldon-models/triton/onnx_gpt2
      name: gpt2
      type: MODEL
    name: default
    replicas: 1
  protocol: kfserving

and the metadata endpoint is accessible at:

curl -s http://localhost:32000/seldon/default/gpt2/v2/models/gpt2 | jq .

output as expected:

 {
  "name": "gpt2",
  "versions": [
    "1"
  ],
  "platform": "onnxruntime_onnx",
  "inputs": [
    {
      "name": "input_ids",
      "datatype": "INT32",

However inside the graph:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: gpt2
spec:
  protocol: v2
  predictors:
    - name: default
      graph:
        name: tokeniser-encoder
        children:
          - name: gpt2
            implementation: TRITON_SERVER
            modelUri: gs://seldon-models/triton/onnx_gpt2
            children:
              - name: tokeniser-decoder
      componentSpecs:
        - spec:
            containers:
              - name: tokeniser-encoder
                image: seldonio/gpt2-tokeniser:0.1.0
                env:
                  # Use always a writable HuggingFace cache location regardless of the user
                  - name: TRANSFORMERS_CACHE
                    value: /opt/mlserver/.cache
                  - name: MLSERVER_MODEL_NAME
                    value: "tokeniser-encoder"
              - name: tokeniser-decoder
                image: seldonio/gpt2-tokeniser:0.1.0
                env:
                  - name: SELDON_TOKENIZER_TYPE
                    value: "DECODER"
                  # Use always a writable HuggingFace cache location regardless of the user
                  - name: TRANSFORMERS_CACHE
                    value: /opt/mlserver/.cache
                  - name: MLSERVER_MODEL_NAME
                    value: "tokeniser-decoder"

The endpoint curl localhost:32000/seldon/default/gpt2/v2/models/gpt2 will result in {"error":"Model gpt2 not found"}

Other combinations like curl localhost:32000/seldon/default/gpt2/gpt2/v2/models/gpt2 won't work either.

Expected behaviour

Intermediate triton endpoint be accessible.

Environment

Cloud Provider: Bare Metal
Kubernetes Cluster Version

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.0-2+59bbb3530b6769", GitCommit:"59bbb3530b6769e4935a05ac0e13c9910c79253e", GitTreeState:"clean", BuildDate:"2022-05-13T06:41:13Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}

Deployed Seldon System Images:

value: docker.io/seldonio/seldon-core-executor:1.14.0
image: docker.io/seldonio/seldon-core-operator:1.14.0

The text was updated successfully, but these errors were encountered:

saeid93 added the bug label Jun 30, 2022

ukclivecox self-assigned this Jul 4, 2022

ukclivecox mentioned this issue Jul 4, 2022

fix metadata #4207

Merged

ukclivecox closed this as completed in #4207 Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem accessing triton nodes endpoints when they are part of a graph #4198

Problem accessing triton nodes endpoints when they are part of a graph #4198

saeid93 commented Jun 30, 2022

Problem accessing triton nodes endpoints when they are part of a graph #4198

Problem accessing triton nodes endpoints when they are part of a graph #4198

Comments

saeid93 commented Jun 30, 2022

Describe the bug

To reproduce

Expected behaviour

Environment