Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem accessing triton nodes endpoints when they are part of a graph #4198

Closed
saeid93 opened this issue Jun 30, 2022 · 0 comments · Fixed by #4207
Closed

Problem accessing triton nodes endpoints when they are part of a graph #4198

saeid93 opened this issue Jun 30, 2022 · 0 comments · Fixed by #4207
Assignees
Labels

Comments

@saeid93
Copy link
Contributor

saeid93 commented Jun 30, 2022

Describe the bug

As per my conversation with @adriangonz on slack there seem to be a bug in accessing the endpoints of the inference graph nodes where some nodes of the inference graph are prepackaged Triton servers. For example in single node setting metadata of a triton node is accessible but the same endpoint is not accessible when the Triton node is part of an inference graph.

To reproduce

To compare the two situation, you can deploy following single node Triton server:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: gpt2
spec:
  name: default
  predictors:
  - graph:
      implementation: TRITON_SERVER
      logger:
        mode: all
      modelUri: gs://seldon-models/triton/onnx_gpt2
      name: gpt2
      type: MODEL
    name: default
    replicas: 1
  protocol: kfserving

and the metadata endpoint is accessible at:

curl -s http://localhost:32000/seldon/default/gpt2/v2/models/gpt2 | jq .

output as expected:

 {
  "name": "gpt2",
  "versions": [
    "1"
  ],
  "platform": "onnxruntime_onnx",
  "inputs": [
    {
      "name": "input_ids",
      "datatype": "INT32",

However inside the graph:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: gpt2
spec:
  protocol: v2
  predictors:
    - name: default
      graph:
        name: tokeniser-encoder
        children:
          - name: gpt2
            implementation: TRITON_SERVER
            modelUri: gs://seldon-models/triton/onnx_gpt2
            children:
              - name: tokeniser-decoder
      componentSpecs:
        - spec:
            containers:
              - name: tokeniser-encoder
                image: seldonio/gpt2-tokeniser:0.1.0
                env:
                  # Use always a writable HuggingFace cache location regardless of the user
                  - name: TRANSFORMERS_CACHE
                    value: /opt/mlserver/.cache
                  - name: MLSERVER_MODEL_NAME
                    value: "tokeniser-encoder"
              - name: tokeniser-decoder
                image: seldonio/gpt2-tokeniser:0.1.0
                env:
                  - name: SELDON_TOKENIZER_TYPE
                    value: "DECODER"
                  # Use always a writable HuggingFace cache location regardless of the user
                  - name: TRANSFORMERS_CACHE
                    value: /opt/mlserver/.cache
                  - name: MLSERVER_MODEL_NAME
                    value: "tokeniser-decoder"

The endpoint curl localhost:32000/seldon/default/gpt2/v2/models/gpt2 will result in {"error":"Model gpt2 not found"}

Other combinations like curl localhost:32000/seldon/default/gpt2/gpt2/v2/models/gpt2 won't work either.

Expected behaviour

Intermediate triton endpoint be accessible.

Environment

  • Cloud Provider: Bare Metal
  • Kubernetes Cluster Version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.0-2+59bbb3530b6769", GitCommit:"59bbb3530b6769e4935a05ac0e13c9910c79253e", GitTreeState:"clean", BuildDate:"2022-05-13T06:41:13Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
  • Deployed Seldon System Images:
value: docker.io/seldonio/seldon-core-executor:1.14.0
image: docker.io/seldonio/seldon-core-operator:1.14.0
@saeid93 saeid93 added the bug label Jun 30, 2022
@ukclivecox ukclivecox self-assigned this Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants