Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime templates #189

Merged
merged 1 commit into from
May 30, 2024

Conversation

heyselbi
Copy link
Contributor

@heyselbi heyselbi commented Apr 9, 2024

Will close: https://issues.redhat.com/browse/RHOAIENG-860
and: https://issues.redhat.com/browse/RHOAIENG-3786

Will close: https://issues.redhat.com/browse/RHOAIENG-5525 --> adding vLLM template
Might partially close: https://issues.redhat.com/browse/RHOAIENG-4227 and https://issues.redhat.com/browse/RHOAIENG-4263 --> added the startupProbe as recommended

Will close: https://issues.redhat.com/browse/RHOAIENG-6596 --> added metrics annotations to all runtime templates

Description

We are migrating the runtime templates from odh-dashboard repo to model serving repo of odh-model-controller both for ODH and RHOAI. We are also moving the image tags from kustomization.yaml to params.env per DevOps request.

How Has This Been Tested?

This has been only tested within model serving. The test together with operator and odh-dashboard is in progress (not yet complete).

  1. In a cluster that has RHOAI or ODH deployed, use the following DSC:
spec:
  components:
    codeflare:
      managementState: Removed
    kserve:
      devFlags:
        manifests:
          - contextDir: config
            sourcePath: ''
            uri: 'https://github.com/heyselbi/odh-model-controller/tarball/move-templates'
      managementState: Managed
      serving:
        ingressGateway:
          certificate:
            type: SelfSigned
        managementState: Managed
        name: knative-serving
    trustyai:
      managementState: Removed
    ray:
      managementState: Removed
    kueue:
      managementState: Removed
    workbenches:
      managementState: Removed
    dashboard:
      devFlags:
        manifests:
          - contextDir: manifests
            sourcePath: ''
            uri: 'https://github.com/heyselbi/odh-dashboard/tarball/move-templates'
      managementState: Managed
    modelmeshserving:
      managementState: Removed
    datasciencepipelines:
      managementState: Removed
  1. Ensure that the pods have all reconciled and are running:
[selbi@fedora ~]$ oc get pods -n redhat-ods-applications
NAME                                         READY   STATUS    RESTARTS   AGE
kserve-controller-manager-79c79f64bb-w5f9k   1/1     Running   0          2d21h
odh-model-controller-7476ddbdfc-m7xmq        1/1     Running   0          7m6s
odh-model-controller-7476ddbdfc-sbkk5        1/1     Running   0          7m16s
odh-model-controller-7476ddbdfc-vft9c        1/1     Running   0          7m27s
rhods-dashboard-646ccf474d-6xskx             2/2     Running   0          6m47s
rhods-dashboard-646ccf474d-9rltt             2/2     Running   0          7m48s
rhods-dashboard-646ccf474d-gm6j2             2/2     Running   0          6m47s
rhods-dashboard-646ccf474d-pfhfq             2/2     Running   0          7m47s
rhods-dashboard-646ccf474d-zrc5x             2/2     Running   0          7m48s
  1. Make sure that odh-model-controller and rhods-dashboard have the images below (pr-*):
oc get pods -n redhat-ods-applications -o jsonpath="{..image}" | tr -s '[[:space:]]' '\n' |sort |uniq -c
      2 quay.io/modh/kserve-controller@sha256:3ae3440ce4bbd981efefed70a4f21b587f45faf43f6cc4b95a77d1423fa5c7b3
     10 quay.io/opendatahub/odh-dashboard:pr-2706
      6 quay.io/opendatahub/odh-model-controller:pr-189
     10 registry.redhat.io/openshift4/ose-oauth-proxy@sha256:ab112105ac37352a2a4916a39d6736f5db6ab4c29bad4467de8d613e80e9bb33

  1. On top right, select OpenShift AI
    image
  2. In Dashboard, check all runtimes show up
    image
  3. Check the runtime templates and check that it has the prometheus annotations
oc get template/tgis-grpc-serving-template -o yaml -n redhat-ods-applications | grep prometheus
      prometheus.io/path: /metrics
      prometheus.io/port: "3000"

Tests I performed:

  • The above test --> that runtime templates appear
  • vLLM (no dashboard): tested SR and ISVC following instructions here with CPUs (it errored saying no GPUs found) --> that vLLM template arguments work for model loading. ISVC used:
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata: 
  name: vllm-gpt2-openai
  namespace: vllm-gpt2
spec: 
  predictor: 
    model: 
      runtime: kserve-vllm
      modelFormat: 
        name: vLLM
      storageUri: pvc://vlmm-gpt2-claim/gpt2/
  • vLLM (no dashboard): tested SR and ISVC following instructions here with GPUs --> curl command was successful
  • vLLM: tested via Dashboard/UI successfully. Created a minio and uploaded gpt2 model. Created data connection via UI and followed instructions for acceleratorProfile enablement for GPUs to appear when selecting resources. Resulting curl command:
[selbi@fedora ~]$ curl -v https://gpt2-dashboard-test.apps.rosa.selbii.[hidden].com:443/v1/chat/completions -k -H "Content-Type: application/json"   -d '{
    "model": "gpt2",
    "messages": [
      {
        "role": "system",
        "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
      },
      {
        "role": "user",
        "content": "Compose a poem that explains the concept of recursion in programming."
      }
    ]
  }' | jq .
  % Total    % Received **[....]**
{
  "id": "cmpl-6e7206f4c93b4603b44b8aaa0b233b10",
  "object": "chat.completion",
  "created": 1716930382,
  "model": "gpt2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "A friend of mine came over **[cut the response since it was too long]**"
      },
      "logprobs": null,
      "finish_reason": "length",
      "stop_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 32,
    "total_tokens": 1024,
    "completion_tokens": 992
  }
}

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Copy link
Contributor

openshift-ci bot commented Apr 9, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

openshift-ci bot commented Apr 9, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: heyselbi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Apr 9, 2024
@heyselbi heyselbi marked this pull request as ready for review April 9, 2024 18:05
@spolti
Copy link
Member

spolti commented Apr 10, 2024

Is there a jira for it?

@spolti
Copy link
Member

spolti commented Apr 10, 2024

Removing myself and Davide as reviewers and adding Sean and Daniele.

@spolti spolti requested review from Xaenalt and dtrifiro and removed request for davidesalerno and spolti April 10, 2024 00:58
config/base/params.env Outdated Show resolved Hide resolved
@heyselbi
Copy link
Contributor Author

/hold

Copy link
Contributor

@VedantMahabaleshwarkar VedantMahabaleshwarkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left suggestions for changes according to https://issues.redhat.com/browse/RHOAIENG-6596

config/runtimes/caikit-tgis-template.yaml Show resolved Hide resolved
config/runtimes/ovms-kserve-template.yaml Outdated Show resolved Hide resolved
config/runtimes/ovms-kserve-template.yaml Outdated Show resolved Hide resolved
config/runtimes/ovms-kserve-template.yaml Outdated Show resolved Hide resolved
config/runtimes/ovms-mm-template.yaml Outdated Show resolved Hide resolved
config/runtimes/ovms-mm-template.yaml Outdated Show resolved Hide resolved
config/runtimes/tgis-template.yaml Show resolved Hide resolved
@spolti
Copy link
Member

spolti commented Apr 30, 2024

We can skip the prometheus.io/scrape annotation.

@heyselbi
Copy link
Contributor Author

heyselbi commented May 2, 2024

@VedantMahabaleshwarkar could you check the changes are okay for you? Is it confirmed I can remove the scrape annotation?

VedantMahabaleshwarkar

This comment was marked as outdated.

@heyselbi
Copy link
Contributor Author

Initial testing is working with the DSC mentioned in the testing instructions.

Will now add the vLLM template and add @dtrifiro's suggestions.

Copy link

@lucferbux lucferbux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Testins instructions for the dashboard side are clear enough, just listing them in the serving runtime view should be enough.

You can add an extra step, which will be just check the list when a user deploys a model, to see if the serving runtime is there, but if the annotation is present in the Custom Serving Runtime section that should be the case.

config/runtimes/kustomization.yaml Outdated Show resolved Hide resolved
config/runtimes/ovms-mm-template.yaml Show resolved Hide resolved
@openshift-ci openshift-ci bot removed the lgtm label May 22, 2024
Copy link
Contributor

openshift-ci bot commented May 22, 2024

New changes are detected. LGTM label has been removed.

config/runtimes/vllm-template.yaml Outdated Show resolved Hide resolved
config/runtimes/vllm-template.yaml Outdated Show resolved Hide resolved
config/base/kustomization.yaml Show resolved Hide resolved
config/overlays/odh/params.yaml Outdated Show resolved Hide resolved
@andrewballantyne
Copy link
Member

FYI, when this merges, we need to remove the hold & merge the Dashboard dependant PR: opendatahub-io/odh-dashboard#2706

config/runtimes/tgis-template.yaml Show resolved Hide resolved
config/runtimes/vllm-template.yaml Outdated Show resolved Hide resolved
config/runtimes/vllm-template.yaml Show resolved Hide resolved
config/runtimes/tgis-template.yaml Show resolved Hide resolved
config/runtimes/tgis-template.yaml Show resolved Hide resolved
config/runtimes/tgis-template.yaml Show resolved Hide resolved
@heyselbi
Copy link
Contributor Author

/unhold

@heyselbi
Copy link
Contributor Author

All testing details and results are shared in the main PR body

@openshift-merge-bot openshift-merge-bot bot merged commit f3d7cfd into opendatahub-io:main May 30, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants