Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Accelerator support for kserve #2261

Merged
merged 1 commit into from
Dec 1, 2023

Conversation

lucferbux
Copy link
Contributor

Description

Closes #2244

How Has This Been Tested?

Prerequisites

Enable accelerator support in your cluster, you can add this accelerator into your cluster:

apiVersion: dashboard.opendatahub.io/v1
kind: AcceleratorProfile
metadata:
  name: migrated-gpu
  namespace: redhat-ods-applications
spec:
  displayName: Nvidia GPU
  enabled: true
  identifier: nvidia.com/gpu
  tolerations:
    - effect: NoSchedule
      key: nvidia.com/gpu
      operator: Exists

KServe Resource Creation

  1. Deploy a new KServe model selecting the accelerator and a number of nodes
  2. Check the ServingRuntime spec. It shouldn't contain tolerations, neither the gpu requests/limits nvidia.com/gpu
  3. Check the InferenceService spec. It should contain tolerations under spec.predictor.tolerations and the gpu request/limits under spec.predictor.model.resources

KServe Resource Editing

  1. Edit the model deployed and remove the accelerator
  2. Check the ServingRuntime spec. It shouldn't be the same, just removing the opendatahub.io/accelerator-name label
  3. Check the InferenceService spec. It shouldn't contain tolerations under spec.predictor.tolerations neither the gpu request/limits under spec.predictor.model.resources

Modelmesh Resrouce Creation

  1. Create a new modelmesh server and select accelerator
  2. Check the ServingRuntime spec. It should contain tolerations and gpu resources
  3. Deploy a model
  4. Check the InferenceService spec. It shouldn't contain any gpu resource

Test Impact

Covered all paths with unit testing

Request review criteria:

Self checklist (all need to be checked):

  • The developer has manually tested the changes and verified that the changes work
  • Commits have been squashed into descriptive, self-contained units of work (e.g. 'WIP' and 'Implements feedback' style messages have been removed)
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has added tests or explained why testing cannot be added (unit tests & storybook for related changes)

If you have UI changes:

  • Included any necessary screenshots or gifs if it was a UI change.
  • Included tags to the UX team if it was a UI/UX change (find relevant UX in the SMEs section).

After the PR is posted & before it merges:

  • The developer has tested their solution on a cluster by using the image produced by the PR to main

I haven't tested this changes with a proper gpu cluster, we should do that asap

Copy link
Member

@andrewballantyne andrewballantyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, one question...

frontend/src/k8sTypes.ts Show resolved Hide resolved
Copy link
Member

@andrewballantyne andrewballantyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good... type question was resolved. LGTM.

Copy link
Contributor

openshift-ci bot commented Dec 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewballantyne, Xaenalt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Dec 1, 2023
@openshift-merge-bot openshift-merge-bot bot merged commit f5f08a0 into opendatahub-io:main Dec 1, 2023
6 checks passed
@lucferbux
Copy link
Contributor Author

I updated the types of InferenceService and ServingRuntime since there was a mismatch with the CRDs and handle the exceptions accordingly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Accelerator enablement in kserve is not working
5 participants