Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Resource Allocation #1231

Open
5 tasks
jonathan-innis opened this issue May 3, 2024 · 5 comments
Open
5 tasks

Dynamic Resource Allocation #1231

jonathan-innis opened this issue May 3, 2024 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@jonathan-innis
Copy link
Member

jonathan-innis commented May 3, 2024

Description

What problem are you trying to solve?

If you haven't heard, there's a lot of buzz in the community about this thing called "Dynamic Resource Allocation." Effectively, it's a change to the existing Kubernetes resource model that would allow users to select against resources surfaced through a ResourceSlice object associated with a node that exposes Node hardware. Users create a ResourceClaim and perform selection through attribute-based selection using Common Expression Language.

The proposal for this change is documented here where there is a ton of discussion for the use-cases and the implications throughout the Kubernetes project.

The change to the resource model is of particular importance to Karpenter since we rely deeply on this resource model to know whether a pod is eligible to schedule against an instance type which we can think of as a "theoretical" node. Effectively, Karpenter now needs to be aware of the concepts ResourceSlice and ResourceClaim to know which instance types have the hardware required to schedule a set of pods. As Karpenter performs scheduling against these ResourceSlices it needs to simulate a pod taking up that hardware and rule out an instance type when the hardware can no longer fit the pods scheduling against it.

This has some relation to #751 but I think we can decouple for now. DRA only requires what we know what the model would look like if the node were to launch, it doesn't necessitate that we allow users to specify arbitrary resources.

CloudProviders can first-class a set of resources it knows will appear in the ResourceSlices when the node comes up and hand that back in the GetInstanceTypes call for the scheduler to reason about. Some solid use-cases for this are things like NVIDIA GPUs whose hardware is well-known before launching the instance type or AWS's Inferentia accelerators.

Tasks

I want to build-out a set of Tasks that can be taken up to get a PoC for this working. Ideally, someone could build this out with kwok and then we could apply the same changes to the Azure and AWS providers.

  • Add ResourceSlice to the CloudProvider InstanceType model
  • Add ResourceSlice returning from the GetInstanceTypes() call in Kwok
  • Handle adding ResourceClaims to pod requirements
  • Handle ResourceClaim/ResourceSlice compatibility with CEL resolution
  • Handle simulating ResourceClaims against ResourceSlices through CEL (the tricky bit)

Working Group

Separately, if you are interested in attending the Working Group and contributing to other use-cases around DRA, the log is here and the official working group charter and meeting times are here

The YouTube Playlist for previous meetings can also be found here.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@jonathan-innis jonathan-innis added the kind/feature Categorizes issue or PR as related to a new feature. label May 3, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 3, 2024
@jonathan-innis
Copy link
Member Author

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 3, 2024
@jonathan-innis
Copy link
Member Author

jonathan-innis commented May 3, 2024

IMO, it makes a lot of sense to build-out a staging/dra branch for the PoC work here. We can start building out the changes and collaborate on them without pulling them into the main branch. This is definitely going to be important since the DRA stuff is in beta and still in flux.

@uniemimu
Copy link

uniemimu commented May 6, 2024

This is definitely going to be important since the DRA stuff is in beta and still in flux.

DRA is alpha. DRA beta ETA is 1.32. Starting the work aligned with KEP 4381 makes sense.

@jonathan-innis
Copy link
Member Author

Update: There is another KEP (that is probably the more up-to-date one) that proposes a bunch of changes in 1.31: kubernetes/enhancements#4709. I'd encourage folks who are interested to take a look at it and see what we think about how it fits in with Karpenter's scheduling logic.

As @uniemimu called out, the current target is 1.32 for the API that is proposed in the KEP to go to beta.

@jonathan-innis
Copy link
Member Author

jonathan-innis commented Jul 8, 2024

FYI: Anyone who is interested in developing this PoC can use the SIG-provided example driver for testing changes: https://github.com/kubernetes-sigs/dra-example-driver (structured-parameters branch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants