Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Criticality Discussion/ Improvements #213

Open
kfswain opened this issue Jan 21, 2025 · 1 comment
Open

Criticality Discussion/ Improvements #213

kfswain opened this issue Jan 21, 2025 · 1 comment

Comments

@kfswain
Copy link
Collaborator

kfswain commented Jan 21, 2025

Criticality has been discussed quite a bit, Ex: here and here, as well as in the Inf-GW weekly meeting.

We recognize that this field may be imperfect, but without user feedback its currently difficult to iterate in the proper direction. So to centralize discussion, we are creating this issue.

@shaneutt
Copy link
Member

One of my questions is about how Criticality functions during the entire lifecycle of models:

In practice will operators have to update large groups of InferenceModels to accommodate for the criticality of new InferenceModels? For example let's say I have 60 models out there, 10 of them Critical, 30 Default and 20 Sheddable. I have 20 new InferenceModels to deploy which now need to be the only Critical ones. So... do I have to "downgrade" other existing models in order to achieve this? Is this example a reasonable one, and am I understanding correctly?

More essentially: Criticality is a specification of relationships across multiple resources to "rank" them, and any kind of cross-resource relationship in Kubernetes has the potential to introduce complexity. Do we see potential for ways (like the above) in which this could become painful for operators on clusters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants