Criticality Discussion/ Improvements #213

kfswain · 2025-01-21T16:12:51Z

Criticality has been discussed quite a bit, Ex: here and here, as well as in the Inf-GW weekly meeting.

We recognize that this field may be imperfect, but without user feedback its currently difficult to iterate in the proper direction. So to centralize discussion, we are creating this issue.

shaneutt · 2025-01-21T16:44:56Z

One of my questions is about how Criticality functions during the entire lifecycle of models:

In practice will operators have to update large groups of InferenceModels to accommodate for the criticality of new InferenceModels? For example let's say I have 60 models out there, 10 of them Critical, 30 Default and 20 Sheddable. I have 20 new InferenceModels to deploy which now need to be the only Critical ones. So... do I have to "downgrade" other existing models in order to achieve this? Is this example a reasonable one, and am I understanding correctly?

More essentially: Criticality is a specification of relationships across multiple resources to "rank" them, and any kind of cross-resource relationship in Kubernetes has the potential to introduce complexity. Do we see potential for ways (like the above) in which this could become painful for operators on clusters?

kfswain mentioned this issue Jan 21, 2025

v0.1 API Review #154

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Criticality Discussion/ Improvements #213

Criticality Discussion/ Improvements #213

kfswain commented Jan 21, 2025

shaneutt commented Jan 21, 2025

Criticality Discussion/ Improvements #213

Criticality Discussion/ Improvements #213

Comments

kfswain commented Jan 21, 2025

shaneutt commented Jan 21, 2025