You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Criticality has been discussed quite a bit, Ex: here and here, as well as in the Inf-GW weekly meeting.
We recognize that this field may be imperfect, but without user feedback its currently difficult to iterate in the proper direction. So to centralize discussion, we are creating this issue.
The text was updated successfully, but these errors were encountered:
One of my questions is about how Criticality functions during the entire lifecycle of models:
In practice will operators have to update large groups of InferenceModels to accommodate for the criticality of new InferenceModels? For example let's say I have 60 models out there, 10 of them Critical, 30 Default and 20 Sheddable. I have 20 new InferenceModels to deploy which now need to be the only Critical ones. So... do I have to "downgrade" other existing models in order to achieve this? Is this example a reasonable one, and am I understanding correctly?
More essentially: Criticality is a specification of relationships across multiple resources to "rank" them, and any kind of cross-resource relationship in Kubernetes has the potential to introduce complexity. Do we see potential for ways (like the above) in which this could become painful for operators on clusters?
Criticality has been discussed quite a bit, Ex: here and here, as well as in the Inf-GW weekly meeting.
We recognize that this field may be imperfect, but without user feedback its currently difficult to iterate in the proper direction. So to centralize discussion, we are creating this issue.
The text was updated successfully, but these errors were encountered: