-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow a way to specify extended resources for scale-from-zero scenario #132
Comments
I have thought of some possible solution:
|
Upstream issue worth noticing kubernetes#1869 |
Post grooming decisionSpecify the node template in the provider config section of the worker. From there, the corresponding extension will pick it up and populate the worker config which contains the NodeTemplate. This will be used to generate the machine class. The CA code at the moment does not consider the ephemeral storage in case of scale from zero. Inside if len(filteredNodes) > 0 {
klog.V(1).Infof("Nodes already existing in the worker pool %s", workerPool)
baseNode := filteredNodes[0]
klog.V(1).Infof("Worker pool node used to form template is %s and its capacity is cpu: %s, memory:%s", baseNode.Name, baseNode.Status.Capacity.Cpu().String(), baseNode.Status.Capacity.Memory().String())
instance = instanceType{
VCPU: baseNode.Status.Capacity[apiv1.ResourceCPU],
Memory: baseNode.Status.Capacity[apiv1.ResourceMemory],
GPU: baseNode.Status.Capacity[gpu.ResourceNvidiaGPU],
EphemeralStorage: baseNode.Status.Capacity[apiv1.ResourceEphemeralStorage],
PodCount: baseNode.Status.Capacity[apiv1.ResourcePods],
}
} else {
klog.V(1).Infof("Generating node template only using nodeTemplate from MachineClass %s: template resources-> cpu: %s,memory: %s", machineClass.Name, nodeTemplateAttributes.Capacity.Cpu().String(), nodeTemplateAttributes.Capacity.Memory().String())
instance = instanceType{
VCPU: nodeTemplateAttributes.Capacity[apiv1.ResourceCPU],
Memory: nodeTemplateAttributes.Capacity[apiv1.ResourceMemory],
GPU: nodeTemplateAttributes.Capacity["gpu"],
// Numbers pods per node will depends on the CNI used and the maxPods kubelet config, default is often 110
PodCount: resource.MustParse("110"),
} We need to fix this part to consider ephemeral storage in the |
What would you like to be added:
There should be a mechanism in our autoscaler so that the user could specify any extended resources his nodes have and the autoscaler becomes aware of it, so that during scaling out a node group from zero it could consider that.
Why is this needed:
It has been noticed that currently the autoscaler couldn't scale a node group from zero if the pod is requesting an extended resource. This is happening because the nodeTemplate which the autoscaler creates doesn't have the extended resources specified.
However it is able to scale from one , because there the autoscaler can form nodeTemplate from an existing node.
There are ways in AWS and Azure implementation of autoscaler to specify such resources.
The text was updated successfully, but these errors were encountered: