Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

wangyang0616 · 2023-07-27T02:38:31Z

What happened:
In a large-scale cluster, multiple pods in the terminating state exist and exist for a long time. (Node faults, application faults, and untimely manual handling) The pod to be scheduled is in the pending state. Volcano enters the pipeline state and waits for the terminating pod to release resources. The pod cannot run for a long time.

After the autoscaler scale-out function is enabled and the capacity is expanded by cluster pending pod, the scale-out function cannot be triggered.

What you expected to happen:
During Volcano scheduling, if pending pods exist in the cluster, nodes can be automatically expanded to schedule pending pods after autoscaler is enabled, regardless of whether the cluster enters the pipeline state.

How to reproduce it (as minimally and precisely as possible):

Install the autoscaler component and expand the capacity based on the pending pod.
Manually set a terminating pod.
Schedule a pod and apply for the same amount of resources as the terminating pod.

Anything else we need to know?:
When the autoscaler scales out a pending pod, the condition for watching the pod is that the pod is in the pending state and the reason is Unschedulable.

Environment:

Volcano Version: master
Kubernetes version (use kubectl version):
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

The text was updated successfully, but these errors were encountered:

wangyang0616 added the kind/bug Categorizes issue or PR as related to a bug. label Jul 27, 2023

wangyang0616 mentioned this issue Jul 27, 2023

Fix: the pod pipeline status is incompatible with autoscaler capacity expansion #3001

Merged

volcano-sh-bot closed this as completed in #3001 Jul 27, 2023

william-wang added this to the v1.8 milestone Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

wangyang0616 commented Jul 27, 2023

Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

Comments

wangyang0616 commented Jul 27, 2023