Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

Closed
wangyang0616 opened this issue Jul 27, 2023 · 0 comments · Fixed by #3001
Closed

Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

wangyang0616 opened this issue Jul 27, 2023 · 0 comments · Fixed by #3001
Labels
kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@wangyang0616
Copy link
Member

What happened:
In a large-scale cluster, multiple pods in the terminating state exist and exist for a long time. (Node faults, application faults, and untimely manual handling) The pod to be scheduled is in the pending state. Volcano enters the pipeline state and waits for the terminating pod to release resources. The pod cannot run for a long time.

After the autoscaler scale-out function is enabled and the capacity is expanded by cluster pending pod, the scale-out function cannot be triggered.

What you expected to happen:
During Volcano scheduling, if pending pods exist in the cluster, nodes can be automatically expanded to schedule pending pods after autoscaler is enabled, regardless of whether the cluster enters the pipeline state.

How to reproduce it (as minimally and precisely as possible):

  1. Install the autoscaler component and expand the capacity based on the pending pod.
  2. Manually set a terminating pod.
  3. Schedule a pod and apply for the same amount of resources as the terminating pod.

Anything else we need to know?:
When the autoscaler scales out a pending pod, the condition for watching the pod is that the pod is in the pending state and the reason is Unschedulable.

Environment:

  • Volcano Version: master
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants