Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment workload pending due to podgroup in Pending Phase #2143

Closed
wpeng102 opened this issue Apr 2, 2022 · 4 comments
Closed

Deployment workload pending due to podgroup in Pending Phase #2143

wpeng102 opened this issue Apr 2, 2022 · 4 comments
Labels
area/controllers kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@wpeng102
Copy link
Member

wpeng102 commented Apr 2, 2022

What happened:
Cluster has enough resource but deployment workload scheduled by volcano still pending due to podgroup in pending phase.

What you expected to happen:
Volcano schedule deployment workload well.

How to reproduce it (as minimally and precisely as possible):

  1. Submit a deployoment workload which use volcano scheduler.
  2. Update deployment resource request several times, each update will create a rs and volcano will create a podgroup
root@w00564621-res-33064-1r3xs:~/peng/yaml# kubectl get rs
NAME                             DESIRED   CURRENT   READY   AGE
deploy-with-volcano-545bc5465    0         0         0       114m
deploy-with-volcano-584fc556b5   3         3         3       115m
deploy-with-volcano-646b8c6b4b   1         1         0       112m
deploy-with-volcano-64b96f7df6   0         0         0       122m
deploy-with-volcano-678d5f7586   0         0         0       121m
deploy-with-volcano-6b5bb766b4   0         0         0       112m
deploy-with-volcano-6c9b444795   0         0         0       120m
deploy-with-volcano-7bd985746b   0         0         0       115m

root@w00564621-res-33064-1r3xs:~/peng/yaml# kubectl get pg
NAME                                            AGE
podgroup-07237d09-a605-4149-ae85-a32e666b1f21   121m
podgroup-1b6548ec-30bc-4d1a-8869-357a059331d9   116m
podgroup-3bd4da40-1b0f-4e6f-ac4a-97730e79fdf1   115m
podgroup-476434f7-791e-4228-bf2f-b0ebb664d0bb   122m
podgroup-845df9b1-f2bf-46ed-90db-1f874866b4ad   112m
podgroup-c6dbc539-5cda-4671-8d18-74cf82c8ae59   122m
podgroup-f0175b8a-b3f8-4739-b738-ff02ec39caba   113m
  1. There will be many podgroup in Inqueue phase, in overcommit plugin it will compute inqueueResource for these Inqueue pg, even if it has no task.

    if job.PodGroup.Status.Phase == scheduling.PodGroupInqueue && job.PodGroup.Spec.MinResources != nil {
    op.inqueueResource.Add(api.NewResource(*job.PodGroup.Spec.MinResources))
    continue
    }

  2. Then submit new workload will pending because overcommit plugin thought all the resources used by the above Inqueue gp.

@wpeng102 wpeng102 added the kind/bug Categorizes issue or PR as related to a bug. label Apr 2, 2022
@wpeng102
Copy link
Member Author

wpeng102 commented Apr 2, 2022

Volcano should create podgroup for deployoment instead of create podgroup for replicaset.

@k82cn k82cn added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/controllers and removed kind/bug Categorizes issue or PR as related to a bug. labels Apr 2, 2022
@william-wang william-wang added this to the roadmap milestone May 13, 2022
@stale
Copy link

stale bot commented Aug 12, 2022

Hello 👋 Looks like there was no activity on this issue for last 90 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 12, 2022
@Thor-wl Thor-wl removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 19, 2022
@stale
Copy link

stale bot commented Nov 22, 2022

Hello 👋 Looks like there was no activity on this issue for last 90 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 22, 2022
@stale
Copy link

stale bot commented Jan 22, 2023

Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Jan 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controllers kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

4 participants