Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PG minresource has some problems after PR#1459 #2921

Closed
3 of 6 tasks
lowang-bh opened this issue Jun 17, 2023 · 5 comments · Fixed by #3057
Closed
3 of 6 tasks

PG minresource has some problems after PR#1459 #2921

lowang-bh opened this issue Jun 17, 2023 · 5 comments · Fixed by #3057
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@lowang-bh
Copy link
Member

lowang-bh commented Jun 17, 2023

What happened:

minReq := v1.ResourceList{}
podCnt := int32(0)
for _, task := range tasksPriority {
for i := int32(0); i < task.Replicas; i++ {
if podCnt >= job.Spec.MinAvailable {
break
}
podCnt++
pod := &v1.Pod{
Spec: task.Template.Spec,
}
minReq = quotav1.Add(minReq, *util.GetPodQuotaUsage(pod))
}
}

There are many cases.

  • job.MinAvailable !=0, task.MinAvailable == 0: (job.MinAvailable<=sum(task.Replicas))

    • sum up task's replicas untile pod cnt == job.MinAvailable
  • job.MinAvailable ==0, task.MinAvailable == 0: (job.MinAvailable==sum(task.Replicas))

    • sum up task's replicas or empty (Done via patchDefaultMinAvailable func in admission webhook)
  • job.MinAvailable ==0, task.MinAvailable != 0: (job.MinAvailable==sum(task.MinAvailable)<=sum(task.Replicas))

    • sum up tasks's MinAvailable(Done via patchDefaultMinAvailable func in admission webhook)
    • calulate min resource from highest priority role to lowest priority and each role with first min avaiable member
  • job.MinAvailable !=0, task.MinAvailable != 0:

    • job.MinAvailable >= sum(task.MinAvailable)
      • TODO
    • job.MinAvailable < sum(task.MinAvailable)
      • TODO

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Volcano Version:
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@hwdef
Copy link
Member

hwdef commented Jun 27, 2023

@shinytang6
Could you please check this?

@wangyang0616
Copy link
Member

Is it convenient to share the problem and fix pr at the community meeting this week? @lowang-bh

@renwenlong-github
Copy link

1、job.MinAvailable =0

1.1、task.MinAvailable == 0

webhook mutate set job.MinAvailable = sum(task.Replacas)

1.2、task.MinAvailable != 0

webhook mutate set job.MinAvailable = sum(task.MinAvailable)

2、job.MinAvailable !=0

2.1、task.MinAvailable == 0

  • when task is one is ok, webhook mutate set task.MinAvailable=job.MinAvailable
  • when task is multiple, webhook validate job.MinAvailable = sum(task.Replacas). otherwise, I don’t know which tasks is job.MinAvailable
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: minavailable-job
spec:
  schedulerName: volcano
  minAvailable: 4
  tasks:
    - replicas: 5
      minAvailable: 2
      name: "master"
      template:
        metadata:
          name: master
        spec:
          containers:
            - image: nginx
              name: nginx
              resources:
                requests:
                  cpu: "1"
                  memory: "1Gi"
          restartPolicy: OnFailure
    - replicas: 3
      minAvailable: 2
      name: "work"
      template:
        metadata:
          name: web
        spec:
          containers:
            - image: nginx
              name: nginx
              resources:
                requests:
                  cpu: "1"
                  memory: "1Gi"
          restartPolicy: OnFailure

2.2、task.MinAvailable != 0

user set job.MinAvailable and task.MinAvailable. webhook validate job.MinAvailable must equal to sum(task.MinAvailable)

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: minavailable-job
spec:
  schedulerName: volcano
  minAvailable: 5
  tasks:
    - replicas: 5
      minAvailable: 2
      name: "master"
      template:
        metadata:
          name: master
        spec:
          containers:
            - image: nginx
              name: nginx
              resources:
                requests:
                  cpu: "1"
                  memory: "1Gi"
          restartPolicy: OnFailure
    - replicas: 3
      minAvailable: 2
      name: "work"
      template:
        metadata:
          name: web
        spec:
          containers:
            - image: nginx
              name: nginx
              resources:
                requests:
                  cpu: "1"
                  memory: "1Gi"
          restartPolicy: OnFailure

Who can know job.minAvailable: 5 represents those pods

@stale
Copy link

stale bot commented Oct 15, 2023

Hello 👋 Looks like there was no activity on this issue for last 90 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 15, 2023
@lowang-bh
Copy link
Member Author

/remove lifecycle

@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants