Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix calculations of podgroup min resource #3057

Merged
merged 3 commits into from
Jun 13, 2024

Conversation

lowang-bh
Copy link
Member

@lowang-bh lowang-bh commented Aug 15, 2023

commit 1: refact jobinfo's calculation to a function.
commit 2: fix cal podgroup min resource and add testcase. relative design docs: docs about job's min resource #2945
commit 3: when jobMinAvailable < totalTask's, keep the origin logic of calculate podgorup minResource: sum up first jobMinAvailable

Fix #2921 also.

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 15, 2023
@lowang-bh lowang-bh changed the title Cal pg minresource fix calculations of podgroup min Aug 15, 2023
@lowang-bh lowang-bh changed the title fix calculations of podgroup min fix calculations of podgroup min resource Aug 15, 2023
@lowang-bh
Copy link
Member Author

lowang-bh commented Aug 15, 2023

/assign @wangyang0616 @hwdef @Yikun @Thor-wl @william-wang

Hi, all guys, would you please have a look at this pr and let us discuss the remaining case (jobMinAvailable < sumup(taskMinAvailable)), eg, jobMinAvailable=2, totalTaskMinAvailable=3, we should sum up at most 2 or 3 member's resource as the job's min resource?

I have an idea is to validate the jobMinAvailable and taskTotalMinAvailable. Now there is only validations about jobMinAvailable vs totalReplicas.

But there is also another thing need to be noted: task is allocated from high priorith to low priority in allocate action, utill all jobMinAvaiable tasks allocated, then it is ready to commit.
Consider this scenario: queue with capacity 2C, job with jobMinAvailable=2 has two master, each requst 1c, and taskMinAvailable =1, two workers, each requst 0.5c and taskMinAvailable=2. The idea allocation process is 1master and then 2 workers. But current allocation process is 2master (high priority) allocated first and used up queue capacity and then job can not be ready. (maybe we need to enqueue task which is min available to tasks queue)

for taskID, minNum := range ji.TaskMinAvailable {
if occupiedMap[taskID] < minNum {
klog.V(4).Infof("Job %s/%s Task %s occupied %v less than task min avaliable", ji.Namespace, ji.Name, taskID, occupiedMap[taskID])
return false
}
}

@lowang-bh
Copy link
Member Author

trigger CI

@lowang-bh lowang-bh closed this Aug 16, 2023
@lowang-bh lowang-bh reopened this Aug 16, 2023
@lowang-bh lowang-bh closed this Aug 17, 2023
@lowang-bh lowang-bh reopened this Aug 17, 2023
@volcano-sh-bot volcano-sh-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 26, 2023
@lowang-bh
Copy link
Member Author

/assign @k82cn @kevin-wangzefeng

@lowang-bh
Copy link
Member Author

/assign @Monokaix

@Monokaix
Copy link
Member

Monokaix commented Feb 1, 2024

Hi,please resolve code conflict.

@volcano-sh-bot volcano-sh-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 5, 2024
@volcano-sh-bot volcano-sh-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 5, 2024
@Monokaix
Copy link
Member

Monokaix commented Jun 6, 2024

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 6, 2024
@lowang-bh lowang-bh closed this Jun 13, 2024
@lowang-bh lowang-bh reopened this Jun 13, 2024
@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 13, 2024
Copy link
Member

@william-wang william-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 13, 2024
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: william-wang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 13, 2024
@volcano-sh-bot volcano-sh-bot merged commit 67fcdc3 into volcano-sh:master Jun 13, 2024
14 checks passed
@lowang-bh lowang-bh deleted the calPGMinresource branch June 13, 2024 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PG minresource has some problems after PR#1459
10 participants