-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod Scheduling Readiness #3658
Pod Scheduling Readiness #3658
Conversation
Does this change have the same purpose as this PR #3612 ? |
Hi,I think there is no necessary to submit two PRs, just based the previous PR and update it OK,and please also update the design doc. |
Please refer to kube-scheduler to add a featuregate to enable/disable this feature. |
PR summary: Pod Scheduling gates is a new feature in K8S, which uses the
Future Works that are not covered in this PR:
|
Commit Author: /assign @ykcai-daniel |
/assign @Monokaix |
/assign @googs1025 |
/assign @lowang-bh @hwdef |
pkg/scheduler/api/job_info.go
Outdated
@@ -228,6 +230,17 @@ func (ti *TaskInfo) ClearLastTxContext() { | |||
ti.LastTransaction = nil | |||
} | |||
|
|||
// Return if the pod of a task is scheduling gated by checking if length of sch gates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments: if length of sch gates
is zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the typo. I will fix the commit
pkg/scheduler/api/job_info.go
Outdated
@@ -228,6 +230,17 @@ func (ti *TaskInfo) ClearLastTxContext() { | |||
ti.LastTransaction = nil | |||
} | |||
|
|||
// Return if the pod of a task is scheduling gated by checking if length of sch gates | |||
// When the Pod is not yet created or sch gates field not set, return false | |||
func (ti *TaskInfo) SchedulingGated() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about store this value in taskInfo, instead of a function calling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a field schGated
into taskInfo
@@ -142,7 +142,7 @@ func (cp *capacityPlugin) OnSessionOpen(ssn *framework.Session) { | |||
} | |||
|
|||
if job.PodGroup.Status.Phase == scheduling.PodGroupInqueue { | |||
attr.inqueue.Add(job.GetMinResources()) | |||
attr.inqueue.Add(job.DeductSchGatedResources(job.GetMinResources())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its better to do the calcalation in GetMinResources function. Let GetMinResources return the min resource to run the job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this function into GetMinresources
have some downsides. The motivation of DeductSchGatedResources
is to deduct the resources of scheduling gated tasks in a job when calculating inqueued resources so that it will not block other jobs from being inqueued. Therefore, this function is will only be used in the queue-related plugins (proporation, capacity, overcommit).
On the other hand, GetMinResources
is a function that returns the minimum resource requirement of a Job, which should be an immutable quantity determined at the creation of the PodGroup. This quantity should not change regardless of scheduling gated tasks. Unlike DeductSchGatedResources, apart from inqueue resource calculation, GetMinResources is also used in other places, such as GetElasticResources
If we include DeductSchGatedResources
into GetMinresources
and tightly couple these two functions, this will bring some runtime information about the state of Pods into a function that should return immutable information, making it bug-prone when future developers use it.
@lowang-bh cc @Monokaix Pull request updated. Please review the changes. Thank you! |
pkg/scheduler/api/job_info.go
Outdated
|
||
// ti.SchGated is initialized in NewTaskInfo | ||
func (ti *TaskInfo) SchedulingGated() bool { | ||
return ti.SchGated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this filed be exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this function, there is no need to export it. I will change it to schGated
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to use this filed instead of a function calling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an optional way, but seems there is no need to expose an extra filed of task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable.
@@ -282,18 +282,18 @@ func (cp *capacityPlugin) OnSessionOpen(ssn *framework.Session) { | |||
queue := ssn.Queues[queueID] | |||
// If no capability is set, always enqueue the job. | |||
if attr.realCapability == nil { | |||
klog.V(4).Infof("Capability of queue <%s> was not set, allow job <%s/%s> to Inqueue.", | |||
klog.V(3).Infof("Capability of queue <%s> was not set, allow job <%s/%s> to Inqueue.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change these log level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this log level when debugging the capcacity plugin. Will change is back
pkg/scheduler/api/job_info.go
Outdated
// If resource is less than gated resources, return zero; | ||
// Gated resources might be more than minResource when total rep of each task > minAvailable | ||
func (ji *JobInfo) DeductSchGatedResources(res *Resource) *Resource { | ||
result := res.Clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can check whether res is zero to have a shortcut here, because in most case task is not gated.
For more details about the design of the PR, see #3581 . Please also merge the design doc. |
// If resource is less than gated resources, return zero; | ||
// Note: The purpose of this functionis to deduct the resources of scheduling gated tasks | ||
// in a job when calculating inqueued resources so that it will not block other jobs from being inqueued. | ||
func (ji *JobInfo) DeductSchGatedResources(res *Resource) *Resource { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should start with DeductSchGatedResources
to be consistent with func name.
pkg/scheduler/api/job_info.go
Outdated
// Note: The purpose of this functionis to deduct the resources of scheduling gated tasks | ||
// in a job when calculating inqueued resources so that it will not block other jobs from being inqueued. | ||
func (ji *JobInfo) DeductSchGatedResources(res *Resource) *Resource { | ||
result := res.Clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems there is no need to clone res
because no one will also change it, and also we can change to:
schGatedResource := ji.GetSchGatedPodResources()
// Most jobs do not have any scheduling gated tasks, hence we add this short cut
if schGatedResource.IsEmpty() {
return res
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, we don't clone res
if scheduling gated resource is zero.
17bbc42
to
5bf6f62
Compare
9aee58a
to
64b16e1
Compare
return CreateDeploymentGated(ctx, name, rep, img, req, []v1.PodSchedulingGate{}) | ||
} | ||
|
||
// CreateDeployment creates a new deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateDeployment -》 CreateDeploymentGated
82cef03
to
6ab5464
Compare
Signed-off-by: ykcai-daniel <[email protected]>
6ab5464
to
757c682
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Monokaix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Updated PR of #3612, a gated task no longer blocks the job. Do not use new state.