-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ignoring terminated pods in scaledown #3545
ignoring terminated pods in scaledown #3545
Conversation
Welcome @dbenque! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this mostly makes sense to me. i think my big concern is that we would mark a node for deletion while a pod was in the middle of terminating, but it seems like the deletion timestamp logic should help us prevent that.
it seems like there are no tests in the simulator/drain_test that enable this flag, is it possible to test there as well? (or did i just miss it)
@@ -414,7 +414,7 @@ func (sd *ScaleDown) checkNodeUtilization(timestamp time.Time, node *apiv1.Node, | |||
return simulator.ScaleDownDisabledAnnotation, nil | |||
} | |||
|
|||
utilInfo, err := simulator.CalculateUtilization(node, nodeInfo, sd.context.IgnoreDaemonSetsUtilization, sd.context.IgnoreMirrorPodsUtilization, sd.context.CloudProvider.GPULabel()) | |||
utilInfo, err := simulator.CalculateUtilization(node, nodeInfo, sd.context.IgnoreDaemonSetsUtilization, sd.context.IgnoreMirrorPodsUtilization, sd.context.IgnorePodsThatShouldBeTerminated, sd.context.CloudProvider.GPULabel()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do wonder if we should convert CalculateUtilization
to take a context instead of continuing to grow the argument list?
this is not a blocker for me, just curious what others think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is something I was thinking about also, but I did not want to bring to many change in that PR.
Depending on what others think, I can do the modification in that PR, or refactor this in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think your instinct not to introduce too many changes is a good one. it just struck me as odd while reading the code.
@elmiko , yes the logic is preventing this because the pod should be |
84a2f9f
to
facb95a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the contribution! I have some comments, but nothing major, the overall approach LGTM.
f2a16ed
to
9b53c40
Compare
@towca thanks a lot for your review. |
9b53c40
to
2c3c84c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments and sorry for the long review delay. Comments mostly test-related, sorry for the amount but I didn't focus on tests in the first pass.
4a2008d
to
8eff34d
Compare
@towca thanks for your second review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for this contribution and for taking the time to improve some unrelated tests!
You have some gofmt errors that you need to resolve (see Travis). Could you also squash the commits into 1 while you're at it?
8eff34d
to
2529b66
Compare
@towca Thanks for yours reviews. I have fixed the small gofmt problem, and squashed the commits. |
Thanks! |
@towca: GitHub didn't allow me to assign the following users: for, approval. Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dbenque, towca, vivekbagade The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[cluster-autoscaler] Backporting #3545 to release 1.18
[cluster-autoscaler] Backporting #3545 to release 1.19
Fix proposal for #3407