-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(clusterstate): invalidate instance cache when scaling down #6337
Conversation
03b621f
to
2166ddd
Compare
/assign vadasambar |
Thank you for the PR! |
// NodeName is the name of the node to be deleted. | ||
NodeName string | ||
// Node is the node to be deleted. | ||
Node *apiv1.Node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to use Node
instead of NodeName
? I see we are using only name field of the node in the code (in this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use Node
instead of NodeName
to delete scaleDownRequest
when there are no instances on the cloud provider side. I have to get the node object from lister before calling the HasInstance
function if we use NodeName
.
// delete scaleDownRequest if there's no instance in cloud provider side | ||
// otherwise we check the delete time | ||
hasInstance, err := csr.cloudProvider.HasInstance(scaleDownRequest.Node) | ||
if err == nil && !hasInstance { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👀
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: qianlei90, vadasambar The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm @BigDarkClown @x13n can you please take a look at this PR 🙏 |
@qianlei90 can you rebase this PR ? |
2166ddd
to
bfff113
Compare
New changes are detected. LGTM label has been removed. |
/hold |
Since there's no linked issue, can you clarify what is the bug you're trying to fix with this change? |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
After CA finishes scaling down, the instance cache still exists in the ClusterStateRegistry, causing CA to mistakenly believe that there are some unregistered nodes until the cache is refreshed (
CloudProviderNodeInstancesCacheRefreshInterval = 2 * time.Minute
).This PR invalidates the cache once the scale down is completed.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: