Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: emit pod nomination events and add provisioning node tainted trigger controller #933

Merged
merged 19 commits into from
Feb 2, 2024

Conversation

njtran
Copy link
Contributor

@njtran njtran commented Jan 10, 2024

Fixes #N/A

Description
This is a breaking change for users that rely on messages emitted by the provisioning.trigger controller in their logging. This adds another reconciler so that the provisioning trigger consists of a provisioning.nodetrigger which will trigger the batching mechanism for any tainted nodes, and provisioning.podtrigger which is the same as the provisioning.trigger.

This fixes an issue where successive disruption controller reconciliations wouldn't consider the results of the previous (and currently active) disruption command. This could result in Karpenter disrupting a node that was recently the target of a scheduling loop, when Karpenter should really wait and consider another node for disruption.

How was this change tested?
make presubmit
kwok cloud provider

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 10, 2024
@njtran njtran marked this pull request as draft January 10, 2024 23:30
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 10, 2024
@coveralls
Copy link

coveralls commented Jan 10, 2024

Pull Request Test Coverage Report for Build 7746983532

  • -59 of 119 (50.42%) changed or added relevant lines in 12 files are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.2%) to 80.435%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/controllers/disruption/drift.go 4 5 80.0%
pkg/controllers/disruption/expiration.go 4 5 80.0%
pkg/controllers/disruption/consolidation.go 13 16 81.25%
pkg/controllers/disruption/emptynodeconsolidation.go 5 8 62.5%
pkg/controllers/disruption/singlenodeconsolidation.go 5 8 62.5%
pkg/controllers/provisioning/provisioner.go 3 6 50.0%
pkg/controllers/disruption/helpers.go 1 5 20.0%
pkg/controllers/disruption/multinodeconsolidation.go 11 16 68.75%
pkg/controllers/provisioning/controller.go 0 36 0.0%
Files with Coverage Reduction New Missed Lines %
pkg/controllers/provisioning/controller.go 1 0.0%
pkg/test/expectations/expectations.go 2 94.92%
Totals Coverage Status
Change from base Build 7737948211: -0.2%
Covered Lines: 7869
Relevant Lines: 9783

💛 - Coveralls

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 13, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2024
@njtran njtran marked this pull request as ready for review January 23, 2024 18:34
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 23, 2024
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 24, 2024
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 24, 2024
pkg/controllers/disruption/consolidation.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/controller.go Show resolved Hide resolved
pkg/controllers/disruption/drift.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/emptynodeconsolidation.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/multinodeconsolidation.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/suite_test.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/types.go Outdated Show resolved Hide resolved
pkg/controllers/provisioning/controller.go Outdated Show resolved Hide resolved
pkg/test/pods.go Outdated Show resolved Hide resolved
pkg/utils/nodeclaim/nodeclaim.go Outdated Show resolved Hide resolved
pkg/utils/pod/scheduling.go Outdated Show resolved Hide resolved
pkg/controllers/provisioning/controller.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/suite_test.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/emptynodeconsolidation.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/emptiness.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/drift.go Outdated Show resolved Hide resolved
@njtran njtran changed the title fix: emit pod nomination events and add watch for tainted nodes fix: emit pod nomination events and add provisioning node tainted trigger controller Jan 28, 2024
pkg/utils/nodeclaim/nodeclaim.go Outdated Show resolved Hide resolved
pkg/controllers/provisioning/controller.go Outdated Show resolved Hide resolved
pkg/controllers/provisioning/controller.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/multinodeconsolidation.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/helpers.go Show resolved Hide resolved
pkg/controllers/provisioning/controller.go Outdated Show resolved Hide resolved
pkg/controllers/disruption/suite_test.go Outdated Show resolved Hide resolved
Copy link
Member

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2024
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 1, 2024
Copy link
Member

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jonathan-innis, njtran

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jonathan-innis,njtran]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@njtran
Copy link
Contributor Author

njtran commented Feb 2, 2024

/remove-hold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 2, 2024
@k8s-ci-robot k8s-ci-robot merged commit 49a70dc into kubernetes-sigs:main Feb 2, 2024
12 checks passed
@njtran njtran deleted the emitAndEventFilter branch February 2, 2024 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants