Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset all AdmissionChecks on Eviction to Pending #3323

Merged
merged 5 commits into from
Oct 28, 2024

Conversation

PBundyra
Copy link
Contributor

@PBundyra PBundyra commented Oct 25, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

Currently, when AdmissionCheck is in Retry state, the Workloads gets evicted and there's no actual retry. The changes improves that flow, changing all AdmissionChecks to Pending if one of them is in Retry state (unless there's a Rejected AdmissionCheck).

Which issue(s) this PR fixes:

Fixes #3322

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Workload is requeued with all AdmissionChecks set to Pending if there was an AdmissionCheck in Retry state.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Oct 25, 2024
@PBundyra
Copy link
Contributor Author

/assign @mimowo

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 25, 2024
Copy link

netlify bot commented Oct 25, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 1451b4e
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/671f97c1d83c5100085ca0aa

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 25, 2024
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 28, 2024
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 28, 2024
pkg/workload/workload.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 28, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 28, 2024
@mimowo
Copy link
Contributor

mimowo commented Oct 28, 2024

I think we actually need to reset the checks regardless of the reason of eviction. For example, if the checks passed, but the workload is evicted due to WaitForPodsReady?

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 28, 2024
@mimowo
Copy link
Contributor

mimowo commented Oct 28, 2024

When deactivating (evicting with deactivation reason) a workload we reset the requeueState so that it is "clean" or re-activation:

if wl.Status.RequeueState != nil {
wl.Status.RequeueState = nil
updated = true
}

I think for consistency we should do the same with checks, but we can re-visit this as a follow up.

/lgtm
/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Oct 28, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2c3dd1a32282d6a662b4c6c8192d1f63d446f7e9

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mimowo, PBundyra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 07352e4 into kubernetes-sigs:main Oct 28, 2024
16 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.9 milestone Oct 28, 2024
@tenzen-y
Copy link
Member

It seems that this is obviously bug.
/cherry-pick release-0.8

@k8s-infra-cherrypick-robot
Copy link
Contributor

@tenzen-y: #3323 failed to apply on top of branch "release-0.8":

Applying: Make AdmissionChecks Pending after one is in Retry state
Applying: Reduce number of patch requests
Using index info to reconstruct a base tree...
M	pkg/controller/core/workload_controller.go
M	pkg/scheduler/scheduler.go
M	pkg/workload/admissionchecks.go
M	pkg/workload/workload.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/workload/workload.go
CONFLICT (content): Merge conflict in pkg/workload/workload.go
Auto-merging pkg/workload/admissionchecks.go
Auto-merging pkg/scheduler/scheduler.go
CONFLICT (content): Merge conflict in pkg/scheduler/scheduler.go
Auto-merging pkg/controller/core/workload_controller.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0002 Reduce number of patch requests

In response to this:

It seems that this is obviously bug.
/cherry-pick release-0.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@PBundyra PBundyra changed the title Reset all Ready and Ready AdmissionChecks on Eviction as Pending Reset all AdmissionChecks on Eviction to Pending Oct 29, 2024
@mimowo
Copy link
Contributor

mimowo commented Oct 29, 2024

/cherry-pick release-0.8

I guess we could cherry-pick but then we also need to cherry-pick the forced-ownership PR which might be more that we want.

tenzen-y pushed a commit to tenzen-y/kueue that referenced this pull request Oct 31, 2024
…ernetes-sigs#3323)

* Make AdmissionChecks Pending after one is in Retry state

* Reduce number of patch requests

* Improve message, fix e2e test

* Improve naming and message on retry

* Reset check on different evictions
tenzen-y pushed a commit to tenzen-y/kueue that referenced this pull request Oct 31, 2024
…ernetes-sigs#3323)

* Make AdmissionChecks Pending after one is in Retry state

* Reduce number of patch requests

* Improve message, fix e2e test

* Improve naming and message on retry

* Reset check on different evictions
k8s-ci-robot pushed a commit that referenced this pull request Oct 31, 2024
…) (#3395)

* Make AdmissionChecks Pending after one is in Retry state

* Reduce number of patch requests

* Improve message, fix e2e test

* Improve naming and message on retry

* Reset check on different evictions

Co-authored-by: Patryk Bundyra <[email protected]>
PBundyra added a commit to PBundyra/kueue that referenced this pull request Nov 5, 2024
…ernetes-sigs#3323)

* Make AdmissionChecks Pending after one is in Retry state

* Reduce number of patch requests

* Improve message, fix e2e test

* Improve naming and message on retry

* Reset check on different evictions
@PBundyra PBundyra deleted the improve-ac-retry branch November 5, 2024 13:18
kannon92 pushed a commit to openshift-kannon92/kubernetes-sigs-kueue that referenced this pull request Nov 19, 2024
…ernetes-sigs#3323)

* Make AdmissionChecks Pending after one is in Retry state

* Reduce number of patch requests

* Improve message, fix e2e test

* Improve naming and message on retry

* Reset check on different evictions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workload is stuck after one of the AdmissionCheck is in Retry state
5 participants