Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated cherry pick of #1838: WaitForPodsReady: Reset the requeueState while reconciling #1843

Conversation

tenzen-y
Copy link
Member

@tenzen-y tenzen-y commented Mar 14, 2024

Cherry pick of #1838 on release-0.6.
#1838: WaitForPodsReady: Reset the requeueState while reconciling
For details on the cherry pick process, see the cherry pick requests page.

WaitForPodsReady: Fix a bug that the requeueState isn't reset.

@k8s-ci-robot k8s-ci-robot added this to the v0.6 milestone Mar 14, 2024
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 14, 2024
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 14, 2024
Copy link

netlify bot commented Mar 14, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 41a8ac3
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/65f32d4a0bcbb10008222200

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Mar 14, 2024
@tenzen-y tenzen-y force-pushed the automated-cherry-pick-of-#1838-upstream-release-0.6 branch from 0ac8d04 to 41a8ac3 Compare March 14, 2024 17:00
@tenzen-y
Copy link
Member Author

/assign @alculquicondor

@tenzen-y
Copy link
Member Author

2024-03-14T17:03:16.805955051Z ERROR controller-runtime.test-env envtest/server.go:317 unable to start the controlplane {"tries": 0, "error": "timeout waiting for process kube-apiserver to start"}

Failed to start the control-plane.

/test pull-kueue-test-integration-release-0-6

@tenzen-y
Copy link
Member Author

2024-03-14T17:03:16.805955051Z ERROR controller-runtime.test-env envtest/server.go:317 unable to start the controlplane {"tries": 0, "error": "timeout waiting for process kube-apiserver to start"}

Failed to start the control-plane.

/test pull-kueue-test-integration-release-0-6

Wait, it seems that other errors happened:

2024-03-14T17:07:38.313021161Z ERROR controller/controller.go:329 Reconciler error {"controller": "rayjob", "controllerGroup": "ray.io", "controllerKind": "RayJob", "RayJob": {"name":"test-job","namespace":"core-l2gkb"}, "namespace": "core-l2gkb", "name": "test-job", "reconcileID": "7e6da7c5-2f19-4364-af73-462799c71413", "error": "Operation cannot be fulfilled on workloads.kueue.x-k8s.io "rayjob-test-job-715ab": StorageError: invalid object, Code: 4, Key: /registry/kueue.x-k8s.io/workloads/core-l2gkb/rayjob-test-job-715ab, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 609c5df0-dfe8-41e7-bd0e-94889424872f, UID in object meta: "}

@tenzen-y
Copy link
Member Author

@alculquicondor
Copy link
Contributor

Precondition failed: UID in precondition:

I have seen these kind of errors when the server is terminating or objects are being deleted while there is still another API call for an update in flight.

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 14, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 11619245f48f9536e7c42b494d6250ebea2880a6

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [alculquicondor,tenzen-y]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tenzen-y
Copy link
Member Author

Precondition failed: UID in precondition:

I have seen these kind of errors when the server is terminating or objects are being deleted while there is still another API call for an update in flight.

I see. Thank you for letting me know.

@k8s-ci-robot k8s-ci-robot merged commit 3adac90 into kubernetes-sigs:release-0.6 Mar 14, 2024
14 checks passed
@alculquicondor
Copy link
Contributor

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 14, 2024
@tenzen-y tenzen-y deleted the automated-cherry-pick-of-#1838-upstream-release-0.6 branch March 15, 2024 10:02
Ygnas pushed a commit to Ygnas/kueue that referenced this pull request Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants