Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestPatch is flaky #8478

Closed
killianmuldoon opened this issue Apr 5, 2023 · 7 comments · Fixed by #9914
Closed

TestPatch is flaky #8478

killianmuldoon opened this issue Apr 5, 2023 · 7 comments · Fixed by #9914
Labels
area/clusterclass Issues or PRs related to clusterclass area/testing Issues or PRs related to testing help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@killianmuldoon
Copy link
Contributor

Which jobs are flaking?

Unit test runs on 1.4 and main.

Which tests are flaking?

sigs.k8s.io/cluster-api/internal/util/ssa.TestPatch

Since when has it been flaking?

This test has been flaking since it was introduced.

Testgrid link

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/periodic-cluster-api-test-release-1-4/1640193879221211136

Reason for failure (if possible)

I did a little bit of debugging on this one and found that sometimes the resourceVersion of the object being patched (but only for the MachineTests in my reproduction) is bumped along with the time signature in the managed fields, but there is no other change on the object.

This results in the ssa call not being cached, as it checks the resourceVersion, unexpectedly.

Anything else we need to know?

No response

Label(s) to be applied

/kind flake
/area testing

@k8s-ci-robot k8s-ci-robot added kind/flake Categorizes issue or PR as related to a flaky test. area/testing Issues or PRs related to testing needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 5, 2023
@fabriziopandini
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 5, 2023
@chrischdi
Copy link
Member

k8s-triage search query

@killianmuldoon
Copy link
Contributor Author

/help

@k8s-ci-robot
Copy link
Contributor

@killianmuldoon:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jul 12, 2023
@killianmuldoon killianmuldoon added the area/clusterclass Issues or PRs related to clusterclass label Nov 1, 2023
@chrischdi
Copy link
Member

chrischdi commented Dec 21, 2023

Example:

Failed
=== RUN   TestPatch/Test_patch_with_Machine
    testing.go:1471: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test
--- FAIL: TestPatch/Test_patch_with_Machine (0.17s)

Permalink to check occurence.

Reproducible (not 100%) locally using:

export KUBEBUILDER_ASSETS=/path/to/my/kubebuilder-arm64-1.28.0
go test -race -count 120 -failfast -shuffle on -timeout 30s -run '^TestPatch$' sigs.k8s.io/cluster-api/internal/util/ssa

@chrischdi
Copy link
Member

chrischdi commented Dec 21, 2023

First analysis:

if err := c.Patch(ctx, modifiedUnstructured, client.Apply, patchOptions...); err != nil {

seems to patch the object again.
The only changes I was able to see was the bump of the .metadata.resourceVersion and timestamp inside the managed fields.

@chrischdi
Copy link
Member

Maybe related: kubernetes/kubernetes#118519

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterclass Issues or PRs related to clusterclass area/testing Issues or PRs related to testing help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants