-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1744245: fix e2e failure #1001
Bug 1744245: fix e2e failure #1001
Conversation
@tkashem: This pull request references Bugzilla bug 1737081, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
7f61d32
to
07cbe84
Compare
@tkashem: This pull request references Bugzilla bug 1737081, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Can we pull in @galletti94 's test case and get both in this PR? |
Back to back delete and recreate of a subscription object causes operator install to fail. How to reproduce: - Create a CatalogSource object - Create a subscription that refers to the CatalogSource above. - Wait for the operator to install successfully. - Update the CatalogSource - Wait for the CatalogSource to become healthy - Delete the Subscription object ( from above ). - Create the Subscription object ( no time delay between delete and create ). Delete and Create can be done one after another, there is no need to make them concurrent. The operator install will fail, Subscription status will have an error condition `ReferencedInstallPlanNotFound`. The new install plan object created by OLM gets deleted by GC. Root cause: - OLM uses a lister to get the list of Subscription(s) in a given namespace and sets the relevant subscriptions(s) found in the list as owner of the installplan object(s). - Because lister uses cache, it will return a deleted subscription until the cache is synced. - The new installplan object may get an owner ref that points to the deleted subscription. - GC garbage collects the deleted subscription and consequently deletes the new InstallPlan. - Subscription reconciler reports that the new InstallPlan object is missing and moves the Subscription to a Failed state. The api audit log has entries that validates that GC is rightfully "deleting" the new InstallPlan object. Fix: - For now, use a direct non-cached client to retrieve the list of Subscription. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1744245 Jira: https://jira.coreos.com/browse/OLM-1245
07cbe84
to
71df193
Compare
Do you want both fixes in the same PR? This might delay other PR(s) blocked. Can we get this merged and unblock folks and then we can focus on the issue @galletti94 is working? |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ecordell, tkashem The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@tkashem: This pull request references Bugzilla bug 1744245, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@ecordell: This pull request references Bugzilla bug 1744245, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@tkashem: All pull requests linked via external trackers have merged. Bugzilla bug 1744245 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Back to back delete and recreate of a subscription object causes
operator install to fail.
How to reproduce:
and create ). Delete and Create can be done one after another,
there is no need to make them concurrent.
The operator install will fail, Subscription status will have an error
condition
ReferencedInstallPlanNotFound
. The new install plan objectcreated by OLM gets deleted by GC.
Root cause:
namespace and sets the relevant subscriptions(s) found in the list as
owner of the installplan object(s).
until the cache is synced.
deleted subscription.
deletes the new InstallPlan.
missing and moves the Subscription to a Failed state.
The api audit log has entries that validates that GC is rightfully
"deleting" the new InstallPlan object.
Fix:
Subscription.