Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update catalog if image changes #816

Merged
merged 6 commits into from
May 7, 2019

Conversation

ecordell
Copy link
Member

No description provided.

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 16, 2019
@ecordell
Copy link
Member Author

/retest

@njhale njhale changed the title Udpate catalog if image changes Update catalog if image changes Apr 16, 2019
@jpeeler
Copy link

jpeeler commented Apr 17, 2019

/lgtm
/retest

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 17, 2019
@cuppett
Copy link

cuppett commented Apr 17, 2019

/test e2e-aws-olm

@ecordell
Copy link
Member Author

/retest

@ecordell
Copy link
Member Author

/hold
I have additional tests that are based on #820, so holding until that goes in.

/retest
(to test console tests)

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 23, 2019
@ecordell
Copy link
Member Author

/test e2e-aws-console-olm

@ecordell
Copy link
Member Author

/retest

@ecordell
Copy link
Member Author

/retest

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 24, 2019
@ecordell
Copy link
Member Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2019
@ecordell
Copy link
Member Author

ecordell commented Apr 24, 2019 via email

@ecordell
Copy link
Member Author

/retest

5 similar comments
@ecordell
Copy link
Member Author

/retest

@ecordell
Copy link
Member Author

/retest

@ecordell
Copy link
Member Author

/retest

@ecordell
Copy link
Member Author

/retest

@ecordell
Copy link
Member Author

/retest

@ecordell
Copy link
Member Author

ecordell commented May 4, 2019

/retest

@ecordell
Copy link
Member Author

ecordell commented May 4, 2019 via email

@ecordell
Copy link
Member Author

ecordell commented May 5, 2019

/retest

1 similar comment
@ecordell
Copy link
Member Author

ecordell commented May 5, 2019

/retest

@ecordell
Copy link
Member Author

ecordell commented May 6, 2019

/test unit

@ecordell
Copy link
Member Author

ecordell commented May 6, 2019

/test e2e-aws-olm

ecordell added 3 commits May 6, 2019 17:55
this was causing the operatorgroup test to hang
cleanup function was hanging, so explicitly clean up here
@openshift-ci-robot openshift-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 6, 2019
Copy link
Member Author

@ecordell ecordell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments explaining these changes

@@ -25,5 +25,8 @@ metadata:
rbac.authorization.k8s.io/aggregate-to-view: "true"
rules:
- apiGroups: ["operators.coreos.com"]
resources: ["clusterserviceversions", "catalogsources", "installplans", "subscriptions", "packagemanifests"]
resources: ["clusterserviceversions", "catalogsources", "installplans", "subscriptions", "operatorgroups"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the ui issue where project admins can't view the olm ui.

@@ -323,22 +323,6 @@ func (o *Operator) handleDeletion(obj interface{}) {
return
}

// Check the registry server
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this - all we need is the requeue that's done right below, and the main catalog sync takes care of this.

@@ -429,17 +413,22 @@ func (o *Operator) syncCatalogSources(obj interface{}) (syncError error) {
logger.Debug("catsrc configmap state good, checking registry pod")
}

reconciler := o.reconciler.ReconcilerForSource(catsrc)
if reconciler == nil {
srcReconciler := o.reconciler.ReconcilerForSource(catsrc)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name was shadowing the package name

// TODO: Add failure status on catalogsource and remove from sources
return fmt.Errorf("no reconciler for source type %s", catsrc.Spec.SourceType)
}

healthy, err := srcReconciler.CheckRegistryServer(catsrc)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the bulk of the feature change :) everything else is fixing flakes in the tests.

}
}

a.copyQueueIndexer.Enqueue(outCSV)
if !outCSV.IsUncopiable() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check for uncopiable before we enqueue, so we don't have to check on the dequeue

@@ -1234,11 +1190,11 @@ func (a *Operator) updateInstallStatus(csv *v1alpha1.ClusterServiceVersion, inst
}

// parseStrategiesAndUpdateStatus returns a StrategyInstaller and a Strategy for a CSV if it can, else it sets a status on the CSV and returns
func (a *Operator) parseStrategiesAndUpdateStatus(csv *v1alpha1.ClusterServiceVersion) (install.StrategyInstaller, install.Strategy, install.Strategy) {
func (a *Operator) parseStrategiesAndUpdateStatus(csv *v1alpha1.ClusterServiceVersion) (install.StrategyInstaller, install.Strategy) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findIntermediatesForDeletion was the only thing that needed to have both strategies.

ownedRole.SetOwnerReferences([]metav1.OwnerReference{ownerutil.NonBlockingOwner(targetCSV)})
if err := ownerutil.AddOwnerLabels(ownedRole, targetCSV); err != nil {
targetRole := ownedRole.DeepCopy()
targetRole.SetResourceVersion("0")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the resourceversion being set was causing creates to fail

honestly not sure why this was a flake and not just a flat out failure. I meant to go back and see why this ever worked.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me think we need some kind of "SafeDeepCopy".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea - like a CopyForCopy vs. CopyForUpdate

@@ -242,7 +242,7 @@ func (a *Operator) requirementStatus(strategyDetailsDeployment *install.Strategy
}

// permissionStatus checks whether the given CSV's RBAC requirements are met in its namespace
func (a *Operator) permissionStatus(strategyDetailsDeployment *install.StrategyDetailsDeployment, ruleChecker install.RuleChecker, csvNamespace string) (bool, []v1alpha1.RequirementStatus, error) {
func (a *Operator) permissionStatus(strategyDetailsDeployment *install.StrategyDetailsDeployment, ruleChecker install.RuleChecker, targetNamespace, serviceAccountNamespace string) (bool, []v1alpha1.RequirementStatus, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another bug - the targetnamespace and serviceaccount namespace are different for multinamespace. again, not sure why this flaked instead of failing always.

(it looks like it would fail always, but resync would fix, so any test with MultiNamespace would be very slow)


aCSV := newCSV(csvName, opGroupNamespace, "", semver.MustParse("0.0.0"), []apiextensions.CustomResourceDefinition{mainCRD}, nil, namedStrategy)
createdCSV, err := crc.OperatorsV1alpha1().ClusterServiceVersions(opGroupNamespace).Create(&aCSV)
require.NoError(t, err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the lack of ownerlabels meant the copying would fail

@@ -369,24 +369,28 @@ func cleanupOLM(t *testing.T, namespace string) {
var err error
err = waitForEmptyList(func() (int, error) {
res, err := crc.OperatorsV1alpha1().ClusterServiceVersions(namespace).List(metav1.ListOptions{FieldSelector: nonPersistentCSVFieldSelector})
t.Logf("%d %s remaining", len(res.Items), "csvs")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw some cases where namespace cleanup failed, this helps to find those instances.

@ecordell
Copy link
Member Author

ecordell commented May 6, 2019

/test e2e-aws-console-olm

@ecordell
Copy link
Member Author

ecordell commented May 7, 2019

/retest

5 similar comments
@ecordell
Copy link
Member Author

ecordell commented May 7, 2019

/retest

@ecordell
Copy link
Member Author

ecordell commented May 7, 2019

/retest

@ecordell
Copy link
Member Author

ecordell commented May 7, 2019

/retest

@ecordell
Copy link
Member Author

ecordell commented May 7, 2019

/retest

@ecordell
Copy link
Member Author

ecordell commented May 7, 2019

/retest

@@ -914,7 +903,7 @@ func (o *Operator) createInstallPlan(namespace string, subs []*v1alpha1.Subscrip
func (o *Operator) requeueSubscription(name, namespace string) {
// we can build the key directly, will need to change if queue uses different key scheme
key := fmt.Sprintf("%s/%s", namespace, name)
o.subQueue.AddRateLimited(key)
o.subQueue.Add(key)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this change made?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't really need to rate limit our internal requeues - otherwise we can end up DOSing ourselves. In general we should only requeue when we have work to do

@tkashem
Copy link
Collaborator

tkashem commented May 7, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 7, 2019
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alecmerdler, ecordell, jpeeler, tkashem

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [alecmerdler,ecordell,jpeeler]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -418,6 +428,10 @@ func (a *Operator) ensureTenantRBAC(operatorNamespace, targetNamespace string, c
return err
}

if len(ownedRoles) == 0 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes this worth checking, but not the targetCSV? Perhaps this PR obsoletes #837, but the above was the biggest cache issue I saw.

@ecordell
Copy link
Member Author

ecordell commented May 7, 2019

/retest

@ecordell ecordell mentioned this pull request May 7, 2019
@openshift-ci-robot
Copy link
Collaborator

@ecordell: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-console-olm 0061ece link /test e2e-aws-console-olm

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ecordell ecordell merged commit 7c22f02 into operator-framework:master May 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants