Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve and unpack operators from bundle images #1081

Closed
wants to merge 9 commits into from

Conversation

jpeeler
Copy link

@jpeeler jpeeler commented Oct 18, 2019

No description provided.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 18, 2019
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpeeler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 18, 2019
Copy link
Member

@njhale njhale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks promising! Here's the feedback/questions I have so far:

pkg/controller/registry/resolver/generation.go Outdated Show resolved Hide resolved
"github.com/operator-framework/operator-lifecycle-manager/pkg/api/apis/operators/v1alpha1"
"github.com/operator-framework/operator-registry/pkg/registry"
)

// Generation represents a set of operators and their required/provided API surfaces at a point in time.
type Generation interface {
AddOperator(o OperatorSurface) error
AddPendingOperator(l LaunchBundleImageInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other methods take an interface as input, which is a classic go idiom. Is there some way we can use OperatorSurface without changing its API (e.g. tack new fields on OperatorSourceInfo, for which there is already a getter)? If not, does it make sense to define a new interface?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will circle back to this.

pkg/controller/registry/resolver/generation.go Outdated Show resolved Hide resolved
pkg/controller/registry/resolver/evolver.go Outdated Show resolved Hide resolved
pkg/controller/registry/resolver/generation.go Outdated Show resolved Hide resolved
pkg/controller/registry/resolver/evolver.go Outdated Show resolved Hide resolved
pkg/controller/registry/resolver/querier.go Outdated Show resolved Hide resolved
pkg/controller/registry/resolver/resolver.go Show resolved Hide resolved
@njhale njhale changed the title WIP: (crude WIP) WIP: Resolve and unpack operators from bundle images Oct 28, 2019
@jpeeler jpeeler force-pushed the installplan-bundle branch from b192cb9 to 1f749ec Compare October 29, 2019 15:28
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 29, 2019
@jpeeler jpeeler force-pushed the installplan-bundle branch from 1f749ec to c4d3879 Compare October 29, 2019 15:41
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 29, 2019
@jpeeler jpeeler force-pushed the installplan-bundle branch from c4d3879 to 6b025da Compare October 31, 2019 16:47
@openshift-ci-robot openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 31, 2019
@jpeeler jpeeler force-pushed the installplan-bundle branch from 6b025da to 7a35d3b Compare November 1, 2019 16:51
@@ -132,12 +137,14 @@ func NewOperator(ctx context.Context, kubeconfigPath string, clock utilclock.Clo
client: crClient,
lister: lister,
namespace: operatorNamespace,
resolver: resolver.NewOperatorsV1alpha1Resolver(lister, crClient),
resolver: resolver.NewOperatorsV1alpha1Resolver(lister, crClient, opClient.KubernetesInterface(), operatorNamespace),
Copy link
Member

@ecordell ecordell Nov 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used for launching the bundle image (LaunchBundleImage).

pkg/controller/operators/catalog/operator.go Outdated Show resolved Hide resolved
return nil, nil, nil, err
}
for _, ip := range ips {
for _, lookup := range ip.Status.BundleLookups {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing why this is needed - can't you just return the whole set of pending operators as BundleLookups that the installplan should unpack?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have a TODO here to investigate improving. Without it install plans get duplicated due to the next loop doing a resolve (again) and repeating. The only way that I can see to avoid this is by checking all the install plans for the image.

for bundleImageInfo := range gen.PendingOperators() {
// TODO: switch image to standalone image, but this image can be used upstream as well
// change to use configmapRegistryImage
//configmap, job, err := configmap.LaunchBundleImage(r.kubeclient, bundleImageInfo.image, "quay.io/openshift/origin-operator-registry:latest", r.operatorNamespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify our discussion last night, you should use the actual variable "configmapRegistryImage" from pkg/catalogoperator.go NewOperator, not quay.io/openshift/origin-operator-registry:latest

@@ -68,6 +69,29 @@ func (r *ResourceQueueSet) Requeue(namespace, name string) error {
return fmt.Errorf("couldn't find queue for resource")
}

// TODO: this may not actually be required if the requeue is done on the namespace rather than the installplan
// RequeueAfter requeues the resource in the set with the given name and namespace (just like Requeue), but only does so after duration has passed
func (r *ResourceQueueSet) RequeueAfter(namespace, name string, duration time.Duration) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we watch Jobs and requeue owner installplans when we see them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's a good idea and is something I was considering.

// change to use configmapRegistryImage
//configmap, job, err := configmap.LaunchBundleImage(r.kubeclient, bundleImageInfo.image, "quay.io/openshift/origin-operator-registry:latest", r.operatorNamespace)

configmap, job, err := configmap.LaunchBundleImage(r.kubeclient, bundleImageInfo.image, "quay.io/jpeeler/bundle-init-image:latest", r.operatorNamespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do these get cleaned up?

(this is not necessarily blocking, we can follow up with GC)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could return the pending operators from the resolver without launching them?

then when we create the installplan, write the "bundlelookups" into it

then when we process the installplan, launch jobs for any bundlelookups that don't have corresponding jobs?

this would also mean that when we launch, we have an installplan, so we can add ownerrefs, so we can watch jobs and requeue based on ownerref 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct that they aren't yet

@jpeeler jpeeler force-pushed the installplan-bundle branch from 7a35d3b to ff053aa Compare November 2, 2019 01:12
@jpeeler
Copy link
Author

jpeeler commented Nov 2, 2019

This isn't merge-able 😩

The e2e currently has exactly double steps, which seems like resources are getting added from the bundle (which in this case has a CSV) and the bundle image.

Code generation has not been done, unit tests aren't updated.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 18, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 18, 2019
@jpeeler
Copy link
Author

jpeeler commented Dec 18, 2019

/hold
This isn't quite ready yet, but is much closer now. I tried to properly integrate the ideas from all of these PRs here as well:

#1194
#1105
#1108

This is a blocking PR that needs to be merged first (but could possibly change due to feedback received here):
operator-framework/operator-registry#137

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 18, 2019
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 19, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 20, 2019
@jpeeler jpeeler changed the title WIP: Resolve and unpack operators from bundle images Resolve and unpack operators from bundle images Dec 20, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 20, 2019
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 20, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2019
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2019
Jeff Peeler and others added 9 commits December 21, 2019 07:55
Specifically, pulls in v1.5.5.
This extracts bundle data from an image and uses that data to populate
the installplan. (The code here depends on later commits.)
WIP: this is currently using a local custom image.
Instead use image that is pullable from quay. Test data has been updated
to include image building files.
The test data for generating this custom image uses a go script to
generate and populate an operator registry database that references a
bundle image.
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2019
@jpeeler
Copy link
Author

jpeeler commented Dec 21, 2019

/retest

@openshift-ci-robot
Copy link
Collaborator

@jpeeler: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws b192cb9 link /test e2e-aws
ci/prow/e2e-aws-upgrade b192cb9 link /test e2e-aws-upgrade
ci/prow/e2e-gcp 9707998 link /test e2e-gcp
ci/prow/e2e-aws-olm 9707998 link /test e2e-aws-olm

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@@ -1033,6 +1044,96 @@ func (o *Operator) createInstallPlan(namespace string, subs []*v1alpha1.Subscrip
return reference.GetReference(res)
}

func (o *Operator) checkBundleLookups(plan *v1alpha1.InstallPlan) (bool, error) {
for _, bundleLookup := range plan.Status.BundleLookups {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over the course of our work on OLM, we've developed an idiomatic way of reconciling resources:

  1. Build state from cluster
  2. Inform cluster action from state; or
  3. Update resource status from state

I think that we can run into trouble if we jump straight to 2. without first performing 1., or by coupling 2. and 3.

In this case, do you think we can:

  • build the state before actioning on it; i.e. Check for ConfigMaps/Jobs before creating them
  • update the InstallPlan object once per reconciliation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good plan - we just need a deterministic way to name the configmaps/jobs from the installplan. InstallPlanName + image name would make sense, or a hash of that info.

@@ -85,6 +87,7 @@ type Operator struct {
catalogSubscriberIndexer map[string]cache.Indexer
clientAttenuator *scoped.ClientAttenuator
serviceAccountQuerier *scoped.UserDefinedServiceAccountQuerier
bundleLoader *configmap.BundleLoader
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in an ideal world this would be an interface. It would be nice if we had other options than unpacking into a configmap in the future. But we can do that as a separate refactoring after this merges.

csvProvidedAPIsIndexer: map[string]cache.Indexer{},
catalogSubscriberIndexer: map[string]cache.Indexer{},
serviceAccountQuerier: scoped.NewUserDefinedServiceAccountQuerier(logger, crClient),
clientAttenuator: scoped.NewClientAttenuator(logger, config, opClient, crClient),
bundleLoader: configmap.NewBundleLoader(),
configmapRegistryImage: configmapRegistryImage,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want a separate variable for "bundleUnpackerImage". In OpenShift this will end up the same image (registry image) but they do different things and there's no reason they should have to be the same.

@@ -1033,6 +1044,96 @@ func (o *Operator) createInstallPlan(namespace string, subs []*v1alpha1.Subscrip
return reference.GetReference(res)
}

func (o *Operator) checkBundleLookups(plan *v1alpha1.InstallPlan) (bool, error) {
for _, bundleLookup := range plan.Status.BundleLookups {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good plan - we just need a deterministic way to name the configmaps/jobs from the installplan. InstallPlanName + image name would make sense, or a hash of that info.

}

// extract data from configmap and write to install plan
bundle, err := o.bundleLoader.Load(configmap)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense but is not critical, IMO. Installing via installplan steps ensures we install via a well-tested path. But, this would be a great followup and will enable installplans to install larger sets of objects.

@njhale njhale mentioned this pull request Jan 6, 2020
5 tasks
@jpeeler
Copy link
Author

jpeeler commented Jan 10, 2020

Superseded by #1215, which has now merged.

@jpeeler jpeeler closed this Jan 10, 2020
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 10, 2020
@openshift-ci-robot
Copy link
Collaborator

@jpeeler: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants