Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Descope" OLM delivered operators #2437

Open
njhale opened this issue Nov 9, 2021 · 1 comment
Open

"Descope" OLM delivered operators #2437

njhale opened this issue Nov 9, 2021 · 1 comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@njhale
Copy link
Member

njhale commented Nov 9, 2021

Feature Request

Note: This issue mostly consists of select snippets from a document @ecordell drafted a while back. I've curated the important bits to frame the problem for further discussion.

Scoping, Descoping, What?

In short, when we talk about "scope" in OLM, we're talking about how OLM handles the privileges granted to an operator and its users with respect to the namespaces an admin configures it to install; i.e. the opinionated behavior of RBAC generation around ClusterServiceVersions, their InstallModes, and OperatorGroups.

Note: see the OperatorGroup docs for more details.

Problem

APIs in a kubernetes cluster are cluster-scoped. They are visible via discovery to any user that wishes to see them. Even operators that agree on a particular GVK may have differences of opinion in how those objects should be admitted to a cluster, or how conversion between API versions should happen.

With Operator Framework, we want to build an ecosystem of high-quality operators that can be re-used across different projects, whether they’re in the same cluster or not. But re-using operators compounds the scoping problems within a cluster - it increases the likelihood that more than one “opinion” about an API exists in the cluster.

History

When OLM was first written, CRDs defined only the existence of a GVK in a cluster. Operators developed for OLM could only install in a namespace, watching that namespace - this delivered on the self-service, operational-encoding story of operators. The same operator could be installed in every namespace of a cluster.

Privilege escalation became a concern - since operators are run with a service account in a namespace, anyone with the ability to create workloads in that namespace could escalate to the permissions of the operator. This made service provider/consumer relationships a difficult sell for operators in OLM.

At the same time, CRDs continued to add features. With version schemas and admission and conversion webhooks, CRDs no longer simply registered a global name for a type, and operators in separate namespaces had lots of options to interfere with one another if they shared the same CRD. OLM also expanded to support APIServices in addition to operators based on CRDs, and so required a notion of cluster-wide operators.

To address these concerns, a notion of scoping operators was introduced via the OperatorGroup object. An OperatorGroup would specify a set of namespaces within a cluster in which all operators installed would share the same scope. OLM would ensure that only one operator within a namespace owned a particular CRD to avoid collision problems, and more installation options were provided to allow separating operators from their managed workloads.

Proposal

Entirely remove the notion of scoping from OLM; i.e. "descope".

This means that:

  • Only one operator that provides an API -- e.g. via CRD or APIService -- may be installed simultaneously
  • OLM stops being opinionated about how privileges are granted to operators and users; i.e. OperatorGroups, InstallModes, and today's generated RBAC are deprecated and removed
  • Operator authors and admins use more traditional means -- e.g. (cluster)Roles/RoleBindings -- to declare cluster privileges for both operators and their users

It does not mean that:

  • Every operator needs to have permission to do its job in every namespace
  • Every user in a cluster needs to have permission to use the operator's APIs
  • Only one controller pod needs to run for that api in a cluster
  • Only one controller can be installed to manage an API (i.e. ingress-style)
  • More sophisticated privilege generation cannot be used; e.g. FeatureBinding

Design!?

The specifics of how we will achieve descoping will need an enhancement proposal to be made clear. Such a proposal will, at minimum, need to cover:

  • Deprecation of related APIs; e.g. OperatorGroups, ClusterServiceVersions, etc
  • Migration of existing operator content; i.e. How does an author define a scoped -> descoped operator upgrade? what does a cluster admin need to do?
  • If/how existing "scope user stories" can be achieved after "descoping"; e.g. config management
  • How does Operator Discovery work without a first-class concept of scoping?
@njhale njhale added kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Nov 9, 2021
@bitscuit
Copy link

Is there any proposal yet or draft of one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

2 participants