Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how Capsule integrates in a Flux GitOps based environment #528

Closed
bsctl opened this issue Mar 14, 2022 · 18 comments · Fixed by #609
Closed

Document how Capsule integrates in a Flux GitOps based environment #528

bsctl opened this issue Mar 14, 2022 · 18 comments · Fixed by #609
Assignees
Labels
documentation Improvements or additions to documentation
Milestone

Comments

@bsctl
Copy link
Member

bsctl commented Mar 14, 2022

Describe the feature

Document how Capsule integrates in a Flux GitOps based environment.

What would the new user story look like?

The cluster admin can learn how to configure a Flux GitOps based environment with Capsule

Expected behaviour

A detailed user guide is provided into documentation.

@bsctl bsctl added blocked-needs-validation Issue need triage and validation and removed blocked-needs-validation Issue need triage and validation labels Mar 14, 2022
@oliverbaehler
Copy link
Collaborator

Comment for Assignment

@bsctl
Copy link
Member Author

bsctl commented Mar 15, 2022

Thanks @oliverbaehler really appreciate.

@maxgio92
Copy link
Collaborator

maxgio92 commented Jun 7, 2022

Hey @bsctl I was working exactly on the same use case here.
And I then came across the issue @oliverbaehler just discussed in #582.

In my PoC here the patch and list verb permission the GitOps reconciler would need.

Tenant self-service reconciliation

TL;DR: The tenant owner needs permission to patch (also list) cluster-scoped Kubernetes resources/objects too.
The flux kustomize-controller in the end will kubectl apply the desired state, and as in the PoC it will do it impersonating the Tenant Owner user.

More details on #582 - I recommend reading it.

Enter Capsule Proxy

I see that this use case is perfect for Capsule Proxy. In this way, the tenant resource control and isolation would happen on a step before during the request process as the request would be proxied instead of being validated.

Then, Capsule Proxy would provide the tenant-scoped view of cluster-scoped resources, to the Tenant Owner. From this point on, the Tenant Owner would be safe to list and patch Namespaces, as he could it do only on his ones.

Also, no further complexity would be introduced.

Flux + Capsule + Capsule Proxy

In a nutshell, the Kustomize reconciler in order to reconcile the desired state of the tenant - which would be ideally declared and versioned on Git by the tenant owner itself - would impersonate the Tenant Owner SA user and communicate to the Capsule Proxy to also operate on cluster-scoped resources (e.g. Namespace-as-a-service) - and nothing more!

The tenant can configure it through the Kustomization spec.kubeConfig field.

For example, a Tenant:

apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
  name: dev-team
spec:
...
  owners:
  - name: system:serviceaccount:dev-team:gitops-reconciler
    kind: ServiceAccount

would declare its Namespaces, through some Kustomization like this:

apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: namespaces
  namespace: dev-team
spec:
  kubeConfig:
    secretRef:
      name: capsule-proxy
      key: kubeconfig
  interval: 1m
  sourceRef:
    kind: GitRepository
    name: dev-team
  path: ./staging/namespaces
  prune: true

where the capsule-proxy Secret kubeconfig would contain the tenant owner SA (dev-team:gitops-reconciler) token and Capsule proxy svc endpoint and CA certificate.

The missing piece

As of now, on this idea the missing piece is the kubeConfig Secret being automatically in-place with the Tenant Owner ServiceAccount and up-to-date with its token, as the kubeconfig expects a secretRef field.

from here.

WIP

/cc @prometherion

@oliverbaehler
Copy link
Collaborator

Yeah that's what our current setup looks like. We have shell-operator in place which dumps kubeconfigs from serviceaccounts into secrets which access the capsule proxy's internal url. We could add something like that to the capsule proxy.

You will still need to get around, since you implement a huge security exploit when just allowing patch privileges #582

@maxgio92
Copy link
Collaborator

maxgio92 commented Jun 7, 2022

TL;DR what is missing in this scenario is:

  1. a controller to ensure and update Tenant Owners' kubeconfig Secrets
  2. (in order to disable --default-service-account) a controller to ensure SA field on Flux Kustomization when don't have spec.kubeConfig set, i.e. when not using Capsule Proxy - e.g. it can be done with Kyverno Policies
  3. Grant Global Namespace Patch Rights #582 for which the PR feat: grant Global Patch Rights #584
  4. (optional) to enforce tenants (Kustomization) reconciliation through Capsule Proxy (ensure Kustomization.spec.kubeConfig)

About Flux

In the end we should be able to apply also Flux's multi-tenancy lockdown features:

  • --no-cross-namespace-refs=true - prevent from cross-namespace Flux resource reference: we can enable it.
    In this way Source-type (e.g. GitRepository) and Reconciliation-type (e.g. Kustomization) resources need to be placed in the same Namespace and we need to leverage the spec.targetNamespace on Reconciliation resources in order to be applied to another Namespace of the same Tenant.

except for:

  • --default-service-account=<name> - ensure a default ServiceAccount that the Reconciliation-type controllers (Kustomize and Helm) will impersonate to reconcile desired state: we cannot enable it.
    This is because we pass a kubeconfig for Tenants' Reconciliation resources and if the ServiceAccount is specified too, it will be then tried to be impersonated by the identity specified in the kubeconfig, and we don't need it. For this reason, we need point 2 (above), to close the vulnerabilities we open by not enabling this Flux lockdown feature.

@oliverbaehler
Copy link
Collaborator

Hi @maxgio92, I have tried the approach to allow Flux within tenants, meaning that cross references within namespaces in the same tenant are such which are marked as public, are allowed. But we don't have the time to maintain such policies over time and we are also moving towards argo, so i guess we won't need them anymore (They weren't used in production yet). Maybe it's something usefu. See the following policies:

helmrelease.policy.yaml

---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: flux-helmrelease-cross-reference
  annotations:
    policies.kyverno.io/title: Flux Helmrelease Cross Reference
    policies.kyverno.io/category: Flux
    policies.kyverno.io/subject: HelmRelease
    policies.kyverno.io/description: >-
      Disallows cross namespaces references of HelmRelease Resources.
spec:
  validationFailureAction: enforce
  background: false
  rules:

    # Defaults all namespace attributes to the namespace the Helmrelease is installed into.
    # Does not overwrite if set
    - name: HelmRelease Default Namespaces
      preconditions:
        any:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]
      match:
        all: 
          - resources:
              kinds:
                - HelmRelease
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      mutate:
        patchStrategicMerge:
          spec:
            +(targetNamespace): "{{request.object.metadata.namespace}}"
            +(storageNamespace): "{{request.object.metadata.namespace}}"
            +(spec):
              +(sourceRef):
                +(namespace): "{{request.object.metadata.namespace}}"

    # Disallow Source References 
    # Unless in Public Namespace or same Tenant
    - name: helmrelease-source-cross-reference
      context:

        # Load Gloabl Configuration
        - name: global
          configMap:
            name: kyverno-global-config
            namespace: kyverno-system

        # Get All Public Namespaces
        - name: public_namespaces
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.labels.\"{{global.data.public_identifier_label}}\" == '{{global.data.public_identifier_value}}'].metadata.name | []" 

        # Get Tenant information from source namespace
        # Defaults to a character, which can't be a label value
        - name: source_tenant
          apiCall:
            urlPath: "/api/v1/namespaces/{{request.object.metadata.namespace}}"
            jmesPath: "metadata.labels.\"{{global.data.tenant_identifier_label}}\" | '?'"

        # Get Tenant information from destination namespace
        # Returns Array with Tenant Name or Empty
        - name: destination_tenant
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.name == '{{request.object.spec.chart.spec.sourceRef.namespace}}'].metadata.labels.\"{{global.data.tenant_identifier_label}}\""

      preconditions:
        all:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]
        any: 

          # Source is not Self-Reference  
          - key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
            operator: NotEquals
            value: "{{request.object.metadata.namespace}}"

          # Source not in Public Namespaces
          - key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
            operator: NotIn
            value: "{{public_namespaces}}"

          # Source not in Destination
          - key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
            operator: NotIn
            value: "{{destination_tenant}}"

      match:
        all: 
          - resources:
              kinds:
                - HelmRelease
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      validate:
        message: "Can not use namespace {{request.object.spec.chart.spec.sourceRef.namespace}} as source reference!"
        deny: {}

kustomization.policy.yaml

---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: flux-kustomization-cross-reference
  annotations:
    policies.kyverno.io/title: Flux Kustomization Cross Reference
    policies.kyverno.io/category: Flux
    policies.kyverno.io/subject: Kustomization
    policies.kyverno.io/description: >-
      Disallows cross namespaces references of Kustomization Resources
spec:
  validationFailureAction: enforce
  background: false
  rules:
    - name: flux-kustomization-defaults
      preconditions:
        any:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]
      match:
        all: 
          - resources:
              kinds:
                - Kustomization
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      mutate:
        patchStrategicMerge:
          spec:
            +(targetNamespace): "{{request.object.metadata.namespace}}"
            +(sourceRef):
              +(namespace): "{{request.object.metadata.namespace}}"

    # Disallow Source References 
    # Unless in Public Namespace or same Tenant
    - name: helmrelease-source-cross-reference
      context:

        # Load Gloabl Configuration
        - name: global
          configMap:
            name: kyverno-global-config
            namespace: kyverno-system

        # Get All Public Namespaces
        - name: public_namespaces
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.labels.\"{{global.data.public_identifier_label}}\" == '{{global.data.public_identifier_value}}'].metadata.name | []" 

        # Get Tenant information from source namespace
        # Defaults to a character, which can't be a label value
        - name: source_tenant
          apiCall:
            urlPath: "/api/v1/namespaces/{{request.object.metadata.namespace}}"
            jmesPath: "metadata.labels.\"{{global.data.tenant_identifier_label}}\" | '?'"

        # Get Tenant information from destination namespace
        # Returns Array with Tenant Name or Empty
        - name: destination_tenant
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.name == '{{request.object.spec.chart.spec.sourceRef.namespace}}'].metadata.labels.\"{{global.data.tenant_identifier_label}}\""

      preconditions:
        all:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]

          - key: "{{request.object.spec.sourceRef.namespace}}"
            operator: NotIn
            value: "{{public_namespaces}}"

          - key: "{{request.object.spec.targetNamespace}}"
            operator: NotIn
            value: "{{destination_tenant}}"

      match:
        all: 
          - resources:
              kinds:
                - Kustomization
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      validate:
        message: "Can not use namespace as source reference, namespace must be public or within tenant!"
        deny: {}

global.config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: kyverno-global-config
  namespace: kyverno-system
data:
  #
  ## Flux Configurations
  #

  public_identifier_label: "company.com/public"
  public_identifier_value: "true"


  #
  ## Tenant Configurations
  #

  ## tenant_identifier_label
  #    Label which is used to select the tenant name
  tenant_identifier_label: "capsule.clastix.io/tenant"

@maxgio92
Copy link
Collaborator

maxgio92 commented Jun 9, 2022

Thank you @oliverbaehler !

@prometherion
Copy link
Member

2. (in order to disable --default-service-account) a controller to ensure SA field on Flux Kustomization when don't have spec.kubeConfig set, i.e. when not using Capsule Proxy - e.g. it can be done with Kyverno Policies

Could we make this optional right now and use the Kyverno policies shared by @oliverbaehler so we can close this with the docs update?

@prometherion prometherion added this to the v0.1.2 milestone Jun 9, 2022
@oliverbaehler
Copy link
Collaborator

oliverbaehler commented Jun 11, 2022

@prometherion If we want to propose these i would take some time to harden them and verify that they are working as expect, not that we create a security leak.

My intention was to share reconciler resources which are common (eg. bitnami helm repository is in a public namespace). But I don't know if I am still a fan of this idea, since these policies rely on api calls which decrease cluster performance (API Requests)

@prometherion
Copy link
Member

It depends on the time since I'd like to publish the v0.1.2 before the next community call, expected in 2 weeks.

Do you think that's feasible according to your availability?

My intention was to share reconciler resources which are common (eg. bitnami helm repository is in a public namespace). But I don't know if I am still a fan of this idea, since these policies rely on api calls which decrease cluster performance (API Requests)

I'm missing the context here, could you elaborate a bit more?

@maxgio92
Copy link
Collaborator

maxgio92 commented Jun 13, 2022

@prometherion TL;DR without point 2 a Tenant could bypass Capsule and modify system resources (e.g. in kube-system NS) or other Tenants' resources.

Point 2 is important to avoid that Tenant resources are reconciled with cluster-admin privileges.
This would happen because we cant' enforce reconciliation with impersonation of the default SA of the same namespace of the Kustomization (i.e. reachable with Kustomize controller flag --default-service-account=<name>), instead by default no impersonation is done so the reconciliation would be done with default Kustomize controller's SA and related cluster-admin privileges on the whole cluster.

@oliverbaehler I'm going to test the PoC with the new patch verb support.
In any case, I'd avoid to provide support for cross-namespace Flux CR reference instead, as stated above, leverage targetNamespace of Flux reconcile-type resources to choose where to apply. PS: at least for this Capsule release :-)

Keep you posted.

@maxgio92
Copy link
Collaborator

maxgio92 commented Jun 13, 2022

I wrote down these test definitions:

  • A tenant should not be able to write other tenants' Namespaces (e.g. try to re-assign Namespace)
  • A tenant should not be able to read other tenants' namespaced resource (e.g. reference other tenants' Source-type Flux CRs like GitRepository/HelmRepository or Kustomization dependsOn)
  • A tenant should not be able to write other tenants' namespaced resource
  • A tenant should not be able to read non-tenant namespaced resources (e.g. reference Flux "root" Source-type CR like GitRepository)
  • A tenant should not be able to write non-tenant namespaced resources (e.g. create resource into admin Namespaces)
  • A tenant should not be able to read non-tenant cluster-level resources, except for his Namespaces
  • A tenant should not be able to write non-tenant cluster-level resources, except for (his/new) Namespaces (e.g. try to bind cluster-admin at cluster-level to his owner SA)
  • A tenant should not be able to read his namespaced resource cross-namespace referencing
  • A tenant should be able to reconcile only through Capsule Proxy. Without it LIST on cluster-level resources are not permitted through direct API server communication.
  • A platform admin should be able to reconcile its config without kubeConfig: either with the explicitely declared SA or with the default one

@maxgio92
Copy link
Collaborator

TL;DR The last two points can be achieved without policies, instead enabling also the default SA impersonation-feature of Flux MultiTenancy Lockdown.
What is needed is the privilege of the Tenant Owner to impersonate himself, and add support in Capsule Proxy for requests with impersonation headers.

Moreover, with this approach we remove the dependency of a further policy engine.

@maxgio92
Copy link
Collaborator

maxgio92 commented Jul 19, 2022

Update: all the points are achieved.
I updated the one before the last by removing the fact that spec.kubeconfig should not be empty. This is not required now, but the important part is that Capsule Proxy enables list operations for the Tenant GitOps Reconciler.

Even though not-so-elegant, the Tenant GitOps Reconciler communicates to the Capsule Proxy impersonating himself (ServiceAccount's User), because we need to set mandatory default ServiceAccount impersonation on all Reonciliation-type (Kustomization, HelmRelease) Flux CRs, for security reasons (more on this on the points above).

Nonetheless, for the Tenant this is transparent, he can omit the spec.ServiceAccountName and just specify the spec.kubeconfig.

For this reason the impersonation support has been introduced into Capsule Proxy (see projectcapsule/capsule-proxy#215).

I'm going to prepare a documentation for this scenario and propose some automations that could improve the UX for these GitOps-managed multi-tenancy scenarios.

@bsctl
Copy link
Member Author

bsctl commented Jul 20, 2022

I wrote down these test definitions: ....

@maxgio92 Are these tests automated in e2e?

@maxgio92
Copy link
Collaborator

@bsctl no, I'm not sure it would make so much sense as they wouldn't test Capsule but a use case integration with an external project.

@bsctl
Copy link
Member Author

bsctl commented Jul 20, 2022

@maxgio92 you're right, my bad.

@maxgio92
Copy link
Collaborator

Hey @oliverbaehler, we released the guide for this scenario https://capsule.clastix.io/docs/guides/flux2-capsule 🥳

If you want to take a look and you notice some improvement, correction, please let me know :-)

In any case, thank you a lot for the value you put on this! 🙏🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants