Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tenant Owner Reference on namespaces is lost after restoring from Velero backup #312

Closed
bsctl opened this issue Jul 1, 2021 · 1 comment · Fixed by #320
Closed

Tenant Owner Reference on namespaces is lost after restoring from Velero backup #312

bsctl opened this issue Jul 1, 2021 · 1 comment · Fixed by #320
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request high-priority Feature Request with high-priority

Comments

@bsctl
Copy link
Member

bsctl commented Jul 1, 2021

Bug description

The Tenant Owner Reference on namespaces is lost after restoring from backup using Velero.

[Velero requires cluster admin permission, so the restore is performed by the cluster admin and not by the tenant owner. Strictly speaking it's not a bug, since it works by design. However we should address this in some way because it is a requirement from project adopters].

How to reproduce

  1. As cluster admin, create a tenant
apiVersion: capsule.clastix.io/v1alpha1
kind: Tenant
metadata:
  name: oil
spec:
  owner:
    name: alice
    kind: User
  1. As tenant owner, alice, create one or more namespaces in the tenant
$ kubectl --as alice --as-group capsule.clastix.io create ns oil-production
$ kubectl --as alice --as-group capsule.clastix.io create ns oil-development
$ kubectl --as alice --as-group capsule.clastix.io create ns oil-marketing
  1. namespaces have correctly the Owner Reference set
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    capsule.clastix.io/allowed-registries: docker.io,quay.io
    capsule.clastix.io/ingress-classes: cmp,haproxy
    capsule.clastix.io/storage-classes: standard
    scheduler.alpha.kubernetes.io/node-selector: pool=cmp
  labels:
    capsule.clastix.io/tenant: oil
    name: oil-marketing
  name: oil-marketing
  ownerReferences:
  - apiVersion: capsule.clastix.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Tenant
    name: oil
spec:
  finalizers:
  - kubernetes
  1. Tenant state is correctly updated
$ kubectl get tnt oil
NAME   NAMESPACE QUOTA   NAMESPACE COUNT   OWNER NAME   OWNER KIND   NODE SELECTOR    AGE
oil    9                 3                 alice        User         {"pool":"cmp"}   12h
  1. As cluster admin, take a backup of the tenant, using Velero for example

$ velero create backup oil --include-namespaces='oil-production,oil-development,oil-marketing'

  1. As tenant owner, delete a namespace
$ kubectl --as alice --as-group capsule.clastix.io delete ns oil-marketing
  1. As cluster admin, restore from previous backup
$ kubectl get ns -l capsule.clastix.io/tenant=oil
NAME              STATUS   AGE
oil-development   Active   11h
oil-production    Active   11h

$ velero create restore --from-backup oil

$ kubectl get ns -l capsule.clastix.io/tenant=oil
NAME              STATUS   AGE
oil-development   Active   11h
oil-marketing     Active   11m
oil-production    Active   11h
  1. the restored namespace misses the Tenant Owner Reference
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    capsule.clastix.io/allowed-registries: docker.io,quay.io
    capsule.clastix.io/ingress-classes: cmp,haproxy
    capsule.clastix.io/storage-classes: standard
    scheduler.alpha.kubernetes.io/node-selector: pool=cmp
  labels:
    capsule.clastix.io/tenant: oil
    name: oil-marketing
  name: oil-marketing
spec:
  finalizers:
  - kubernetes

and the Tenant namespace count is not updated

$ kubectl get tnt oil
NAME   NAMESPACE QUOTA   NAMESPACE COUNT   OWNER NAME   OWNER KIND   NODE SELECTOR    AGE
oil    9                 2                 alice        User         {"pool":"cmp"}   12h

Expected behavior

After restore, the namespaces should keep the Tenant Owner Reference even if the restore is made by cluster admin and not by tenant owner.

Logs

If applicable, please provide logs of capsule.

In a standard stand-alone installation of Capsule,
you'd get this by running kubectl -n capsule-system logs deploy/capsule-controller-manager.

Additional context

  • Capsule version: v0.1.0-rc2
  • Helm Chart version: (helm list -n capsule-system)
  • Kubernetes version: (kubectl version)
@bsctl bsctl added bug Something isn't working blocked-needs-validation Issue need triage and validation high-priority Feature Request with high-priority labels Jul 1, 2021
@prometherion
Copy link
Member

prometherion commented Jul 1, 2021

Velero dropping the OwnerReferences is reasonable, although not effective from the developer experience perspective.

Capsule is entirely based on the OwnerReference field for bounding Namespace resources to the Tenant one and changing this approach would require a huge refactoring since that would mean re-engineering the solution, also considering the fact this is a limitation by Velero itself, tho.

To avoid this issue I guess we could easily create a sort of utility, outside of the Capsule Controller Manager code-base that should patch the Namespace resources with the desired OwnerReferences.

We could get the list of the Tenants, extracting their names and UIDs. With the name, we just need to get the list of Namespace resources (kubectl get namespaces -l capsule.clastix.io/tenant=${TENANT_NAME}) and patch each resource (kubectl patch namespaces ${NAMESPACE} --type=json -p '[{"op": "add", "path": "/metadata/ownerReferences", "value": [{"apiVersion": "capsule.clastix.io/v1alpha1", "blockOwnerDeletion": true, "controller": true, "kind": "Tenant", "name": "${TENANT_NAME}", "uid": "${TENANT_UID}"}]}]').

This bash script could be executed by the Cluster Administrator upon Velero backup restore.

@prometherion prometherion added documentation Improvements or additions to documentation enhancement New feature or request and removed blocked-needs-validation Issue need triage and validation bug Something isn't working labels Jul 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request high-priority Feature Request with high-priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants