Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem on backup cluster scoped resource and namespaced resource together #5119

Closed
half-life666 opened this issue Jul 14, 2022 · 8 comments
Closed

Comments

@half-life666
Copy link
Contributor

What steps did you take and what happened:

  1. Take a backup with one namespace and one cluster resource, e.g., storageclass or clusterroles
  2. If the user want to backup pvc/pv and the data from pv, the user has to specify persistentvolumes in includedResources
    below is an example yaml:
spec:
  defaultVolumesToRestic: true
  hooks: {}
  includeClusterResources: true
  includedNamespaces:
  - mongo
  includedResources:
  - persistentvolumeclaims
  - persistentvolumes
  - clusterroles.rbac.authorization.k8s.io
  metadata: {}
  snapshotVolumes: false
  storageLocation: xxx
  ttl: 24h0m0s

This setup can backup the pvc/pv from the specified namespace, however, it will introduces a few issues:

  1. It will backup all the persistentvolumes in the cluster, while the user only want to backup the pvc/pv from the namespace
  2. At restore, the PV which does not have an assosiated PVC can not actually be restored, because velero will try to dynamic provision the PV by creating the PVC (from backup), which does not exist.
  3. Restored items count will be wrong
    Because velero believes the PV will be "dynamic provisioned" by later PVC creation, so it will accumulate the "itemsRestored" count and log "Restored x items out of an estimated total of y"

What did you expect to happen:
Under this setup the backup and restore will cause confusion to user, because the backup details includes every PV from the cluster, but at restore, the PVs has no PVC will not be restored, and the "itemsRestored" is wrong. I think it would be more clearer if:

  • When PVC/PV are specified in "includedResources", and namespaces are specified, the backup should only backup the PVC/PV in namespace

This will not cause the restore issues mentioned above.

The following information will help us better understand what's going on:

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)

  • kubectl logs deployment/velero -n velero
  • velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
  • velero backup logs <backupname>
  • velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
  • velero restore logs <restorename>
time="2022-07-07T10:53:24Z" level=info msg="Dynamically re-provisioning persistent volume because it doesn't have a snapshot and its reclaim policy is Delete." logSource="pkg/restore/restore.go:1095" restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Restored 4 items out of an estimated total of 7 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:664" name=pvc-2227d67e-0b72-4fbd-a2fe-e8cf74a3192e namespace= progress= resource=persistentvolumes restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Dynamically re-provisioning persistent volume because it doesn't have a snapshot and its reclaim policy is Delete." logSource="pkg/restore/restore.go:1095" restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Restored 5 items out of an estimated total of 7 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:664" name=pvc-45d44079-4c2a-4adf-8197-195d5a2cc81f namespace= progress= resource=persistentvolumes restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Dynamically re-provisioning persistent volume because it doesn't have a snapshot and its reclaim policy is Delete." logSource="pkg/restore/restore.go:1095" restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Restored 6 items out of an estimated total of 7 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:664" name=pvc-b7b8f55d-2fcd-4329-93bc-e1e004290e09 namespace= progress= resource=persistentvolumes restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Getting client for /v1, Kind=PersistentVolumeClaim" logSource="pkg/restore/restore.go:878" restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Executing item action for persistentvolumeclaims" logSource="pkg/restore/restore.go:1128" restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Executing AddPVFromPVCAction" cmd=/velero logSource="pkg/restore/add_pv_from_pvc_action.go:44" pluginName=velero restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Adding PV pvc-2227d67e-0b72-4fbd-a2fe-e8cf74a3192e as an additional item to restore" cmd=/velero logSource="pkg/restore/add_pv_from_pvc_action.go:66" pluginName=velero restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Skipping persistentvolumes/pvc-2227d67e-0b72-4fbd-a2fe-e8cf74a3192e because it's already been restored." logSource="pkg/restore/restore.go:986" restore=qiming-backend/mongo-pv-75kns-5cps2

time="2022-07-07T10:53:24Z" level=info msg="Resetting PersistentVolumeClaim mongo-ce/mongo-mongodb for dynamic provisioning" logSource="pkg/restore/restore.go:1207" restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Attempting to restore PersistentVolumeClaim: mongo-mongodb" logSource="pkg/restore/restore.go:1233" restore=qiming-backend/mongo-pv-75kns-5cps2
time="2022-07-07T10:53:24Z" level=info msg="Restored 8 items out of an estimated total of 8 (estimate will change throughout the restore)" logSource="pkg/restore/restore.go:664" name=mongo-mongodb namespace=mongo-ce progress= resource=persistentvolumeclaims restore=qiming-backend/mongo-pv-75kns-5cps2

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Velero version (use velero version): v1.7.0
  • Velero features (use velero client config get features):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@reasonerjt
Copy link
Contributor

I think there are 2 aspects for this problem:

  1. We should find a way for user not to backup additional PV
  2. We need to clarify if this is a valid use case for velero that user may want to backup a PV that will not be mounted to any pods during restore.

@sseago
Copy link
Collaborator

sseago commented Jul 18, 2022

@reasonerjt Being mounted by pods isn't the issue. It's only with Restic backups that the PVC/PV must be mounted to a pod for backup/restore to work. For snapshots/CSI, no pods are necessary. The issue above isn't a missing Pod for a PVC+PV but that there's a PV backed up without the corresponding PVC.

@reasonerjt
Copy link
Contributor

@half-life666
After reading through the "expected behavior" in the OP:

When PVC/PV are specified in "includedResources", and namespaces are specified, the backup should only backup the PVC/PV in namespace

I believe this would be fixed after the proposal in #5333 is implemented, right? In this case, by adding clusterroles to the ClusterScopedResources, we make sure the clusterroles AND PVs associated with the PVCs are included.

@reasonerjt reasonerjt added this to the v1.11 milestone Jan 10, 2023
@half-life666
Copy link
Contributor Author

@reasonerjt
Right

@sseago
Copy link
Collaborator

sseago commented Jan 10, 2023

@reasonerjt Hmm. So there's one thing that's still not covered by that proposal. Whether we're talking about the new design in #5333 or the current design, with cluster-scoped resources, we can either include all resources of the types specified setting includeClusterResources to true (i.e. all PVs and ClusterRoles), or we can only include relevant resources (leaving includeClusterResources with the default/nil value), but that won't include ClusterRoles. If we want to include everything from certain cluster-scoped resources (ClusterRoles) but only relevant resources from others (PVs), there's nothing in the #5333 proposal which will implement this. It could be worked around by doing one backup which includes PVs and leaves includeClusterResources nil, and doing another backup which only includes clusterroles and sets includeClusterResources to true.

@reasonerjt
Copy link
Contributor

As we discussed earlier, with includeClusterResources and excludeClusterResources unset, velero will backup the relevant resources (the PVs bound to the PVCs in certain namespaces).
Hence, the user will need to set includeClusterResources to clusterroles, and velero will backup all clusterroles in addition to the relevant PVs.

@blackpiglet
Copy link
Contributor

@half-life666
#5333 also addresses this issue.
Could you check whether the design work with your scenario?

@reasonerjt reasonerjt assigned blackpiglet and unassigned reasonerjt Feb 6, 2023
@blackpiglet
Copy link
Contributor

Fixed by #5838

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants