Velero restored pods' schedule behaviors are different from Kubernetes scheduler #6945

Lyndon-Li · 2023-10-12T03:20:26Z

During restore, Velero removes the objects' ownerReference field, including pod objects. However, pods' ownerReference makes remarkable affects to Kubernetes scheduler --- for statefulset/replicaset pods, the scheduler spreads the pods evenly across node as much as possible, this is done for statefulset/replicaset pods only.
If the pods's ownerReferences are removed, the scheduler has no way to identify the pods as part of statefulset/replicaset, as a result, the aforementioned strategies are not applied to the pods restored by Velero.

The consequence from users perspective is that when restoring a statefulset/replicaset/deployment, pods are probably scheduled to the same node instead of spreading evenly across nodes. While the spread of the pods heavily impacts the quality of HA.

The text was updated successfully, but these errors were encountered:

github-actions · 2023-12-12T01:49:16Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.

cdtzabra · 2024-01-23T17:38:03Z

This make an orphan pods not attached to replicasets

And for the statefullset app issue to attach a pvc to a right pods

reasonerjt · 2024-03-04T09:11:30Z

This make an orphan pods not attached to replicasets

I believe the controller should be able to reconcile and eventually the criteria of the replicasets are met?

And for the statefullset app issue to attach a pvc to a right pods

Would you mind opening another issue to elaborate on this issue? I don't think velero can be used as a deployer to make sure "redeploy" any application during restore out of the box, but if it's specific to statefulset, there may be something can be enhanced.

cdtzabra · 2024-03-04T10:14:06Z

Hi @reasonerjt

No, the controller doesn't do anything for FS restoration with Restic or Kopia (volume-to-fs-backup).

If you want PVC/PV data, you have to restore the pods. However, if, in the same restore, you keep the deployment/statefullset/x with or without the replicaset, you'll still have new pods that are spawned and therefore unable to use the pvcs, since they're still attached to the restored pods.

The only workaround in this case is to delete these restored pods (since they don't correspond to the replicaset and are therefore orphaned) to free up the pvcs.

I migrated from Velero/restic to velero/kopia but it is still the same behavior

As a result, I had to divide my restore(from cluster x to cluster y) into several steps

restore only pods + pvc
check that data is restored and delete pods
restore the rest of the workload without pods

some related issues:

Lyndon-Li self-assigned this Oct 12, 2023

Lyndon-Li added the Restore label Oct 12, 2023

github-actions bot added the staled label Dec 12, 2023

Lyndon-Li removed the staled label Dec 12, 2023

reasonerjt added the Needs triage We need discussion to understand problem and decide the priority label Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Velero restored pods' schedule behaviors are different from Kubernetes scheduler #6945

Velero restored pods' schedule behaviors are different from Kubernetes scheduler #6945

Lyndon-Li commented Oct 12, 2023 •

edited

Loading

github-actions bot commented Dec 12, 2023

cdtzabra commented Jan 23, 2024

reasonerjt commented Mar 4, 2024 •

edited

Loading

cdtzabra commented Mar 4, 2024

Velero restored pods' schedule behaviors are different from Kubernetes scheduler #6945

Velero restored pods' schedule behaviors are different from Kubernetes scheduler #6945

Comments

Lyndon-Li commented Oct 12, 2023 • edited Loading

github-actions bot commented Dec 12, 2023

cdtzabra commented Jan 23, 2024

reasonerjt commented Mar 4, 2024 • edited Loading

cdtzabra commented Mar 4, 2024

Lyndon-Li commented Oct 12, 2023 •

edited

Loading

reasonerjt commented Mar 4, 2024 •

edited

Loading