-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Velero helm release marks backup "failed" even after backing up all items successfully #216
Comments
@tapanhalani Also, please make sure the Velero CRD manifests is up-to-date by running velero install --crds-only --dry-run -o yaml | kubectl apply -f - since the |
@jenting Thank you for your response. But the issue persists even after changing the name from "azure" to "default". Also, the log "Checking for existing backup locations ready to be verified" is an info log, whereas the "Backup failed" log is an error log, which seems to me like the former log is expected periodically (maybe to see if any new backup location is updated, and needs to be verified), whereas the latter log is an error. I tried to debug using "debug" level logs, but no additional information exists on the reason of this "backup failed" error:
Regarding CRDs, I deleted and re-installed them as a part of helm release itself, using the flag
I think this process should install all the up-to-date CRDs. |
Could you please run |
@jenting Here is the output of the above command:
On another note, I had some backups (with status: failed, but done to completion) with version 2.11.0 of helm chart already existing in the cluster. When I upgraded the release to 2.14.8, I had to re-install CRDs along with it, which deleted those backups from k8s cluster (though not from the azure storage account and Azure volume-snapshot RG, which still had the k8s backups and volume snapshots). So when velero-2.14.8 started, it found those pre-existing backups, and tried to sync them with k8s cluster, but got the following error:
Upon checking in Azure portal, the blob "velero-mybackup-20210218050043" exists in the storage account configured in Azure velero plugin, but still it seems not accessible from Velero. Is this indicative of any permissions issue in Azure ? |
@jenting I have managed to resolve the above issue after upgrading the resource limits of my velero instance from 50mcpu/200Mi memory to 1cpu/2Gi memory, as per findings in this issue: vmware-tanzu/velero#1856 . |
Thank u @tapanhalani, good to know. |
* add: kafka-exporter chart Signed-off-by: gkarthiks <[email protected]> * fix: artifacthub annotations Signed-off-by: gkarthiks <[email protected]> * feat: adding the tls option Signed-off-by: gkarthiks <[email protected]> * fix: default service monitor to false Signed-off-by: gkarthiks <[email protected]> * fix: default probes to false Signed-off-by: gkarthiks <[email protected]>
* add: kafka-exporter chart Signed-off-by: gkarthiks <[email protected]> * fix: artifacthub annotations Signed-off-by: gkarthiks <[email protected]> * feat: adding the tls option Signed-off-by: gkarthiks <[email protected]> * fix: default service monitor to false Signed-off-by: gkarthiks <[email protected]> * fix: default probes to false Signed-off-by: gkarthiks <[email protected]> Signed-off-by: Torsten Walter <[email protected]>
What steps did you take and what happened:
I have installed velero helm chart in AKS cluster, and configured azure plugin with it for backup and volumeSnapshots. The backups start according to the schedule set, and according to velero logs, I can see all k8s objects are successfully backed-up, and snapshots of volumes are created succcessfully in Azure RG. But the backup object in k8s is itself marked as failed.
What did you expect to happen:
Since the backup completed successfully, I expect the backup object to be marked "successful". If there is any error in the backup process, then there should be a more clear error (or maybe I am missing something somewhere).
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
Here are final logs of velero that marks completion of the backup (and also subsequent failure due to some unclear reason):
Here is the output of
kubectl describe backup
command (just the spec and status sections):Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
helm version
): 3.5.2helm list -n <YOUR NAMESPACE>
): velero 2.14.8kubectl version
): 1.19.0The text was updated successfully, but these errors were encountered: