You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the problem/challenge you have
We are trying to build an alert that fires per schedule if the last backup failed. The current metric velero_backup_last_successful_timestamp only exposes the timestamp of the last successful backup per schedule. It is difficult to figure out how to write the alert using this metric when we only need whether the last backup for each schedule was successful or not.
Describe the solution you'd like
We would like to have a metric like velero_backup_last_status which would be of metric type = gauge.
The metric would return 1 or 0 depending on the success or failure of the last backup. The metric would also expose the schedule as a label. As of now, we only have the metric exposing successes via the velero_backup_last_successful_timestamp. The missing detail here is whether the last backup attempt succeeded or failed which the new metric can expose.
Anything else you would like to add:
We have a submitted a PR to upstream #5397 which can be extended further once this metric is exposed by Velero. It would be able to fire an alert if a backup failed for a specific schedule.
The alert should stop firing as soon as the backup is created for that schedule. Any suggestions are welcome on how to correctly approach this as we have multiple schedules creating backups twice a day, daily, and weekly.
Environment:
Velero version (use velero version): v1.8.1
Kubernetes version (use kubectl version): 1.20.10
Kubernetes installer & version:
Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): Ubuntu 20.04.2 LTS (kernel version : 5.8.0-1041-aws)
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
👍 for "The project would be better with this feature added"
👎 for "This feature will not enhance the project in a meaningful way"
The text was updated successfully, but these errors were encountered:
I think the idea of adding the metrics velero_backup_last_status with the label of schedule is helpful especially for schedule backup. Currently Velero lacks a metric which shows the status of the specific backup.
Describe the problem/challenge you have
We are trying to build an alert that fires per schedule if the last backup failed. The current metric
velero_backup_last_successful_timestamp
only exposes the timestamp of the last successful backup per schedule. It is difficult to figure out how to write the alert using this metric when we only need whether the last backup for each schedule was successful or not.Describe the solution you'd like
We would like to have a metric like
velero_backup_last_status
which would be of metric type = gauge.The metric would return 1 or 0 depending on the success or failure of the last backup. The metric would also expose the schedule as a label. As of now, we only have the metric exposing successes via the
velero_backup_last_successful_timestamp
. The missing detail here is whether the last backup attempt succeeded or failed which the new metric can expose.Anything else you would like to add:
We have a submitted a PR to upstream #5397 which can be extended further once this metric is exposed by Velero. It would be able to fire an alert if a backup failed for a specific schedule.
The alert should stop firing as soon as the backup is created for that schedule. Any suggestions are welcome on how to correctly approach this as we have multiple schedules creating backups twice a day, daily, and weekly.
Environment:
velero version
): v1.8.1kubectl version
): 1.20.10/etc/os-release
): Ubuntu 20.04.2 LTS (kernel version : 5.8.0-1041-aws)Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
The text was updated successfully, but these errors were encountered: