Volume snapshotter plugin are used by Velero to take snapshots of persistent volume contents. Depending on the underlying storage system, those snapshots may be available to use immediately, they may be uploaded to stable storage internally by the plugin or they may need to be uploaded after the snapshot has been taken. We would like for Velero to continue on to the next part of the backup as quickly as possible but we would also like the backup to not be marked as complete until it is a usable backup. We'd also eventually like to bring the control of upload under the control of Velero and allow the user to make decisions about the ultimate destination of backup data independent of the storage system they're using.
AWS - AWS snapshots return quickly, but are then uploaded in the background and cannot be used until EBS moves the data into S3 internally.
vSphere - The vSphere plugin takes a local snapshot and then the vSphere plugin uploads the data to S3. The local snapshot is usable before the upload completes.
Restic - Does not go through the volume snapshot path. Restic backups will block Velero progress until completed.
- Enable monitoring of operations that continue after snapshotting operations have completed
- Keep non-usable backups (upload/persistence has not finished) from appearing as completed
- Minimize change to volume snapshot and BackupItemAction plugins
- Unification of BackupItemActions and VolumeSnapshotters
In this model, movement of the snapshot to stable storage is under the control of the snapshot plugin. Decisions about where and when the snapshot gets moved to stable storage are not directly controlled by Velero. This is the model for the current VolumeSnapshot plugins.
In this model, the snapshot is moved to external storage under the control of Velero. This enables Velero to move data between storage systems. This also allows backup partners to use Velero to snapshot data and then move the data into their backup repository.
Velero currently has backup phases "InProgress" and "Completed". The backup moves to the Completed phase when all of the volume snapshots have completed and the Kubernetes metadata has been written into the object store. However, the actual data movement may be happening in the background after the backup has been marked "Completed". The backup is not actually a stable backup until the data has been persisted properly. In some cases (e.g. AWS) the backup cannot be restored from until the snapshots have been persisted.
Once the snapshots have been taken, however, it is possible for additional backups to be made without interference. Waiting until all data has been moved before starting the next backup will slow the progress of the system without adding any actual benefit to the user.
A new backup phase, "Uploading" will be introduced. When a backup has entered this phase, Velero is free to start another backup. The backup will remain in the "Uploading" phase until all data has been successfully moved to persistent storage. The backup will not fail once it reaches this phase, it will continuously retry moving the data. If the backup is deleted (cancelled), the plugins will attempt to delete the snapshots and stop the data movement - this may not be possible with all storage systems.
When a backup request is initially created, it is in the "New" phase.
The next state is either "InProgress" or "FailedValidation"
If the backup request is incorrectly formed, it goes to the "FailedValidation" phase and terminates
When work on the backup begins, it moves to the "InProgress" phase. It remains in the "InProgress" phase until all pre/post execution hooks have been executed, all snapshots have been taken and the Kubernetes metadata and backup info is safely written to the object store plugin.
In the current implementation, Restic backups will move data during the "InProgress" phase. In the future, it may be possible to combine a snapshot with a Restic (or equivalent) backup which would allow for data movement to be handled in the "Uploading" phase,
The next phase is either "Completed", "Uploading", "Failed" or "PartiallyFailed". Backups which would have a final phase of "Completed" or "PartiallyFailed" may move to the "Uploading" state. A backup which will be marked "Failed" will go directly to the "Failed" phase. Uploads may continue in the background for snapshots that were taken by a "Failed" backup, but no progress will not be monitored or updated. When a "Failed" backup is deleted, all snapshots will be deleted and at that point any uploads still in progress should be aborted.
The "Uploading" phase signifies that the main part of the backup, including snapshotting has completed successfully and uploading is continuing. In the event of an error during uploading, the phase will change to UploadingPartialFailure. On success, the phase changes to Completed. The backup cannot be restored from when it is in the Uploading state.
The "UploadingPartialFailure" phase signifies that the main part of the backup, including snapshotting has completed, but there were partial failures either during the main part or during the uploading. The backup cannot be restored from when it is in the UploadingPartialFailure state.
When a backup has had fatal errors it is marked as "Failed" This backup cannot be restored from.
The "Completed" phase signifies that the backup has completed, all data has been transferred to stable storage and the backup is ready to be used in a restore. When the Completed phase has been reached it is safe to remove any of the items that were backed up.
The "PartiallyFailed" phase signifies that the backup has completed and at least part of the backup is usable. Restoration from a PartiallyFailed backup will not result in a complete restoration but pieces may be available.
When a BackupAction is executed, any SnapshotItemAction or VolumeSnapshot plugins will return snapshot IDs. The plugin should be able to provide status on the progress for the snapshot and handle cancellation of the upload if the snapshot is deleted. If the plugin is restarted, the snapshot ID should remain valid.
When all snapshots have been taken and Kubernetes resources have been persisted to the ObjectStorePlugin the backup will either have fatal errors or will be at least partially usable.
If the backup has fatal errors it will move to the "Failed" state and finish. If a backup fails, the upload will not be cancelled but it will not be monitored either. For backups in any phase, all snapshots will be deleted when the backup is deleted. Plugins will cancel any data movement and remove snapshots and other associated resources when the VolumeSnapshotter DeleteSnapshot method or DeleteItemAction Execute method is called.
Velero will poll the plugins for status on the snapshots when the backup exits the "InProgress" phase and has no fatal errors.
If any snapshots are not complete, the backup will move to either Uploading or UploadingPartialFailure or Failed.
Post-snapshot operations may take a long time and Velero and its plugins may be restarted during this time. Once a backup has moved into the Uploading or UploadingPartialFailure phase, another backup may be started.
While in the Uploading or UploadingPartialFailure phase, the snapshots and backup items will be periodically polled. When all of the snapshots and backup items have reported success, the backup will move to the Completed or PartiallyFailed phase, depending on whether the backup was in the Uploading or UploadingPartialFailure phase.
The Backup resources will not be written to object storage until the backup has entered a final phase: Completed, Failed or PartialFailure
InProgress backups will not have a velero-backup.json
present in the object store. During reconciliation, backups which
do not have a velero-backup.json
object in the object store will be ignored.
type UploadProgress struct {
completed bool // True when the operation has completed, either successfully or with a failure
err error // Set when the operation has failed
itemsCompleted, itemsToComplete int64 // The number of items that have been completed and the items to complete
// For a disk, an item would be a byte and itemsToComplete would be the
// total size to transfer (may be less than the size of a volume if
// performing an incremental) and itemsCompleted is the number of bytes
// transferred. On successful completion, itemsCompleted and itemsToComplete
// should be the same
started, updated time.Time // When the upload was started and when the last update was seen. Not all
// systems retain when the upload was begun, return Time 0 (time.Unix(0, 0))
// if unknown.
}
A new method will be added to the VolumeSnapshotter interface (details depending on plugin versioning spec)
UploadProgress(snapshotID string) (UploadProgress, error)
UploadProgress will report the current status of a snapshot upload. This should be callable at any time after the snapshot has been taken. In the event a plugin is restarted, if the snapshotID continues to be valid it should be possible to retrieve the progress.
error
is set if there is an issue retrieving progress. If the snapshot is has encountered an error during the upload,
the error should be return in UploadProgress and error should be nil.
Currently CSI snapshots and the Velero Plugin for vSphere are implemented as BackupItemAction plugins. The majority of BackupItemAction plugins do not take snapshots or upload data so rather than modify BackupItemAction we introduce a new plugins, SnapshotItemAction. SnapshotItemAction will be used in place of BackupItemAction for the CSI snapshots and the Velero Plugin for vSphere and will return a snapshot ID in addition to the item itself.
The SnapshotItemAction plugin identifier as well as the Item and Snapshot ID will be stored in the
<backup-name>-itemsnapshots.json.gz
. When checking for progress, this info will be used to select the appropriate
SnapshotItemAction plugin to query for progress.
NotApplicable should only be returned if the SnapshotItemAction plugin should not be handling the item. If the SnapshotItemAction plugin should handle the item but, for example, the item/snapshot ID cannot be found to report progress, a UploadProgress struct with the error set appropriately (in this case NotFound) should be returned.
// SnapshotItemAction is an actor that snapshots an individual item being backed up (it may also do other
operations on the item that is returned).
type SnapshotItemAction interface {
// AppliesTo returns information about which resources this action should be invoked for.
// A BackupItemAction's Execute function will only be invoked on items that match the returned
// selector. A zero-valued ResourceSelector matches all resources.
AppliesTo() (ResourceSelector, error)
// Execute allows the ItemAction to perform arbitrary logic with the item being backed up,
// including mutating the item itself prior to backup. The item (unmodified or modified)
// should be returned, along with an optional slice of ResourceIdentifiers specifying
// additional related items that should be backed up.
Execute(item runtime.Unstructured, backup *api.Backup) (runtime.Unstructured, snapshotID string,
[]ResourceIdentifier, error)
// Progress
Progress(input *SnapshotItemProgressInput) (UploadProgress, error)
}
// SnapshotItemProgressInput contains the input parameters for the SnapshotItemAction's Progress function.
type SnapshotItemProgressInput struct {
// Item is the item that was stored in the backup
Item runtime.Unstructured
// SnapshotID is the snapshot ID returned by SnapshotItemAction
SnapshotID string
// Backup is the representation of the restore resource processed by Velero.
Backup *velerov1api.Backup
}
No changes to the existing format are introduced by this change. A <backup-name>-itemsnapshots.json.gz
file will be
added that contains the items and snapshot IDs returned by ItemSnapshotAction. Also, the creation of the
velero-backup.json
object will not occur until the backup moves to one of the terminal phases (Completed,
PartiallyFailed, or Failed). Reconciliation should ignore backups that do not have a velero-backup.json
object.
The cluster that is creating the backup will have the Backup resource present and will be able to manage the backup before the backup completes.
If the Backup resource is removed (e.g. Velero is uninstalled) before a backup completes and writes its
velero-backup.json
object, the other objects in the object store for the backup will be effectively orphaned. This
can currently happen but the current window is much smaller.
The itemsnapshots file is similar to the existing <backup-name>-itemsnapshots.json.gz
Each snapshot taken via
SnapshotItemAction will have a JSON record in the file. Exact format TBD.
For systems such as EBS, a snapshot is not available until the storage system has transferred the snapshot to stable storage. CSI snapshots expose the readyToUse state that, in the case of EBS, indicates that the snapshot has been transferred to durable storage and is ready to be used. The CSI BackupItemProgress.Progress method will poll that field and when completed, return completion.
The vSphere Plugin for Velero uploads snapshots to S3 in the background. This is also a BackupItemAction plugin, it will check the status of the Upload records for the snapshot and return progress.
The backup workflow remains the same until we get to the point where the velero-backup.json
object is written.
At this point, we will queue the backup to a finalization go-routine. The next backup may then begin. The finalization
routine will run across all of the volume snapshots and call the UploadProgress method on each of them. It will
then run across all items and call BackupItemProgress.Progress for any that match with a BackupItemProgress.
If all snapshots and backup items have finished uploading (either successfully or failed), the backup will be completed
and the backup will move to the appropriate terminal phase and upload the velero-backup.json
object to the object store
and the backup will be complete.
If any of the snapshots or backup items are still being processed, the phase of the backup will be set to the appropriate phase (Uploading or UploadingPartialFailure). In the event of any of the upload progress checks return an error, the phase will move to UploadingPartialFailure. The backup will then be requeued and will be rechecked again after some time has passed.
On restart, the Velero server will scan all Backup resources. Any Backup resources which are in the InProgress phase will be moved to the Failed phase. Any Backup resources in the Oploading or OploadingPartialFailure phase will be treated as if they have been requeued and progress checked and the backup will be requeued or moved to a terminal phase as appropriate.
VolumeSnapshotter new plugin APIs
BackupItemProgress new plugin interface
New backup phases
Defer uploading velero-backup.json
AWS EBS plugin UploadProgress implementation
Upload monitoring
Implementation of <backup-name>-itemsnapshots.json.gz
file
Restart logic
Change in reconciliation logic to ignore backups that have not completed
CSI plugin BackupItemProgress implementation
vSphere plugin BackupItemProgress implementation (vSphere plugin team)
Futures are here for reference, they may change radically when actually implemented.
Some storage systems have the ability to provide different levels of protection for snapshots. These are termed "Fragile" and "Durable". Currently, Velero expects snapshots to be Durable (they should be able to survive the destruction of the cluster and the storage it is using). In the future we would like the ability to take advantage of snapshots that are Fragile. For example, vSphere snapshots are Fragile (they reside in the same datastore as the virtual disk). The Velero Plugin for vSphere uses a vSphere local/fragile snapshot to get a consistent snapshot, then uploads the data to S3 to make it Durable. In the current design, upload progress will not be complete until the snapshot is ready to use and Durable. It is possible, however, to restore data from a vSphere snapshot before it has been made Durable, and this is a capability we'd like to expose in the future. Other storage systems implement this functionality as well. We will be moving the control of the data movement from the vSphere plugin into Velero.
Some storage system, such as EBS, are only capable of creating Durable snapshots. There is no usable intermediate Fragile stage.
For a Velero backup, users should be able to specify whether they want a Durable backup or a Fragile backup (Fragile backups may consume less resources, be quicker to restore from and are suitable for things like backing up a cluster before upgrading software). We can introduce three snapshot states - Creating, Fragile and Durable. A snapshot would be created with a desired state, Fragile or Durable. When the snapshot reaches the desired or higher state (e.g. request was for Fragile but snapshot went to Durable as on EBS), then the snapshot would be completed.