Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volume Snapshot Data Movement Design #5968

Merged
merged 1 commit into from
May 24, 2023

Conversation

Lyndon-Li
Copy link
Contributor

Add the design for Volume Snapshot Data Movement

@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch from d80d1b6 to c45a735 Compare March 8, 2023 04:47
@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch 2 times, most recently from 0913403 to c909c1d Compare March 8, 2023 04:53
@codecov-commenter
Copy link

codecov-commenter commented Mar 8, 2023

Codecov Report

Merging #5968 (d58d187) into main (94fec66) will decrease coverage by 0.19%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #5968      +/-   ##
==========================================
- Coverage   39.94%   39.75%   -0.19%     
==========================================
  Files         254      256       +2     
  Lines       22361    23237     +876     
==========================================
+ Hits         8932     9239     +307     
- Misses      12773    13300     +527     
- Partials      656      698      +42     
Impacted Files Coverage Δ
pkg/backup/backup.go 52.58% <0.00%> (-26.30%) ⬇️
pkg/backup/item_collector.go 54.01% <0.00%> (-5.87%) ⬇️
...kupitemaction/v2/restartable_backup_item_action.go 61.64% <0.00%> (-3.58%) ⬇️
pkg/controller/download_request_controller.go 68.22% <0.00%> (-2.08%) ⬇️
...k/backupitemaction/v2/backup_item_action_server.go 31.29% <0.00%> (-1.25%) ⬇️
internal/hook/item_hook_handler.go 87.11% <0.00%> (-0.81%) ⬇️
pkg/controller/backup_controller.go 55.15% <0.00%> (-0.73%) ⬇️
pkg/persistence/object_store.go 52.63% <0.00%> (-0.56%) ⬇️
pkg/cmd/server/server.go 6.14% <0.00%> (-0.49%) ⬇️
pkg/backup/request.go 100.00% <0.00%> (ø)
... and 11 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch from c909c1d to 5dfc612 Compare March 8, 2023 05:07
@Lyndon-Li Lyndon-Li self-assigned this Mar 8, 2023
@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch 7 times, most recently from 35241d3 to 7b44c0c Compare March 10, 2023 03:38
@Lyndon-Li Lyndon-Li marked this pull request as ready for review March 10, 2023 03:51
@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch 3 times, most recently from f3b503e to 8dc8fa6 Compare March 14, 2023 09:05
For backup, we intend to create an extensive architecture for various snapshot types, snapshot accesses and various data accesses. For example, the snapshot specific operations are isolated in Data Mover Plugin and Exposer. In this way, we only need to change the two modules for variations. Likely, the data access details are isolated into uploaders, so different uploaders could be plugged into the workflow smoothly.

For restore, we intend to create a generic workflow that could for all backups. This means the restore is backup source independent. Therefore, for example, we can restore a CSI snapshot backup to another cluster with no CSI facilities or a CSI driver the same as the source cluster.
We still have the Exposer module for restore and it is to expose the target volume to the data path. Therefore, we still have the flexibility to introduce different ways to expose the target volume.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, in the diagram, it looks like the data mover controller should be responsible of creating the PV, this has to be clarified, i.e. the data mover provider will handle provisioning the PV and velero will not do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the PV is created by the data mover. This is clarified in the Create Target PV section, I have added one more line to clarify that Velero should not create the PV if data movement restore is involved.


## Components
**Velero**: Velero controls the backup/restore workflow, it calls BIA/RIA V2 to backup/restore an object that involves data movement, specifically, a PVC or a PV.
**BIA/RIA V2**: BIA/RIA V2 are the protocols between Velero and the data mover plugins. They support asynchronized operations so that Velero backup/restore is not marked as completion until the data movement is done and in the meantime, Velero is free to process other backups during the data movement.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need to separate the BIA v2 and DMP.

Categorize CSI plugin as a Data Mover Plugin doesn't sound quite right to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here BIA/RIA V2 means the interface/protocol and framework. And Data Mover Plugin means the module that implements the interface.
CSI plugin may not be the only Data Mover Plugin, for example, volume snapshotter could be integrated with data movement in future as another Data Mover Plugin.

Below is the restore workflow:
![restore-workflow.png](restore-workflow.png)

## Components
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we separate the components that are only relevant to internal DM, which are not interesting to other DM providers.

Like Node-Agent, Exposer, VGDP, Uploader, they seem to fail within the scope of Velero internal Data Mover.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should make clear of the components that are built-in DM specific. I have modified the doc to clarify this.

@reasonerjt
Copy link
Contributor

reasonerjt commented Mar 14, 2023

I wish to suggest we put a section to introduce the higher level of the interaction between data movement controller and velero, and then in a separate section we may zoom into the details of the internal data mover, such that other data mover providers will focus on the first section and the contract is more clarified.

@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch 2 times, most recently from f6c8b5a to 4e4950a Compare March 14, 2023 10:56
@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch from 4e4950a to 800de5f Compare March 20, 2023 03:51
@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch 4 times, most recently from 325b710 to 6f827fc Compare March 29, 2023 04:21
@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch 3 times, most recently from 354a2e9 to 07fef22 Compare April 10, 2023 04:10

**Acquire Object Lock**
**Release Object Lock**
There are multiple instances of Data Uploader Controllers and when a DUCR is created, there should be only one of the instances handle the CR.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Data Uploader Controllers should be implemented in the Node-Agent, so the CR should be handled by the Node-Agent pod that shares the same node as the uploading volume, then there is only one candidate, and there is no need to have the lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the Data Uploader Controller starts to process the DUCR once it is created (its phase is New or ""), then the controller creates the backupPVC, so it means the early phases before the backupPV is provisioned need the lock to make a consensus of which controller is responsible to create the backupPVC.

@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch 2 times, most recently from 38b511a to df32c43 Compare April 20, 2023 11:55
@Lyndon-Li
Copy link
Contributor Author

@shubham-pampattiwar
After a discussion, we think the workflow for the data mover backup deletion is too complicated, since a DataUploadDelete CRD and controller need to be involved.
Therefore, I modified the workflow in the design, please help to review it and let us know for any concerns or suggestions.

@reasonerjt @ywk253100 Please also help to review the same.

@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch from df32c43 to 8e5a6a4 Compare April 20, 2023 12:00
@shubham-pampattiwar
Copy link
Collaborator

@Lyndon-Li Can we not just DeleteItemAction plugin for DataUploadCRs ? so whenever a backup is deleted the CRs also get deleted.

@Lyndon-Li
Copy link
Contributor Author

@shubham-pampattiwar

Can we not just DeleteItemAction plugin for DataUploadCRs ? so whenever a backup is deleted the CRs also get deleted.

Firstly, The aim here is to delete the backup data stored in the backup repo. And only the specific DM knows where the backup data is and how to delete it. So, the question is how do we let the DM know the backup is to be deleted.
Secondly, we cannot let the controller to monitor the deletion event of DataUpload CRs, because kubectl delete DataUpload -n velero could also generate the same event, but we don't want delete the backup data once some user run this command by mistake.

Therefore, Velero needs a private mechanism to notify the DM when it handling a deletebackup request, then the DM does its own work to delete the backup data

@shubham-pampattiwar
Copy link
Collaborator

@shubham-pampattiwar

Can we not just DeleteItemAction plugin for DataUploadCRs ? so whenever a backup is deleted the CRs also get deleted.

Firstly, The aim here is to delete the backup data stored in the backup repo. And only the specific DM knows where the backup data is and how to delete it. So, the question is how do we let the DM know the backup is to be deleted. Secondly, we cannot let the controller to monitor the deletion event of DataUpload CRs, because kubectl delete DataUpload -n velero could also generate the same event, but we don't want delete the backup data once some user run this command by mistake.

Therefore, Velero needs a private mechanism to notify the DM when it handling a deletebackup request, then the DM does its own work to delete the backup data

@Lyndon-Li Thanks for the elaborate explanation, I think I understand now, got confused with the deletion of in-cluster CRs vs Data in backup repository. The modified delete workflow looks sane to me 👍

@Lyndon-Li Lyndon-Li force-pushed the velero-data-movement-design branch from 8e5a6a4 to dd40f7b Compare May 16, 2023 10:43
Copy link
Contributor

@reasonerjt reasonerjt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's approve it.

If we find minor changes required during the implementation, let's make sure they are also reflected in incremental change to this doc.

**Node-Agent**: Node-Agent is an existing Velero module that will be used to host VBDM.
**Exposer**: Exposer is to expose the snapshot/target volume as a path/device name/endpoint that are recognizable by Velero generic data path. For different snapshot types/snapshot accesses, the Exposer may be different. This isolation guarantees that when we want to support other snapshot types/snapshot accesses, we only need to replace with a new Exposer and keep other components as is.
**Velero Generic Data Path (VGDP)**: VGDP is the collective of modules that is introduced in [Unified Repository design][1]. Velero uses these modules to finish data transmission for various purposes. In includes uploaders and the backup repository.
**Uploader**: Uploader is the module in VGDP that reads data from the source and writes to backup repository for backup; while read data from backup repository and write to the restore target for restore. At present, only file system uploader is supported. In future, the block level uploader will be added. For file system uploader, only Kopia uploader will be used, Restic will not be integrated with VBDM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restic will not be integrated with VBDM.

We wanna highlight this, b/c it means if user wanna use data mover kopia is THE uploader for all fs-level backup.


**Velero**: Velero controls the backup/restore workflow, it calls BIA/RIA V2 to backup/restore an object that involves data movement, specifically, a PVC or a PV.
**BIA/RIA V2**: BIA/RIA V2 are the protocols between Velero and the data mover plugins. They support asynchronized operations so that Velero backup/restore is not marked as completion until the data movement is done and in the meantime, Velero is free to process other backups during the data movement.
**Data Mover Plugin (DMP)**: DMP implement BIA/RIA V2 and it invokes the corresponding data mover by creating the DataUpload/DataDownload CRs. DMP is also responsible to take snapshot of the source volume, so it is a snapshot type specific module. For CSI snapshot data movement, the CSI plugin could be extended as a DMP, this also means that the CSI plugin will fully implement BIA/RIA V2 and support some more methods like Progress, Cancel, etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trivial but theoretically DMP may not have to be responsible to take snapshot.

For example, a developer may create a BIA plugin A to take snapshot X and return it as an additional item.
Then there's BIA plugin B to handle the snapshot X and move it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requirement could be met technically, the implementation will be like:

  • The DMP we are talking about in this design exposes a BIA for a PVC, which takes snapshot of the PVC and the submit a DUCR to launch a DM
  • Another kind of plugin (we call it DMP or not) could exposes a BIA for a VS (wherever the VS comes from), then this kind of DMP doesn't take snapshot but directly launch a DM

The current design doesn't cover the second case, if required, we can add in an incremental design.

Besides ```additionalItem``` (as the 2nd return value), Execute method will return one more resource list called ```itemToUpdate```, which means the items to be updated and persisted when the async operation completes. For details, visit [general progress monitoring design][2].
Specifically, this mechanism will be used to persist DUCR into the persisted backup data, in another words, DUCR will be returned as ```itemToUpdate``` from Execute method. DUCR contains all the information the restore requires, so during restore, DUCR will be extracted from the backup data.
Additionally, in the same way, a DMP could add any other items into the persisted backup data.
Execute method also returns the ```operationID``` which uniquely identifies the asynchronized operation. This ```operationID``` is generated by plugins. The [general progress monitoring design][2] doesn't restrict the format of the ```operationID```, for Velero CSI plugin, the ```operationID``` is a combination of the backup CR UID and the source PVC (represented by the ```item``` parameter) UID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this means there's re-try for the upload within one backup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the progress monitoring design doesn't have a retry mechanism itself, so whether or not a retry happens is decided by the DM. At present, VBDM doesn't have retry mechanism.

Copy link
Collaborator

@shubham-pampattiwar shubham-pampattiwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Thank you @Lyndon-Li

@Lyndon-Li Lyndon-Li merged commit 3ad091d into vmware-tanzu:main May 24, 2023


[1]: ../unified-repo-and-kopia-integration/unified-repo-and-kopia-integration.md
[2]: ../general-progress-monitoring.md
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@Lyndon-Li Lyndon-Li Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing this.
I tried to add the Implemented infix, but finally realized that we cannot do this. At the release time of 1.12, we will also move this design to the Implemented folder. We will do the same for unified-repo-and-kopia-integration folder (we have missed to do so in the release of 1.11), then all the links will be fixed.

Therefore, let's leave it as is for now, and it will be fixed itself at the release time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants