-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Volume Snapshot Data Movement Design #5968
Volume Snapshot Data Movement Design #5968
Conversation
d80d1b6
to
c45a735
Compare
0913403
to
c909c1d
Compare
Codecov Report
@@ Coverage Diff @@
## main #5968 +/- ##
==========================================
- Coverage 39.94% 39.75% -0.19%
==========================================
Files 254 256 +2
Lines 22361 23237 +876
==========================================
+ Hits 8932 9239 +307
- Misses 12773 13300 +527
- Partials 656 698 +42
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
c909c1d
to
5dfc612
Compare
35241d3
to
7b44c0c
Compare
f3b503e
to
8dc8fa6
Compare
For backup, we intend to create an extensive architecture for various snapshot types, snapshot accesses and various data accesses. For example, the snapshot specific operations are isolated in Data Mover Plugin and Exposer. In this way, we only need to change the two modules for variations. Likely, the data access details are isolated into uploaders, so different uploaders could be plugged into the workflow smoothly. | ||
|
||
For restore, we intend to create a generic workflow that could for all backups. This means the restore is backup source independent. Therefore, for example, we can restore a CSI snapshot backup to another cluster with no CSI facilities or a CSI driver the same as the source cluster. | ||
We still have the Exposer module for restore and it is to expose the target volume to the data path. Therefore, we still have the flexibility to introduce different ways to expose the target volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, in the diagram, it looks like the data mover controller should be responsible of creating the PV, this has to be clarified, i.e. the data mover provider will handle provisioning the PV and velero will not do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the PV is created by the data mover. This is clarified in the Create Target PV section, I have added one more line to clarify that Velero should not create the PV if data movement restore is involved.
|
||
## Components | ||
**Velero**: Velero controls the backup/restore workflow, it calls BIA/RIA V2 to backup/restore an object that involves data movement, specifically, a PVC or a PV. | ||
**BIA/RIA V2**: BIA/RIA V2 are the protocols between Velero and the data mover plugins. They support asynchronized operations so that Velero backup/restore is not marked as completion until the data movement is done and in the meantime, Velero is free to process other backups during the data movement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need to separate the BIA v2 and DMP.
Categorize CSI plugin as a Data Mover Plugin
doesn't sound quite right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here BIA/RIA V2 means the interface/protocol and framework. And Data Mover Plugin means the module that implements the interface.
CSI plugin may not be the only Data Mover Plugin, for example, volume snapshotter could be integrated with data movement in future as another Data Mover Plugin.
Below is the restore workflow: | ||
![restore-workflow.png](restore-workflow.png) | ||
|
||
## Components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we separate the components that are only relevant to internal DM, which are not interesting to other DM providers.
Like Node-Agent, Exposer, VGDP, Uploader, they seem to fail within the scope of Velero internal Data Mover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should make clear of the components that are built-in DM specific. I have modified the doc to clarify this.
I wish to suggest we put a section to introduce the higher level of the interaction between data movement controller and velero, and then in a separate section we may zoom into the details of the internal data mover, such that other data mover providers will focus on the first section and the contract is more clarified. |
f6c8b5a
to
4e4950a
Compare
4e4950a
to
800de5f
Compare
325b710
to
6f827fc
Compare
354a2e9
to
07fef22
Compare
|
||
**Acquire Object Lock** | ||
**Release Object Lock** | ||
There are multiple instances of Data Uploader Controllers and when a DUCR is created, there should be only one of the instances handle the CR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Data Uploader Controllers should be implemented in the Node-Agent, so the CR should be handled by the Node-Agent pod that shares the same node as the uploading volume, then there is only one candidate, and there is no need to have the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the Data Uploader Controller starts to process the DUCR once it is created (its phase is New or ""), then the controller creates the backupPVC, so it means the early phases before the backupPV is provisioned need the lock to make a consensus of which controller is responsible to create the backupPVC.
38b511a
to
df32c43
Compare
@shubham-pampattiwar @reasonerjt @ywk253100 Please also help to review the same. |
df32c43
to
8e5a6a4
Compare
@Lyndon-Li Can we not just DeleteItemAction plugin for DataUploadCRs ? so whenever a backup is deleted the CRs also get deleted. |
Firstly, The aim here is to delete the backup data stored in the backup repo. And only the specific DM knows where the backup data is and how to delete it. So, the question is how do we let the DM know the backup is to be deleted. Therefore, Velero needs a private mechanism to notify the DM when it handling a |
@Lyndon-Li Thanks for the elaborate explanation, I think I understand now, got confused with the deletion of in-cluster CRs vs Data in backup repository. The modified delete workflow looks sane to me 👍 |
Signed-off-by: Lyndon-Li <[email protected]>
8e5a6a4
to
dd40f7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's approve it.
If we find minor changes required during the implementation, let's make sure they are also reflected in incremental change to this doc.
**Node-Agent**: Node-Agent is an existing Velero module that will be used to host VBDM. | ||
**Exposer**: Exposer is to expose the snapshot/target volume as a path/device name/endpoint that are recognizable by Velero generic data path. For different snapshot types/snapshot accesses, the Exposer may be different. This isolation guarantees that when we want to support other snapshot types/snapshot accesses, we only need to replace with a new Exposer and keep other components as is. | ||
**Velero Generic Data Path (VGDP)**: VGDP is the collective of modules that is introduced in [Unified Repository design][1]. Velero uses these modules to finish data transmission for various purposes. In includes uploaders and the backup repository. | ||
**Uploader**: Uploader is the module in VGDP that reads data from the source and writes to backup repository for backup; while read data from backup repository and write to the restore target for restore. At present, only file system uploader is supported. In future, the block level uploader will be added. For file system uploader, only Kopia uploader will be used, Restic will not be integrated with VBDM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restic will not be integrated with VBDM.
We wanna highlight this, b/c it means if user wanna use data mover kopia is THE uploader for all fs-level backup.
|
||
**Velero**: Velero controls the backup/restore workflow, it calls BIA/RIA V2 to backup/restore an object that involves data movement, specifically, a PVC or a PV. | ||
**BIA/RIA V2**: BIA/RIA V2 are the protocols between Velero and the data mover plugins. They support asynchronized operations so that Velero backup/restore is not marked as completion until the data movement is done and in the meantime, Velero is free to process other backups during the data movement. | ||
**Data Mover Plugin (DMP)**: DMP implement BIA/RIA V2 and it invokes the corresponding data mover by creating the DataUpload/DataDownload CRs. DMP is also responsible to take snapshot of the source volume, so it is a snapshot type specific module. For CSI snapshot data movement, the CSI plugin could be extended as a DMP, this also means that the CSI plugin will fully implement BIA/RIA V2 and support some more methods like Progress, Cancel, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is trivial but theoretically DMP may not have to be responsible to take snapshot.
For example, a developer may create a BIA plugin A to take snapshot X and return it as an additional item.
Then there's BIA plugin B to handle the snapshot X and move it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requirement could be met technically, the implementation will be like:
- The DMP we are talking about in this design exposes a BIA for a PVC, which takes snapshot of the PVC and the submit a DUCR to launch a DM
- Another kind of plugin (we call it DMP or not) could exposes a BIA for a VS (wherever the VS comes from), then this kind of DMP doesn't take snapshot but directly launch a DM
The current design doesn't cover the second case, if required, we can add in an incremental design.
Besides ```additionalItem``` (as the 2nd return value), Execute method will return one more resource list called ```itemToUpdate```, which means the items to be updated and persisted when the async operation completes. For details, visit [general progress monitoring design][2]. | ||
Specifically, this mechanism will be used to persist DUCR into the persisted backup data, in another words, DUCR will be returned as ```itemToUpdate``` from Execute method. DUCR contains all the information the restore requires, so during restore, DUCR will be extracted from the backup data. | ||
Additionally, in the same way, a DMP could add any other items into the persisted backup data. | ||
Execute method also returns the ```operationID``` which uniquely identifies the asynchronized operation. This ```operationID``` is generated by plugins. The [general progress monitoring design][2] doesn't restrict the format of the ```operationID```, for Velero CSI plugin, the ```operationID``` is a combination of the backup CR UID and the source PVC (represented by the ```item``` parameter) UID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this means there's re-try for the upload within one backup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the progress monitoring design doesn't have a retry mechanism itself, so whether or not a retry happens is decided by the DM. At present, VBDM doesn't have retry mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! Thank you @Lyndon-Li
|
||
|
||
[1]: ../unified-repo-and-kopia-integration/unified-repo-and-kopia-integration.md | ||
[2]: ../general-progress-monitoring.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broken link; now at https://github.com/vmware-tanzu/velero/blob/main/design/Implemented/general-progress-monitoring.md (missing implemented/
infix)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing this.
I tried to add the Implemented
infix, but finally realized that we cannot do this. At the release time of 1.12, we will also move this design to the Implemented
folder. We will do the same for unified-repo-and-kopia-integration
folder (we have missed to do so in the release of 1.11), then all the links will be fixed.
Therefore, let's leave it as is for now, and it will be fixed itself at the release time.
Add the design for Volume Snapshot Data Movement