Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define itemoperations.json format and update DownloadRequest API. #5752

Merged
merged 1 commit into from
Jan 19, 2023

Conversation

sseago
Copy link
Collaborator

@sseago sseago commented Jan 9, 2023

This is to support uploading async operation metadata to object storage to support progress monitoring.

Thank you for contributing to Velero!

Please add a summary of your change

Does your change fix a particular issue?

Fixes #5494, #5495

Please indicate you've done the following:

  • Accepted the DCO. Commits without the DCO will delay acceptance.
  • [x ] Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
  • Updated the corresponding documentation in site/content/docs/main.

This is to support uploading async operation metadata to
object storage to support progress monitoring.

Signed-off-by: Scott Seago <[email protected]>
@sseago sseago force-pushed the itemoperations-json branch from 3d7bf42 to 70b4238 Compare January 9, 2023 23:19
Status OperationStatus `json:"status"`
}

type BackupOperationSpec struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also applied to RestoreOperation
Do we need to sync this itemoperations.json for backup sync? Consider below cases:

  1. If we install velero in another cluster, then during backup sync, this json should not be synced, because the plugin cannot recover the async operation in this new cluster?
  2. If we install velero again in the same cluster, if the data mover CR is processed by velero's modules, for example, node-agent, I don't think the (formerly InProgress)CR could be re-queued and recovered, then should we sync this json?
  3. If we install velero again in the same cluster, if the data mover CR is processed by a plugging module from 3rd vendors out of velero namespace, the CR processing may not be affected by the velero reinstallation, then we should sync this json?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think there are two questions here:

  1. Do we sync the data?
  2. Does velero attempt to process backups synced (calling progress on operations,etc.)?

These are separate questions -- I think we want the data because it will be displayed in velero describe backup to give full status information on a prior backup -- i.e. it's PartiallyFailed and one operation from this list ended in error. That being said, these json files in object storage are not actually synced to the cluster. Only the kubernetes CRs are synced to the cluster. Things like the itemoperations.json are pulled from the cluster as needed via DownloadRequests, so there's not actually a concern about syncing this data file to the cluster.

But you are correct that velero should not attempt to reconcile operations on a synced backup, Backups pulled from object storage should be considered complete. Currently we don't write a backup to object storage until it's marked complete. With this new feature, we will be writing backups to object storage earlier -- when moving to WaitingForPluginOperations phase. I think the solution here is that when the implementation of the controller logic is added for operation monitoring, we update the logic in the backup sync controller and we don't sync backups that are not in a terminal state -- i.e. if we find 5 backups in the BSL, 2 with phase Completed, 2 with phase PartiallyFailed, and 1 with phase WaitingForPluginOperations, we only sync the first four. The last one won't get synced into remote clusters until it reaches a terminal state.
@shubham-pampattiwar @dymurray does the above look right to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the json is only used to provide information to velero describe backup, then we don't need to sync it in backup sync, we only pull it on demand and discard it when we finish.

Calling BIA V2 in a backup sync scenarios will make things complicated(as the cases I listed above) and in most cases the call would fail. So personally I agree with the idea that we don't sync the backup until it reaches to a terminal state

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lyndon-Li Yes. We never pull things like itemoperations or logs in sync. Backup sync only pulls down the backup json to recreate the velero backup resource -- so I think we're already on the same page here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 @sseago @Lyndon-Li We will keep this discussion in mind when making the controller changes.

ResourceIdentifier string "json:resourceIdentifier"

// OperationID returned by the BIA plugin
OperationID string "json:operationID"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also applied to RestoreOperation
During the processing of the BIA V2 plugin, it may generate some internal info that are critical for keeping the workflow, let's call it context.
If we want the plugin to recover from the restart, the context must be recovered first, so the current info in BackupOperationSpec is not enough. The consequence is that even though the plugin could recover partial to know that who it is and who it is working for, it cannot recover its work, so it has fail the next call (i.e., Progress) to tell the caller that the async operation cannot continue.
On the other hand, this context is internal to the specific plugin. It is meaningless to the outside world.

How to we cope with this problem? Should we add a plugin specific section in the BackupOperationSpec?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That's a good question. So in most cases, the operation ID will actually correspond to a CR created and processed by an external controller. For example, a datamover plugin operating on a VolumeSnapshotContent object will create a datamover CR and the operation ID will just be an object reference to that. In the event of a plugin restart, that id is all we need, since the progress method will simply GET the datamover CR and report status from that. If it is an operation internal to the plugin, then we can't necessarily assume that the operation is even recoverable, so an internal plugin async operation probably just errors out on a crash -- calling progress in this case will result in the plugin not being able to find the operation ID, so it will report an error which will let velero know to add an error to the backup/restore and move to partial failure. I don't think adding a plugin-specific config map or something similar here will really solve the problem, and in addition, it would be challenging to find a way to return data of unknown type/size via the protobuf GRPC framework. I think this is ultimately a design consideration on the part of plugin authors -- if you want plugin-crash-resilient longrunning operations, implement them in an external controller, communicating with the plugin via CRs. Implementing async operations as a goroutine or something similar inside the plugin process will be inherently more fragile.
@shubham-pampattiwar @dymurray any thoughts here?

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally understand your concern. On the other hand, I don't know which way is more reasonable for our async operation model. Let's map this to a case in the real life:

  • Person A asks Person B to do a business in a place
  • Since the place is far away, B decides to rent a car to go there. So renting a car is purely an internal operation and the key of the car could be an internal info, a.k.a. the context
  • B needs to return the car after the business is done but before he reports the completion to A

If we don't allow B to have any internal state and context, things have to be like this:

  • Person A asks Person B to do a business in a place
  • B has to ask Person C to do this because he cannot rent a car and keep the key
  • A cannot communicate with C directly because C is the personal agent of B
  • For anything, A has to ask B and B then asks C

Is this what we want? How do we define B's role as it looks very thin?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing in the approved design that would account for augmenting the api with plugin-specific data structures in response. This won't be needed for any of the currently-identified use cases (datamover, etc.). SInce it's not in the approved design and adding it to the GRPC API might be challenging, I would suggest putting anything like that off for a future v3 API update if and when we identify a use case that requires it.

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed affects one of the two proposals in the current Velero built-in data movement design, let's discuss this in the community meeting.

If we decide to drop the proposal and take the other one, I have no other concern for postposing this to v3 API, just one thing to mention:
In the v2 API, it is not possible for the plugins to support restart-recovery. It doesn't like that the implementation could decide whether and how it could support restart-recovery.
Still take the above A, B, C for example, because of the v2 API, B cannot keep internal state and context, so C has to do this. However, image how C does this, impossible -- In order to do this, C has to persist something or let other counterpart to keep something:

  • If C persists something, it means we need to develop another mechanism something like the json, it is very complicated and unnecessary
  • If C lets counterpart to keep something, the counterpart need to have the same lifecycle with the task(or in another words, the owner of the task), that is A. Unfortunately, A cannot communicate with C

Then going to the data mover case, we can say that, once the module that hosts the SnapshotBackup/Restore controller restarts, the existing backup/restore will fail.
This is the same as the existing PodVolume backup/restore, but because of the v2 API restriction, we cannot do anything better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think we have 2 questions to answer here:

  1. do we need to add plugin-specific information to the interface as defined in the previously-approved designs
  2. If yes to above, when is it needed.

Either way, if we need 1), someone will need to submit a design proposal for these additions. Since we don't even know for sure whether the internal datamover needs this, we're not in a position to implement and merge a PR based on this right now. This work can be done in follow-on PRs as modifications to the implementation of the previously-approved design.

As for 2) -- when we need this, if the internal datamover needs this, then we may want to get it into v2 API -- but the difference between getting it in v2 vs. v3 API is just whether we intend to merge it for Velero 1.11 vs. waiting for 1.12. Either way, it should be done as follow-on PRs since holding up current implementation of an approved design for some additional features that haven't been fully defined yet would endanger the release schedule. Lets implement the current design as agreed while we start iterating on what we want to add, if anything. Even if we decide we need it for 1.11, we're not going to save any time by holding current PRs while we iterate over the next design phase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on the plan, let's see if the bult-in data movement needs it after the discussion.
And as we discussed in the meeting, if we need to add another change to the current BIA/RIA v2 interface, let's discuss and see if V3 is really required or we add the same to v2


// OperationID returned by the BIA plugin
OperationID string "json:operationID"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also applied to RestoreOperation
Except the plugin restart case, what else is the persisted json used for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will also be used in velero describe to give user information on the operations associated with a backup/restore, and it may also be used on subsequent reconciles -- i.e. backup 1 is WaitingForPluginOperations, so velero executes backup 2. When it later goes back to reconcile backup 1, it will pull this file down and check status for any unfinished backup. We don't necessarily need to keep every unfinished backup/restore in memory until the longstanding operations are done when Velero isn't otherwise processing the backup/restore.

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I say that - the plugin restart doesn't use this json?
Previously, I thought the plugin requires the info in the json in order to recovery the connection with the backup, the operationID and the CR.
By your explanations and after further thinking, I think this is not required:

  • Velero doesn't know the restart of the plugin server, so it calls the BIA V2 operation with the same OperationID and backup name
  • A new plugin server is started by the plugin framework
  • The plugin server now has the OperationID and the backup name
  • From the backup name, it can find the CR, because when creating the CR the plugin can add something like an annotation to the CR to indicate that when backup is the CR for

So the json is not used at all. Am I correct?

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. backup 1 is WaitingForPluginOperations, so velero executes backup 2. When it later goes back to reconcile backup 1, it will pull this file down and check status for any unfinished backup

This idea sounds good. But let's think one more step:
In this case, after backup 1 goes to WaitingForPluginOperations and Velero starts to execute backup 2, most probably, the plugin client to backup 1 will be closed at this time. Then the plugin server will also close.
It means, this comes to a plugin restart case, so it comes to the question - does the plugin recovery after restart must succeed? If the answer is no, then when Velero reconcile backup 1 later, it cannot call any BIA V2 operation at this time, because the call would return a failure.

Another question is, who is going to update the status in the file since Velero has started to execute backup 2?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lyndon-Li That's right. The plugin itself never needs access to this json. Rather Velero's controllers need the json content in order to know which plugins to poll for progress updates. The plugin only ever needs information from Velero as defined by the plugin API -- i.e. for the Execute call, it uses the Execute input args, for the Progress call it uses the defined Progress input args.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Velero will update the itemoperations.json status each time is runs through the list of outstanding operations and calls Progress on them for a given backup. So in the above example, backup1 completes the main backup process and initial round of Progress calls, updates itemoperations along with the rest of the backup (backup json, log, etc.). Then it runs through backup2. Then it reconciles backup1 again, making progress calls on the outstanding operations, and if there are any status changes, updates the itemoperations json file again via the object store API.

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the plugin server only restarts when the container restarts

Let me further test this by debugging.

Copy link
Contributor

@Lyndon-Li Lyndon-Li Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it reconciles backup1 again, making progress calls on the outstanding operations, and if there are any status changes, updates the itemoperations json file again via the object store API

Thanks for the explanation.
As far as my understanding, frequently persisting the json will involve many network activities and further more, this may not be friendly to the future backup repository features.
If the purpose is for saving memory only, I would suggest we keep the json for every backup in memory, this will not take too much memory because the in-parallel backup count is limited and a single json is with limited size. For example, let say each json is 10KB and we have 1000 in-parallel backups, the total memory usage is around 10 MB, which is totally acceptable.
Moreover, even though we have 1000 in-parallel backups, I believe we will not have all of them in parallel in the async operations. The data movement will take CPU, memory and network bandwidth, having too many async operations in parallel will not speed up them but none of them could complete in a reasonable time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lyndon-Li That makes sense. It may be worth holding off on certain json updates, keeping things in memory for performance reasons. None of that affects this PR, though -- the interface for uploading/downloading this json if/when we do want it is unchanged. We will definitely upload it the first time we update the backup (after initial processing and first pass at operation progress). This will ensure that the full list of operations is persisted so that velero can recover from a restart after this point). If we decide not to update the backup store on every intermediate reconcile, then we can implement the in-memory cache you propose above, and on any subsequent reconcile (i.e. after initial backup run), we check for in-memory progress struct first. if found, we use it, else pull it from object store. At the end of reconcile, if we're completing the backup, we upload it and remove it from the in-memory list of current backups, and if the backup is still not done, we just update in-memory store and move on.

The two main consequences of the above:

  1. we still preserve resilience to Velero restarts -- if a backup has updated operation progress that's not uploaded, on the first reconcile post-restart, Velero will have to make progress calls for operations that had previously been marked completed in-memory but not persisted -- that's OK, though. The operation will still be complete, so the result should be the same.
  2. It's possible that velero describe will report out-of-date information. I'm not sure whether client-side describe calls will be able to access the in-velero-pod-memory progress data. One way around this would be to modify the DownloadRequest controller and in the case of DownloadRequests for a backup/restore's item operations, that controller could force updating the BSL json data. We can work through that in the context of the describe implemetation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good, thanks. For any other details, let's discuss in the implementation PR of backup controller and download request controller.

@Lyndon-Li Lyndon-Li self-assigned this Jan 16, 2023
@weshayutin
Copy link
Contributor

woot, thank you for the collaboration and reviews :)

Copy link
Collaborator

@shubham-pampattiwar shubham-pampattiwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shubham-pampattiwar shubham-pampattiwar merged commit d14879f into vmware-tanzu:main Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement <backup-name>-itemoperations.json.gz file
4 participants