-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backup): 4. meta server send backup_request to replica #1112
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -64,24 +64,28 @@ struct configuration_restore_request | |
9:optional string restore_path; | ||
} | ||
|
||
// meta -> replica | ||
struct backup_request | ||
{ | ||
1:dsn.gpid pid; | ||
2:policy_info policy; | ||
3:string app_name; | ||
4:i64 backup_id; | ||
1:dsn.gpid pid; | ||
2:string app_name; | ||
3:i64 backup_id; | ||
4:backup_status status; | ||
5:string backup_provider_type; | ||
// user specified backup_path. | ||
5:optional string backup_path; | ||
6:optional string backup_root_path; | ||
} | ||
|
||
struct backup_response | ||
{ | ||
1:dsn.error_code err; | ||
2:dsn.gpid pid; | ||
3:i32 progress; // the progress of the cold_backup | ||
4:string policy_name; | ||
5:i64 backup_id; | ||
6:i64 checkpoint_total_size; | ||
3:i64 backup_id; | ||
4:backup_status status; | ||
5:optional dsn.error_code checkpoint_err; | ||
6:optional dsn.error_code upload_err; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why need two error_code? using one-error_code + hint is not ok? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good idea, meta server can distinguish error with its backup_status. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, what the different between There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I think that just There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With your design, the sender needs to judge the response as follows: if (rpc == ok) {
if ( resp.err == ok) {
if (resp.checkpoint_err == ok) {
/** if you define `more clear err code as you say`, maybe need:**/
/** if (err_a == ...) {
if (err_b == ...)
}
}
}
} This logic is also redundant. don't suggest this design, you can refer the origin rpc defination, I think the origin is elegant There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, bulk load still seperete rpc_error, download_error and ingestion_err, you can reference sturcture And in my current design, won't have the too many if condition in your example. In my design, it just like
If I combine error and checkpoint_upload_err into one, the code will be like:
I think it is okay to have two errors to distinguish different errors, if there is only one field redundant, but can make structure clear. In my previous design, checkpoint error and upload error also should be seperated, I have already compromised my logic :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, wait other's opinion, if they have no objections, I will be willing think it is a reasonable design There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @acelyc111 Please give us your suggestion about this comment~ @foreverneverer has different opion with me, and we can not persude each other. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I didn't miss anything, I didn't find where |
||
7:optional i32 upload_progress; | ||
8:optional i64 checkpoint_total_size; | ||
} | ||
|
||
// clear all backup resources (including backup contexts and checkpoint dirs) of this policy. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you use another RPC code? Is there any problem if a user use old version shell-tool attempt to control the cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rpc is not used from shell-tool to meta server, is meta server to replica server, won't trigger the control problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks.
I meant how to keep compatablity, if the meta servers are in new version, and the replica server are in old version, what will happen if we ask the cluster to do backup? Is it neccessary to add new rpc code for the new implemention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If meta server is in new version, replica server in old version. I don't consider this condition compatible, I think it is a dangerous case for meta server and replica server has different version, because new version especially a feature version, meta server will provide mamy new rpc which replica server can not recognize, and it is not necessary to add new rpc code for compatible only when new meta and old replica case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rolling update is a common case which many user have to face. We will not ensure every feature should work well at this case, but at least avoid the server crash.
If we rewrite the RPC message, the related RPC code would better to add a new one, and left the old one as deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rolling update is a common case, but it is not recommended to update meta firstly, we recommendly update replica server firstly, so it is a seldom case that meta is new version but replica is old version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No matter which one is new version. The new replica server will still crash if receive a old backup request from old version meta server?