-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change: Let user define Snapshot and how to Send/Receive the Snapshot #600
Change: Let user define Snapshot and how to Send/Receive the Snapshot #600
Conversation
/// The snapshot data.
pub data: C::SD, This looks good |
@drmingdrmer i just updated this with what I think would be a more useful way of handling snapshots. The main thought being that the raft engine itself should let the user of the library send the snapshot how they want too. The main downside I see to this being that currently the chuncks of the snapshot could fail early if some aspect of the vote changes. but this could be put on the user to handle. This branches changes I believe simplify the snapshot concept in the engine and give the user more freedom to do as they wish with the snapshot process. |
I do not quite get this: if a vote change causes the snapshotting to shut down, it has to be dealt with openraft. Such a task can not be left to the application. If the major change is to define snapshot chunk with |
@drmingdrmer I've gone through and updated the tests where appropriate now. |
It looks like you removed the snapshot streaming entirely. How does an application implement streaming if the snapshot is very large? |
it really becomes a question for the user of the api. The On the receiving the client needs to aggregate the full snapshot before applying raft. But that again can be optimized by the user for their use case. |
If The |
yep thats right. you would not want to just serialize the |
I also think this pr needs some work in relation to the API change. It's just that I'm not sure what the project wants in terms of that. I know from my experience and what I've read from other issues it looks like the engine should not be what defines how the snapshot is sent between nodes, since what defines a user's snapshot can vary so greatly between implementations. Please let me know any suggestions or feedback! |
One of my concerns is that an application has to understand raft protocol very well to define its own RPC APIs. This is why raft-protocol RPCs have to be defined by |
Could you expand on the
Technically all of these can be done in the current branch without these changes. |
Every time enter No matter what
Yes.
This will block
This blocks RaftCore` too. |
Sorry, do you mean the local The blocking would occur at 3. Since 3 is in the local replication task, my understanding is that the local |
The remote
Sending data from leader to followers are already done in other tasks. Step 3 won't block. |
Step 4 isn't callable until step 3 has completed. So the full snapshot would already be on the remote client when step 4 is called. I know above it was referenced that |
Let me update the rocks and mem examples to show what I mean. |
If For a stream
I'm not quite sure what
Did you mean to let the leader send multiple chunks, and let the follower buffer all the chunks and then let the follower re-build a I think it works but introduces some complexity, such as the receiving peer has to watch raft vote changes so that it can be canceled. |
Right the
Yep
That's a great point about the vote change. |
Any more feedback on this idea? |
Such an abstraction leaves too many things to do for the application developers. Application developers should spend as little time as possible on understanding a framework. As I recall one of the openraft application developers believes the So I'd try not to introduce complexity for application developers if possible. |
Sounds like this change doesn't make much sense as is then. I'll close the PR. |
I still need to do the pr cleanup below.
The goal of this PR is to update the snapshoting process to be more customizable. The first thought was to make the
snapshot
be broken into astream
and asink
. Which is what this PR currently shows. But looking at it more I am curious if even this isn't quite right. I know the paper goes over the RPC for InstallSnapshot and that the Raft engine should process the chunks of the snapshot. But is this really necessary? Couldn't the RPC be simplified to beWhere
C::SD
is the snapshot data which can be any struct. And have the user of the API handle how it should be sent and received. This would be much more flexible and remove all the logic in the raft_core around building the snapshot parts since the user will have already done that in the best way for their use case.Checklist
This change is