Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to evaluate learner progress #15

Closed
jingyih opened this issue Apr 7, 2019 · 15 comments
Closed

How to evaluate learner progress #15

jingyih opened this issue Apr 7, 2019 · 15 comments

Comments

@jingyih
Copy link
Owner

jingyih commented Apr 7, 2019

Update (5/7/2019): final design #15 (comment)

Let's discuss how to evaluate learner progress before promoting a learner member.

Here is the relevant part from design doc as reference:

Only accept promote request if and only if: The learner node is in a healthy state. The learner is in sync with leader or the delta is within the threshold (e.g. the number of entries to replicate to learner is less than 1/10 of snapshot count, which means it is less likely that even after promotion leader would not need send snapshot to the learner).

So far I have gathered suggestions from @gyuho and @xiang90, but have not yet come up with a complete solution.

Locally, learner knows its own committed index. But it does not have information on leader's raft log(?). So learner needs to query leader in order to understand its 'progress'. Do we need a new raft message type for the query? Maybe use read index query to get leader's committed index and compare it with its own?

After learner finds its 'progress', we can expose it via Ready.SoftState so that etcdserver can use it.

Please let me know your comments / suggestion.

cc @gyuho @xiang90 @WIZARD-CXY @jpbetz

@jingyih
Copy link
Owner Author

jingyih commented Apr 8, 2019

Regarding the idea of using read index query to get leader's committed index, it is not straightforward to implement. Every ReadState resulted from a read index query will end up been consumed by the linearizable read loop. So in order for this idea to work, we probably need to pass additional flag to read index request context. So that later at the output of raft ready channel, we can distinguish the ReadState, bypass the linearizable read loop, and handle it separately. Any thoughts?

Ref:
Current request context for read index query:

if err := s.r.ReadIndex(cctx, ctxToSend); err != nil {

ctxToSend := make([]byte, 8)
id1 := s.reqIDGen.Next()
binary.BigEndian.PutUint64(ctxToSend, id1)

We need something like this:

type ctxToSend struct {
  reqID        uint64
  someFlagName bool // when unset, send the corresponding ReadState to r.readStateC; otherwise send to a different channel to get leader's committed index
}

@jingyih
Copy link
Owner Author

jingyih commented Apr 8, 2019

If you have concerns or better ideas, please let me know. Otherwise I will try to come up with a prototype in the next couple days.

@WIZARD-CXY
Copy link

I'm not an expert on this, but what about this one?
see func newReady(r *raft, prevSoftSt *SoftState, prevHardSt pb.HardState)
we can get r.prs and r.learnPrs in newReady and set it to rd.SoftState it will then be sent to node.readyc then we will get infos from Node.Ready() channel in the case rd := <-r.Ready(): of func (r *raftNode) start(rh *raftReadyHandler) . we can define a raftReadyHandler function to update the learner progress or peer progress just like updateCommittedIndex to update the commit index.

@WIZARD-CXY
Copy link

WIZARD-CXY commented Apr 10, 2019

@jingyih I don't know if I explain clearly above, or I can code a prototype. I think my approach is simple and more native than the read index query approach.

@jingyih
Copy link
Owner Author

jingyih commented Apr 10, 2019

@WIZARD-CXY
My understanding is that ONLY leader tracks the progress of other nodes and actively maintains its r.prs and r.learnerPrs. So a follower node's r.prs does not have information about the leader. On the other hand, the member promote request can be sent to any member.

@jingyih
Copy link
Owner Author

jingyih commented Apr 10, 2019

Now that I thought about it. The method I proposed will not work. Because a random node who receives the member promote call does not have info on learner node, nor does it have proper ways to retrieve it.

@jingyih
Copy link
Owner Author

jingyih commented Apr 10, 2019

So basically a node can either query the progress in raft layer, or in etcdserver layer. If we want the query to happen in etcdserver layer, we need: A) expose progress of leader and learner via rd.SoftState, as suggested by @WIZARD-CXY, and we need to define a way for nodes to retrieve these information from leader (maybe via HTTP?). I like this approach because it decouples etcdserver and raft more cleanly, as compared to query leader and learner progress in raft layer.

@jingyih
Copy link
Owner Author

jingyih commented Apr 11, 2019

Talked to @WIZARD-CXY offline, he is going to prototype this. Regarding retrieving information from leader via HTTP, here are some references:

  1. As an example, this is the code for retrieving members information from peer using HTTP client. Note that in our case we need to identify which peer is leader and only retrieve information from leader.

    func GetClusterFromRemotePeers(lg *zap.Logger, urls []string, rt http.RoundTripper) (*membership.RaftCluster, error) {

  2. How the request is handled.

    func NewPeerHandler(lg *zap.Logger, s etcdserver.ServerPeer) http.Handler {

@WIZARD-CXY
Copy link

Thanks for the info @jingyih. I will prototype this one and send a pr asap. Meanwhile @xiang90 can u take a look?

@jingyih
Copy link
Owner Author

jingyih commented Apr 11, 2019

Let's start with the first step, which is exposing progress via ready softstate.

We can probably avoid the second step which is retrieving info from leader. On client side we may write some helper function to determine the leader endpoint, and only send member promote request to leader.

@WIZARD-CXY
Copy link

WIZARD-CXY commented Apr 12, 2019

good idea

@WIZARD-CXY
Copy link

@jingyih After a closer look, I find sth interesting, check this function raftStatus in package etcdserver. It already have the progress infos if it is a leader. Haha, 踏破铁鞋无觅处 得来全不费工夫.

@WIZARD-CXY
Copy link

@jingyih https://github.com/jingyih/etcd/pull/8/files see place where I mentioned you, we can use raftStatus there.

@jingyih
Copy link
Owner Author

jingyih commented Apr 13, 2019

@jingyih After a closer look, I find sth interesting, check this function raftStatus in package etcdserver. It already have the progress infos if it is a leader. Haha, 踏破铁鞋无觅处 得来全不费工夫.

Nice!

@jingyih
Copy link
Owner Author

jingyih commented May 7, 2019

The final design is:

  1. Leader has necessary information to evaluate learner progress. The information is exposed via raftStatus() function:

    raftStatus func() raft.Status

  2. etcd server will forward member promote request to leader.

Implemented in #19 and #31.

@jingyih jingyih closed this as completed May 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants