Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: raft learners should be returned after applyConfChange #9087

Closed
absolute8511 opened this issue Jan 3, 2018 · 4 comments
Closed

raft: raft learners should be returned after applyConfChange #9087

absolute8511 opened this issue Jan 3, 2018 · 4 comments
Labels

Comments

@absolute8511
Copy link
Contributor

absolute8511 commented Jan 3, 2018

I reviewed some codes for raft learner pr #8751 and found there may be an issue which can affect the ApplyConfChange in the raft node.

// ApplyConfChange applies config change to the local node.
// Returns an opaque ConfState protobuf which must be recorded
// in snapshots. Will never return nil; it returns a pointer only
// to match MemoryStorage.Compact.
ApplyConfChange(cc pb.ConfChange) *pb.ConfState

Since we add new Learners to pb.ConfState, we should return the pb.ConfState with Learners in the raft.

	case cc := <-n.confc:
		if cc.NodeID == None {
			r.resetPendingConf()
			select {
			case n.confstatec <- pb.ConfState{Nodes: r.nodes()}:
			case <-n.done:
			}
			break
		}
		switch cc.Type {
		case pb.ConfChangeAddNode:
			r.addNode(cc.NodeID)
		case pb.ConfChangeAddLearnerNode:
			r.addLearner(cc.NodeID)
		case pb.ConfChangeRemoveNode:
			// block incoming proposal when local node is
			// removed
			if cc.NodeID == r.id {
				propc = nil
			}
			r.removeNode(cc.NodeID)
		case pb.ConfChangeUpdateNode:
			r.resetPendingConf()
		default:
			panic("unexpected conf type")
		}
		select {
		case n.confstatec <- pb.ConfState{Nodes: r.nodes()}:
		case <-n.done:
		}

However, from the code above, we return all the nodes and learners written into the Nodes and leave the Learners empty. Is that by designed or a potential bug?

A possible test case for this as below:

func TestNodeProposeAddLearnerNode(t *testing.T) {
	ticker := time.NewTicker(time.Millisecond * 100)
	defer ticker.Stop()
	n := newNode()
	s := NewMemoryStorage()
	r := newTestRaft(1, []uint64{1}, 10, 1, s)
	go n.run(r)
	n.Campaign(context.TODO())
	stop := make(chan struct{})
	done := make(chan struct{})
	applyConfChan := make(chan struct{})
	go func() {
		defer close(done)
		for {
			select {
			case <-stop:
				return
			case <-ticker.C:
				n.Tick()
			case rd := <-n.Ready():
				s.Append(rd.Entries)
				t.Logf("raft: %v", rd.Entries)
				for _, ent := range rd.Entries {
					if ent.Type == raftpb.EntryConfChange {
						var cc raftpb.ConfChange
						cc.Unmarshal(ent.Data)
						state := n.ApplyConfChange(cc)
						if len(state.Learners) == 0 || state.Learners[0] != cc.ReplicaID {
							t.Fatalf("apply conf change should return new added learner: %v", state.String())
						}
						
						if len(state.Nodes) != 1 {
							t.Fatalf("add learner should not change the nodes: %v", state.String())
						}
						t.Logf("apply raft conf %v changed to: %v", cc, state.String())
						applyConfChan <- struct{}{}
					}
				}
				n.Advance()
			}
		}
	}()
	cc := raftpb.ConfChange{Type: raftpb.ConfChangeAddLearnerNode, ReplicaID: 2}
	n.ProposeConfChange(context.TODO(), cc)
	<-applyConfChan
	close(stop)
	<-done
}
@xiang90
Copy link
Contributor

xiang90 commented Jan 5, 2018

@absolute8511 looks like a bug. can you fix it and add the test?

@xiang90
Copy link
Contributor

xiang90 commented Jan 5, 2018

/cc @siddontang

@siddontang
Copy link
Contributor

Thanks @absolute8511

This is a bug here, but we have not used learner yet.

Can you file an PR to fix it?

@xiang90
Copy link
Contributor

xiang90 commented Jan 18, 2018

this is fixed by #9116

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants