Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about log probe #72

Closed
kikimo opened this issue Jun 4, 2023 · 3 comments
Closed

Questions about log probe #72

kikimo opened this issue Jun 4, 2023 · 3 comments
Labels
question Further information is requested

Comments

@kikimo
Copy link
Contributor

kikimo commented Jun 4, 2023

Look at the following piece of code, the test at line will make raft send at most maxMsgSize bytes of entries when raft is in probe state, ie. pr.State != tracker.StateProbe:

raft/raft.go

Lines 565 to 585 in 3e6cb62

func (r *raft) maybeSendAppend(to uint64, sendIfEmpty bool) bool {
pr := r.prs.Progress[to]
if pr.IsPaused() {
return false
}
lastIndex, nextIndex := pr.Next-1, pr.Next
lastTerm, errt := r.raftLog.term(lastIndex)
var ents []pb.Entry
var erre error
// In a throttled StateReplicate only send empty MsgApp, to ensure progress.
// Otherwise, if we had a full Inflights and all inflight messages were in
// fact dropped, replication to that follower would stall. Instead, an empty
// MsgApp will eventually reach the follower (heartbeats responses prompt the
// leader to send an append), allowing it to be acked or rejected, both of
// which will clear out Inflights.
if pr.State != tracker.StateReplicate || !pr.Inflights.Full() {
ents, erre = r.raftLog.entries(nextIndex, r.maxMsgSize)
}

I have two questions:

  1. Is this an expected behaviour(send at most maxMsgSize bytes of entries when raft is in probe state)?
  2. If this is an expected behaviour, why don't we send just one or empty entry when pr.State != tracker.StateProbe? Since in a probe state it's very likely for a append message to be rejected, sending just one or zero entry might accelerate the probe process.

@ahrtr @pavelkalinnikov

@kikimo kikimo changed the title Questiongs about StateProbe Questiongs about log probe Jun 4, 2023
@kikimo kikimo changed the title Questiongs about log probe Questions about log probe Jun 4, 2023
@pav-kv
Copy link
Contributor

pav-kv commented Jun 6, 2023

  1. Is this an expected behaviour(send at most maxMsgSize bytes of entries when raft is in probe state)?

Yes. This behaviour is "correct" either way. But there are options to save some bandwidth, as you point out, depending on assumptions. I think the current strategy optimistically assumes that the first probe will succeed, and in this case we will save one roundtrip of latency.

I can think of 2 cases when this probing happens:

  1. There is a stale follower who went offline for a while, and since then a few leadership changes and log suffix overrides happened. In this case it is likely that the first append message in the probing state won't succeed, and there will be a few roundtrips before the appends stabilize.
  2. During a normal leadership run there was a network hiccup, and one append message got lost. Leader will eventually get a reject, but it will probably successfully recover the flow of appends with a single probing message.
  • If this is an expected behaviour, why don't we send just one or empty entry when pr.State != tracker.StateProbe? Since in a probe state it's very likely for a append message to be rejected, sending just one or zero entry might accelerate the probe process.

What you're suggesting would be best for the case (1). The current strategy is better for case (2) in terms of replication latency. There is no obviously always-better option, it's a trade-off.

It's hard (but maybe not impossible) to distinguish between case (1) and (2) on the leader end either, to make this decision dynamic. So we stick to the optimistic, I guess. I don't have data though, to support the argument that this optimistic approach is best on average. I think it largely depends on the deployment.

@pav-kv
Copy link
Contributor

pav-kv commented Jun 6, 2023

Btw, see a related broader-scope issue #64 - this and similar user/workload/deployment-dependent flow control aspects could be delegated to the upper layer, and not necessarily hardcoded in raft package.

@pav-kv pav-kv added the question Further information is requested label Jun 6, 2023
@kikimo
Copy link
Contributor Author

kikimo commented Jun 7, 2023

Thanks for your reply, close this issue.

@kikimo kikimo closed this as completed Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants