-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcdserver: initial read index implementation #6212
Conversation
on my local machine, there is a 500% improvement. base s-read:
before
after
|
With patch
Without patch
Great. +2x faster! |
@gyuho Is this the same setup with https://github.com/coreos/etcd/blob/master/Documentation/op-guide/performance.md? In the perf doc, we use 1000 clients not 100 clients. |
@xiang90 It's with slower machines ( |
For reference, new test results with 8-CPU, 16GB memory machines Now linearized reads are almost same as serializable reads
|
ok := make(chan struct{}) | ||
|
||
select { | ||
case s.readwaitc <- ok: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wary of this pattern. Would it be possible to do something like
func (s *EtcdServer) notifyLinearRead() <-chan struct{} {
rlock()
defer runlock()
return s.notifyNextLinearRead
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really. we need one chan per read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or you want server side to return the chan instead of creating chan here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does it need a channel per read? the logic looks a lot like a barrier...
clock independent version
~10% slower than clock dependent one, but acceptable. if the latency between leader, follower is high, then the latency difference can be significant though. |
Hi @xiang90 |
@mitake Yeah this is etcd implementation of |
@gyuho I see, thanks! |
8d14c11
to
b13a23a
Compare
@heyitsanthony all fixed. PTAL. I still need to add backward compatibility for this feature. |
result, err := s.processInternalRaftRequest(ctx, pb.InternalRaftRequest{Range: r}) | ||
if err != nil { | ||
return nil, err | ||
var resp *pb.RangeResponse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why shuffle this around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nm, I see it falls through from linearized to serialized once it gets the notify
approach looks OK but the synchronization/error handling is a little iffy |
I will try with more testing + benchmarks. |
@heyitsanthony All fixed. I made some minor changes, the performance got improved for unbuffered case magically. I do not really understand why, but the result seems to match the simple benchmark you wrote. The test has an assumption that we can almost always batch the concurrent requests together. The unbuffered still can be slower than the other two solutions when some of the requests reach the etcd server at slightly different timestamp and miss the batch. (Basically if the requests can be divided into N groups, which consumes N times resources )But the difference is not huge for most of the cases, I guess we should just go with the simplest solution for now. |
If this approach looks good, i will resolve the conflicts and fix tests. |
} | ||
} | ||
|
||
func (nc *notifier) close(err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notify
?
@xiang90 approach looks good. Thanks! |
e1cb8e1
to
b06b789
Compare
@heyitsanthony All fixed. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few final nits
return nc.err | ||
case <-ctx.Done(): | ||
return ctx.Err() | ||
case <-s.done: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.stopping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a routine etcd server created. This is a per request routine so I assume we should use s.done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, it doesn't really make a difference
for !timeout && !done { | ||
select { | ||
case rs = <-s.r.readStateC: | ||
if !bytes.Equal(rs.RequestCtx, ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done = bytes.Equal(rs.RequestCtx, ctx)
?
func TestCtlV3Elect(t *testing.T) { testCtl(t, testElect) } | ||
func TestCtlV3Elect(t *testing.T) { | ||
for i := 0; ; i++ { | ||
fmt.Println(i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stray debugging output?
@@ -86,6 +95,31 @@ type Authenticator interface { | |||
} | |||
|
|||
func (s *EtcdServer) Range(ctx context.Context, r *pb.RangeRequest) (*pb.RangeResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also support l-read Txn
?
Use read index to achieve l-read.
@heyitsanthony Can we improve txn in another pull request? It is more complicated than read. We need to add some code to find out readonly Txn. And then if the txn is serializable, it can access local kv immediately. Or it needs to wait for linearizable notify. |
@xiang90 ok can defer the txn stuff but it needs to go in before 3.1 |
lgtm |
Will this improvement be backported to v2.3.x ? |
@ericpai no |
The actual readindex implementation in raft still depends on clock.
But this is pretty much what i would expect in etcdserver.
@gyuho Can you please do a benchmark for this? See if there is any perf improvement?