-
Notifications
You must be signed in to change notification settings - Fork 726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd-client: no need to watch leader #327
pd-client: no need to watch leader #327
Conversation
Since PD server can proxy requests to the leader, PD client don't need to watch the leader any more. Now PD client first try to connect to the leader, if it fails, it will connect to any other PD servers temporarily, and retry to connect to the leader later. Reference Issues: #293
"github.com/pingcap/pd/pkg/rpcutil" | ||
) | ||
|
||
func MustRPCCall(c *C, conn net.Conn, request *pdpb.Request) *pdpb.Response { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not very meaningful to add comment for these test wrappers, and I have filtered out testutil
from golint
.
@@ -55,7 +61,7 @@ endif | |||
mv vendor _vendor/vendor | |||
|
|||
clean: | |||
# clean unix socket | |||
# clean unix socket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment
case <-w.quit: | ||
return | ||
// Add default schema "http://" to addrs. | ||
urls := make([]string, 0, len(w.addrs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract these to a function
|
||
// We can't connect to the leader now, try to reconnect later. | ||
if !isLeader { | ||
time.AfterFunc(reconnectPDTimeout, func() { conn.Close() }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seem that we may meet data race here for connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer using a channel here, and handling this in the following select.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, for reconnect, I think we can first try to connect leader, if success, then we can close origin connection and use the leader connection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The document says that "Multiple goroutines may invoke methods on a Conn simultaneously".
I actually create another goroutine to retry to connect to the leader in the first implementation, but that seems a bit complicated. Maybe I should try to simplify that.
When we can't connect to the leader, we will connect to any other servers temporarily. Meanwhile, we delegate a goroutine to keep trying to connect to the leader. Once the delegated goroutine connected to the leader, it sends the connection to the channel, and we can use the leader connection to replace the old one.
net.Conn | ||
wg sync.WaitGroup | ||
quit chan struct{} | ||
C chan *conn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C is a little short.
|
||
func (c *conn) Close() { | ||
c.Conn.Close() | ||
close(c.quit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any risk of panic if we use c.Conn in goroutine loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it, which goroutine loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only want to make sure if we close conn first, then close quit channel. Is it OK for us to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The c.quit
has nothing to do with the c.Conn
, just to notify the connectLeader
goroutine to quit, seems safe to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
if err == nil { | ||
c := newConn(conn) | ||
c.wg.Add(1) | ||
go c.connectLeader(urls, reconnectPDTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we enter here multi times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will enter this only when we connect to other servers first.
The created goroutine will return if it connects to the leader or the original connection is closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may still meet the leader coroutine doesn't return but current connection breaks and goes to RECONNECT again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's OK, whenever the connection Close()
, it will notify the leader goroutine to return.
That's why I wrap the whole thing into a struct.
return &conn{ | ||
Conn: c, | ||
quit: make(chan struct{}), | ||
ConnChan: make(chan *conn), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a buffer cap 1 here.https://github.com/pingcap/pd/pull/327/files#diff-67a3b07cdf6db518644ac1cb6ffc974cR104 may block
LGTM |
1 similar comment
LGTM |
Since PD server can proxy requests to the leader,
PD client don't need to watch the leader any more.
Now PD client first try to connect to the leader,
if it fails, it will connect to any other PD servers temporarily,
and retry to connect to the leader later.
Update:
When we can't connect to the leader, we will connect to any
other servers temporarily. Meanwhile, we delegate a
goroutine to keep trying to connect to the leader. Once the
delegated goroutine connected to the leader, it sends the
connection to the channel, and we can use the leader
connection to replace the old one.
Reference Issues: #293