-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using the Statistics info in ETCD support using all nodes of the cluster #9
Comments
Just looking at this. |
Thoughts right now.
Later if we can figure out where we are putting our calls to this new 'method' use that to inform us of where to put it in the codez. I'm guessing, when we get a 'cannot connect' type exception and we start connecting to a new node, once we have we reup the list. But I really think step one is just keeping the initial list of options and see how that goes. We can make it smarter once this is baked and we are happy. |
Ok. So list of nodes in the constructor and the ability to adjust that list afterwards? |
My initial concern is less about 'load' and more about when a failure is On Thu, May 1, 2014 at 10:01 AM, David Smith [email protected]:
|
Agreed. I'm some of the way to this already. Needs a bit of work. |
:) On Thu, May 1, 2014 at 10:04 AM, David Smith [email protected]:
|
Any thoughts on testing this? |
Not yet, I will as i write the code but the big tests would be something like.
|
Current tests require one etcd node. Will this require multiple? |
or do something more mocky? |
The response back from a server that is down is slow. If that is first in the list then response is very slow every time. Worse if multiple servers down. |
So we could have a 'composite' EtcdClient to start with that is like 'FailOverEtcdClient' you can seed it with multiple nodes, but it could keep an ordered list or queue. We can peek the top one and use that. If it ever fails we keep popping and reque until we find one that works. Then whenever we find new ones we just 'enqueue' them. Later we can add more sophisticated policies like 'slowness' and such. That keeps the core EtcdClient code nice and simple too. |
I've got something kind of working and I'm looking at unit testing |
Nope. You are ahead of me here, but my guess is that the gossip protocol On Fri, May 2, 2014 at 10:40 AM, David Smith [email protected]:
|
Yep. Thread.Sleep does it again. I'll commit that now :) My understanding is Raft isn't a gossip protocol? |
derp. just that its distributed, and it may need time to percolate. :) On Fri, May 2, 2014 at 11:35 AM, David Smith [email protected]:
|
Answered by the coreos guys. |
I'm looking at spinning up etcd instances as part of testing. |
I forgot to reply about the composite EtcdClient. The approach I am taking is to have EtcdClient accept a params of uris in it's constructor. I've got a Cluster class that EtcdClient initalises. This has a list of nodes to iterate over and a method to demote (needs a less loaded name) a node if you think it's iffy. makeXRequest then just loops over the nodes until it succeeds, demoting any notes that fail. |
Here is what I have so far |
re: etcd binary - i'd be ok with that. a |
put another interface on the EtcdClient and have it implement it 'explicitly' then have the 'modules' take the client in as that interface. That way they will see the method but nothing else will. |
nice. like that. |
problem with tools directory is that binary needed isdependent on os |
i find that windows people tend to need more help than the unix folks. On Wed, May 7, 2014 at 8:36 AM, David Smith [email protected]:
|
My new plan is to use vagrant to spin up 3 machines with known IP addresses and then manipulate them over ssh (where necessary) as part of the tests. |
+1-d On Thu, May 8, 2014 at 8:02 AM, David Smith [email protected]
|
The goal here is that if server A is down and we know that there is also server B and C - be able to fall back to those.
The text was updated successfully, but these errors were encountered: