This repository has been archived by the owner on Feb 27, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 773
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: yunfeiyangbuaa <[email protected]>
- Loading branch information
1 parent
7bfa08f
commit eeb9bbb
Showing
17 changed files
with
817 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
# Supernode High Availability | ||
|
||
This doc contains all the design and usage document of supernode HA. | ||
|
||
## Two Goals Need to be Achieved | ||
|
||
To implement HA,we must guarantee the following two goals to be achieved: | ||
|
||
- Leader election: If the active supernode breaks down,the HA implement should elect another active supernode from standby supernodes.And it is important to consider how to avoid the brain split problem. | ||
- Active and standby node synchronization: Because the supernode is stateful, we should keep the standby supernode’s status and active supernode’s status constant,otherwise the standby supernode can not take over present work after bean activated. | ||
|
||
## Tool Introduction | ||
|
||
We can use distributed key-value store system such as etcd,zookeeper,consul and so on to achieve leader election,which have been applied in the many HA implements like hadoop,spark. | ||
|
||
Let’s focus on etcd because we will use it to construct our plan. etcd is a strongly consistent,distributed key-value store that provides a reliable way to store data that needs to | ||
be accessed by a distributed system or cluster of machines. It gracefully handles leader elections during network partitions and can tolerate machine failure, even in the leader node. | ||
|
||
## HA Design | ||
|
||
Below is a diagram illustrating that how to implement HA. | ||
|
||
![ha_design.png](../images/supernode_ha.png) | ||
|
||
### Leader Election | ||
|
||
Every supernode has four status,they are init,standby,active,kill_itself and only the ha_mgr can change the status.If supernode don't use HA,it will change from init to active directly. | ||
If the supernode applies HA.it will setup by following steps: | ||
|
||
- First: Every supernode will try to get a distributed lock(go.etcd.io/etcd/clientv3.Txn) in etcd.On success,it will return a unique key that exists so long as the lock is held by the caller.Supernode should keep a lease(go.etcd.io/etcd/clientv3.lease) alive by periodically sending keep-alive messages otherwise the lease will expire thus all resources (such as key-values and locks) it previously attached to will expire.The lock is held until Unlock is called on the key or the lease associate with the owner expires.Multiple candidates are requiring for the same lock while each has attached their lock request with its own lease id.The one who successfully gets the lock turns to be the leader.Once the leader loses its connection with etcd server, its lock will be revoked.As a result, another candidate who has a valid lease will get the lock and thus turn to be the leader. | ||
- Second: Every supernode will get the result after the election.Every supernode will be informed who is active supernode.If the active supernode is itself,it will change its status from init to active,Otherwise,it will change its status from init to standby and do these two work: | ||
- change its status from init to standby. | ||
- create a etcd watch on the lock owned by active supernode,so that if active supernode loses the lock,it can be notified and take part in next election. | ||
- Third:Standby supernode will create a lease on etcd and store its info(ip and standby listen port) in the form of key-value.if it beaks down,the information in etcd will disappear because of etcd lease. | ||
- Fourth:active supernode will create a etcd watch on these keys.Active supernode will know every standby supernode's info.So standby supernode can take part in the ha system dynamically.If a standby supernode break down,active supernode will be notified immediately with the help of etcd lease and watch. | ||
|
||
### Active and Standby Node Synchronization | ||
|
||
We implement status synchronization as follows: | ||
|
||
- First: Every supernode has two listen port,one for active and one for standby.If supernode is active,it use active port to provide service for dfget.If the supernode is standby,it open standby port only to receive request from active supernode.Dfget only know active supernode,which means standby supernode is unachievable for dfget. | ||
- Second: Active supernode receives dfget's request and send all these request to standby supernodes,standby supernode will do with these request just like active supernode,which means standby supernodes's status is the same as the status of active supernode. | ||
- Third: If Active supernode breaks down,dfget will try to ping every supernodes he knows.After standby supernode takes over work and open active port,dfget will success and go on previous work. | ||
|
||
### Virtual IP | ||
|
||
Dfget can use the supernodes cluster's address instead of every supernode's ip to download document. | ||
|
||
We can use nginx to implement this.these work is in progress. | ||
|
||
## Usage | ||
|
||
We can use these cli commend to config supernode HA: | ||
|
||
``` | ||
-H, --use-ha set whether to use supernode HA | ||
--etcd-address strings if you use supernode HA,you should set the etcd address to implement ha (default [127.0.0.1:2379]) | ||
--standby-port int if you use supernode HA,you should set the standby port to implement ha (default 8003) | ||
``` | ||
|
||
we can use these commend to deploy a supernodes HA cluster: | ||
|
||
```sh | ||
supernode --home-dir /home/admin/supernode --port=8002 --download-port=8001 --advertise-ip=127.0.0.1 -H --standby-port 8003 --etcd-address 127.0.0.1:2379 | ||
supernode --home-dir /home/admin/supernode --port=8004 --download-port=8001 --advertise-ip=127.0.0.2 -H --standby-port 8005 --etcd-address 127.0.0.1:2379 | ||
supernode --home-dir /home/admin/supernode --port=8006 --download-port=8001 --advertise-ip=127.0.0.3 -H --standby-port 8007 --etcd-address 127.0.0.1:2379 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.