ParaRC can run in standalone
mode, which we can test the parallel repair without the integration to a distributed storage system, and a HDFS3 integration
mode, which we can test parallel repair with Hadoop-3.3.4. We first introduce the standalone mode, and then the HDFS3 integration mode.
To run ParaRC in the standalone mode, we need to finish the following steps:
- Prepare a cluster in Alibaba Cloud
- Compile ParaRC
- Prepare configuration files for each machine
- Generate the MLP in the offline
- Generate a stripe of blocks
- Start ParaRC
- Test parallel repair
To run ParaRC in the HDFS3 integration mode, we need to finish the following steps:
-
Prepare a cluster in Alibaba Cloud *
-
Compile ParaRC
-
Deploy Hadoop-3.3.4 with OpenEC *
-
Prepare configuration files for each machine *
-
Generate the MLP in the offline
-
Generate a stripe of blocks in Hadoop-3.3.4 *
-
Start ParaRC
-
Test parallel repair
Note that the steps marked with * in the HDFS3 integration mode are different with those in the standalone mode.
Machine | Number | Alibaba Machine Type | IP |
---|---|---|---|
PRS Generator | 1 | ecs.r7.2xlarge | 192.168.0.1 |
Controller | 1 | ecs.r7.xlarge | 192.168.0.2 |
Agent | 15 | ecs.r7.xlarge | 192.168.0.3; 192.168.0.4; 192.168.0.5; 192.168.0.6; 192.168.0.7; 192.168.0.8; 192.1;68.0.9; 192.168.0.10; 192.168.0.11; 192.168.0.12; 192.168.0.13; 192.168.0.14; 192.168.0.15; 192.168.0.16; 192.168.0.17; |
Among the 15 agents, the last one (192.168.0.17) serves as the new node that replaces a failed node.
In each machine, we create a default username called pararc
.
We need to install some third party libraries to compile and run ParaRC.
-
isa-l library
$> git clone https://github.com/01org/isa-l.git $> cd isa-l/ $> ./autogen.sh $> ./configure $> make $> sudo make install
-
cmake
$> sudo apt-get install cmake
Please download the source code in /home/pararc/ParaRC
. Then compile the source code by :
$> cd /home/pararc/ParaRC
$> ./compile.sh
Each machine requires a configuration file /home/pararc/ParaRC/conf/sysSettings.xml
. We show the configruation parameters in the following table:
Parameters | Description | Example |
---|---|---|
prsgenerator.addr | The IP address of the PRS Generator. | 192.168.0.1 |
controller.addr | The IP address of the controller. | 192.168.0.2; |
agents.addr | The IP address of all agents. | 192.168.0.3; 192.168.0.4; 192.168.0.5; 192.168.0.6; 192.168.0.7; 192.168.0.8; 192.1;68.0.9; 192.168.0.10; 192.168.0.11; 192.168.0.12; 192.168.0.13; 192.168.0.14; 192.168.0.15; 192.168.0.16. |
fullnode.addr | The IP address of the new node. | 192.168.0.17; |
controller.thread.num | The number of threads in the controller. | 20 |
agent.thread.num | The number of threads in each agent. | 20 |
cmddist.thread.num | The number of threads to distribute the commands. | 10 |
local.addr | The local IP address of a machine. | 192.168.0.2 for the controller; 192.168.0.3 for the first agent. |
block.directory | The directory to store blocks. | /home/pararc/ParaRC/blkDir for standalone mode; /home/pararc/hadoop-3.3.4-src/hadoop-dist/target/hadoop-3.3.4/dfs/data/current for hdfs-3 integration mode. |
stripestore.directory | The directory to store stripe metadata. | /home/pararc/ParaRC/stripeStore |
tradeoffpoint.directory | The directory to store the MLP generated by PRS Generator offline. | /home/pararc/ParaRC/tradeoffPoint |
pararc.mode | The mode of ParaRC | standalone for standalone mode; hdfs3 for hdfs-3 integration mode. |
Here is a sample of the configuration file in the controller in /home/pararc/ParaRC/conf/sysSettings.xml
<setting>
<attribute><name>prsgenerator.addr</name><value>192.168.0.1</value></attribute>
<attribute><name>controller.addr</name><value>192.168.0.2</value></attribute>
<attribute><name>agents.addr</name>
<value>192.168.0.3</value>
<value>192.168.0.4</value>
<value>192.168.0.5</value>
<value>192.168.0.6</value>
<value>192.168.0.7</value>
<value>192.168.0.8</value>
<value>192.168.0.9</value>
<value>192.168.0.10</value>
<value>192.168.0.11</value>
<value>192.168.0.12</value>
<value>192.168.0.13</value>
<value>192.168.0.14</value>
<value>192.168.0.15</value>
<value>192.168.0.16</value>
</attribute>
<attribute><name>fullnode.addr</name>
<value>192.168.0.17</value>
</attribute>
<attribute><name>controller.thread.num</name><value>20</value></attribute>
<attribute><name>agent.thread.num</name><value>20</value></attribute>
<attribute><name>cmddist.thread.num</name><value>10</value></attribute>
<attribute><name>local.addr</name><value>192.168.0.2</value></attribute>
<attribute><name>block.directory</name><value>/home/lixl/ParaRC/blkDir</value></attribute>
<attribute><name>stripestore.directory</name><value>/home/lixl/ParaRC/stripeStore</value></attribute>
<attribute><name>tradeoffpoint.directory</name><value>/home/lixl/ParaRC/tradeoffPoint</value></attribute>
<attribute><name>pararc.mode</name><value>standalone</value></attribute>
</setting>
For configuration files in agents, please also prepare a configuration file under /home/pararc/ParaRC/conf
.
To generate MLP for an (n, k) MSR code, we need to do:
- Run
GenMLP
forn
times to generate MLP to repair the n blocks in a stripe in thePRS Generator
. - Prepare a MLP file that records the MLPs for future access in the
Controller
To generate a MLP, please run the following command in the PRS Generator:
$> ./GenMLP [code] [n] [k] [w] [repairIdx]
-
code
- The code that we generate MLP for.
- We can set it as
Clay
for Clay codes.
-
n
- The erasure coding parameter
n
- For example,
n=14
for (14,10) Clay code.
- The erasure coding parameter
-
k
- The erasure coding parameter k
- For example,
k=10
for (14,10) Clay code.
-
w
- The sub-packetization level.
- For example,
w=256
for (14,10) Clay code.
-
repairIdx
- The index of the block that we repair in a stripe.
- For example,
repairIdx = 0
when we generate MLP to repair the first block in a stripe.
-
Example
./GenMLP Clay 14 12 256 0
After running this command, we will get a string, which represents the MLP we generate to repair block with index 0
. The string specifies the color of a intermediate vertex that represents a intermediate sub-block or a repaired sub-block.
After we generate n MLPs, we generate a MLP file, which is a .xml
file in the tradeoffpoint.directory
. The parameters in the MLP file are as follows:
Parameter | Description | Example |
---|---|---|
code | Type of an erasure code. | Clay |
ecn | Erasure coding parameter n. | 14 |
eck | Erasure coding parameter k. | 10 |
ecw | Sub-packetization level. | 256 |
digits | The number of digits that represents a color. For example, as n = 14 , there are 14 possible colors in total, such that we need to represent each color by a two-digit number, from 0 to 13 . |
2 |
point | A MLP to repair index i is represented as i:coloring string , where i is the index of a block, and coloring string is the coloring result for each intermediate vertex. |
0: 0000......1301 |
Please refer to the sample MLP files we generate for the (14,10) Clay code in /home/pararc/ParaRC/tradeoffPoint
.
To generate a stripe of blocks, we need to do:
- Run
GenData
to generate n blocks erasure-coded by a MSR code in theController
. - Distribute the n blocks to n different agents.
- Generate a metadata file to record the metadata information of the stripe in
stripeStore.directory
.
Here is the command to generate a stripe of blocks:
$> ./GenData [code] [n] [k] [w] [stripeidx] [blockbytes] [subpktbytes]
- code
- The name of an erasure code.
- For example,
Clay
for Clay codes.
- n
- The erasure coding parameter n.
- For example,
n=14
for (14,10) Clay code.
- k
- The erasure coding parameter k.
- For example,
k=10
for (14,10) Clay code.
- w
- The sub-packetization level
- For example,
w=256
for (14,10) Clay code.
- stripeidx
- The index of the stripe.
- For example,
stripeidx=0
for the first stripe we generate.
- blockbytes
- The size of a block in bytes.
- For example,
blockbytes=268435456
for the block size of 256 MiB.
- subpktbytes
- The size of a sub-packet in bytes.
- For example,
subpktbytes=65536
for the sub-packet size of 64 KiB.'
For example, we generate a stripe of (14,10) Clay-coded stripe as follows:
$> ./GenData Clay 14 10 256 0 268435456 65536
Now we have a stripe of blocks in block.directory
:
$> ls blkDir
stripe-0-0 stripe-0-1 stripe-0-10 stripe-0-11 stripe-0-12 stripe-0-13 stripe-0-2 stripe-0-3 stripe-0-4 stripe-0-5 stripe-0-6 stripe-0-7 stripe-0-8 stripe-0-9
The name of a block follows the format of stripe-stripeidx-blockidx
. For example, stripe-0-9
specifies that this block is in stripe 0
, and it is with the block index 9
.
Then, we distribute blocks to 14 different agents:
$> for i in {0..13}
> do
> nodeid=`expr $i + 1`
> scp stripe-0-$i agent$nodeid:/home/pararc/ParaRC/blkDir
> done
Now, the 14 agents, from agent1 to agent14, all have a block in block.directory
.
Finally, we generate the metadata file Clay-0.xml
, meaning that this stripe is erasure-coded by Clay code, and the stripe index is 0
. The parameters in a metadata file is as follows:
Parameters | Description | Example |
---|---|---|
code | The name of the erasure code. | Clay |
ecn | The erasure coding parameter n. | 14 for (14,10) Clay code. |
eck | The erasure coding parameter k. | 10 for (14,10) Clay code. |
ecw | The sub-packetization level. | 256 for (14,10) Clay code. |
stripename | The name of a stripe. | Clay-0 in our example. |
blocklist | The list of blocks in the stripe, including the name of each block, and IP of corresponding node that stores each block. | stripe-0-0:192.168.0.3 ...... |
blockbytes | The size of a block in bytes. | 268435456 for 256 MIB. |
subpktbytes | The size of a sub-packet in bytes. | 65536 for 64 KiB sub-packet. |
Here is the sample metadata file Clay-0.xml
:
<stripe>
<attribute><name>code</name><value>Clay</value></attribute>
<attribute><name>ecn</name><value>14</value></attribute>
<attribute><name>eck</name><value>10</value></attribute>
<attribute><name>ecw</name><value>256</value></attribute>
<attribute><name>stripename</name><value>Clay-0</value></attribute>
<attribute><name>blocklist</name>
<value>stripe-0-0:192.168.0.3</value>
<value>stripe-0-1:192.168.0.4</value>
<value>stripe-0-2:192.168.0.5</value>
<value>stripe-0-3:192.168.0.6</value>
<value>stripe-0-4:192.168.0.7</value>
<value>stripe-0-5:192.168.0.8</value>
<value>stripe-0-6:192.168.0.9</value>
<value>stripe-0-7:192.168.0.10</value>
<value>stripe-0-8:192.168.0.11</value>
<value>stripe-0-9:192.168.0.12</value>
<value>stripe-0-10:192.168.0.13</value>
<value>stripe-0-11:192.168.0.14</value>
<value>stripe-0-12:192.168.0.15</value>
<value>stripe-0-13:192.168.0.16</value>
<value></value>
</attribute>
<attribute><name>blockbytes</name><value>268435456</value></attribute>
<attribute><name>subpktbytes</name><value>65536</value></attribute>
</stripe>
Run the following command in the controller to start ParaRC.
$> python script/start.py
In the new node, we can test the parallel repair to repair a block:
$> ./DistClient degradeRead [blockname] [method]
- blockname
- The name of a block
- For example, "stripe-0-0" in our example to repair the first block
- method
dist
for parallel repair;conv
for conventional centralized repair
For example, we run the following command to test the parallel repair:
$> ./DistClient degradeRead stripe-0-0 dist
Note that to test for different method, please re-start the system to avoid the case that data are cached in each agent.
To test for full-node recovery, please prepare multiple stripes when generate blocks. Then run the following command to repair all the blocks in a failed node:
$> ./DistClient nodeRepair [ip] [code] [method]
- ip
- The ip of the failed node
- For example,
192.168.0.3
for agent1
- code
- The erasure code for stripes
- For example,
Clay
- method
dist
for parallel repair;conv
for conventional centralized repair
For example, we run the following command to repair agent1
$> ./DistClient nodeRepair 192.168.0.3 Clay dist
The HDFS3 integration of ParaRC is built with OpenEC atop Hadoop-3.3.4. Note that only the steps marked with * are different with those in standalone mode. We focus on the steps marked with * in this part.
-
Prepare a cluster in Alibaba Cloud *
-
Compile ParaRC
-
Deploy Hadoop-3.3.4 with OpenEC *
-
Prepare configuration files for each machine *
-
Generate the MLP in the offline
-
Generate a stripe of blocks in Hadoop-3.3.4 *
-
Start ParaRC
-
Test parallel repair
We can use the same cluster we applied in the standalone mode. The following tables shows how we deploy run ParaRC with Hadoop-3.3.4, OpenEC.
ParaRC | OpenEC | Hadoop |
---|---|---|
PRS Generator | - | - |
Controller | Controller | NameNode |
Agents | Agents | DataNodes |
In the source code of ParaRC, we have a patch for OpenEC:
$> ls openec-pararc-patch
hdfs3-integration openec-patch
- hdfs3-integration
- Source code patch and installation scripts for Hadoop-3.3.4
- openec-patch
- Source code patch for OpenEC
We first deploy Hadoop-3.3.4 in the cluster, then follows OpenEC.
We first download the source code of Hadoop-3.3.4 in /home/pararc/hadoop-3.3.4-src
in the NameNode
. Then we install the pararc patch with Hadoop-3.3.4:
$> cd /home/pararc/ParaRC/openec-pararc-patch/hdfs3-integration
$> ./install.sh
After running the install.sh
, we copy the pararc patch to the source code of Hadoop-3.3.4 and compile the source code of Hadoop-3.3.4.
We follow the document of the configuration for Hadoop-3.0.0 in OpenEC to configure Hadoop-3.3.4. We show the difference here:
-
As the default username is
pararc
, please change accordingly in configuration files. -
Please use the IPs generated in Alibaba Cloud in your account when configuring Hadoop-3.3.4
-
In
hdfs-site.xml
Parameter Description Example dfs.blocksize The size of a block in bytes. 268435456 for block size with 256 MiB. oec.pktsize The size of a packet in bytes. Note that for a MSR code with sub-packetization level w, the size of a packet is w times the size of a sub-packet. 16777216 for (14,10) Clay code, where w=256, sub-packet size is 64 KiB.
After we generate all the configuration files, please follow the document for Hadoop-3.0.0 provided in OpenEC to deploy Hadoop-3.3.4 in the cluster.
We first download the source code of OpenEC in /home/pararc/openec
in the Controller
. Then we install the ParaRC patch with OpenEC.
$> cd /home/pararc/ParaRC/openec-pararc-patch/openec-patch
$> cp -r ./* /home/pararc/openec/src
$> cd /home/pararc/openec/
$> cmake . -DFS TYPE:STRING=HDFS3
$> make
Now we have built the source code of OpenEC with ParaRC patch.
We follow the document of the configuration in OpenEC with the following differences:
-
ec.policy
Parameter Description Example ecid Unique id for an erasure code. clay_14_10 class Class name of erasure code implementation. Clay n Parameter n for the erasure code. 14 k Parameter k for the erasure code. 10 w Sub-packetization level. 256 opt Optimization level for OpenEC . We do not use optimization in OpenEC. -1 -
As the default username is
pararc
, please change accordingly in configuration files. -
Please use the IPs generated in Alibaba Cloud in your account when configuring OpenEC.
After we generate the configuration files, please follow the document in OpenEC to deploy OpenEC in the cluster.
In the configuration files of ParaRC, we have the following differences compared with the configuration in the standalone mode.
Parameters | Description | Example |
---|---|---|
block.directory | The directory to store blocks. | /home/pararc/hadoop-3.3.4-src/hadoop-dist/target/hadoop-3.3.4/dfs/data/current for hdfs-3 integration mode. |
We prepare a script script/gendata.py
to generate blocks in Hadoop-3.3.4.
The usage of this script is:
$> python script/gendata.py [nStripes] [code] [n] [k] [w]
-
nStripes
- The number of stripes to generate.
-
code
- The name of the erasure code.
-
n
- The erasure coding parameter n.
-
k
- The erasure coding parameter k.
-
w
- The sub-packetization level.
-
Example
$> python script/gendata.py 1 Clay 14 10 256
After running this script, we generate a stripe of blocks as well as corresponding metadata file.
Finally, we follow the steps in standalone mode to test the parallel repair.