-
Notifications
You must be signed in to change notification settings - Fork 68
Conversation
Please remove unnecessary log file, remove unnecessary empty lines. The specific test info, such as hostname, should also be removed. Please consider adding some comments to make the code easily readable. Please add README.md to tell how to use these scripts. |
num=${CASES[$size]} | ||
dir="${size}_${num}" | ||
ssh ${REMOTE_NAMENODE} "hdfs dfs -mkdir /${dir}" | ||
hadoop jar /root/rui/hadoop-3.2.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.2.0-SNAPSHOT-tests.jar TestDFSIO -write -nrFiles $(($num)) -size ${size} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this command generalized, e.g.: hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-tests.jar TestDFSIO ...
8c6ad31
to
9439983
Compare
supports/ec-performance-test/config
Outdated
# The namenode's hostname of src cluster | ||
SRC_NODE= | ||
|
||
# The namenode's hostname for remote HDFS cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed and use SRC_NODE instead?
num=${CASES[$size]} | ||
dir="${size}_${num}" | ||
ssh ${REMOTE_NAMENODE} "hdfs dfs -mkdir /${dir}" | ||
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.2.0-SNAPSHOT-tests.jar TestDFSIO -write -nrFiles $(($num)) -size ${size} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use hadoop-mapreduce-client-jobclient-*-tests.jar instead.
supports/ec-performance-test/config
Outdated
# The hosts require dropping cache, e.g., HOSTS="host1 host2" | ||
HOSTS= | ||
|
||
# the number of mapper for distcp, e.g., MAPPER_NUM="30 60 90" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment: For the sake of fair comparison, please make sure this value is consistent with the overall cmdlet executors of SSM.
Please add some notes that you think are necessary to test in README.md. The reason why we set a large time interval in SSM rule. For the sake of fair comparison, DistCP and SSM should have same parallelism. i.e. the num of mappers for DistCP should be consistent with the num of overall executors (smart.cmdlet.executors * num_smart_agent) for SSM. The parallelism is 90 in our test. You should specify this value in DistCP cmd. For SSM, given that there are 9 smart agents, you should set smart.cmdlet.executors as 10 in smart-default.xml. You should drop cache after each round of test to avoid impact from OS cache. This operation can be included in your test scripts. |
No description provided.