From d9b6d18acd397758c052efa041720328542b3d51 Mon Sep 17 00:00:00 2001
From: Nick Dokos <ndokos@redhat.com>
Date: Thu, 11 Aug 2022 00:08:11 -0400
Subject: [PATCH 1/3] Fix port requirements in pbench-uperf doc

PBENCH-868

pbench-uperf starts with a server port of 20010, not 20000, so we
fix the doc to agree with the implementation.

The doc does not mention that the local ports on each system need to
be open for uperf to operate successfully. That information plus
information on how to find the range on a system and how to use
`firewall-cmd' to open the range are added.
---
 agent/bench-scripts/pbench-uperf.md | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/agent/bench-scripts/pbench-uperf.md b/agent/bench-scripts/pbench-uperf.md
index 3bd923e12f..8d4a75555e 100644
--- a/agent/bench-scripts/pbench-uperf.md
+++ b/agent/bench-scripts/pbench-uperf.md
@@ -91,16 +91,28 @@ To temporarily disable (this may give security folks heartburn):
     # systemctl stop firewalld
     # systemctl stop iptables
 
-To temporarily enable port under firewalld use:
+To temporarily enable a port under firewalld use:
 
-    # firewall-cmd --add-port=20000/tcp
+    # firewall-cmd --add-port=20010/tcp
 
-Where "20000" is the default port pbench-uperf will use for the uperf server.
-If you are using multiple servers, then starting with port 20000, pbench-uperf
+Where "20010" is the default port pbench-uperf will use for the uperf server.
+If you are using multiple servers, then starting with port 20010, pbench-uperf
 will use ports in increments of 10.  E.g. for 3 client / server pairs, ports
-ports 20000, 20010, and 20020 will be used.  Be sure you open those ports on
+ports 20010, 20020, and 20030 will be used.  Be sure you open those ports on
 the remote systems ahead of time.
 
+You will also need to open all the local ports on each system. You can find out
+the range of the local ports using this command:
+
+    # sysctl net.ipv4.ip_local_port_range
+
+The default range is 32768-60999, You can use the same command to open a range
+of ports:
+
+    # firewall-cmd --add-port=32768-60999/tcp
+
+and similarly for udp (if needed).
+
 ## syntax
 
 Important test parameters are listed in their long form but there is also a short form available with **-h** :

From f15a47b30c78ecf610bd762155e49b3c6e0b8a64 Mon Sep 17 00:00:00 2001
From: Nick Dokos <ndokos@redhat.com>
Date: Wed, 10 Aug 2022 23:36:08 -0400
Subject: [PATCH 2/3] Delete trailing whitespace

---
 agent/bench-scripts/pbench-uperf.md | 86 ++++++++++++++---------------
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/agent/bench-scripts/pbench-uperf.md b/agent/bench-scripts/pbench-uperf.md
index 8d4a75555e..2e04bab1ff 100644
--- a/agent/bench-scripts/pbench-uperf.md
+++ b/agent/bench-scripts/pbench-uperf.md
@@ -1,14 +1,14 @@
-This page documents how pbench-uperf can help you with network performance testing.  Before covering the specific command-line syntax, 
+This page documents how pbench-uperf can help you with network performance testing.  Before covering the specific command-line syntax,
 we discuss why you would use this tool and methods for utilizing it.
 
 # network performance
 
-Network performance has a huge impact on performance of distributed storage, 
-but is often not given the attention it deserves 
-during the planning and installation phases of the cluster lifecycle. 
+Network performance has a huge impact on performance of distributed storage,
+but is often not given the attention it deserves
+during the planning and installation phases of the cluster lifecycle.
 
-The purpose of pbench-uperf is to characterize the capacity 
-of your entire network infrastructure to support the desired level of traffic 
+The purpose of pbench-uperf is to characterize the capacity
+of your entire network infrastructure to support the desired level of traffic
 induced by distributed storage, using multiple network connections in parallel.
 After all, that's what your distributed storage will be doing with the network.
 
@@ -18,33 +18,33 @@ sets of hosts, only 2 hosts at a time.
 
 # examples motivating network testing
 
-The two most common hardware problems impacting distributed storage are, 
-not surprisingly, disk drive failures and network failures. 
+The two most common hardware problems impacting distributed storage are,
+not surprisingly, disk drive failures and network failures.
 Some of these failures do not cause hard errors and are more or less silent,
-but instead cause performance degradation. 
-For example, with a bonded network interface containing two physical network interfaces, 
-if one of the physical interfaces fails (either port on NIC/switch, or cable), 
-then the bonded interface will stay up, but will have less performance 
-(how much less depends on the bonding mode). 
-Another error would be failure of an 10-GbE Ethernet interface to 
-autonegotiate speed to 10-Gbps -- 
-sometimes network interfaces auto-negotiate to 1-Gbps instead. 
-If the TCP connection is experiencing a high rate of packet loss 
+but instead cause performance degradation.
+For example, with a bonded network interface containing two physical network interfaces,
+if one of the physical interfaces fails (either port on NIC/switch, or cable),
+then the bonded interface will stay up, but will have less performance
+(how much less depends on the bonding mode).
+Another error would be failure of an 10-GbE Ethernet interface to
+autonegotiate speed to 10-Gbps --
+sometimes network interfaces auto-negotiate to 1-Gbps instead.
+If the TCP connection is experiencing a high rate of packet loss
 or is not tuned correctly, it may not reach the full network speed supported by the hardware.
 
-So why run parallel netperf sessions instead of just one? 
-There are a variety of network performance problems 
-relating to network topology (the way in which hosts are interconnected), 
-particularly network switch and router topology, that only manifest when 
-several pairs of hosts are attempting to transmit traffic 
-across the same shared resource, 
-which could be a trunk connecting top-of-rack switches or 
-a blade-based switch with insufficient bandwidth to switch backplane, for example. 
+So why run parallel netperf sessions instead of just one?
+There are a variety of network performance problems
+relating to network topology (the way in which hosts are interconnected),
+particularly network switch and router topology, that only manifest when
+several pairs of hosts are attempting to transmit traffic
+across the same shared resource,
+which could be a trunk connecting top-of-rack switches or
+a blade-based switch with insufficient bandwidth to switch backplane, for example.
 Individual netperf/iperf sessions will not find these problems, but **pbench-uperf** will.
 
-This test can be used to simulate flow of data through a distributed filesystem, 
-for example. If you want to simulate 4 Gluster clients, call them c1 through c4, 
-writing large files to a set of 2 servers, call them s1 and s2, 
+This test can be used to simulate flow of data through a distributed filesystem,
+for example. If you want to simulate 4 Gluster clients, call them c1 through c4,
+writing large files to a set of 2 servers, call them s1 and s2,
 you can specify these (sender, receiver) pairs (we'll see how in a second):
 
      (c1,s1), (c2, s2), (c3, s1), (c4, s2)
@@ -57,11 +57,11 @@ Finally, if you want to simulate a mixed read-write workload, use these pairs:
 
     (c1,s1), (c2, s2), (c3, s1), (c4, s2), (s1, c1), (s2, c2), (s1, c3), (s2, c4)
 
-More complicated flows can model behavior of non-native protocols, 
-where a cluster node acts as a proxy server - 
-it is a server (for non-native protocol) and a client (for native protocol). 
-For example, such protocols often induce full-duplex traffic 
-which can stress the network differently than unidirectional in/out traffic. 
+More complicated flows can model behavior of non-native protocols,
+where a cluster node acts as a proxy server -
+it is a server (for non-native protocol) and a client (for native protocol).
+For example, such protocols often induce full-duplex traffic
+which can stress the network differently than unidirectional in/out traffic.
 For example, try adding this set of flows to preceding flow:
 
     (s1, s2),.(s2, s3),.(s3, s4),.(s4, s1)
@@ -72,9 +72,9 @@ Use the command:
 
     # pbench-uperf -h
 
-You typically run pbench-uperf from a head node or test driver that has password-less ssh access 
-to the set of machines being tested. 
-The hosts running the test do not need ssh access to each other -- 
+You typically run pbench-uperf from a head node or test driver that has password-less ssh access
+to the set of machines being tested.
+The hosts running the test do not need ssh access to each other --
 they only have to allow password-less ssh access from the head node.
 
 ## firewalls
@@ -129,13 +129,13 @@ Important test parameters are listed in their long form but there is also a shor
 
 **FIXME** - not all parameters documented yet.
 
-For high network speeds, multiple uperf instances per node must be used to harness enough CPU power 
+For high network speeds, multiple uperf instances per node must be used to harness enough CPU power
 to drive the network interface to full speed.
 
 If your test duration is not high enough, you may start to see errors caused by a high standard deviation in test results.
 
-The client and server lists should be the same length.  
-pbench-uperf will create a uperf session from clients[k] to servers[k], 
+The client and server lists should be the same length.
+pbench-uperf will create a uperf session from clients[k] to servers[k],
 where clients[k] is the k'th client in the --clients list, and
 servers[k] is the k'th server in the --servers list.
 
@@ -143,18 +143,18 @@ servers[k] is the k'th server in the --servers list.
 
 There are 2 basic forms of performance results:
 
-* throughput -- how much work is done in a unit of time?  
+* throughput -- how much work is done in a unit of time?
   * for **stream** test: Gbit/sec
     * response time is not measurable
   * for **rr** test: exchanges/sec
       response time -- average time between beginning of request send and end of response receive
 
-The latter **rr** test is probably most important for understanding what to expect from distributed storage clients, 
+The latter **rr** test is probably most important for understanding what to expect from distributed storage clients,
 where read and write requests have to be acknowledged.
-For example, if you have a 100-Gbit network with a round trip time of 1 millisec and a message size of 1 Mbit, 
+For example, if you have a 100-Gbit network with a round trip time of 1 millisec and a message size of 1 Mbit,
 you can transmit the message in 10 microsec. but you can't get a response for 100 times that long!
 
 Network utilization can be derived from pbench **sar**  Mbit/sec results on the network interfaces.
 
 Scalability can be derived from running a series of these tests with varying numbers of network interfaces and hosts,
-keeping the ratio of threads to interfaces constant. 
+keeping the ratio of threads to interfaces constant.

From c09a3490b32d230acd8446dcc46ef414afcd81fe Mon Sep 17 00:00:00 2001
From: Nick Dokos <ndokos@redhat.com>
Date: Thu, 11 Aug 2022 11:49:05 -0400
Subject: [PATCH 3/3] Fix typos

---
 agent/bench-scripts/pbench-uperf.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/agent/bench-scripts/pbench-uperf.md b/agent/bench-scripts/pbench-uperf.md
index 2e04bab1ff..f27c232bda 100644
--- a/agent/bench-scripts/pbench-uperf.md
+++ b/agent/bench-scripts/pbench-uperf.md
@@ -64,7 +64,7 @@ For example, such protocols often induce full-duplex traffic
 which can stress the network differently than unidirectional in/out traffic.
 For example, try adding this set of flows to preceding flow:
 
-    (s1, s2),.(s2, s3),.(s3, s4),.(s4, s1)
+    (s1, s2), (s2, s3), (s3, s4), (s4, s1)
 
 # how to run it