Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix port requirements in pbench-uperf doc #2969

Merged
merged 3 commits into from
Aug 15, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 61 additions & 49 deletions agent/bench-scripts/pbench-uperf.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
This page documents how pbench-uperf can help you with network performance testing. Before covering the specific command-line syntax,
This page documents how pbench-uperf can help you with network performance testing. Before covering the specific command-line syntax,
we discuss why you would use this tool and methods for utilizing it.

# network performance

Network performance has a huge impact on performance of distributed storage,
but is often not given the attention it deserves
during the planning and installation phases of the cluster lifecycle.
Network performance has a huge impact on performance of distributed storage,
but is often not given the attention it deserves
during the planning and installation phases of the cluster lifecycle.

The purpose of pbench-uperf is to characterize the capacity
of your entire network infrastructure to support the desired level of traffic
The purpose of pbench-uperf is to characterize the capacity
of your entire network infrastructure to support the desired level of traffic
induced by distributed storage, using multiple network connections in parallel.
After all, that's what your distributed storage will be doing with the network.

Expand All @@ -18,33 +18,33 @@ sets of hosts, only 2 hosts at a time.

# examples motivating network testing

The two most common hardware problems impacting distributed storage are,
not surprisingly, disk drive failures and network failures.
The two most common hardware problems impacting distributed storage are,
not surprisingly, disk drive failures and network failures.
Some of these failures do not cause hard errors and are more or less silent,
but instead cause performance degradation.
For example, with a bonded network interface containing two physical network interfaces,
if one of the physical interfaces fails (either port on NIC/switch, or cable),
then the bonded interface will stay up, but will have less performance
(how much less depends on the bonding mode).
Another error would be failure of an 10-GbE Ethernet interface to
autonegotiate speed to 10-Gbps --
sometimes network interfaces auto-negotiate to 1-Gbps instead.
If the TCP connection is experiencing a high rate of packet loss
but instead cause performance degradation.
For example, with a bonded network interface containing two physical network interfaces,
if one of the physical interfaces fails (either port on NIC/switch, or cable),
then the bonded interface will stay up, but will have less performance
(how much less depends on the bonding mode).
Another error would be failure of an 10-GbE Ethernet interface to
autonegotiate speed to 10-Gbps --
sometimes network interfaces auto-negotiate to 1-Gbps instead.
If the TCP connection is experiencing a high rate of packet loss
or is not tuned correctly, it may not reach the full network speed supported by the hardware.

So why run parallel netperf sessions instead of just one?
There are a variety of network performance problems
relating to network topology (the way in which hosts are interconnected),
particularly network switch and router topology, that only manifest when
several pairs of hosts are attempting to transmit traffic
across the same shared resource,
which could be a trunk connecting top-of-rack switches or
a blade-based switch with insufficient bandwidth to switch backplane, for example.
So why run parallel netperf sessions instead of just one?
There are a variety of network performance problems
relating to network topology (the way in which hosts are interconnected),
particularly network switch and router topology, that only manifest when
several pairs of hosts are attempting to transmit traffic
across the same shared resource,
which could be a trunk connecting top-of-rack switches or
a blade-based switch with insufficient bandwidth to switch backplane, for example.
Individual netperf/iperf sessions will not find these problems, but **pbench-uperf** will.

This test can be used to simulate flow of data through a distributed filesystem,
for example. If you want to simulate 4 Gluster clients, call them c1 through c4,
writing large files to a set of 2 servers, call them s1 and s2,
This test can be used to simulate flow of data through a distributed filesystem,
for example. If you want to simulate 4 Gluster clients, call them c1 through c4,
writing large files to a set of 2 servers, call them s1 and s2,
you can specify these (sender, receiver) pairs (we'll see how in a second):

(c1,s1), (c2, s2), (c3, s1), (c4, s2)
Expand All @@ -57,24 +57,24 @@ Finally, if you want to simulate a mixed read-write workload, use these pairs:

(c1,s1), (c2, s2), (c3, s1), (c4, s2), (s1, c1), (s2, c2), (s1, c3), (s2, c4)

More complicated flows can model behavior of non-native protocols,
where a cluster node acts as a proxy server -
it is a server (for non-native protocol) and a client (for native protocol).
For example, such protocols often induce full-duplex traffic
which can stress the network differently than unidirectional in/out traffic.
More complicated flows can model behavior of non-native protocols,
where a cluster node acts as a proxy server -
it is a server (for non-native protocol) and a client (for native protocol).
For example, such protocols often induce full-duplex traffic
which can stress the network differently than unidirectional in/out traffic.
For example, try adding this set of flows to preceding flow:

(s1, s2),.(s2, s3),.(s3, s4),.(s4, s1)
(s1, s2), (s2, s3), (s3, s4), (s4, s1)

# how to run it

Use the command:

# pbench-uperf -h

You typically run pbench-uperf from a head node or test driver that has password-less ssh access
to the set of machines being tested.
The hosts running the test do not need ssh access to each other --
You typically run pbench-uperf from a head node or test driver that has password-less ssh access
to the set of machines being tested.
The hosts running the test do not need ssh access to each other --
they only have to allow password-less ssh access from the head node.

## firewalls
Expand All @@ -91,16 +91,28 @@ To temporarily disable (this may give security folks heartburn):
# systemctl stop firewalld
# systemctl stop iptables

To temporarily enable port under firewalld use:
To temporarily enable a port under firewalld use:

# firewall-cmd --add-port=20000/tcp
# firewall-cmd --add-port=20010/tcp

Where "20000" is the default port pbench-uperf will use for the uperf server.
If you are using multiple servers, then starting with port 20000, pbench-uperf
Where "20010" is the default port pbench-uperf will use for the uperf server.
If you are using multiple servers, then starting with port 20010, pbench-uperf
will use ports in increments of 10. E.g. for 3 client / server pairs, ports
ports 20000, 20010, and 20020 will be used. Be sure you open those ports on
ports 20010, 20020, and 20030 will be used. Be sure you open those ports on
the remote systems ahead of time.

You will also need to open all the local ports on each system. You can find out
the range of the local ports using this command:

# sysctl net.ipv4.ip_local_port_range

The default range is 32768-60999, You can use the same command to open a range
of ports:

# firewall-cmd --add-port=32768-60999/tcp

and similarly for udp (if needed).

## syntax

Important test parameters are listed in their long form but there is also a short form available with **-h** :
Expand All @@ -117,32 +129,32 @@ Important test parameters are listed in their long form but there is also a shor

**FIXME** - not all parameters documented yet.

For high network speeds, multiple uperf instances per node must be used to harness enough CPU power
For high network speeds, multiple uperf instances per node must be used to harness enough CPU power
to drive the network interface to full speed.

If your test duration is not high enough, you may start to see errors caused by a high standard deviation in test results.

The client and server lists should be the same length.
pbench-uperf will create a uperf session from clients[k] to servers[k],
The client and server lists should be the same length.
pbench-uperf will create a uperf session from clients[k] to servers[k],
where clients[k] is the k'th client in the --clients list, and
servers[k] is the k'th server in the --servers list.

# results

There are 2 basic forms of performance results:

* throughput -- how much work is done in a unit of time?
* throughput -- how much work is done in a unit of time?
* for **stream** test: Gbit/sec
* response time is not measurable
* for **rr** test: exchanges/sec
response time -- average time between beginning of request send and end of response receive

The latter **rr** test is probably most important for understanding what to expect from distributed storage clients,
The latter **rr** test is probably most important for understanding what to expect from distributed storage clients,
where read and write requests have to be acknowledged.
For example, if you have a 100-Gbit network with a round trip time of 1 millisec and a message size of 1 Mbit,
For example, if you have a 100-Gbit network with a round trip time of 1 millisec and a message size of 1 Mbit,
you can transmit the message in 10 microsec. but you can't get a response for 100 times that long!

Network utilization can be derived from pbench **sar** Mbit/sec results on the network interfaces.

Scalability can be derived from running a series of these tests with varying numbers of network interfaces and hosts,
keeping the ratio of threads to interfaces constant.
keeping the ratio of threads to interfaces constant.