-
Notifications
You must be signed in to change notification settings - Fork 16
tutorial codes synthetic workload development
OK, we're going to assume now that you're at the very least familiar with PDES, ROSS, and the general overview of CODES (all discussed in the first three tutorials). If you aren't familiar with these topics or haven't read those tutorials yet, it is recommended that you do that before proceeding.
An old example of a simple synthetic workload generator can be found in /doc/example/example.c
in the CODES repository. This file has comments that describe various parts of a complete workload.
Let's say that you were performing research on various technologies for how to launder clothes. You have compiled a small collection of various types of laundry machines to wash clothes. Your task is to find what is the most promising new laundry machine in your collection so that future laundry machine engineers can benefit from the insight that you've learned.
It's important for you to be able to compare these washers fairly and without bias. You don't want to recommend one washer over the others just because you liked the color of it - that won't do anything to help future washing machine designs clean clothes better. A simple test would be to arrange several identical loads of laundry, load each washer with the same items of clothing with the same level of soiledness, and then judge how well each washing machine performed. You'd get metrics like how long it took; how clean the clothes were; were any clothes damaged in the process... etc. Because you gave each of them the same amount of work, you'll be able to get some apples-to-apples comparison and thus have the valuable insight you so required.
CODES workloads are like a load of laundry for the washer that is a CODES network model. If we give different network models the exact same workload, then we will be able to have an apples-to-apples comparison of how each network model behaved.
A synthetic workload generator in CODES is simply a ROSS PDES program that supplies traffic to the provided network. In terms of the Mail model described in the ROSS/PDES tutorial, the workload generator would be like the people within a household that put letters in their household mailbox to be sent along the mail system to people in other households - or, in CODES land, inject packets onto the network to be routed to other workload LPs on other terminals.
The simplest form of a CODES workload is a Synthetic Workload Generator. This type of workload just creates packets with programmed destinations; there's no real substance to these packets and they generally don't represent any real-world application communication pattern. An example type of traffic that could easily be implemented in a synthetic workload generator is Uniform Random Traffic. This is where each workload LP has a set number of messages that it will generate, each with a destination chosen - uniformly at random - from the pool of available terminals in the network.
But a synthetic workload generator can be more complicated. It can be as arbitrarily complicated as the developer wishes - if they have the drive and know-how to develop it! A slightly more complicated synthetic workload generator would be a Ping-Pong Traffic generator. This would be very similar to an ordinary uniform random traffic generator. Each workload LP would pick a random other workload LP, send a PING
message to it. But there will be additional behavior written into the synthetic workload generator that when a PING
message is received, the receiving LP must send a PONG
message back to it. And when a PONG
message is received, then the LP will send a PING
message back - until it's sent some set number of PING
messages. At the end of the simulation, the number of PONG
messages received by a workload LP must be equal to the number of PING
messages that it sent and vice versa.
While this ping-pong generator isn't really much more than a simple random traffic generator with more steps, it does provide an example of how logic can be encoded into the workload generator to create more complex workloads.
So let's actually create the ping-pong generator.
As mentioned earlier, a synthetic workload generator is just a ROSS PDES program that uses some CODES functions to inject packets into a provided network. So we'll lay it out much like we would any other ROSS program.
We'll have forward event handlers, reverse event handlers, LP state for our workload LPs (which we'll also refer to as servers), a message struct that houses the information passed between workload LPs and functions for setup/finalization.
It's important to keep in mind that there's more than one way to skin a cat and that there are many different ways that one could implement a ping-pong traffic generator. Some are more concise than how this example will be written, but this tutorial is written to try and be as clear about what is happening as possible.
First thing we need to do is enumerate the types of messages that our workload generator will be passing around. We'll obviously have PING
and PONG
messages. But we'll also need messages that the server LPs send to themselves during initialization to know that they need to send the first PING
message. These messages are going to be called KICKOFF
messages.
So let's create an enum to make classification of these events simple:
enum svr_event
{
KICKOFF,
PING,
PONG
};
Next thing we need is the structure that houses the information our workload LPs will need to pass between each other. This structure would have something like: the ID of the server that sent it, the type of message that we're sending, helpers for reverse computation, we could encode some paylaod value too to add some substance to the messages that we're carrying.
This struct would look like:
struct svr_msg
{
enum svr_event svr_event_type; //KICKOFF, PING, or PONG
int sender_id; //ID of the sender workload LP to know who to send a PONG message back to
int payload_value; //Some value that we will encode as an example
model_net_event_return event_rc; //helper to encode data relating to CODES rng usage
};
So we have the message struct, an enumeration of the types of events that the worklaod LPs will send, next we need somewhere for each LP to store information that applies exclusively to themselves individually. This is a structure that contains all state corresponding to a particular worklaod LP. This would contain information like: the ID of the server, whether it has completed its sent message quota or not, a sum of all of the received payload values to print out at the end... anything that would be valuable for the particular server LP to know during the simulation.
struct svr_state
{
int svr_id; /* the ID of this server */
int ping_msg_sent_count; /* PING messages sent */
int ping_msg_recvd_count; /* PING messages received */
int pong_msg_sent_count; /* PONG messages sent */
int pong_msg_recvd_count; /* PONG messages received */
tw_stime start_ts; /* time that this LP started sending requests */
tw_stime end_ts; /* time that this LP ended sending requests */
int payload_sum; /* the running sum of all payloads received */
};
Each LP in ROSS needs an initialization function, this sets up the LP state and schedules any self-starting events (KICKOFF
). Our initialization function would look like:
static void svr_init(svr_state * s, tw_lp * lp)
{
//Initialize State
s->ping_msg_sent_count = 0;
s->ping_msg_recvd_count = 0;
s->pong_msg_sent_count = 0;
s->pong_msg_recvd_count = 0;
s->start_ts = 0.0;
s->end_ts = 0.0;
s->svr_id = codes_mapping_get_lp_relative_id(lp->gid, 0, 0); /* turns the LP Global ID into the server ID */
s->payload_sum = 0;
//Now we create and send a self KICKOFF message - this is a PDES coordination event and thus doesn't need to be injected into the connected network
//so we won't use model_net_event(), that's reserved for stuff we want to send across the network
/* Set a time from now when this message is to be received by the recipient (self in this case.) add some tiny random noise to help avoid event ties (different events with same timestamp). The lookahead value is a value required for conservative mode execution to work, it prevents scheduling a new event within the lookahead window */
tw_stime kickoff_time = g_tw_lookahead + (tw_rand_unif(lp->rng) * .0001);
tw_event *e;
svr_msg *m;
e = tw_event_new(lp->gid, kickoff_time, lp); //ROSS method to create a new event
m = tw_event_data(e); //Gives you a pointer to the data encoded within event e
m->svr_event_type = KICKOFF; //Set the event type so we can know how to classify the event when received
tw_event_send(e); //ROSS method to send off the event e with the encoded data in m
}
This initializes everything the LP needs during the simulation and sends the first KICKOFF
message to itself to start the simulation.
Now that we've sent our first events, we need a function that will classify received events/messages and call different functions depending on said event type.
Here's what the top level event handler would look like. It has a switch statement that looks at the event type message and then calls different functions to do different things depending on the event type. This is a catch all for all events that the server would receive.
static void svr_event(svr_state * s, tw_bf * b, svr_msg * m, tw_lp * lp)
{
switch (m->svr_event_type)
{
case KICKOFF:
handle_kickoff_event(s, b, m, lp);
break;
case PING:
handle_ping_event(s, b, m, lp);
break;
case PONG:
handle_pong_event(s, b, m, lp);
break;
default:
tw_error(TW_LOC, "\n Invalid message type %d ", m->svr_event_type);
break;
}
}
When we receive a kickoff event, we need to send a PING
message to a random destination. Within it will be encoded a payload value, a random number between 1-10. We'll change some LP state here too by setting the start time and incrementing the number of PING messages sent. You'll notice that even though this event handler is "sending a message", it doesn't use tw_event_new()
or tw_event_send()
like we did when we sent our kickoff event. This is because the kickoff event was just a PDES coordination message that we sent to the the LP's self to then send a PING
message on the network. We don't want our kickoff event passed on the network, it was just a PDES pattern we used to make the simulation do what we wanted. model_net_event()
is a CODES function wrapper for the necessary ROSS events that CODES will use to inject the message into the network - the PING
server message that we created will be packaged in with it.
This model_net_event()
function call is long and complicated. The important parts is setting the destination LPID, the PAYLOAD_SZ
that represents how big the event you're sending should be treated as, the size of the server message we're bundling with it, and the pointer to the server message itself. The size of the encoded server message is not considered by CODES when the packet is being routed through the attached network - what is passed in for PAYLOAD_SZ
is the exact number of bytes that the message will be treated as when it is being injected into the network.
static void handle_kickoff_event(svr_state * s, tw_bf * b, svr_msg * m, tw_lp * lp)
{
s->start_ts = tw_now(lp); //the time when we're starting this LP's work is NOW
svr_msg * ping_msg = malloc(sizeof(svr_msg)); //allocate memory for new message
tw_lpid local_dest = -1; //ID of a sever, relative to only servers
tw_lpid global_dest = -1; //ID of a server LP relative to ALL LPs
//We want to make sure we're not accidentally picking ourselves
local_dest = tw_rand_integer(lp->rng, 1, num_nodes - 2);
local_dest = (s->svr_id + local_dest) % num_nodes;
//local_dest is now a number [0,num_nodes) but is assuredly not s->svr_id
assert(local_dest >= 0);
assert(local_dest < num_nodes);
assert(local_dest != s->svr_id);
ping_msg->sender_id = s->svr_id; //encode our server ID into the new ping message
ping_msg->svr_event_type = PING; //set it to type PING
ping_msg->payload_value = tw_rand_integer(lp->rng, 1, 10); //encode a random payload value to it from [1,10]
codes_mapping_get_lp_info(lp->gid, group_name, &group_index, lp_type_name, &lp_type_index, NULL, &rep_id, &offset); //gets information from CODES necessary to get the global LP ID of a server
global_dest = codes_mapping_get_lpid_from_relative(local_dest, group_name, lp_type_name, NULL, 0);
s->ping_msg_sent_count++;
m->event_rc = model_net_event(net_id, "test", global_dest, PAYLOAD_SZ, 0.0, sizeof(svr_msg), (const void*)ping_msg, 0, NULL, lp);
}
When we receive a PING event, then we want to send a PONG event back to the server that sent the PING message. We also want to change some state: we'll increment the ping message received count, we'll add the encoded payload value to our running sum, and we'll increment the pong message sent count too as we're sending a PONG message back to the original sender.
static void handle_ping_event(svr_state * s, tw_bf * b, svr_msg * m, tw_lp * lp)
{
s->ping_msg_recvd_count++; //increment the counter for ping messages received
int original_sender = m->sender_id; //this is the server we need to send a PONG message back to
s->payload_sum += m->payload_value; //increment our running sum of payload values received
svr_msg * pong_msg = malloc(sizeof(svr_msg)); //allocate memory for new message
pong_msg->sender_id = s->svr_id;
pong_msg->svr_event_type = PONG;
// only ping messages contain a payload value - not every value in a message struct must be utilized by all messages!
codes_mapping_get_lp_info(lp->gid, group_name, &group_index, lp_type_name, &lp_type_index, NULL, &rep_id, &offset); //gets information from CODES necessary to get the global LP ID of a server
tw_lpid global_dest = codes_mapping_get_lpid_from_relative(original_sender, group_name, lp_type_name, NULL, 0);
s->pong_msg_sent_count++;
m->event_rc = model_net_event(net_id, "test", global_dest, PAYLOAD_SZ, 0.0, sizeof(svr_msg), (const void*)pong_msg, 0, NULL, lp);
}
When we receive a PONG event, then we'll need to send a PING event back unless we've reached our pre-set number of PINGS to send. Again, we're going to change some state by incrementing the pong received count and ping sent count (if we end up sending the ping).
static void handle_pong_event(svr_state * s, tw_bf * b, svr_msg * m, tw_lp * lp)
{
s->pong_msg_recvd_count++; //increment the counter for ping messages received
if(s->ping_msg_sent_count >= num_msgs) //if we've sent enough ping messages, then we stop and don't send any more
{
b->c1 = 1; //flag that we didn't really do anything in this event so that if this event gets reversed, we don't over-aggressively revert state or RNGs
return;
}
//Now we need to send another ping message back to the sender of the pong
int pong_sender = m->sender_id; //this is the sender of the PONG message that we want to send another PING message to
svr_msg * ping_msg = malloc(sizeof(svr_msg)); //allocate memory for new message
ping_msg->sender_id = s->svr_id; //encode our server ID into the new ping message
ping_msg->svr_event_type = PING; //set it to type PING
ping_msg->payload_value = tw_rand_integer(lp->rng, 1, 10); //encode a random payload value to it
codes_mapping_get_lp_info(lp->gid, group_name, &group_index, lp_type_name, &lp_type_index, NULL, &rep_id, &offset); //gets information from CODES necessary to get the global LP ID of a server
tw_lpid global_dest = codes_mapping_get_lpid_from_relative(pong_sender, group_name, lp_type_name, NULL, 0);
s->ping_msg_sent_count++;
m->event_rc = model_net_event(net_id, "test", global_dest, PAYLOAD_SZ, 0.0, sizeof(svr_msg), (const void*)ping_msg, 0, NULL, lp);
}
Implementing the reverse handlers for this simple workload is straight forward for the most part. As a reminder, CODES operates on top of ROSS which allows for Optimistic Parallel Execution which means that LPs can process events whenever they arrive and not need to worry about synchronization - instead it will roll back the simulation when a causality error is discovered. ROSS is also deterministic, which means that it will generate the same result regardless of if it is run sequentially or in parallel.
But there's no free lunch, we still need to tell ROSS (CODES) what to do when we need to undo an event that was done in error. This means we need to revert any LP state that was changed as well as reverse any RNG calls that we had done. This makes it so that when the simulation resumes in forward, it will operate as if it had never had to reverse in the first place.
All we did to our LP state during the Kickoff event was to increment our ping sender counter. So we just need to decrement it here. We also need to use a reverse rng method for each time we did an RNG call. We picked a random destination and a random payload value, so we call tw_rand_reverse_unif()
twice here to undo them. We don't need to use the same type of reverse RNG (unif to unif or integer to integer), tw_rand_reverse_unif()
will happily reverse the RNG call performed by tw_rand_integer()
, etc. We did create a new model net event for packet injection and this is a CODES method that does some RNG calls deep inside of it that need to be rolled back as well via model_net_event_rc2()
.
static void handle_kickoff_rev_event(svr_state * s, tw_bf * b, svr_msg * m, tw_lp * lp)
{
tw_rand_reverse_unif(lp->rng); //reverse the rng call for getting a local_dest
tw_rand_reverse_unif(lp->rng); //reverse the rng call for creating a payload value;
s->ping_msg_sent_count--; //undo the increment of the ping_msg_sent_count in the server state
model_net_event_rc2(lp, &m->event_rc); //undo any model_net_event calls encoded into this message
}
We can do similar things to our PING reverse handler. All we changed to our state was incrementing our ping message received counter and our running payload sum. We didn't do any RNG calls this time and so all that's left is undoing the model net event call.
static void handle_ping_rev_event(svr_state * s, tw_bf * b, svr_msg * m, tw_lp * lp)
{
s->ping_msg_recvd_count--; //undo the increment of the counter for ping messages received
s->payload_sum -= m->payload_value; //undo the increment of the payload sum
model_net_event_rc2(lp, &m->event_rc); //undo any model_net_event calls encoded into this message
}
In our PONG reverse handler, we undo our pong received counter increment but then we have this conditional if (b->c1)
. This is a feature of ROSS. It provides a size 32 bitfield into each message so that information can be coded into it so that, upon reverse, long complex logic can be removed. So in our forward event handler, we checked to see if(s->ping_msg_sent_count >= num_msgs)
, if it was, then we flipped the 1st bit in the bitfield to 1 and immediately returned. In this case, the bit flag b->c1
is a signal to the reverse handler that we just returned from the function without doing anything else. So during reverse computation, we know not to do anything else.
static void handle_pong_rev_event(svr_state * s, tw_bf * b, svr_msg * m, tw_lp * lp)
{
s->pong_msg_recvd_count--; //undo the increment of the counter for ping messages received
if (b->c1) //if we flipped the c1 flag in the forward event
return; //then we don't need to undo any rngs or state change
tw_rand_reverse_unif(lp->rng); //undo the rng for the new payload value
s->ping_msg_sent_count--;
model_net_event_rc2(lp, &m->event_rc); //undo any model_net_event calls encoded into this message
}
The bitfield doesn't have to be used to signal an early return, it can be used any time you want to simplify the logic of a reverse handler. For example, imagine a forward event handler with 5 nested conditionals, you'd only need to flip a bit in the bitfield to signal that you reached the innermost condition successfully so that during reverse computation you just check for that bit and you know "the forward event handler hit the innermost conditional in that complicated logic".
Once the simulation has finished, ROSS has every LP execute a finalize function. In this we can do any wrapping up or statistics calculations that we'd want to output. For example, we could have it print out statistics regarding how many PINGS and PONGS were sent or received, that running payload sum, etc...
So here we're going to set the end time to be Now. This isn't necessarily an accurate portrayal of the endtime for when this LP was actually done, that would happen in the handling of our PONG messages once we know we had sent our last PING. Setting the end time during the finalize introduces a bit of extra time into the mix as the finalize step won't be performed until ALL LPs have finished handling all of their events. So with this calculation performed in finalize, if there's a single straggler, this could impact the calculated end timestamp considerably. Where you'd put something like this depends entirely on your definition of it.
We can calculate the total number of messages sent by combining two of our counters, we can calculate the total number of bytes that were simulated by the PAYLOAD_SZ
value, the time the LP spent doing work, and print it all out.
static void svr_finalize(svr_state * s, tw_lp * lp)
{
s->end_ts = tw_now(lp);
int total_msgs_sent = s->ping_msg_sent_count + s->pong_msg_sent_count;
int total_msg_size_sent = PAYLOAD_SZ * total_msgs_sent;
tw_stime time_in_seconds_sent = ns_to_s(s->end_ts - s->start_ts);
printf("Sever LPID:%llu svr_id:%d sent %d bytes in %f seconds, PINGs Sent: %d; PONGs Received: %d; PINGs Received: %d; PONGs Sent %d; Payload Sum: %d\n", (unsigned long long)lp->gid, s->svr_id, total_msg_size_sent,
time_in_seconds_sent, s->ping_msg_sent_count, s->pong_msg_recvd_count, s->ping_msg_recvd_count, s->pong_msg_sent_count, s->payload_sum);
}
ROSS provides an easy way for you to add command line arguments to a model. We define an opt value array like this:
const tw_optdef app_opt [] =
{
TWOPT_GROUP("Model net synthetic traffic " ),
TWOPT_UINT("num_messages", num_msgs, "Number of PING messages to be generated per terminal "),
TWOPT_UINT("payload_sz",PAYLOAD_SZ, "size of the message being sent "),
TWOPT_CHAR("lp-io-dir", lp_io_dir, "Where to place io output (unspecified -> no output"),
TWOPT_UINT("lp-io-use-suffix", lp_io_use_suffix, "Whether to append uniq suffix to lp-io directory (default 0)"),
TWOPT_END()
};
and then in the main function of our workload we call tw_opt_add(app_opt)
to know to how to parse what's fed in as command line arguments. This tutorial workload model has ways to change the number of PING messages that are sent by each server, the size of the payload that each PING and PONG message would take up on the network and ways to configure the LPIO directory that network models and workload generators can use to output more complicated output files.
The workload generator is the entry-point for a CODES simulation. So in this file we also expect a main method. Most of this can generally be left as is. This will be where you do any necessary configuration loading to initialize global static variables that will be used in other parts of the workload generator (like num_nodes
in this example).
int main(int argc, char **argv)
{
int nprocs;
int rank;
int num_nets;
int *net_ids;
tw_opt_add(app_opt);
tw_init(&argc, &argv);
codes_comm_update();
if(argc < 2)
{
printf("\n Usage: mpirun <args> --sync=1/2/3 -- <config_file.conf> ");
MPI_Finalize();
return 0;
}
MPI_Comm_rank(MPI_COMM_CODES, &rank);
MPI_Comm_size(MPI_COMM_CODES, &nprocs);
configuration_load(argv[2], MPI_COMM_CODES, &config);
model_net_register();
svr_add_lp_type();
codes_mapping_setup();
net_ids = model_net_configure(&num_nets);
net_id = *net_ids;
free(net_ids);
/* 1 day of simulation time is drastically huge but it will ensure
that the simulation doesn't try to end before all packets are delivered */
g_tw_ts_end = s_to_ns(24 * 60 * 60);
//Load configuration type stuff here
num_nodes = codes_mapping_get_lp_count("MODELNET_GRP", 0, "nw-lp", NULL, 1); //get the number of nodes so we can use this value during the simulation
assert(num_nodes);
//All pre-run code should be done prior to this point
if(lp_io_dir[0])
{
do_lp_io = 1;
int flags = lp_io_use_suffix ? LP_IO_UNIQ_SUFFIX : 0;
int ret = lp_io_prepare(lp_io_dir, flags, &io_handle, MPI_COMM_CODES);
assert(ret == 0 || !"lp_io_prepare failure");
}
tw_run();
if (do_lp_io){
int ret = lp_io_flush(io_handle, MPI_COMM_CODES);
assert(ret == 0 || !"lp_io_flush failure");
}
model_net_report_stats(net_id);
tw_end();
return 0;
}
There's still some small parts here and there that are necessary to complete a whole, working, workload generator for CODES. These are mostly just ways to tell CODES and ROSS how to access the functions that we've written and information about the LPs defined in our workload generator. To see them all, you can view the completed ping-pong workload generator in the /doc/example/tutorial-synthetic-ping-pong.c
file in the repository.
The completed, working, ping pong synthetic traffic generator has been added to the repo in /doc/example/tutorial-synthetic-ping-pong.c
. It's even been included to the build process so you can run it right now!
Let's just assume that you're in your CODES build directory so I don't have to use absolute paths:
mpirun -n 4 ./bin/tutorial-synthetic-ping-pong --synch=3 -- ./doc/example/tutorial-ping-pong.conf
This will instantiate a 1D Dragonfly Dally network specified in the configuration file, with 72 total server LPs, one per terminal on the network. Each server LP will send a total of 20 PING messages to a random destination on the network. We'll get output like:
*** START PARALLEL OPTIMISTIC SIMULATION WITH SUSPEND LP FEATURE ***
Set num_servers per router 2, servers per injection queue per router 2, servers per node copy queue per node 1, num nics 2
GVT #79: simulation 100% complete, max event queue size 75 (GVT = MAX).
AVL tree size: 2
*** END SIMULATION ***
Sever LPID:0 svr_id:0 sent 186368 bytes in 0.000232 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 60; PONGs Sent 71; Payload Sum: 297
Sever LPID:1 svr_id:1 sent 40960 bytes in 0.000227 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
Sever LPID:45 svr_id:18 sent 40960 bytes in 0.000230 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
Sever LPID:46 svr_id:19 sent 155648 bytes in 0.000231 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 40; PONGs Sent 56; Payload Sum: 229
Sever LPID:90 svr_id:36 sent 40960 bytes in 0.000224 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
Sever LPID:91 svr_id:37 sent 90112 bytes in 0.000224 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 20; PONGs Sent 24; Payload Sum: 123
Sever LPID:135 svr_id:54 sent 40960 bytes in 0.000226 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
Sever LPID:136 svr_id:55 sent 88064 bytes in 0.000225 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 20; PONGs Sent 23; Payload Sum: 126
Sever LPID:50 svr_id:20 sent 114688 bytes in 0.000225 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 20; PONGs Sent 36; Payload Sum: 116
Sever LPID:51 svr_id:21 sent 40960 bytes in 0.000222 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
Sever LPID:95 svr_id:38 sent 98304 bytes in 0.000231 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 20; PONGs Sent 28; Payload Sum: 113
Sever LPID:96 svr_id:39 sent 40960 bytes in 0.000223 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
Sever LPID:140 svr_id:56 sent 233472 bytes in 0.000226 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 60; PONGs Sent 94; Payload Sum: 316
Sever LPID:141 svr_id:57 sent 159744 bytes in 0.000223 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 40; PONGs Sent 58; Payload Sum: 212
Sever LPID:100 svr_id:40 sent 143360 bytes in 0.000237 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 40; PONGs Sent 50; Payload Sum: 210
Sever LPID:101 svr_id:41 sent 40960 bytes in 0.000224 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
Sever LPID:145 svr_id:58 sent 40960 bytes in 0.000219 seconds, PINGs Sent: 20; PONGs Received: 20; PINGs Received: 0; PONGs Sent 0; Payload Sum: 0
.
.
.
Average number of hops traversed 3.544445 average chunk latency 4.851079 us maximum chunk latency 10.262621 us avg message size 2048.000000 bytes finished messages 2880 finished chunks 2880
ADAPTIVE ROUTING STATS: 2385 chunks routed minimally 495 chunks routed non-minimally completed packets 2880
Total packets generated 2880 finished 2880 Locally routed- same router 0 different-router 440 Remote (inter-group) 2440
We can analyze this output to see that we had a total of 2880 packets generated (in this case: one packet per PING/PONG message - if we increased the payload size greater than the configured payload size, then it would be broken up across multiple packets).
If we wanted to increase the intensity of this workload we only need to increase the payload size so that messages require more packets to be sent across the network at any given time.
mpirun -n 4 ./bin/tutorial-synthetic-ping-pong --synch=3 --payload_sz=8192 -- ./doc/example/tutorial-ping-pong.conf
Will do the exact same simulation but with a larger payload size (2 packets required per PING/PONG message), this will cause greater congestion on the network and we can observe that packets observe different total hop counts and latencies!
.
.
.
Average number of hops traversed 4.024827 average chunk latency 11.006624 us maximum chunk latency 25.119147 us avg message size 8192.000000 bytes finished messages 2880 finished chunks 5760
ADAPTIVE ROUTING STATS: 3289 chunks routed minimally 2471 chunks routed non-minimally completed packets 5760
Total packets generated 5760 finished 5760 Locally routed- same router 0 different-router 880 Remote (inter-group) 4880