Optimizing Centralized Federated Execution #264

byeonggiljun · 2023-08-22T09:43:32Z

This issue is inherited from this discussion, mostly from @edwardalee's description. I'm writing this issue to try to make a more readable document as that discussion is very long.

Motivation

Suppose that the Sender, which triggers at 100ms intervals, only occasionally sends an output message, say, on average, every few seconds. Currently, Sender sends LTC, ABS, and NET messages every 100 msec, even with no interesting information. In the below visualization (from @ChadliaJerad), the RTI is on the left, the Sender in the middle, and the Receiver on the right. Every message is redundant because Receiver has nothing to do with those messages (its next event tag is 2 sec, which is the timeout value). So if we can eliminate those messages, network overhead can be reduced.

Solution

A new message type, Next Downstream Tag (NDT) is proposed to resolve this inefficiency. When the RTI receives a NET from a downstream federate, it should notify upstream federates with an NDT message. Federates should maintain a ndt_queue (sorted by tag) that keeps track of NDT messages received from the RTI. Whenever an upstream federate reaches completion of a tag g, it has to check the NDT queue and if there is no output being produced, send an LTC(g) (and NET) *iff g >= peek(ndt_queue).

NET Handling Mechanism

RTI Side

When the RTI receives a NET(g_d), it sends NDT(g_d) to upstream federates that have not yet completed g_d. As a further performance optimization, the RTI may decide to only send NDT to federates that produce a lot of LTC and NET messages without producing output.

Federate Side

When an upstream federate receives an NDT(g_d), it should

Push the tag g_d onto the ndt_queue.
If output is being produced or g_d <= g, send LTC(g_d) and proper NET so that the RTI can give a grant to downstream federates.
Pop the ndt_queue until peek(ndt_queue) > g

A federate doesn’t have to send ABS, NET, or LTC at the tag g if g < peek(ndt_queue). Of course, NET and LTC should be sent if there is any actual output.

Things to discuss

How do we efficiently look up which federates to send an NDT to?
How can we handle a federation with cyclic dependency between federates? Do we just break the cycle at the point of the sender of the NET in response to which an NDT should be send?

TODOs

RTI

Add a command line argument for turning on the NDT messages.
When receiving NET at g, send NDT to upstream federates that did not complete the tag g
(For further optimization) Discuss how to not send unnecessary NDTs and implement the solution

Federate

Create ndt_queue to manage NDTs
Eliminating unnecessary NET, LTC, and ABS messages based on information from ndt_queue

The text was updated successfully, but these errors were encountered:

hokeun · 2023-08-22T21:09:35Z

Very nice summary, thanks @byeong-gil ! I just have a minor suggestion. How about you add the remaining tasks with checkboxes at the end of the issue description above to keep track of this work for everyone?

byeonggiljun self-assigned this Aug 31, 2023

byeonggiljun added enhancement Enhancement of existing feature federated labels Aug 31, 2023

byeonggiljun linked a pull request Aug 31, 2023 that will close this issue

Draft: Introduce the next downstream tag (NDT) to optimize the communication in centralized federated execution #176

Closed

12 tasks

byeonggiljun mentioned this issue Oct 24, 2023

Safely skipping redundant port absent messages in the centralized federated execution #295

Open

byeonggiljun mentioned this issue Jan 23, 2024

Draft: Introduce Next Downstream Tag message (NDT) to optimize the communication in centralized federated execution #337

Closed

byeonggiljun mentioned this issue Feb 5, 2024

Downstream next event tag (DNET), a new signal for more efficient centralized federated execution #349

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing Centralized Federated Execution #264

Optimizing Centralized Federated Execution #264

byeonggiljun commented Aug 22, 2023 •

edited

Loading

hokeun commented Aug 22, 2023

Optimizing Centralized Federated Execution #264

Optimizing Centralized Federated Execution #264

Comments

byeonggiljun commented Aug 22, 2023 • edited Loading

Motivation

Solution

NET Handling Mechanism

RTI Side

Federate Side

Things to discuss

TODOs

RTI

Federate

hokeun commented Aug 22, 2023

byeonggiljun commented Aug 22, 2023 •

edited

Loading