Replies: 4 comments 3 replies
-
I think this design could be very interesting and promising: using the compile-time analysis to take advantage of the program structure for maximizing parallelism and data locality, and then using runtime scheduling for efficient and fast resource allocation. As some of the previous works of relevance show, such an offline-online combined scheduling approach could achieve significant improvements (e.g., https://www.cs.ucr.edu/~dtrip003/publication/Website_WireFrame_Micro2017.pdf). A few high-level comments:
|
Beta Was this translation helpful? Give feedback.
-
If I understood, Edward already mentioned that the data structure used in
but
there is a risk that a more static scheduler would assign
A static scheduler might assign |
Beta Was this translation helpful? Give feedback.
-
@zitaofang proposed a small instruction set for the workers, so that each static schedule can be represented in the form of a program. The instruction set contains 3 operations:
Consider the original example.
The representation for this schedule would be
The representation for this schedule would be
|
Beta Was this translation helpful? Give feedback.
-
Uploading a preliminary project report here: |
Beta Was this translation helpful? Give feedback.
-
This proposal aims to explore a quasi-static scheduler for LF.
A quasi-static scheduler performs scheduling both at compile time and at runtime.
Potential advantages (to be verified)
Approach
The approach is broken down into two parts: a compile-time algorithm and a runtime algorithm.
Running example:
SleepingBarber.lf
For simplicity, let's assume that there is only 1 customer instead of 20.
Compile-time algorithm
Step 0: Construct a precedence graph, which the compile already does.
The precedence graph tells us the order in which reactions are processed when they are enabled at a particular logical instant.
Step 1: Construct a counterfactual causality (CC) graph from the LF program, with nodes being reactions and edges being counterfactual causal relations. We also label the edges with logical delays.
For example, CF1 has an arrow back to itself because of the logical action
next
it can schedule.The presence of the logical action
next
establishes a counterfactual causal relation because, without a previous invocation of CF1 schedulingnext
, the next invocation of CF1 would not occur.Besides actions, the same counterfactual reasoning can be applied to reactions related by connections and timers as well.
Step 2: From the CC graph (not the precedence graph), generate a mapping from schedulable events to subgraphs of the CC graph, with each subgraph being a group of reactions that could get triggered at the same instant w.r.t. the order of counterfactual causality (denoted by the arrows).
Step 3a: Augment the above subgraphs with arrows from the precedence graph (from Step 0).
Step 3b: For two adjacent nodes, if there are longer paths between them, remove the adjacent edges.
Step 4: For each mapping, generate an N-worker schedule table, where N is the number of worker threads used.
Here, let's assume that we have 2 worker threads.
Note that each schedule here is a schedule for a logical instant.
Note that the structure of the LF program naturally presents an opportunity for a certain degree of parallelism.
The degree to which we can parallelize is limited, however.
Notice that here, even if we have 10 worker threads, at most 2 of them will run in parallel.
It would be very useful to have a systematic way to calculate, for a specific LF program, the optimal number of workers that can maximize parallelism.
Now we are ready to proceed to the runtime algorithm.
Runtime algorithm
Step 1: Create a struct variable called
current_schedule
and a priority queue called thepending_events
, with each element being a list of events to be present at some future timet
.Step 2: Remove all events present at time
t
frompending_events
, perform a lookup on the N-worker schedule table, and updateschedules
.In this case, we pop the list of signals for t = 0 off
pending_events
, observe that it only contains startup, look for the static schedule corresponding to startup being present, and setcurrent_schedule
to the static schedule.Step 3: Execute a schedule for a particular logical instant. Skip a reaction invocation from the schedule if its upstream reactions do not produce outputs. Appending the generated events to corresponding linked lists in
pending_events
.In this example, we proceed with executing the static schedule for t = 0.
Assume that
CF1
has just finished andCustomerFactory.next
is scheduled 1 second later. The state of the data structures at this moment looks like this:Assume that C1 does not produce an output that triggers CF3. As soon as this is known, CF3 in the static schedule will be marked as "to be skipped."
Assume that, at this point, B2 finishes and schedules
Barber.done
at t = 500 msec. The state of the data structure becomes:The multi-threaded runtime finishes the current schedule and proceeds to the next step.
Step 4: Repeat Step 2 until there are no more events to process.
Since we have two pending events from executing the previous schedule, we pop off all events at t = 500 msec, advance time to t = 500 msec, and repeat Step 2.
This runtime algorithm proceeds until all pending_events are handled. At this point, the
shutdown
event is scheduled, properly shutting off the execution.Beta Was this translation helpful? Give feedback.
All reactions