From 70cd44ccc68347f920cd1721ed266f09a5d2dd5e Mon Sep 17 00:00:00 2001 From: Ryo Onodera Date: Sat, 13 Apr 2024 16:15:53 +0900 Subject: [PATCH] Document not-chosen approach in detail --- unified-scheduler-pool/src/lib.rs | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/unified-scheduler-pool/src/lib.rs b/unified-scheduler-pool/src/lib.rs index 53fdc9bededbd6..1b524ddc69b1a9 100644 --- a/unified-scheduler-pool/src/lib.rs +++ b/unified-scheduler-pool/src/lib.rs @@ -571,7 +571,7 @@ impl, TH: TaskHandler> ThreadManager { // performance degradation. // // Overall, while this is merely a heuristic, it's effective and adaptive while not - // vulnerable. + // vulnerable, merely reusing existing information without any additional runtime cost. // // One known caveat, though, is that this heuristic is employed under a sub-optimal // setting, considering scheduling is done in real-time. Namely, prioritization enforcement @@ -585,6 +585,30 @@ impl, TH: TaskHandler> ThreadManager { // Finally, note that this optimization should be combined with biased select (i.e. // `select_biased!`), which isn't for now... However, consistent performance improvement is // observed just with this priority queuing alone. + // + // Alternatively, more faithful prioritization can be realized by checking blocking + // statuses of all addresses immediately before sending to the handlers. This would prevent + // false negatives of the heuristics approach (i.e. the last task of a run doesn't need to + // be handled with the higher priority). Note that this is the only improvement, compared + // to the heuristics. That's because this underlying information asymmetry between the 2 + // approaches doesn't exist for all other cases, assuming no look-ahead: idle tasks are + // always unblocked by definition, and other blocked tasks should always be calculated as + // blocked by the very existence of the last blocked task. + // + // On the other hand, the faithful approach incurs a considerable overhead: O(N), where N + // is the number of locked addresses in a task, adding to the current bare-minimum + // complexity of O(2*N) for both scheduling and descheduling. This means 1.5x increase. + // Furthermore, this doesn't nicely work in practice with a real-time streamed scheduler. + // That's because these linearized runs could be intermittent in the view with little or no + // look-back, albeit actually forming a far more longer runs in longer time span. These + // access patterns are very common, considering existence of well-known hot accounts. + // + // Thus, intentionally allowing these false-positives by the heuristic approach is actually + // helping to extend the logical prioritization session for the invisible longer runs, as + // long as the last task of the current run is being handled by the handlers, hoping yet + // another blocking new task is arriving to finalize the tentatively extended + // prioritization further. Consequently, this also contributes to alleviate the known + // heuristic's caveat for the first task of linearized runs, which is described above. let (mut blocked_task_sender, blocked_task_receiver) = chained_channel::unbounded::(context.clone()); let (idle_task_sender, idle_task_receiver) = unbounded::();