From 70cd44ccc68347f920cd1721ed266f09a5d2dd5e Mon Sep 17 00:00:00 2001
From: Ryo Onodera <ryoqun@gmail.com>
Date: Sat, 13 Apr 2024 16:15:53 +0900
Subject: [PATCH] Document not-chosen approach in detail

---
 unified-scheduler-pool/src/lib.rs | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/unified-scheduler-pool/src/lib.rs b/unified-scheduler-pool/src/lib.rs
index 53fdc9bededbd6..1b524ddc69b1a9 100644
--- a/unified-scheduler-pool/src/lib.rs
+++ b/unified-scheduler-pool/src/lib.rs
@@ -571,7 +571,7 @@ impl<S: SpawnableScheduler<TH>, TH: TaskHandler> ThreadManager<S, TH> {
         // performance degradation.
         //
         // Overall, while this is merely a heuristic, it's effective and adaptive while not
-        // vulnerable.
+        // vulnerable, merely reusing existing information without any additional runtime cost.
         //
         // One known caveat, though, is that this heuristic is employed under a sub-optimal
         // setting, considering scheduling is done in real-time. Namely, prioritization enforcement
@@ -585,6 +585,30 @@ impl<S: SpawnableScheduler<TH>, TH: TaskHandler> ThreadManager<S, TH> {
         // Finally, note that this optimization should be combined with biased select (i.e.
         // `select_biased!`), which isn't for now... However, consistent performance improvement is
         // observed just with this priority queuing alone.
+        //
+        // Alternatively, more faithful prioritization can be realized by checking blocking
+        // statuses of all addresses immediately before sending to the handlers. This would prevent
+        // false negatives of the heuristics approach (i.e. the last task of a run doesn't need to
+        // be handled with the higher priority). Note that this is the only improvement, compared
+        // to the heuristics. That's because this underlying information asymmetry between the 2
+        // approaches doesn't exist for all other cases, assuming no look-ahead: idle tasks are
+        // always unblocked by definition, and other blocked tasks should always be calculated as
+        // blocked by the very existence of the last blocked task.
+        //
+        // On the other hand, the faithful approach incurs a considerable overhead: O(N), where N
+        // is the number of locked addresses in a task, adding to the current bare-minimum
+        // complexity of O(2*N) for both scheduling and descheduling. This means 1.5x increase.
+        // Furthermore, this doesn't nicely work in practice with a real-time streamed scheduler.
+        // That's because these linearized runs could be intermittent in the view with little or no
+        // look-back, albeit actually forming a far more longer runs in longer time span. These
+        // access patterns are very common, considering existence of well-known hot accounts.
+        //
+        // Thus, intentionally allowing these false-positives by the heuristic approach is actually
+        // helping to extend the logical prioritization session for the invisible longer runs, as
+        // long as the last task of the current run is being handled by the handlers, hoping yet
+        // another blocking new task is arriving to finalize the tentatively extended
+        // prioritization further. Consequently, this also contributes to alleviate the known
+        // heuristic's caveat for the first task of linearized runs, which is described above.
         let (mut blocked_task_sender, blocked_task_receiver) =
             chained_channel::unbounded::<Task, SchedulingContext>(context.clone());
         let (idle_task_sender, idle_task_receiver) = unbounded::<Task>();