threadpool: throttled big group #1778

AndreMouche · 2017-04-18T03:56:19Z

Hi all,

This PR creates a new threadpool which tries to throttle the group's concurrency to a specified number when it's busy.

Each task uses the attribute group_id to identify which group it belongs to. When one thread asks a new task to run, it schedules according to the following rules:

Find out which group has a running number that is smaller than that of group_concurrency_on_busy.
If more than one group meets the first point, run the one who comes first.

If no group meets the first point, choose according to the following rules:

Select the groups with the least running tasks.
If more than one group meets the first point, choose the one whose task comes first (with the minimum task's ID).

@BusyJay @zhangjinpeng1987 @disksing PTAL

andelf · 2017-04-18T04:34:28Z

src/util/threadpool.rs

+use std::marker::PhantomData;
+
+pub struct Task<T> {
+    // The task's number in the pool.Each task has a unique number,


space after period

andelf · 2017-04-18T04:36:19Z

src/util/threadpool.rs

+
+impl<T> Ord for Task<T> {
+    fn cmp(&self, right: &Task<T>) -> Ordering {
+        self.id.cmp(&right.id).reverse()


reverse ordering?

We reverse ordering here since the heap would pops the largest item number first while we need to pop the task with the smallest id first. @andelf

andelf · 2017-04-18T04:39:44Z

src/util/threadpool.rs

+        BigGroupThrottledQueue {
+            group_concurrency: HashMap::default(),
+            waiting_queue: HashMap::default(),
+            pending_tasks: BinaryHeap::new(),


consistency between new() and default().
maybe new() is better.

andelf · 2017-04-18T04:41:38Z

src/util/threadpool.rs

+            group_concurrency: HashMap::default(),
+            waiting_queue: HashMap::default(),
+            pending_tasks: BinaryHeap::new(),
+            group_concurrency_on_busy: group_concurrency_on_busy,


Since on share the same meaning with when here, and on is more short. I prefer on here @andelf

andelf · 2017-04-18T04:46:42Z

src/util/threadpool.rs

+        }
+    }
+
+    // Try push into pending. Return none on success,return Some(task) on failed.


logically wrong usage for Option. you may use Result<(), ...> instead.

btw, is this thread-safe?

Only one thread can own the lock of BigGroupThrottledQueue at the same time. @andelf

andelf · 2017-04-18T05:29:19Z

src/util/threadpool.rs

+}
+
+/// `ThreadPool` is used to execute tasks in parallel.
+/// Each task would be pushed into the pool,and when a thread


space after comma

andelf · 2017-04-18T05:31:42Z

src/util/threadpool.rs

+        }
+
+        ThreadPool {
+            meta: meta.clone(),


is clone() necessary?

andelf · 2017-04-18T05:39:15Z

src/util/threadpool.rs

+    // return false when get stop msg
+    #[inline]
+    fn wait(&self) -> bool {
+        // try to receive notify


notification

andelf · 2017-04-18T05:39:51Z

src/util/threadpool.rs

+
+    fn run(&mut self) {
+        // start the worker.
+        // loop break on receive stop message.


andelf · 2017-04-18T05:40:20Z

src/util/threadpool.rs

+        // loop break on receive stop message.
+        while self.wait() {
+            // handle task
+            // since `tikv` would be down on any panic happens,


do not format tikv since it's not a function, variable or type

siddontang · 2017-04-18T06:04:25Z

src/util/threadpool.rs

+        for (group_id, tasks) in &self.waiting_queue {
+            let front_task_id = tasks[0].id;
+            assert!(self.group_concurrency.contains_key(group_id));
+            let count = self.group_concurrency[group_id];


can we ensure the group_id exist here?

Yes. The task should be pushed into pending_tasks if group_id not in self.group_concurrency or self.group_concurrency[group_id]<group_concurrency_on_busy. So we can ensure the group_id exist here @siddontang

siddontang · 2017-04-18T06:05:09Z

src/util/threadpool.rs

+    }
+
+    #[inline]
+    fn pop_task_from_waiting_queue(&mut self) -> Option<Task<T>> {


pop_from_waiting_queue

BusyJay · 2017-04-18T06:19:14Z

src/util/threadpool.rs

+pub trait ScheduleQueue<T> {
+    fn pop(&mut self) -> Option<Task<T>>;
+    fn push(&mut self, task: Task<T>);
+    fn finish(&mut self, group_id: T);


What does finish mean here?

BusyJay · 2017-04-18T06:40:44Z

src/util/threadpool.rs

+
+// each thread has a worker.
+struct Worker<Q, T> {
+    job_rever: Arc<Mutex<Receiver<bool>>>,


Use Mutex + Condvar instead.

BusyJay · 2017-04-18T06:41:49Z

src/util/threadpool.rs

+    }
+}
+
+struct ThreadPoolMeta<Q, T> {


TaskPool seems more appropriate to me.

Wenting0905 · 2017-04-18T08:13:27Z

src/util/threadpool.rs

+// distributed under the License is distributed on an "AS IS" BASIS,
+// See the License for the specific language governing permissions and
+// limitations under the License.
+


The paragraph "Unless required ... under the License" is divided into several rows.

we divide it into several rows in order to make a line not too long. It's ok in source code @Wenting0905

Wenting0905 · 2017-04-18T08:31:52Z

src/util/threadpool.rs

+    // group_id => running_num+pending num. It means there may
+    // `group_concurrency[group_id]` tasks of the group are running.
+    group_concurrency: HashMap<T, usize>,
+    // max num of threads each group can run on when pool is busy.


The max number of threads that each group can run when the pool is busy.

Wenting0905 · 2017-04-18T08:46:17Z

src/util/threadpool.rs

+    // `group_concurrency[group_id]` tasks of the group are running.
+    group_concurrency: HashMap<T, usize>,
+    // max num of threads each group can run on when pool is busy.
+    // each value in group_concurrency shouldn't bigger than this value.


Each value in 'group_concurrency' shouldn't be bigger than this value.

Wenting0905 · 2017-04-18T08:55:58Z

src/util/threadpool.rs

+        }
+        let group_id = group_id.unwrap();
+        let task = self.pop_from_waiting_queue_with_group_id(&group_id);
+        // update group_concurrency since current task is going to run.


since the current task

Wenting0905 · 2017-04-18T08:56:36Z

src/util/threadpool.rs

+            let task = waiting_tasks.pop_front().unwrap();
+            (waiting_tasks.is_empty(), task)
+        };
+        // if waiting tasks for group is empty, remove from waiting_tasks.


Wenting0905 · 2017-04-18T08:56:54Z

src/util/threadpool.rs

+        task
+    }
+
+    // pop_group_id_from_waiting_queue returns next task's group_id.


returns the next

Wenting0905 · 2017-04-18T09:06:55Z

src/util/threadpool.rs

+            });
+        }
+
+        // push 2 txn3 into pool, each need 2*sleep_duration.


push 2 txn3 into pool and each needs 2*sleep_duration.

Wenting0905 · 2017-04-18T09:07:10Z

src/util/threadpool.rs

+        }
+
+        // txn11,txn12,txn13,txn14,txn21,txn22,txn31,txn32
+        // first 4 task during [0,sleep_duration] should be


first 4 tasks

Wenting0905 · 2017-04-18T09:07:44Z

src/util/threadpool.rs

+
+        // txn11,txn12,txn13,txn14,txn21,txn22,txn31,txn32
+        // first 4 task during [0,sleep_duration] should be
+        // {txn11,txn12,txn21,txn22}.Since txn1 finished before than txn2,


space after period

Since txn1 is finished before txn2,

Wenting0905 · 2017-04-18T09:07:57Z

src/util/threadpool.rs

+        // txn11,txn12,txn13,txn14,txn21,txn22,txn31,txn32
+        // first 4 task during [0,sleep_duration] should be
+        // {txn11,txn12,txn21,txn22}.Since txn1 finished before than txn2,
+        // 4 task during [sleep_duration,2*sleep_duration] should be


Wenting0905 · 2017-04-18T09:08:23Z

src/util/threadpool.rs

+    fn test_fair_group_queue() {
+        let max_pending_task_each_group = 2;
+        let mut queue = BigGroupThrottledQueue::new(max_pending_task_each_group);
+        // push  4 group1 into queue


delete one space before 4

BusyJay · 2017-04-18T13:06:03Z

src/util/threadpool.rs

+        }
+        while let Some(t) = self.threads.pop() {
+            if let Err(e) = t.join() {
+                return Err(format!("{:?}", e));


What about other threads?

AndreMouche · 2017-04-19T03:23:57Z

PTAL

zhangjinpeng87 · 2017-04-19T11:25:51Z

src/util/threadpool.rs

+    // The task's number in the pool. Each task has a unique number,
+    // and it's always bigger than preceding ones.
+    id: u64,
+    // the task's group_id.


Group which the task belongs to.

zhangjinpeng87 · 2017-04-19T13:19:40Z

src/util/threadpool.rs

+    waiting_queue: HashMap<T, VecDeque<Task<T>>>,
+    // group_id => running_num+pending num. It means there may
+    // `group_concurrency[group_id]` tasks of the group are running.
+    group_concurrency: HashMap<T, usize>,


s/group_concurrency/group_concurrency_stat

BusyJay · 2017-04-19T13:00:57Z

src/util/threadpool.rs

+        while let Some(task) = self.get_next_task() {
+            // handle task
+            // since tikv would be down when any panic happens,
+            // we do't need to process panic case here.


s/do't/don't/

BusyJay · 2017-04-19T13:01:25Z

src/util/threadpool.rs

+            // handle task
+            // since tikv would be down when any panic happens,
+            // we do't need to process panic case here.
+            task.task.call_box(());


Prefer (task.task)().

BusyJay · 2017-04-19T13:01:41Z

src/util/threadpool.rs

+            // we do't need to process panic case here.
+            task.task.call_box(());
+            self.on_task_finished(&task.group_id);
+            self.task_count.fetch_sub(1, AOrdering::SeqCst);


What's A?

BusyJay · 2017-04-19T13:03:16Z

src/util/threadpool.rs

+            builder = builder.name(name.clone());
+            let tasks = task_pool.clone();
+            let task_num = task_count.clone();
+            let thread = builder.spawn(move || {


Chain L306, L307 with L310.

BusyJay · 2017-04-19T13:27:50Z

src/util/threadpool.rs

+
+    fn pop(&mut self) -> Option<Task<T>> {
+        if let Some(task) = self.pending_tasks.pop() {
+            let count = self.group_concurrency.entry(task.group_id.clone()).or_insert(0);


Seems always exist.

BusyJay · 2017-04-19T13:33:43Z

src/util/threadpool.rs

+        let mut next_group = None;
+        for (group_id, tasks) in &self.waiting_queue {
+            let front_task_id = tasks[0].id;
+            assert!(self.group_concurrency.contains_key(group_id));


Unnecessary.

BusyJay · 2017-04-19T13:36:37Z

src/util/threadpool.rs

+        // (group_id,count,task_id) the best current group's info with it's group_id,
+        // running tasks count, front task's id in waiting queue.
+        let mut next_group = None;
+        for (group_id, tasks) in &self.waiting_queue {


Use Iterator::min instead.

hhkbp2 · 2017-04-20T02:32:59Z

src/util/threadpool.rs

+
+pub struct Task<T> {
+    // The task's number in the pool. Each task has a unique number,
+    // and it's always bigger than preceding ones.


s/task number/tast id/
It will be fine to use just task id.

hhkbp2 · 2017-04-20T02:43:48Z

src/util/threadpool.rs

+    }
+}
+
+impl<T> Eq for Task<T> {}


Why the implementation is empty?

https://doc.rust-lang.org/std/cmp/trait.Eq.html @hhkbp2

hhkbp2 · 2017-04-20T02:55:37Z

src/util/threadpool.rs

+// `BigGroupThrottledQueue` tries to throttle group's concurrency to
+//  `group_concurrency_on_busy` when it's busy.
+// When one worker asks a task to run, it schedules in the following way:
+// 1. Find out which group has a running number that is smaller than


Please take a look at these comments. @Wenting0905

hhkbp2 · 2017-04-20T02:57:23Z

src/util/threadpool.rs

+    // more than `group_concurrency_on_busy`), the rest of the group's tasks
+    // would be pushed into `waiting_queue[group_id]`
+    waiting_queue: HashMap<T, VecDeque<Task<T>>>,
+    // group_id => running_num+pending num. It means there may


Yes. This is a subtle optimization to improve the efficiency in schedule next task. We may need waiting_queue only in the normal implementation, while it always need to iterator all groups in waiting_queue to find the optimal task.
In this implementation, we add a pending_heap, when a new task comes:

If the total number of the group's tasks in pending_heap or running is smaller than group_concurrency_on_busy, push it into pending heap.

Otherwise, push the task into waiting_queue.

And when try to get a new task to run:

If the heap is not empty, pop the task in front.

Otherwise, find the optimal task according our rules in waiting_queue.

Here group_concurrency[group_id] save the total number of tasks which are in pending_tasks(pending_heap) or running. It also means there may group_concurrency[group_id] tasks of the group are running. @hhkbp2

Sorry for the crudeness of previous comment. :)
It meant to point out that there may be a syntax error for the comment. Try to revise it like
"It means at most group_concurrency[group_id] tasks of the group may be running."

Wenting0905 · 2017-04-20T03:19:42Z

LGTM

Wenting0905 · 2017-04-20T03:29:31Z

src/util/threadpool.rs

+}
+
+// `BigGroupThrottledQueue` tries to throttle group's concurrency to
+//  `group_concurrency_on_busy` when is busy.


when it's busy

hhkbp2 · 2017-04-24T04:05:34Z

src/util/threadpool.rs

+                .map(|(group_id, waiting_queue)| {
+                    (self.group_concurrency[group_id], waiting_queue[0].id, group_id)
+                })
+                .min();


FYI, it could use data structure like SkipList/Heap which is fast for insertion/deletion/ordered access to track the relationship of (lowest concurrency, low id) -> group_id, to avoid the iteration.

Don't make it more complicated, ordermap can satisfy the need.

zhangjinpeng87 · 2017-04-24T05:45:31Z

src/util/threadpool.rs

+    // `group_concurrency_on_busy`(which means the number of on-going tasks is
+    // more than `group_concurrency_on_busy`), the rest of the group's tasks
+    // would be pushed into `waiting_queue[group_id]`
+    waiting_queue: HashMap<T, VecDeque<Task<T>>>,


Maybe we should name waiting_queue to big_task_waiting_queue. I always confused by the pending_tasks and waiting_queue.

Maybe queue1 queue2 is more clear.

zhangjinpeng87 · 2017-04-24T05:54:43Z

src/util/threadpool.rs

+
+    #[inline]
+    fn pop_from_waiting_queue_with_group_id(&mut self, group_id: &T) -> Task<T> {
+        let (waiting_tasks_is_empty, task) = {


s/waiting_tasks_is_empty/empty_after_pop

zhangjinpeng87 · 2017-04-24T06:20:18Z

src/util/threadpool.rs

+    }
+
+    #[inline]
+    fn pop_from_waiting_queue_with_group_id(&mut self, group_id: &T) -> Task<T> {


zhangjinpeng87 · 2017-04-24T06:27:09Z

src/util/threadpool.rs

+
+struct TaskPool<Q, T> {
+    next_task_id: u64,
+    tasks: Q,


s/tasks/task_queue

zhangjinpeng87 · 2017-04-24T06:34:15Z

src/util/threadpool.rs

+        }
+    }
+
+


remove blank line

zhangjinpeng87 · 2017-04-24T06:38:25Z

src/util/threadpool.rs

+    pub fn execute<F>(&mut self, group_id: T, job: F)
+        where F: FnOnce() + Send + 'static
+    {
+        self.task_count.fetch_add(1, AtomicOrdering::SeqCst);


Move this line after L309.

zhangjinpeng87 · 2017-04-24T06:44:52Z

src/util/threadpool.rs

+            }
+            if let Some(task) = task_pool.pop_task() {
+                // to reduce lock's time.
+                task_pool.on_task_started(&task.group_id);


Please add comments why call on_task_started at here not before L387.

zhangjinpeng87 · 2017-04-24T06:47:36Z

src/util/threadpool.rs

+           -> Worker<Q, T> {
+        Worker {
+            task_pool: task_pool,
+            task_count: task_count,


Seems task_count is not used, can we drop it?

task_count is needed when one task is finished. @zhangjinpeng1987

zhangjinpeng87 · 2017-04-24T07:03:16Z

src/util/threadpool.rs

+            group_concurrency: HashMap::new(),
+            waiting_queue: HashMap::new(),
+            pending_tasks: BinaryHeap::new(),
+            group_concurrency_on_busy: group_concurrency_on_busy,


s/group_concurrency_on_busy/group_concurrency_limit

Who is busy?

zhangjinpeng87 · 2017-04-24T09:00:33Z

src/util/threadpool.rs

+        if statistics.total() >= self.group_concurrency_limit {
+            return Err(PushError(task));
+        }
+        statistics.queue1_count += 1;


Move this line below L144

zhangjinpeng87 · 2017-04-24T09:01:36Z

src/util/threadpool.rs

+    }
+
+    #[inline]
+    fn pop_from_waiting_queue_by_group_id(&mut self, group_id: &T) -> Task<T> {


pop_from_queue2_by_group_id

zhangjinpeng87 · 2017-04-24T10:17:21Z

src/util/threadpool.rs

+    // Try push into high priority queue. Return none on success,return PushError(task) on failed.
+    #[inline]
+    fn try_push_into_high_pri_queue(&mut self, task: Task<T>) -> Result<(), PushError<Task<T>>> {
+        let statistics = self.group_statistics


let mut statistics.. ?

zhangjinpeng87 · 2017-04-24T10:41:35Z

src/util/threadpool.rs

+        // If the value of `group_statistics[group_id]` is not big enough, pop
+        // a task from `low_pri_queue[group_id]` and push it into `high_pri_queue`.
+        let group_task = self.pop_from_low_pri_queue_by_group_id(group_id);
+        assert!(self.try_push_into_high_pri_queue(group_task).is_ok());


zhangjinpeng87 · 2017-04-24T11:40:34Z

LGTM @hhkbp2 PTAL again.

hhkbp2 · 2017-04-24T12:47:38Z

src/util/threadpool.rs

+
+struct GroupStatisticsItem {
+    running_count: usize,
+    high_pri_queue_count: usize,


s/pri/priority/
Use full name unless it's really too long.

hhkbp2 · 2017-04-24T13:02:13Z

LGTM

threadpool:init biggroup throttled threadpool

4e951b3

andelf reviewed Apr 18, 2017

View reviewed changes

siddontang reviewed Apr 18, 2017

View reviewed changes

Merge branch 'master' into shirly/big_group_throttled_threadpool

fdbc3a3

BusyJay reviewed Apr 18, 2017

View reviewed changes

threadpool:refactor according comments

17fc6a9

Wenting0905 reviewed Apr 18, 2017

View reviewed changes

Wenting0905 suggested changes Apr 18, 2017

View reviewed changes

AndreMouche added 3 commits April 18, 2017 18:58

threadpool:adjust comments

e18bf13

threadpool:adjust comments

23c6055

Merge branch 'master' into shirly/big_group_throttled_threadpool

2514ab8

BusyJay reviewed Apr 18, 2017

View reviewed changes

AndreMouche added 3 commits April 18, 2017 22:14

threadpool:address comments

d5929b6

threadpool:merge master

c20e4f3

threadpool:add test test_get_task_count

e06114e

zhangjinpeng87 reviewed Apr 19, 2017

View reviewed changes

BusyJay reviewed Apr 19, 2017

View reviewed changes

AndreMouche added 2 commits April 19, 2017 23:33

threadpool:address comments

ecdd45b

threadpool:address comments

2c1c6bb

hhkbp2 reviewed Apr 20, 2017

View reviewed changes

threadpool:adjust annotations

bcde890

hhkbp2 reviewed Apr 20, 2017

View reviewed changes

Wenting0905 reviewed Apr 20, 2017

View reviewed changes

Wenting0905 approved these changes Apr 20, 2017

View reviewed changes

threadpool:address comment

1b3152f

hhkbp2 reviewed Apr 24, 2017

View reviewed changes

threadpool:use ordermap instead of btreemap

7cfdb31