-
Notifications
You must be signed in to change notification settings - Fork 40
Task scheduling
Stefano Belforte edited this page Jun 24, 2024
·
8 revisions
- prevent one (a few) user from monopolizing resources
- give new users a change to get some work through even if slowly when resources are saturated
-
introduce a new task status before NEW: WAITING
-
crab submit puts tasks in WAITING
-
Option 1
a new TW-like component handles tasks in WAITING moving them in NEW "a bit at a time"
pros: no changes to current TW
cons: a new service to start/manage -
Option 2
in TW MasterWorker introduce a
selectWork
step beforelockWork
which moves tasks from WAITING to NEW
pros: only run TW, like now, all code stays together
cons: TW becomes more complex, bugs in new code may make everything crash
Stefano favours Option 2
- modify list of task statuses in code and documentation
- have an initial transparent
selectWork
which finds all tasks in WAITING and moves to NEW. Reuse code (20 lines !) fromlockWork
but make it call an (initially trivial) external scheduling method. Insert it at beginning of this loop https://github.com/dmwm/CRABServer/blob/f88a9b98f80f9a38e3180ed0a5b2b193a75ee5c0/src/python/TaskWorker/MasterWorker.py#L417-L419 but it should be possible to have it act at a lower frequency - modify handling of submission in REST to put tasks in WAITING. Change https://github.com/dmwm/CRABServer/blob/f88a9b98f80f9a38e3180ed0a5b2b193a75ee5c0/src/python/CRABInterface/DataWorkflow.py#L178
- modify crab status to properly inform user when task is in WAITING
- add a waiting reason in the message field in the task table
- improve
selectWork
by adding a simple algorithm which e.g. picks tasks in round robin among all users (possibly achievable for Summer Student) - further improve adding knowledge of how many resources a user is using now and some fair share algorithm for this we need to differentiate tasks in progress vs. completed (i.e. Dagman running or not) in the Task table
- add pruning of tasks queue: users able to kill, us/TW able to "give it a cut"
- report queue status back to user and possibly refuse further submissions