Broader verification of scheduler correctness #168

SteVwonder · 2016-06-09T17:59:48Z

do you have an idea as to how we can test correctness of various scheduling algorithms for CTS-1 testing? Perhaps we can bring in some well-known workloads w/ known schedules?`

Opening as a separate issue to isolate the discussion here.

SteVwonder · 2016-06-09T18:04:43Z

We currently have a very simple test using the simulator that launches 10 jobs and verifies their schedule under FCFS and EASY. So the framework is there to do the testing (assuming the simulator is still functioning under master, which is a separate issue that I'll be address this summer), but now we need some more extensive tests. I will look into this to see what I can find.

At minimum, we should be able to find emulated/simulated schedulers that run EASY, Conservative, and the hybrid algorithms. We could then throw the same set of jobs to the Flux implementation and the other implementation, comparing their schedules.

At best, there is (as @dongahn mentioned) a set of workloads and associated schedules that already exist.

dongahn · 2016-06-28T22:18:01Z

Agreed.

SteVwonder · 2016-08-12T17:57:53Z

@dongahn and I spoke on the phone today, and the idea of using SLURM's scheduler to verify Flux's scheduler came up. We are unsure of the exact scheduling policies that SLURM supports. @lipari's slides (specifically slide 34) make it seem that 4 years ago only EASY backfilling was supported by SLURM. @lipari, do you know what scheduling policies are currently supported by SLURM? Does it now support conservative & hybrid backfilling?

dongahn · 2016-08-12T18:10:07Z

Thanks @SteVwonder.

Context for @lipari: an easy verification mechanism of our backfill policy implementations will significantly add to reaching our CTS-1 scheduler milestone.

Verifying FCFS is a no brainer -- we can look at each job's starting-time timestamp within lwj.$jobid and make sure they are ordered. But I have no idea how easily I can verify the correctness of backfill algorithms. I would like to cover EASY, CONSERVATIVE, and Hybrid with a relatively deep depth (currently set to 100).

lipari · 2016-08-12T19:06:14Z

The Slurm settings to enable backfill behavior which will serve as a reference are:

PriorityType=basic
SchedulerType=sched/backfill
SchedulerParameters=bf_...

The SchedulerParameters settings are where you will tune the backfill behavior from conservative to liberal. Look at the slurm.conf man page to see all the options. Also take a look at vulcan to see the bf_ settings we use in production.

lipari · 2016-08-12T19:27:32Z

On further review, it looks like there is not a setting to enable EASY backfill. Even looking at the newer Slurm on vulcan, I don't see a setting which will violate the conservative rule of not scheduling a lower priority job if it will delay the start of any higher priority job.

SteVwonder · 2016-08-12T20:13:24Z

Thanks @lipari for information. So it seems that we can use the current version of SLURM to validate our Conservative backfilling algorithm. But that we cannot use the current version to test Hybrid or EASY.

I wonder if there is another scheduler with a simulator that we can test with.

dongahn · 2016-08-12T20:47:33Z

Do we have well-known large enough workloads (e.g., ESP) and expected schedules under different scheduling policies?

Perhaps we should de-couple CTS-1 scheduler milestone and correctness verification efforts.

In particular, I am using core-level scheduling (i.e., submitting jobs only with their requirements for cores not nodes) in which case verifying correctness might even be harder.

Perhaps we should build necessary confidence on correctness using existing workloads and expected schedules (e.g., ESP) and ensure that our schedules matches with golden orders. Then, we assume that this orthogonal effort would validate the correctness of our algorithms, and for the CTS-1 scheduler milestone, then we just focus on perf/scal characteristics?

SteVwonder · 2016-08-12T20:49:15Z

@dongahn, that approach sound reasonable to me.

I just found another scheduler simulator, Alea, that is built on top of GridSim that seems promising:

I will look for some workloads with "golden orders".

dongahn · 2016-08-12T20:52:53Z

@SteVwonder: you might want to ping Suraj. I heard he was trying use a similar for grid computing for his dissertation and was pretty unhappy with their capabilities and software quality.

SteVwonder · 2016-08-12T21:02:21Z

@dongahn: will do that now

Also, just another thought, if we can come up with some invariant that each scheduling algorithm must follow, we could perform a test that the invariant holds before/after each iteration. For example, we could ensure that the start time of each reserved job only decreases (and never increases) after each iteration. This is a more involved approach, but if we pick a strong enough invariant/set of invariants, then our algorithm would be provably correct for a given set of inputs as opposed to empirically correct.

dongahn · 2016-08-12T21:05:07Z

Oh. This is an excellent idea!

SteVwonder · 2016-08-13T00:57:22Z

One other thought: It would be useful for both the formal verification and debugging to dump the state of the resrc after every scheduler iteration. Maybe we can add a flag to sched that would save out the resrc every iteration for later analysis?

dongahn · 2016-08-13T01:02:46Z

@SteVwonder: there is already a debug routine that can print out the resrc state. All we need will be to add a sched load option to turn this on at an appropriate point. Note that this will dump lots of data -- in particular at large scale. Not sure how easily it is to do a formal verification using this data though...

dongahn · 2016-08-13T01:04:12Z

FYI -- the function is here

SteVwonder · 2016-08-13T01:10:03Z

That printing function is nice for human parsing, but not as nice for programmatic parsing. If we serialize the resrc out to JSON (which there is already a function for), then it will be easy to work with in Python. An invariant like "the reservation time of a job should be non-increasing over time" shouldn't be that hard to code up (I'm willing to do so). I just don't know if that is a strong enough invariant.

And I agree, size could be an issue for a sufficiently large system. I think it should be doable for ~100 nodes and several thousand jobs.

dongahn · 2016-08-13T01:23:13Z

@SteVwonder: agreed. Does this function help?

https://github.com/flux-framework/flux-sched/blob/master/resrc/resrc_tree.c#L147

Once a json object is built from the top resrc, one can use json stringfy function to dump it?

e.g., json_object_to_json_string_ext (o, JSON_C_TO_STRING_PRETTY))

dongahn · 2016-08-13T18:29:35Z

@SteVwonder: BTW, unlike under an emulatuon/sumulaion, it just dawned on me that schedulung under a real system is not necessarily deterministic. Even if we try to control a job's running time tightly (using sleep etc) , the system noise will make it pretty difficult to control the order of resource and job events.

Say job 1 using 2 nodes is supposed to end 10 seconds early than job 2, which then can run the highest priority reserved job (job e requesting 2 nodes). But system noise can delay freeing up of job 1 so that job 2 using 1 node gets deallocated first. Then, insteas if job 3, job4 which only requests 1 node may get backfilled. This kind ca change the schedule on the same workload run to run. And this will be pretty prevalent for a stress test workload.

I think this really motivates fully validating under emulation and complementing that by using some best effort invariant based solution if we need additional assurance for the scheduler's behavior under a real system.

dongahn · 2016-08-13T18:45:36Z

Another Sat. morning idea while jogging... what's deterministic is:

given a totally ordered job and resource events, each changing the current resource state, schuduling must be determininstic. Then one addtional idea is not only gather the workload (jobs) but also collect these state changing evets. Then these can be fed into a validator for correctness check?

dongahn · 2016-08-13T18:56:04Z

Final thoguht. If HOC doean't have paper or SW in this area, this can land a pretty nice publication. Essenntialy it is becoming increasing difficult to validate if an algoritm or an implenmentaion of this algorithm correctly enforce the target schedulig policy in particular in the midst of more constraints to consider. We can propose a conceptual model that can lead to a scheduler/rm agbostic validator and show the result suojg flux as prrof by demonstration.

dongahn · 2018-03-31T04:19:20Z

Traditional emulator and other test cases currently verify the correctness.

SteVwonder mentioned this issue Jun 9, 2016

Providing a runtime option to select conservative/hybrid backfill algorithms? #159

Closed

SteVwonder mentioned this issue Jun 16, 2016

Add runtime option to select backfill algorithm #175

Merged

SteVwonder mentioned this issue Aug 23, 2016

Quantify the impact of and optimize the schedule loop's job iteration scheme #183

Closed

SteVwonder mentioned this issue Sep 30, 2016

Scheduler Correctness: Reservations #197

Closed

dongahn closed this as completed Mar 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broader verification of scheduler correctness #168

Broader verification of scheduler correctness #168

SteVwonder commented Jun 9, 2016

SteVwonder commented Jun 9, 2016

dongahn commented Jun 28, 2016

SteVwonder commented Aug 12, 2016

dongahn commented Aug 12, 2016

lipari commented Aug 12, 2016

lipari commented Aug 12, 2016 •

edited

Loading

SteVwonder commented Aug 12, 2016

dongahn commented Aug 12, 2016

SteVwonder commented Aug 12, 2016 •

edited

Loading

dongahn commented Aug 12, 2016

SteVwonder commented Aug 12, 2016

dongahn commented Aug 12, 2016

SteVwonder commented Aug 13, 2016

dongahn commented Aug 13, 2016

dongahn commented Aug 13, 2016

SteVwonder commented Aug 13, 2016 •

edited

Loading

dongahn commented Aug 13, 2016

dongahn commented Aug 13, 2016

dongahn commented Aug 13, 2016 •

edited

Loading

dongahn commented Aug 13, 2016

dongahn commented Mar 31, 2018

Broader verification of scheduler correctness #168

Broader verification of scheduler correctness #168

Comments

SteVwonder commented Jun 9, 2016

SteVwonder commented Jun 9, 2016

dongahn commented Jun 28, 2016

SteVwonder commented Aug 12, 2016

dongahn commented Aug 12, 2016

lipari commented Aug 12, 2016

lipari commented Aug 12, 2016 • edited Loading

SteVwonder commented Aug 12, 2016

dongahn commented Aug 12, 2016

SteVwonder commented Aug 12, 2016 • edited Loading

dongahn commented Aug 12, 2016

SteVwonder commented Aug 12, 2016

dongahn commented Aug 12, 2016

SteVwonder commented Aug 13, 2016

dongahn commented Aug 13, 2016

dongahn commented Aug 13, 2016

SteVwonder commented Aug 13, 2016 • edited Loading

dongahn commented Aug 13, 2016

dongahn commented Aug 13, 2016

dongahn commented Aug 13, 2016 • edited Loading

dongahn commented Aug 13, 2016

dongahn commented Mar 31, 2018

lipari commented Aug 12, 2016 •

edited

Loading

SteVwonder commented Aug 12, 2016 •

edited

Loading

SteVwonder commented Aug 13, 2016 •

edited

Loading

dongahn commented Aug 13, 2016 •

edited

Loading