Integrate resource-match with the new scheduler interface #468

dongahn · 2019-06-11T15:55:01Z

Had a discussion with @grondo yesterday. Though the scheduler interface isn't where he wants it to be, we thought it would be a good idea to start doing the integration work sooner rather than later. Once #467 is merged, I plan to take a crack at this.

dongahn · 2019-06-11T22:18:28Z

@grondo: could you post some pointers as to what files I should start to take a look? Thanks.

grondo · 2019-06-11T22:40:02Z

The helper functions for scheduler integration were developed by @garlick and can be found in src/common/libschedutil.

For example uses check out src/modules/sched-simple and t/job-manager/sched-dummy.c.

We had planned to abstract libschedutil into a scheduler specific module interface to simplify scheduler development, however, that work has been pushed off due to other priorities. For now it might be easiest to just copy libschedutil into flux-sched. Once we transition to a more polished scheduler interface we could remove that convenience library from flux-sched.

dongahn · 2019-06-14T05:49:10Z

Thanks @grondo. This will be my next priority.

dongahn · 2019-06-16T17:31:44Z

I looked at some of these suggested files. They look great. My feeling is, though, the scheduler loop service at flux-sched will have to be much more complex, and we will need to manage this complexity very carefully. Note that the original sched was pretty complex with various scheduler parameter and queueing policy variations (plus embedded emulator, which isn't an issue for this round) and we should use that experience to design this better. My proposal is to make a subdirectory at the top level called sched in which we use a similar strategy as resource.

Build up abstractions using C++ classes like class sched_t and etc;
Build a command-line utility in sched/utilities like % sched> that uses these abstractions such that we can test and debug using the CLI more comprehensively (like resource-query did it for resource matching);
Build a sched-loop module as a thin layer atop of sched classes in sched/modules.

I also talked with @garlick and @grondo and I will copy libschedutil from flux-core to flux-sched in the corresponding location.

dongahn · 2019-06-16T17:39:57Z

@garlick and @grondo: I assume you won't have queueing policies other than fcfs at the job-manager level, correct? So essentially alloc will be issued in the fcfs order from 'job-manager'. (though priority and such can change this order).

This is fine but I thought I should doublecheck.

When users want the any out of order policy like backfilling at the flux-sched level, I will assume sched will have to use "unlimited" to replicate the entire job queue.

Also initially, scheduler loop trigger events will be "alloc" and "free" only since we have no resource event yet (additional resource joined; some resources are detected to be down and/or excluded).

garlick · 2019-06-16T19:37:03Z

Correct on both counts. If this turns out to be too simplistic, let's talk.

dongahn · 2019-06-17T16:32:27Z

Functionality wise, this seems okay.

So I will start to design based on these assumptions. If this appeared to be too redundant, I will call for a discussion.

BTW There are things that the current interface solves pretty nicely for me. They include not needing finite state machines and not having to deal with individual events; and easy to implement resilience scheme.

But I realize I will probably still have to implement performance optimization techniques like queue depth and delay scheduling at the scheduler level. This is fine.

But I will see if there are opportunities to implement those at the core level which can benefit all schedulers.

SteVwonder · 2019-06-17T21:10:42Z

But I realize I will probably still have to implement performance optimization techniques like queue depth

Yeah. It would awesome if the sched / job-manager handshake could be extended to support more than just 1 and unlimited queue-depth to also support an arbitrary N.

dongahn · 2019-06-26T16:16:45Z

@garlick or @SteVwonder: I see from sched-dummy.c, the module load option is now --opt=ABC using optparse as opposed to opt=ABC, which I have been using.

Did we decide to require this style of option passing for modules at this point across the board? I am implementing this part of the new qmanager service and couldn't remember which format is our requirement.

dongahn · 2019-06-26T23:33:44Z

@garlick: My rc1 script for qmanager currently fails because flux-core loads in sched-simple by default. I can get around that by unloading sched-simple if present before loading up qmanager in its rc1 script. Does it sound like a reasonable short term solution?

For the long haul, though it seems we would need a way to query whether a conflicting module has been loaded if so it can be first unloaded.

garlick · 2019-06-27T03:12:46Z

That seems ok for now. I don't have any deeper thoughts on the subject right now 🙂

…

On Wed, Jun 26, 2019, 4:33 PM Dong H. Ahn ***@***.***> wrote: @garlick <https://github.com/garlick>: My rc1 script for qmanager currently fails because flux-core loads in sched-simple by default. I can get around that by unloading sched-simple if present before loading up qmanager in its rc1 script. Does it sound like a reasonable short term solution? For the long haul, though it seems we would need a way to query whether a conflicting module has been loaded if so it can be first unloaded. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#468?email_source=notifications&email_token=AABJPW6IT3A23YAHY7FZJ3LP4P4FRA5CNFSM4HW7ZQQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYVDFCY#issuecomment-506081931>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABJPW2TWVSOGEEB6QYFIELP4P4FRANCNFSM4HW7ZQQA> .

dongahn · 2019-06-28T06:37:14Z

@garlick: A quick question. When you submit a jobspec with 1h duration at this point like

flux job submit test.t60.json

and the scheduler respond to the alloc request. Does the job-manager issues the free request right away? I am seeing my free callback request being called, but I wasn't sure if this is because I'm doing something wrong or just expected.

dongahn · 2019-06-28T06:42:30Z

@garlick: also I unload sched-simple in my rc1 script for qmanager, but I'm getting the following error when I exit out of my flux instance.

2019-06-28T06:41:10.936714Z broker.err[0]: rc3: flux-module: cmb.rmmod[0] sched-simple: No such file or directory
flux-broker: module 'qmanager' was not cleanly shutdown

Any insight?

garlick · 2019-06-28T12:59:32Z

Did we decide to require this style of option passing for modules at this point across the board? I am implementing this part of the new qmanager service and couldn't remember which format is our requirement.

As discussed in the meeting, not required, but easier. If using optparse, just watch out for module argv[0] being the first argument not argv[1] in modules (need to pass argv - 1, argc + 1).

Does the job-manager issues the free request right away?

After execution completes, and execution always completes quickly because the actual launch isn't implemented yet. There is a way to simulate exection of the full duration (with a sleep in the exec system), see t2400-job-exec-test.t.

I unload sched-simple in my rc1 script for qmanager, but I'm getting the following error when I exit out of my flux instance.

Modules are normally loaded in rc1 and unloaded in rc3, so maybe you need to provide an rc3 script also? They are not automatically unloaded.

dongahn · 2019-06-28T15:41:11Z

Modules are normally loaded in rc1 and unloaded in rc3, so maybe you need to provide an rc3 script also? They are not automatically unloaded.

I do have it. I will take a look at it again though.

BTW if the module (sched-simple) loaded by its rc1 got unloaded by others (like this case), presumably doing another unload by its rc3 script wouldn't lead to this error, would it?

dongahn · 2019-06-28T15:42:51Z

After execution completes, and execution always completes quickly because the actual launch isn't implemented yet. There is a way to simulate exection of the full duration (with a sleep in the exec system), see t2400-job-exec-test.t.

This should be very useful!

dongahn · 2019-07-11T17:52:54Z

PR #481 resolved this.

dongahn mentioned this issue Jun 14, 2019

schedutil: refactor to track outstanding futures flux-framework/flux-core#2193

Closed

dongahn mentioned this issue Jun 26, 2019

Add queuing optimization support #476

Closed

dongahn mentioned this issue Jun 27, 2019

Satisfiability check within resource #478

Closed

This was referenced Jun 28, 2019

qmanager to integrate with the new exec system #481

Merged

Add EASY and CONSERVATIVE queueing policy into qmanager #482

Closed

dongahn closed this as completed Jul 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate resource-match with the new scheduler interface #468

Integrate resource-match with the new scheduler interface #468

dongahn commented Jun 11, 2019

dongahn commented Jun 11, 2019

grondo commented Jun 11, 2019

dongahn commented Jun 14, 2019

dongahn commented Jun 16, 2019

dongahn commented Jun 16, 2019

garlick commented Jun 16, 2019

dongahn commented Jun 17, 2019

SteVwonder commented Jun 17, 2019

dongahn commented Jun 26, 2019

dongahn commented Jun 26, 2019

garlick commented Jun 27, 2019 via email

dongahn commented Jun 28, 2019

dongahn commented Jun 28, 2019

garlick commented Jun 28, 2019 •

edited

Loading

dongahn commented Jun 28, 2019

dongahn commented Jun 28, 2019

dongahn commented Jul 11, 2019

Integrate resource-match with the new scheduler interface #468

Integrate resource-match with the new scheduler interface #468

Comments

dongahn commented Jun 11, 2019

dongahn commented Jun 11, 2019

grondo commented Jun 11, 2019

dongahn commented Jun 14, 2019

dongahn commented Jun 16, 2019

dongahn commented Jun 16, 2019

garlick commented Jun 16, 2019

dongahn commented Jun 17, 2019

SteVwonder commented Jun 17, 2019

dongahn commented Jun 26, 2019

dongahn commented Jun 26, 2019

garlick commented Jun 27, 2019 via email

dongahn commented Jun 28, 2019

dongahn commented Jun 28, 2019

garlick commented Jun 28, 2019 • edited Loading

dongahn commented Jun 28, 2019

dongahn commented Jun 28, 2019

dongahn commented Jul 11, 2019

garlick commented Jun 28, 2019 •

edited

Loading