Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qmanager to integrate with the new exec system #481

Merged
merged 15 commits into from
Jul 10, 2019

Conversation

dongahn
Copy link
Member

@dongahn dongahn commented Jun 28, 2019

This PR has the following:

  • Define the high level resource API so that our resource infrastructure can be better used by other module users and CLI users. This will also facilitate other language binding (such as Golang) creations.
  • Implement this API for the module users.
  • Incorporate schedutil from flux-core into flux-sched.
  • Introduce the new queuing policy interface. I expect that many classical queueing polices can be implemented by deriving from this class and overriding its run_sched_loop interface. Basic queuing operations are already supported by the base classes using C++ STL containers. To make certain queuing operation efficient, I decided not to use std::list but use std::map keyed by monotonically increasing queuing time.
  • Implement the FCFS policy derived class and also added a skeleton EASY policy class with no implementation as a placeholder. Note that the queuing policy interface is designed to be used by future CLI users as well as module users -- useful for testing.
  • Use these primitives to implement the first baseline version of qmanager which provides the schedule loop service integrating both the new execution system within flux-core and the resource match service within flux-sched.
  • Fix a few RV1 compatibility issues including libjobspec fix

Resolve Issue #480, #468, and #471, #483, and #477.

@dongahn dongahn changed the title WIP: skeleton qmanager to integrate the scheduler with resource WIP: skeleton qmanager to integrate the scheduler with new exec system Jun 28, 2019
@codecov-io
Copy link

codecov-io commented Jun 28, 2019

Codecov Report

Merging #481 into master will decrease coverage by 1.42%.
The diff coverage is 61.79%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #481      +/-   ##
==========================================
- Coverage   76.31%   74.89%   -1.43%     
==========================================
  Files          45       60      +15     
  Lines        5535     6074     +539     
==========================================
+ Hits         4224     4549     +325     
- Misses       1311     1525     +214
Impacted Files Coverage Δ
resource/hlapi/bindings/c++/reapi_cli_impl.hpp 0% <0%> (ø)
qmanager/policies/queue_policy_easy_impl.hpp 0% <0%> (ø)
qmanager/policies/queue_policy_easy.hpp 0% <0%> (ø)
qmanager/policies/queue_policy_fcfs.hpp 100% <100%> (ø)
resource/libjobspec/jobspec.hpp 100% <100%> (ø) ⬆️
qmanager/policies/queue_policy_fcfs_impl.hpp 100% <100%> (ø)
qmanager/policies/base/queue_policy_base.hpp 100% <100%> (ø)
resource/writers/match_writers.cpp 93.71% <100%> (+0.11%) ⬆️
resource/traversers/dfu_impl.hpp 100% <100%> (ø) ⬆️
src/common/libschedutil/free.c 100% <100%> (ø)
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3ed3da7...a906913. Read the comment docs.

@dongahn dongahn force-pushed the qmanager branch 2 times, most recently from 888b3fe to d35cf64 Compare July 4, 2019 09:24
@dongahn dongahn changed the title WIP: skeleton qmanager to integrate the scheduler with new exec system qmanager to integrate with the new exec system Jul 4, 2019
@dongahn dongahn force-pushed the qmanager branch 5 times, most recently from 5a74cb9 to a805ed2 Compare July 4, 2019 18:58
@dongahn dongahn requested review from SteVwonder, garlick and grondo July 4, 2019 19:13
@dongahn
Copy link
Member Author

dongahn commented Jul 4, 2019

@SteVwonder, @garlick and maybe @grondo: This PR reached a reasonable stopping state to review this PR. The testing coverage went down a bit because this is actually an intermediate work before our end-of-July milestone. As such contains fair amounts of placeholder codes.

@garlick
Copy link
Member

garlick commented Jul 4, 2019

Just some general comments, initially:

  • You have two modules named qmanager and resource. Should they be named with a common prefix to indicate that they are part of a single scheduler implementation, like sched-full-qmanager and sched-full-resource? (substitute something more creative and cool sounding for "full" :-)
  • Your qmanager jobmanager_hello_cb needs to be filled in so that resources that are already allocated when the module loads can be marked allocated.
  • Your qmanager jobmanager_exception_cb needs to be filled in so that flux job cancel works for jobs that have an outstanding alloc request.
  • functions in qmanager.cpp look a bit chatty with the LOG_INFO messages on every message. Perhaps LOG_DEBUG?
  • Have you tried running this end to end with flux-core?
  • I see a ctx->queue->insert (job) call that isn't checked for failure. What happens to the broker if this hits a c++ exception, or am I missing how exceptions are handled?

@dongahn
Copy link
Member Author

dongahn commented Jul 4, 2019

Thank you for the quick review. I note that this is preliminary work so some of the logic only has a place holder. More will come as part of the next sprint.

You have two modules named qmanager and resource. Should they be named with a common prefix to indicate that they are part of a single scheduler implementation, like sched-full-qmanager and sched-full-resource? (substitute something more creative and cool sounding for "full" :-)

I can certainly do this. I was initially a bit ambivalent because resource can be used as a stand alone service independent of qmanager. I will create an issue to have a bit more discussion. But it doesn't have to be a part of this PR, does it?

Your qmanager jobmanager_hello_cb needs to be filled in so that resources that are already allocated when the module loads can be marked allocated.

Yes. This will be done as part of the end of July milestone. And this will be further utilized for resilience as logged with a scheduler resiliency ticket even later.

Your qmanager jobmanager_exception_cb needs to be filled in so that flux job cancel works for jobs that have an outstanding alloc request.

Same as above.

functions in qmanager.cpp look a bit chatty with the LOG_INFO messages on every message. Perhaps LOG_DEBUG?

Ok. Good feedback. I will change some of the messages to LOG_DEBUG as part of this PR.

Have you tried running this end to end with flux-core?

Yes. I had some manual testing. And the test case demonstrates this.

I see a ctx->queue->insert (job) call that isn't checked for failure. What happens to the broker if this hits a c++ exception, or am I missing how exceptions are handled?

I will double check. I generally don't want to raise an exception unless necessary. And I think qmanager also needs the top level exception catch clause. (Will add that)

 

@garlick
Copy link
Member

garlick commented Jul 4, 2019

I think qmanager also needs the top level exception catch clause. (Will add that)

That sounds like the right thing!

All good on filling in stuff and considering the name later as far as I'm concerned.

@dongahn
Copy link
Member Author

dongahn commented Jul 4, 2019

Thanks @garlick. I should say that the new rpc based scheduler interface and infrastructure made my job far easier! Great work.

@dongahn
Copy link
Member Author

dongahn commented Jul 5, 2019

I will double check. I generally don't want to raise an exception unless necessary. And I think qmanager also needs the top level exception catch clause. (Will add that)

I am using stl map and a few of its methods: erase, insert, find and empty. The only methods that can throw an exception would be erase and find when used with the key as their method arguments (not iterator). But even then their exceptions are essentially transparently passing an exception that can be thrown from the comparator.

Now, I just used the default comparator which is std::less<key> which doesn't throw exceptions. So all of these should be pretty much exception-free. But there could be std::bac_alloc exceptions and etc, which can be thrown from object creation time etc, so I will wrap the entire mod_main with

catch (std::exception &e) {
}

dongahn added 4 commits July 8, 2019 14:20
Add initial support for high-level resource API.
Two use cases:

1. The future queue manager support will require to
interact with both match RPCs (when used in a service module)
and CLIs (when used in a commandline based tester).
We plan to hide such software complexity by
making the queue manager class templated and
instantiating it with different resource API
types (module vs. cli).

2. @cmisale's project needs to layer our resource
infrastructure with Go as required by Kubernetes.
Currently low-level C++ API set doesn't do a good
job with this case. Instead, high-level APIs
with C bindings will significantly help with
that effort.

The hlapi/bindings/c++ directory contains the main
code. So that it could be used with templated classes,
our c++ API is a header file only solution.

The hlapi/bindings/c directory contains the c APIs.
They are simply wrappers around the c++ APIs. In fact,
while its APIs are C, its implementation is C++.

While this adds the necessary structure and API
definitions for module APIs (with RPC) and CLIs,
we only have module implementation. IOW, we only
have placeholders for CLI APIs for both c and c++
bindings.
@dongahn
Copy link
Member Author

dongahn commented Jul 8, 2019

OK. Rebased to the upstream/master and then pushed.

@dongahn
Copy link
Member Author

dongahn commented Jul 8, 2019 via email

Copy link
Member

@SteVwonder SteVwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dongahn. Generally LGTM! A few in-line comments below.

}


std::map<uint64_t, flux_jobid_t>::iterator queue_policy_base_impl_t::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment that the return value is the next element in the queue when success (and the current element when unsuccessful)? It makes sense after looking up the return semantics of std::map::erase, but being mostly unfamiliar with the C++ STL, this behavior was initially surprising to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! I somehow responded to this via an email. Will do.

Copy link
Member

@SteVwonder SteVwonder Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow responded to this via an email.

That was my fault. I accidentally made it as a top-level comment, hit delete, and then re-did it as a review comment. I forgot that comments spawned emails. Sorry about that.

qmanager/modules/qmanager.cpp Show resolved Hide resolved
resource/config/system_defaults.hpp Outdated Show resolved Hide resolved
t/sharness.d/sched-sharness.sh Outdated Show resolved Hide resolved
t/t1001-qmanager-basic.t Outdated Show resolved Hide resolved
* Boolean indicating if you want to use the
* allocated job queue or not. This affects the
* alloced_pop method.
* \return 0 on success; -1 on error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In one of the implementations, run_sched_loop returns rc1 + rc2. Maybe update this documentation to say < 0 on error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will do this. But like I said before, this is a loose end to tight for the next step. Sorry for a bit WIP nature of this. But I favor having to deal with two week sprint as opposed to over a month sprint :-//

qmanager/modules/qmanager.cpp Show resolved Hide resolved
qmanager/policies/base/queue_policy_base.hpp Show resolved Hide resolved
@SteVwonder
Copy link
Member

Oh. One other question I had while reading through the code: it wasn't clear to me the need for the duplication of all of the interfaces for the CLI support. It'll probably be clear once there is an implementation filled out in there, but in the meantime, can you briefly explain your plans for the CLI piece? Will it still use RPCs etc to interface with the resource module, or are you planning on running the resource-query (or similar) tool(s) as subprocesses of the qmanager?

@dongahn
Copy link
Member Author

dongahn commented Jul 9, 2019

@SteVwonder: I pushed some more commits to address all of your review comments. This revision has changes for both yours and @garlick. So, if Travis turns green and you are okay with the latest changes, I will squash the latest commits and then this PR should be good to go. Thanks!

dongahn added 11 commits July 9, 2019 17:05
Add the base queue-policy interface in qmanager/policies/base.
(Flux::queue_manager namespace).

Add two policy source files (FCFS and EASY)
into qmanager/policies in the Flux::queue_manager::detail
namespace. Provide an implementation for FCFS queueing policy.

These are header-file only solutions because some
of the core classes are templated with respect to high level
resource API types.
Also make the value type of rank within the rlite writer
to be a string to match with the RV1 spec.
Also Add qmanager support into the sched sharness script
Add support for the upcoming RFC14 & 24 revision.

Adjust resource's traverser with it.
Copy link
Member

@SteVwonder SteVwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dongahn! LGTM.

@dongahn
Copy link
Member Author

dongahn commented Jul 10, 2019

OK. I squashed those later commits. Once Travis turns green, this is good to be merged @SteVwonder. Thanks.

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongahn, I didn't have any real comments on quick perusal besides a couple log messages that probably aren't needed and might clog up the logs.

if (flux_msg_get_userid (msg, &userid) < 0)
return;

flux_log (h, LOG_INFO, "alloc requested by user (%u).", userid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably remove this informational message before merging. Since all alloc requests will come from the instance owner, this message will be the same for every job request.

if (flux_msg_get_userid (msg, &userid) < 0)
return;

flux_log (h, LOG_INFO, "free requested by user (%u).", userid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, this information message will likely not add much to the logs.

@dongahn
Copy link
Member Author

dongahn commented Jul 10, 2019

@grondo: Thanks. Actually, I was concerned about these per-job messages myself as well. And I plan to go over the logging messages in a later PR across both qmanager and resource. Given that this is an intermediate PR, can this go in as is and then a later PR addresses this issue for all?

@grondo
Copy link
Contributor

grondo commented Jul 10, 2019

Fine with me, I didn't consider them mandatory changes.

@dongahn
Copy link
Member Author

dongahn commented Jul 10, 2019

Thanks @grondo and @SteVwonder!

@SteVwonder SteVwonder merged commit ef69e63 into flux-framework:master Jul 10, 2019
@dongahn
Copy link
Member Author

dongahn commented Jul 10, 2019

Yeah! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants