Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add systemd unit file, add mini auth protocol to local connector #995

Merged
merged 22 commits into from
Mar 2, 2017

Conversation

garlick
Copy link
Member

@garlick garlick commented Feb 28, 2017

This work was peeled off of #980. It adds a prototype systemd unit file and build logic for configuring its install location, which gets us part way to #967. With this it's possible to

$ sudo useradd -u 9999 flux
$ ./configure  --with-systemdsystemunitdir=/etc/systemd/system --localstatedir=/var
$ make
$ sudo make install DESTDIR=/
$ sudo systemctl start flux

Then flux commands like sudo -u flux flux dmesg work (as instance owner) but trying that as another user gets you something like

$ flux dmesg
flux: flux_open: Operation not permitted

There's a small tweak to the local connector here to return a single byte response on the wire indicating success or failure of authentication so the client, which probably sends a request first thing, gets a sensible error instead of EPIPE for the unceremonious broker side close.

Finally, ensure that the local connector socket is mode 777 so anyone can attempt to connect.

Add systemd unit file that starts a broker for the system instance,
broker.rundir to /var/run/flux.  Runs broker as user "flux" with
initial program "sleep inf".
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.03%) to 76.137% when pulling 3f306c0 on garlick:systemd into 1a1ab10 on flux-framework:master.

@codecov-io
Copy link

codecov-io commented Feb 28, 2017

Codecov Report

Merging #995 into master will increase coverage by 0.23%.
The diff coverage is 68.76%.

@@            Coverage Diff             @@
##           master     #995      +/-   ##
==========================================
+ Coverage    75.9%   76.13%   +0.23%     
==========================================
  Files         152      151       -1     
  Lines       25923    25989      +66     
==========================================
+ Hits        19676    19788     +112     
+ Misses       6247     6201      -46
Impacted Files Coverage Δ
src/broker/broker.c 65.78% <ø> (-0.05%)
src/cmd/builtin/proxy.c 72.63% <33.33%> (+0.03%)
src/modules/connector-local/local.c 67.28% <55%> (-6.26%)
src/broker/modservice.c 83.21% <69.56%> (+19.17%)
src/connectors/local/local.c 88.54% <75%> (+0.49%)
src/cmd/flux-module.c 85.03% <80%> (-3.6%)
src/common/libflux/message.c 83.57% <0%> (-0.27%)
src/common/libflux/handle.c 87% <0%> (+0.53%)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1a1ab10...2e26e33. Read the comment docs.

@garlick garlick mentioned this pull request Feb 28, 2017
6 tasks
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 76.111% when pulling 91ba09e on garlick:systemd into 1a1ab10 on flux-framework:master.

@grondo
Copy link
Contributor

grondo commented Feb 28, 2017

It isn't clear from the new autoconf macro if systemd is enabled by default and the (verbose) --with-systemdsystemunitdir is only needed to override the unitdir path found in pkg-config, or if one could use --wihtout-systemdsystemunitdir to disable systemd unit file installation. It would seem preferable to have a simple --with-systemd and --without-systemd, but maybe that isn't useful in practice?

BTW, not that I'm suggest we switch to it, but a quick search turned up this alternate implementation of a systemd.m4, which to me looks a bit more complete.

Sorry if you already researched this topic and arrived at the current macro as the best choice.

@garlick
Copy link
Member Author

garlick commented Feb 28, 2017

Ah, that one does look well thought out and complete. I'll go ahead and switch - the one I proposed is just one I have used before on other projects, and I think I used it because Fedora packaging guidelines led me there (by what route I don't recall).

@grondo
Copy link
Contributor

grondo commented Feb 28, 2017

Oh, ok, perhaps the long name is required by some RPM macros or something. Sorry about the diversion.

@garlick
Copy link
Member Author

garlick commented Feb 28, 2017

The long name mirrors the name of a directory in systemd.pc

systemdsystemunitdir=/lib/systemd/system

On ubuntu 16.04 LTS, daemon(7), under Installing Systemd Service Files recommends similar m4:

Packages using autoconf(1) are recommended to use a configure script excerpt like the following to determine the unit installation path during source configuration:

   PKG_PROG_PKG_CONFIG
   AC_ARG_WITH([systemdsystemunitdir],
        [AS_HELP_STRING([--with-systemdsystemunitdir=DIR], [Directory for systemd service files])],,
        [with_systemdsystemunitdir=auto])
   AS_IF([test "x$with_systemdsystemunitdir" = "xyes" -o "x$with_systemdsystemunitdir" = "xauto"], [
        def_systemdsystemunitdir=$($PKG_CONFIG --variable=systemdsystemunitdir systemd)

        AS_IF([test "x$def_systemdsystemunitdir" = "x"],
      [AS_IF([test "x$with_systemdsystemunitdir" = "xyes"],
       [AC_MSG_ERROR([systemd support requested but pkg-config unable to query systemd package])])
       with_systemdsystemunitdir=no],
      [with_systemdsystemunitdir="$def_systemdsystemunitdir"])])
   AS_IF([test "x$with_systemdsystemunitdir" != "xno"],
         [AC_SUBST([systemdsystemunitdir], [$with_systemdsystemunitdir])])
   AM_CONDITIONAL([HAVE_SYSTEMD], [test "x$with_systemdsystemunitdir" != "xno"])

This snippet allows automatic installation of the unit files on systemd machines, and optionally allows their installation even on machines lacking systemd. (Modification of this snippet for the user unit directory is left as an exercise for the reader.)

I'm thinking maybe it is OK as proposed after all?

@grondo
Copy link
Contributor

grondo commented Feb 28, 2017

Yes sorry about that

Problem: local connector allows connect to succeed, then
drops the connection abrubtly when authentication fails.
On the client end, flux_open() succeeds but the next
operation such as a flux_send() will fail with EPIPE.
This error is not particularly helpful for the user.

Add a single byte authentication result to the local wire
protocol.  Return 0 on success or errno on auth failure,
so that on failure, the client flux_open() will get EPERM.
Switch the one json-c user over to flux_request_decodef()
and drop the shortjson.h include.
Problem: messages with decoding problems are forwarded
to the broker.

If a request cannot be decoded, drop it and log the error.
Clean up the error paths in the internal request handling code.
Problem: module calls functions that exit on error like
xstrdup, xasprintf, xzmalloc, and oom.  When module is a
broker thread, the broker exits unnecessarily on these errors.

Rework error paths to aovid exit on error idiom.
@garlick
Copy link
Member Author

garlick commented Mar 1, 2017

I added some cleanup in the connector-local module, and a test to make sure the local connector auth failure returns EPERM to the user.

To make that work without requiring the ability to sudo to another user, I added a way to get/set an integer's worth of debug flags in modules (the need for something along these lines was discussed briefly in #813). There's a command flux-debug that can be used within tests to get/set debug flags per comms module, and then there's a builtin method for all modules that allows a debug mask stored in the handle under flux::debug_flags to be manipulated.

So to test the auth failure, the test calls

$ flux debug --set 1 connector-local    # enable one shot failure
$ flux comms info   # fails with EPERM
flux-comms: flux_open: Operation not permitted

In the auth code there's a check that looks like this:

    int *debug_flags = flux_aux_get (h, "flux::debug_flags");
    if (debug_flags && (*debug_flags & 1)) {
        flux_log (h, LOG_ERR, "connect by uid=%d pid=%d denied by debug flag",
                  c->ucred.uid, (int)c->ucred.pid);
        *debug_flags &= ~1; // one shot
        errno = EPERM;
        goto error;
    }

I'm not sure if this will be useful in all cases but it seemed like a fairly unobtrusive way to implement this test and establish a simple way for similar tests to be added in other places.

Add a new builtin service method that can manipulate
integer flags stored in the handle under the
"flux::debug_flags" key with flux_aux_set().
Modules may access the value with flux_aux_get().

This provides a simple way for any module to implement
test flags that can be enabled during testing, e.g.
to cover error paths.
@grondo
Copy link
Contributor

grondo commented Mar 1, 2017

That's nice!

Did you consider making flux debug a subcommand of flux module, or otherwise pick a more esoteric name? The only reason I think of this is because flux debug seems like prime command name real estate, and I could imagine a future someone would want a flux debug user or sysadmin tool, and then we'd have to change a bunch of test scripts...

@garlick
Copy link
Member Author

garlick commented Mar 1, 2017

Hmm, flux-module might be a better fit. Good idea.

@garlick
Copy link
Member Author

garlick commented Mar 2, 2017

Rebased with flux-debug folded into flux-module as flux module debug

I also converted flux-module to use liboptparse, and folded in flux-comms-stats as flux module stats.

Still todo: update flux-module(1) and add test coverage for flux module stats (there never was any).

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 76.07% when pulling afd8d07 on garlick:systemd into 1a1ab10 on flux-framework:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 76.351% when pulling 1051455 on garlick:systemd into 1a1ab10 on flux-framework:master.

@garlick
Copy link
Member Author

garlick commented Mar 2, 2017

I think this might be ready for merge. Just added coverage for flux module stats and squashed down some incremental change.

I feel OK about the level of testing here - I was dinged by coveralls on having no test coverage for the changes to connector-local to handle ENOMEM, but that's a hard case...

@garlick
Copy link
Member Author

garlick commented Mar 2, 2017

Oh except I need a flux-module(1) update. Stand by.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 76.351% when pulling e69858c on garlick:systemd into 1a1ab10 on flux-framework:master.

@grondo
Copy link
Contributor

grondo commented Mar 2, 2017

I was dinged by coveralls on having no test coverage for the changes to connector-local to handle ENOMEM, but that's a hard case...

Yeah, we don't have a viable way to test error conditions so I usually take that into account when reviewing code coverage results. I'll take a more detailed peek at this but looks good to me so far.

garlick added 10 commits March 2, 2017 16:57
Problem: flux comms-stats is a client for builtin
comms module services.  It should really be part of flux module.

Add flux module stats subcommand, remove flux-comms-stats, and
update sharness test users.
Add support for manipulating module debug_flags via messages.
If debug_flags bit 1 is set, simulate one auth failure.
A test can set this bit and then attempt to connect
to verify that EPERM is returned to flux_open().
Problem: the $rundir/broker.pid file written out by the
broker is necessarily flawed as a mechanism for checking
if a broker is running and is no longer used by anything.

Don't create the broker pid file.
Eliminate dead code for constructing path to broker.pid file.
Eliminate code to create broker.pid file as the
local connector no longer expects it and it serves
no other purpose.
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 76.333% when pulling 6a6c7ae on garlick:systemd into 1a1ab10 on flux-framework:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 76.415% when pulling 2e26e33 on garlick:systemd into 1a1ab10 on flux-framework:master.

@grondo
Copy link
Contributor

grondo commented Mar 2, 2017

This looks good to me, is it ready now?

@garlick
Copy link
Member Author

garlick commented Mar 2, 2017

Sure.

@grondo grondo merged commit 2094ede into flux-framework:master Mar 2, 2017
@garlick garlick deleted the systemd branch March 2, 2017 22:09
@grondo grondo mentioned this pull request Mar 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants