Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broker: add config file boot method #1320

Merged
merged 17 commits into from
Feb 21, 2018
Merged

Conversation

garlick
Copy link
Member

@garlick garlick commented Jan 18, 2018

I'm putting this PR up early to get any feedback since I'm about to go offline for the rest of the day. It's not ready for merging yet (needs some hand testing on a cluster, as well as a sharness test).

This adds a new boot method for brokers where size, rank, session-id, and overlay endpoints are taken from a TOML config file rather than exchanged with PMI. The idea is to allow a multi-node system instance to bootstrap via systemd. Such a system instance still has many hurdles to overcome to be useful. As of now, it will not function unless all nodes listed in the config file are up, and it has no ability to recover its state if something goes wrong, or even survive an orderly shutdown.

Other than pulling boot_pmi.c out of broker.c and creating another boot method example, I didn't tackle any of the refactoring proposed as necessary in #1236. That problem to me seems co-mingled with the challenge of supporting multiple overlay topologies, since much of the current boot code is specific to TBON. Anyway maybe having this second example method on hand will help.

This pulls in the TOML config file support from flux-security. Since that still may undergo some change, I didn't try to introduce a general flux config file. Instead, this is just a config file for the boot process, and it does not contain any shortcuts that would make configuring a cluster easier. I would expect it to evolve:

session-id = "hype"

# if commented out, TBON is used for event distribution
# mcast-endpoint = "eth0;230.0.0.1:8500"

# TBON endpoints array is indexed by rank
tbon-endpoints = [
    "tcp://192.168.1.1:8500",  # rank 0
    "tcp://192.168.1.2:8500", # rank 1
    "tcp://192.168.1.3:8500", # rank 2
]

Each node is expected to have an identical config file, and finds its own endpoint address and that of its parent from the tbon-endpoints array.

The systemd unit file will need to be modified to run the broker something like this:

flux broker -Sboot.method=config -Sboot.config_file=/etc/flux/flux.conf

(I haven't tried that yet and maybe we will actually want systemd to start the broker with flux start)

Feedback welcome.

@coveralls
Copy link

coveralls commented Jan 18, 2018

Coverage Status

Coverage increased (+0.2%) to 78.87% when pulling 6e767a1 on garlick:toml_config into 6a37017 on flux-framework:master.

@codecov-io
Copy link

codecov-io commented Jan 18, 2018

Codecov Report

Merging #1320 into master will increase coverage by 0.25%.
The diff coverage is 80.38%.

@@            Coverage Diff             @@
##           master    #1320      +/-   ##
==========================================
+ Coverage    78.3%   78.55%   +0.25%     
==========================================
  Files         156      162       +6     
  Lines       28407    29606    +1199     
==========================================
+ Hits        22245    23258    +1013     
- Misses       6162     6348     +186
Impacted Files Coverage Δ
src/common/libflux/message.c 81.13% <ø> (-0.36%) ⬇️
src/common/libutil/timestamp.c 100% <100%> (ø)
src/broker/boot_pmi.c 52.71% <52.71%> (ø)
src/common/libutil/ipaddr.c 60.52% <55.88%> (-9.48%) ⬇️
src/broker/broker.c 77.49% <78.94%> (+5.15%) ⬆️
src/common/libutil/tomltk.c 82.06% <82.06%> (ø)
src/common/libtomlc99/toml.c 84.83% <84.83%> (ø)
src/broker/overlay.c 74.14% <85.71%> (-0.07%) ⬇️
src/broker/boot_config.c 86.58% <86.58%> (ø)
src/common/libutil/cf.c 89.82% <89.82%> (ø)
... and 23 more

@morrone
Copy link
Contributor

morrone commented Jan 18, 2018

I didn't tackle any of the refactoring proposed as necessary in #1236. That problem to me seems co-mingled with the challenge of supporting multiple overlay topologies, since much of the current boot code is specific to TBON.

This does in fact complete #1236. I am not at all clear on what you think was "proposed as necessary in #1236". You seem to be assuming intentions that are nowhere present in the work statement.

I think it is necessary to point out too that you did not open a detailed ticket explaining your approach to solving the problem before creating this code and making a pull request. But you did demand that of me.

I am not suggesting that you should be doing anything different in this work. I am just pointing out that if you had allowed me to work the same way that you are working, then the work would already have been complete and you wouldn't have needed to spend your own time on this current coding effort.

@garlick
Copy link
Member Author

garlick commented Jan 19, 2018

Towards the top of #1236 you mentioned modularization that was independent of the bootstrap method. I kept that to a minimum in this PR.

I tried to add enough of a high level description of the bootstrap method in #1236 before starting so you would have an idea of high level plan and scope and could comment. A new issue wasn't necessary.

@garlick garlick force-pushed the toml_config branch 7 times, most recently from 7c81bf4 to 3e34a0e Compare January 30, 2018 18:52
@garlick garlick force-pushed the toml_config branch 2 times, most recently from 01f3bc0 to 3d0e84c Compare February 6, 2018 21:36
@grondo
Copy link
Contributor

grondo commented Feb 7, 2018

@garlick, I see what might be a problem in Travis here. The sharness --chain-lint tests are not compatible in Travis with sending processes into the background for some reason. You might try the following patch and see if this fixes the current failure?

diff --git a/t/t0013-config-file.t b/t/t0013-config-file.t
index cebca9f..764db08 100755
--- a/t/t0013-config-file.t
+++ b/t/t0013-config-file.t
@@ -80,7 +80,7 @@ test_expect_success 'start size=1 with private config file' '
 # size=2 boot from config file
 #
 
-test_expect_success 'start size=2 with private config files' '
+test_expect_success NO_CHAIN_LINT 'start size=2 with private config files' '
        flux broker -Sboot.method=config \
                -Sboot.config_file=${TCONFDIR}/priv2.1.conf &
        run_timeout 5 flux broker -Sboot.method=config \

@garlick
Copy link
Member Author

garlick commented Feb 7, 2018

Thanks! Trying it on my test branch.

@garlick
Copy link
Member Author

garlick commented Feb 7, 2018

That was it! Thanks.

@grondo
Copy link
Contributor

grondo commented Feb 17, 2018

There's a bit of libtomlc99 unicode support code that is not covered by testing (also notice this in flux-security). I wonder if we should just remove that code or at least make sure it works...?

I'm running through this PR on my desktop now.
Just curious, what is the difference between overlay_bind() and overlay_connect() (there is some duplicate code in these functions that could be moved into its own static helper function.)

Otherwise, so far seems to check out ok.

Import TOML parser library and TAP test from flux-security.

Originally:
  https://github.com/cktan/tomlc99
  cktan/tomlc99@a0b45225

Plus fix for a segfaulting test:
  cktan/tomlc99@01ecb88d
  cktan/tomlc99#3

TAP test replaces upstream test.  It runs the BurntSushi test
inputs through the parser and checks for expected pass/fail.
See also: https://github.com/BurntSushi/toml-test

Add 'tar-pax' option to AM_INIT_AUTOMAKE so that long filenames
of test input files can be tarred by the dist target.
Import TOML utility library and unit test from flux-security.
Import prototype config class and unit test from flux-security.
Problem: create_rundir() is called both within boot_pmi(),
and after bootstrap.

The call within boot_pmi() was added so that the session-id
obtained from PMI could be used in the directory name,
and the rundir could be used in ipc:// socket paths.
However the session-id is no longer used in the path, so we
can make single call before bootstrap.
Problem: ipaddr_getprimary() calls log_err_exit() on error.

Change error handling behavior.  Have it return an error string
in an optional error buffer, and return -1 on failure, 0 on success.

Update broker user.
Add function to return argz list of addresses of all local
network interfaces, in stringified IP/IPv6 form.
Problem: no test coverage for ipaddr.c

Add a trivial TAP test.
Problem: the default values fo tbon.endpoint and mcast.endpoint
are not appropriate for boot methods other than PMI.

Move the code for ensuring that these attrs are set into
boot_pmi().  Refactor some repeated code into a helper function.
If a "boot.method" attribute is set to "config", bootstrap
overlay parameters from a config file instead of PMI.

If it is set to PMI or is unset, the previous PMI boot
method is executed.

If config is selected, the attribute "boot.config_file"
attribute must point to a TOML configuration file.
See boot_config.h.
Problem: PMI code and its helpers are confusing to isolate
and understand when they live in broker.c with unrelated code.

Move boot_pmi() and its helper functions to boot_pmi.[ch]
Add a config file that can be used to boot a single-node Flux
configuration.  Modify the systemd unit file so that it uses
this config file to start the local session.
Problem: fatal bootstrap errors sometimes cause segfaults
when libzmq atexit handlers are called.

The libzmq worker thread and zauth actor are started
when flux_sec_comms_init() is called.  Defer this until
overlay_bind() or overlay_connect() is called for the
first time.

This should also make sharness tests that cover fatal bootstrap
errors run a bit faster.
Problem: sharness tests that run a broker under run_timeout()
cannot actually terminate the broker if the timer runs out.

Unblock SIGALRM in the broker.
Problem: four subtests of t1105-proxy.t fail when
run on a system that is also running a Flux system
instance.

These tests use the construct
  FLUX_URI=value VAR=value ... test_must_fail command

It looks like the test_must_fail shell function
actually does not transmit the environment variable
settings to "command", and since FLUX_URI is unset
by the test script, the command attempts to connect
to the system instance and succeeds.

This example, which prints nothing and exits with a
code of 1, demonstrates the problem with the construct:

 #!/bin/sh
 foo() {
     "$@"
 }
 X=1 foo printenv X

Solution: run the command directly and test its exit
code, instead of running it with test_must_fail().
@garlick
Copy link
Member Author

garlick commented Feb 21, 2018

Rebased on current master and resynced the libtomlc99 stuff from flux-security, after flux-framework/flux-security#54 improved test coverage there.

On overlay_bind() vs overlay_connect(): The former calls zmq_bind() on endpoints that will be operating in passive mode, and the latter calls zmq_connect() on endpoints that need to actively connect somewhere.

overlay_bind() also configures zmq security in the "server role", while overlay_connect() configures itt in the "client role". The server role runs the ZAP actor that checks authentication. (Every broker actually runs one of those and has sockets operating in both roles).

Finally, overlay_bind() translates wildcard ports in endpoint URI's into actual ports. boot_pmi.c must call overlay_bind() part way through so it can obtain concrete URI's to share via the PMI KVS. boot_config.c doesn't need to do that becuse it uses a configured port number and has no mechanism to share a random one.

The overlay module does need some refactoring in general, and also to support multiple overlay topologies, but that's a topic for another PR I think.

@grondo
Copy link
Contributor

grondo commented Feb 21, 2018

Thanks. To be clear, regarding refactoring I was only talking about this block of code, duplicated between overlay_bind() and overlay_connect(), seems like the sec_initialized should be handled in a separate function:

    if (!ov->sec_initialized) {
        if (flux_sec_comms_init (ov->sec) < 0) {
            log_msg ("flux_sec_comms_init: %s", flux_sec_errstr (ov->sec));
            goto done;
        }
        ov->sec_initialized = true;
    }

E.g. move this block to overlay_sec_init (overlay_t *ov) and reduce code in each function to

    if (overlay_sec_init (ov) < 0)
        goto done;

I only noticed this because of the duplicated error handling code that is not covered by tests.

@garlick
Copy link
Member Author

garlick commented Feb 21, 2018

Ah, OK that makes sense! I'll fix.

Problem: duplicate code for initializing security
context exists in overlay_connect() and overlay_bind().

Factor out to shared private function.
@grondo
Copy link
Contributor

grondo commented Feb 21, 2018

Thanks, looks good to me. After travis reports back is this ready for merging?

@garlick
Copy link
Member Author

garlick commented Feb 21, 2018 via email

@grondo
Copy link
Contributor

grondo commented Feb 21, 2018

Looks good!

@garlick
Copy link
Member Author

garlick commented Feb 21, 2018

OK to merge IMHO.

@grondo grondo merged commit f7247dd into flux-framework:master Feb 21, 2018
@garlick garlick deleted the toml_config branch February 21, 2018 20:30
@grondo grondo mentioned this pull request May 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants