Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-broker: all ranks appear to open log-filename attribute #2581

Closed
grondo opened this issue Dec 18, 2019 · 3 comments
Closed

flux-broker: all ranks appear to open log-filename attribute #2581

grondo opened this issue Dec 18, 2019 · 3 comments

Comments

@grondo
Copy link
Contributor

grondo commented Dec 18, 2019

While quickly working through flux start example in flux-framework/flux-workflow-examples#40, I noticed that it appears setting the log-filename attribute via -S on the broker commandline causes the logfile to be opened by all ranks, instead of just rank 0 as is documented in manual and code.

The issue appears to be that the log-file is opened when the attribute is set, and the attribute is being set before logbuf->rank, so all broker ranks are rank 0.

To verify, I used printf debugging:

diff --git a/src/broker/log.c b/src/broker/log.c
index 4982cb4..c5324d5 100644
--- a/src/broker/log.c
+++ b/src/broker/log.c
@@ -312,6 +312,8 @@ static int logbuf_set_filename (logbuf_t *logbuf, const char
     FILE *f;
     if (logbuf->rank > 0)
         return 0;
+    fprintf (stderr, "%ld: rank=%d: opening %s\n",
+            (long) getpid(), logbuf->rank, destination);
     if (!(f = fopen (destination, "a")))
         return -1;
     if (!(filename = strdup (destination))) {
@@ -689,6 +691,8 @@ int logbuf_initialize (flux_t *h, uint32_t rank, attr_t *att
     flux_log_set_appname (h, "broker");
     flux_log_set_redirect (h, logbuf_append_redirect, logbuf);
     logbuf->h = h;
+    fprintf (stderr, "%ld: logbuf_initialize: rank=%d\n",
+                     (long) getpid(), rank);
     logbuf->rank = rank;
     flux_aux_set (h, "flux::logbuf", logbuf, logbuf_finalize);
     return 0;
$ src/cmd/flux start -s 4 -o,-S,log-filename=out
flux-start: warning: setting --bootstrap=selfpmi due to --size option
31474: rank=0: opening out
31472: rank=0: opening out
31473: rank=0: opening out
31477: rank=0: opening out
31474: logbuf_initialize: rank=2
31472: logbuf_initialize: rank=0
31473: logbuf_initialize: rank=1
31477: logbuf_initialize: rank=3

I verified the following small change fixes the issue:

diff --git a/src/broker/log.c b/src/broker/log.c
index 4982cb4..b1f4e15 100644
--- a/src/broker/log.c
+++ b/src/broker/log.c
@@ -680,6 +680,7 @@ int logbuf_initialize (flux_t *h, uint32_t rank, attr_t *att
     logbuf_t *logbuf = logbuf_create ();
     if (!logbuf)
         goto error;
+    logbuf->rank = rank;
     if (logbuf_register_attrs (logbuf, attrs) < 0)
         goto error;
     if (fake_rank (h, rank) < 0)
@@ -689,7 +690,6 @@ int logbuf_initialize (flux_t *h, uint32_t rank, attr_t *att
     flux_log_set_appname (h, "broker");
     flux_log_set_redirect (h, logbuf_append_redirect, logbuf);
     logbuf->h = h;
-    logbuf->rank = rank;
     flux_aux_set (h, "flux::logbuf", logbuf, logbuf_finalize);
     return 0;
 error:

However, I'm not sure if some other initialization should also be moved up in the function (e.g. logbuf->h isn't set until the end), and also the broker rank attribute is set here (via a function called fake_rank) and I'm not sure why. Code copied below for reference:

static int fake_rank (flux_t *h, uint32_t rank)
{
    char buf[16];
    snprintf (buf, sizeof (buf), "%u", rank);
    return flux_attr_set_cacheonly (h, "rank", buf);
}

int logbuf_initialize (flux_t *h, uint32_t rank, attr_t *attrs)
{
    logbuf_t *logbuf = logbuf_create ();
    if (!logbuf)
        goto error;
    logbuf->rank = rank;
    if (logbuf_register_attrs (logbuf, attrs) < 0)
        goto error;
    if (fake_rank (h, rank) < 0)
        goto error;
    if (flux_msg_handler_addvec (h, htab, logbuf, &logbuf->handlers) < 0)
        goto error;
    flux_log_set_appname (h, "broker");
    flux_log_set_redirect (h, logbuf_append_redirect, logbuf);
    logbuf->h = h;
    flux_aux_set (h, "flux::logbuf", logbuf, logbuf_finalize);
    return 0;
error:
    logbuf_destroy (logbuf);
    /* FIXME: need logbuf_unregister_attrs() */
    return -1;
}
@garlick
Copy link
Member

garlick commented Dec 18, 2019

Ugh, sorry.

Good catch on the file opening problem.

It would not hurt to move up the logbuf->h assignment also IMHO, whether or not it has any effect.

The fake_rank() call should not be needed. The same thing happens in broker.c in create_dummyattrs() which is called right after logbuf_initialize(). I don't any reason why this needs to be set earlier than that. (I dropped it and all the tests pass).

@grondo
Copy link
Contributor Author

grondo commented Dec 18, 2019

Want to just throw the fix into your current work? Or if you'd like I could make the world's tiniest PR. (I guess I could add a test to make it more worthwhile)

@garlick
Copy link
Member

garlick commented Dec 18, 2019

Sure, I'll add it to the PR I am working on.

garlick added a commit to garlick/flux-core that referenced this issue Dec 19, 2019
Problem: when the broker is started with the
'log-filename' attribute set on the command line,
the log file is opened from all broker ranks.
This does not match the documentation or the code.

Move initialization of logbuf->rank in logbuf_initialize()
to before logbuf_register_attrs() is called.  Now the test
for rank 0 is only true on rank 0.

Move the logbuf->h initialization too for consistency.

Fixes flux-framework#2581
@mergify mergify bot closed this as completed in f0e398d Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants