Skip to content

WeeklyTelcon_20201123

Geoffrey Paulsen edited this page Jan 19, 2021 · 12 revisions

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • NOT-YET-UPDATED

Release Branches

Review v4.0.x Milestones v4.0.5

  • No v4.0 rc this week.

Issue #8246: ROMIO/Luster -

  • Thought 4.0.x was on track for an RC, but RM's now want a better idea of Luster problem. Need to do ROMIO refresh to 3.3.2. Lots of changes between 3.3 and 3.3.2.
    • Pretty large delta for a release branch.
    • Want to get a better understanding of what's going on before another rc.
    • It may be that the right thing to do is put this on 4.1.x instead of 4.0.x.
  • Thinking of putting all unit tests for ROMIO into IBM folder. It might help catch this issue earlier.
  • This is highest priority for RM's- Howard will start testing new ROMIO this week to see if it fixes the issue.

Issue #8217: Memory leaks -

  • Do we have a PR on this?
    • Asked creator of ticket - we don't think he created a PR yet.
  • Would be easy for us to create the PR.
    • Howard/Geoff Paulsen will try to do this patch next week.
  • Almost all of this patch will apply to 4.1.x as well.

Issue #8252:

  • Thomas Naughton found an issue with UCX in OSU benchmark. Issue opened.

Review v4.1.x Milestones v4.1.0

  • Was close to an rc. Had the tarball's ready. But #8246 is holding it up now.
    • If upgrading ROMIO is part of the solution, it is best to put it now.
  • Other than that, RM's believe they have everything ready for an rc.
  • Going to go ahead and release an rc anyway, so please test it!
    • Not going to lose anything if RM's do another rc with new ROMIO.
  • Ralph is still getting a flood of warnings on v4.1.x.
    • Jeff Squyres will take a look again.

Review v5.0.0 Milestones v5.0.0

  • No updates from RM's. Haven't met in a couple weeks due to conflicting schedules.
  • Ralph has updated PMIx/PRRTE pointers.

Master

MTT master failures:

  • MTT compile failures with Clang.
  • Invalid window failures.
    • Jeff Squyres will ask Nathan Hjelm. These are happening because OSC pt2pt is gone.
  • Attribute tests reporting an invalid communicator.
  • Other than that, MTT looks pretty clean on master.

Other misc issues

  • Jeff: Docs issue
    • Sphinx / ReadTheDocs / RST going well. README's done. Working on FAQ. Man pages will come later (waiting for students to finish their part).
    • Doing some minor restructuring.
      • We could really use a definitive list in the README section (i.e., near the top of the docs) about:
        • What Operating Systems are supported
        • What Network stacks are supported
        • What versions of 3rd-party libraries are supported:
          • PMIx
          • PRRTE
          • hwloc
          • libevent
  • Jeff/George: State of the State Of the Union
  • Howard: ROMIO issue: some problem with UCX...?
  • Howard: some other smaller random issues
  • ECP update
    • George said he asked, and has not gotten an answer yet. Will re-ping.
    • Ralph would ideally want to do PMix the same way.
    • Not a done deal in ECP land - was querying interest. If ECP doesn't happen, at worst would be a delay. No downside to trying.
    • Don't want to go to much later than January. If we start getting to Feb/March/April, makes Super Computing a little more difficult in November.
    • If ECP decides not to go with this, can do a stand-alone OMPI webinar in January.
Clone this wiki locally