-
Notifications
You must be signed in to change notification settings - Fork 864
WeeklyTelcon_20200310
Geoffrey Paulsen edited this page Mar 17, 2020
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoffrey Paulsen (IBM)
- Jeff Squyres (Cisco)
- Austen Lauria (IBM)
- Harumi Kuno (HPE)
- Joseph Schuchart
- Joshua Ladd (Mellanox)
- Michael Heinz (Intel)
- Ralph Castain (Intel)
- Todd Kordenbrock (Sandia)
- William Zhang (AWS)
- Akshay Venkatesh (NVIDIA)
- Artem Polyakov (Mellanox)
- Edgar Gabriel (UH)
- Josh Hursey (IBM)
- Noah Evans (Sandia)
- sbreyer
- Howard Pritchard (LANL)
- Brendan Cunningham (Intel)
- Thomas Naughton (ORNL)
- Scott Breyer (Sandia?
- Nathan Hjelm (Google)
- Charles Shereda (LLNL)
- David Bernhold (ORNL)
- George Bosilca (UTK)
- Matthew Dosanjh (Sandia)
- Brandon Yates (Intel)
- Erik Zeiske
- Mark Allen (IBM)
- Matias Cabral (Intel)
- Xin Zhao (Mellanox)
- mohan (AWS)
-
MTT -
- If you change your MTT to startup PRRTE at begining of session, and just use prun.
- Can see times cut in half or more.
- This is good, but also need to test mpirun wrapper.
- Cisco is converting half of MPI installs to use prrte/prun
-
OMPI master submodule pointers setup to track PMIx and PRRTE master.
- Jeff discussed an idea to have some integration with PRRTE that putting a string in a PRRTE PR would automatically open an Open-MPI PR to update the PRRTE submodule after that PRRTE PR is merged to PRRTE master.
Blockers All Open Blockers
Review v3.0.x Milestones v3.0.6
Review v3.1.x Milestones v3.1.6
- Michael is interested in a schedule.
- Jeff is considering a release this week.
- fix segv PR Austen didn't hit segv on v3.0/v3.1 - defering.
Review v4.0.x Milestones v4.0.3
- v4.0.4 in the works.
- No Schedule yet.
- Jeff is looking at PMIx issue, some issue with dstore working with Ralph.
- Schedule:
- Feature Freeze: End of April
- Release: End of June
- Austen took an initial stab at issues and is starting a google sheets of v5.0 features.
- Today we went through all of the items on the google sheets document (https://docs.google.com/spreadsheets/d/1OXxoxT9P_YLtepHg6vsW3-vp4pdzGQgyknNbkzenYvw/edit#gid=0) which were taken from the face to face wiki.
- Josh Ladd led us to gather owners and a status for each of the various tasks. Not all were in attendance so we did the best we could. We can update after we get more information.
- Biggest thing on master is prrte.
- Issues are being found and fixed.
- Cisco mtt failing due to -np
- Might see more failures on master tonight.
- Maybe mid-late summer. No discussion
- scale-testing, PRs have to opt-into it.
Review Master Master Pull Requests
- CI testing only tests build and did it run, but doesn't test HOW it ran.
- Environment setup can be a bit different.
- For example no-permissions in
/tmp
. Might pass on one machine, and fail on another without/tmp
permissions.