-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update config_machines.xml
and corresponding files
#173
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things to change for now.
@xylar Sorry about the comment confusion, I actually had several that I wanted your feedback on. I just re-wrote them so you could see them (instead of including them, accidentally, in a review). Would you mind looking at the other questions when you get a chance? |
ffea3a4
to
6866c38
Compare
Little progress update: Just added what I think is the last of the changes from the E3SM update, and I am about to start adding all the other missing environment variables. |
11c9380
to
74adc57
Compare
Everything looks good. Let's see how testing with the |
@xylar I forgot to finish and push the last bit of the |
@xylar Just set up the conda env on |
@xylar gnu run of the pr suite on pm passed all tests, intel run failed the 10km threads test but passed all others. I will take a look at the environment variables again in the morning and see if there's anything that might have screwed up threading in particular. Here is the log file from the failed run: |
@altheaden, great! It seems like it might be worth having me explore whether commenting back in one or more of these flags makes things pass for that thread test:
In the meantime, I think we can proceed. We will just want to make a Polaris issue reporting that that particular test on that particular machine and compiler is not producing bit-for-bit results as expected. It's a little too early to add that report but if we see the same thing right before we're ready to merge the branch that updates to Polaris 0.4.0-alpha.1, we will post an issue and link to it in the comments about testing that version of Polaris. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@altheaden, in my testing today I noticed 2 things that we need to fix.
3663ae9
to
13c8563
Compare
@xylar After the if-statement change, the intel test had the same results as I got last night. |
@altheaden, yes, that's what I saw, too. I also reran after commenting in the OMP* environment variables and I still see small differences. So this is a bug we're going to have to report to MPAS-Ocean developers to investigate. |
Testing@altheaden and I have used this branch to build Polaris' spack environment for:
We have run the Polaris We ran the Omega CTests on Chrysalis with intel and gnu, and on Perlmutter-GPU with gnugpu. This required #174 and some modifications to Polaris that will be included in the next update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is amazing, @altheaden. Very nice work figuring all of this out and being so thorough.
In this PR, I updated the
config_machines.xml
file, and likewise updated the module names and versions inchrysalis_gnu_openmpi.yaml
. I also added some missing environment variables in the various.csh
and.sh
files forchrysalis
,pm-cpu
, andpm-gpu
.Checklist
Testing
comment in the PR documents testing used to verify the changes