Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update albedo initialization for identical order of operations #303

Merged
merged 1 commit into from
Mar 6, 2020

Conversation

apcraig
Copy link
Contributor

@apcraig apcraig commented Mar 6, 2020

PR checklist

  • Short (1 sentence) summary of your PR:
    fix albedo initialization which resulted in one non-exact restart
  • Developer(s):
    apcraig
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    Tested on conrad full suite with 4 compilers, all pass, all bit-for-bit (results)
    Tested on conrad full suite with 4 compilers with iso1, all tests pass, a few tests fail regression due to compiler optimization related to iso1 change (not this PR) results
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on CICE or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:

With the iso1 changes, one test was failing exact restart on one machine and compiler. After some debugging, it turns out to be an order of operations issue in initialization of albedos that has been there forever but never resulted in problems with exact restart. For whatever reason, this one test on one compiler on one machine picked it up finally. It is fixed and the answers are also bit-for-bit with prior results.

I also fixed initialization of nt_bgc_S when solve_zsal is off. It was not uniquely defined but this did not seem to matter to runs. Still, it should be set.

I added some diagnostics for restart read and write to more easily confirm identical fields written and read.

I modified how the ice_diag files are saved to the log directory by attaching the run timestamp to the filenames. This was long overdue. Prior, runs in the same directory (including restart tests) just overwrote the files on each run. We may want to eventually also change the way we save baselines or rename files in the run directory as this feature is only implemented in the copy of the ice_diag files to the case log directory.

I removed an OMP directive in the albedo initialization as it will not materially help performance since the code is run once. It's also a trivially cheap piece of code.

@apcraig apcraig requested review from eclare108213 and dabail10 March 6, 2020 01:31
@apcraig
Copy link
Contributor Author

apcraig commented Mar 6, 2020

It would be good to get this fix into the repo by Friday evening for weekend testing. We may also want to update CICE prior to the weekend if we can as well. Will see how reviews go.

Copy link
Contributor

@dabail10 dabail10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are some leftover print statements. Otherwise, it looks fine.

@@ -339,7 +339,8 @@ subroutine read_restart_field(nu,work,ndim)

minw = minval(work)
maxw = maxval(work)
write(nu_diag,*) minw, maxw
sumw = sum(work)
write(nu_diag,*) subname, minw, maxw, sumw

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still want the print statements here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were there before, and I've found them to be handy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were there before and I just extended the feature a bit. One of the first things I wanted to check was whether all the fields were read as written and whether they were identical. It's helpful to have the same diagnostics written when writing a restart as when reading it, and the sum is a new diagnostic that provides something like a checksum value of the field.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thank you @apcraig
There's no such thing as "bug-free code"... it's surprising this one hasn't popped up before.

@@ -339,7 +339,8 @@ subroutine read_restart_field(nu,work,ndim)

minw = minval(work)
maxw = maxval(work)
write(nu_diag,*) minw, maxw
sumw = sum(work)
write(nu_diag,*) subname, minw, maxw, sumw

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were there before, and I've found them to be handy.

@apcraig
Copy link
Contributor Author

apcraig commented Mar 6, 2020

Just one other comment. The problem, in some ways, is that the same code is implemented twice. I fixed it so they would compute fields with the same order of operations. But the right thing to do would be to pull out this code and create a subroutine with it where both parts of the model are calling it. This is all in the icepack driver, so it wouldn't affect icepack columnphysics. I didn't do that as it's a bit more intrusive, I thought I'd just keep the fix as simple as possible. But it's something that could or maybe should be done.

@apcraig apcraig merged commit 6a6bb3f into CICE-Consortium:master Mar 6, 2020
lettie-roach pushed a commit to lettie-roach/Icepack that referenced this pull request Oct 18, 2022
* serial fix for cheyenne

* fix hobart compile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants