Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add long term exact restart test and fixed biogeog + no competition tests to fates suite #1849

Merged
merged 28 commits into from
Oct 15, 2022

Conversation

glemieux
Copy link
Collaborator

@glemieux glemieux commented Sep 12, 2022

Description of changes

Adds two 12 month ERS tests to the fates suite along with a new test mod that combines the no competition and fixed biogeography modes as a 30 day ERS test.

Specific notes

The long term ERS test is being added to avoid missing issues related to NGEET/fates#897 in which mid-year restarts were not bfb.

While we include a fates satellite phenology test that exercises fixed biogeography + no competition modes by definition, we should provide additional test coverage for testing these two modes together without being driven by satellite data. This combination is a valid scientific mode in its own right and exercises different code paths.

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):
resolves #1839
resolves #1817
resolves #1551

Are answers expected to change (and if so in what way)? Answers are not expected to change.

Any User Interface Changes (namelist or namelist defaults changes)? No

Testing performed, if any:

TBD

NOTE: Be sure to check your coding style against the standard
(https://github.com/ESCOMP/ctsm/wiki/CTSM-coding-guidelines) and review
the list of common problems to watch out for
(https://github.com/ESCOMP/CTSM/wiki/List-of-common-problems).

@glemieux
Copy link
Collaborator Author

@ekluzek @rgknox what do you think about truncating some of these names? I have run into problems where the length of the testmod name for the reduced complexity modes has caused build errors when I run a single test with a long-ish --testid-base added. The NoComp + FBG testmod name truncation that I came up with is not great 😒

@glemieux glemieux marked this pull request as ready for review September 14, 2022 22:07
@glemieux
Copy link
Collaborator Author

@ekluzek @rgknox what do you think about truncating some of these names? I have run into problems where the length of the testmod name for the reduced complexity modes has caused build errors when I run a single test with a long-ish --testid-base added. The NoComp + FBG testmod name truncation that I came up with is not great unamused

Per a discussion at the ctsm software meeting a consensus on the truncated naming convention was achieved. ColdDef has been truncated to Cold and ReducedComplex has been removed entirely.

@glemieux
Copy link
Collaborator Author

Per @ekluzek suggestion, I've merged in #1827 into my PR branch and run aux_clm testing against the ctsm5.1.dev108 baseline. All non-fates testmods pass as expected on cheyenne and izumi, with one exception: SMS_D_Ld1_PS.f09_g17.I1850Clm50BgcSpinup.cheyenne_intel.clm-cplhist fails during the run. I am trying to resubmit that now.

The fates test mods all have BFAIL and NLCOMP differences due to the name changes in all the test mods. I manually used cprnc to check against the baselines for all the fates testmods and they all have b4b results or only differ in the FIELDLIST. The output for these checks are in the top of the cheyenne directory. Test locations:

  • cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr1827-1849
  • izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr1827-1849-2

@glemieux
Copy link
Collaborator Author

I just noticed I missed that two fates tests are failing for other reasons:

  • ERP_P144x2_Ld30.f45_f45_mg37.I2000Clm50FatesRs.cheyenne_intel.clm-mimicsFatesCold
  • ERP_P36x2_Ld30.f45_f45_mg37.I2000Clm51FatesSpCruRsGs.cheyenne_intel.clm-FatesColdSatPhen

Looking into these as well.

@glemieux
Copy link
Collaborator Author

Per @ekluzek suggestion, I've merged in #1827 into my PR branch and run aux_clm testing against the ctsm5.1.dev108 baseline. All non-fates testmods pass as expected on cheyenne and izumi, with one exception: SMS_D_Ld1_PS.f09_g17.I1850Clm50BgcSpinup.cheyenne_intel.clm-cplhist fails during the run. I am trying to resubmit that now.

I guess I missed that this is actually an expected failure:

<test name="SMS_D_Ld1_PS.f09_g17.I1850Clm50BgcSpinup.cheyenne_intel.clm-cplhist">
<phase name="RUN">
<status>FAIL</status>
<issue>#1844</issue>
</phase>
</test>

Resubmitting it resulted in it passing however.

@glemieux
Copy link
Collaborator Author

glemieux commented Sep 20, 2022

I just noticed I missed that two fates tests are failing for other reasons:

  • ERP_P144x2_Ld30.f45_f45_mg37.I2000Clm50FatesRs.cheyenne_intel.clm-mimicsFatesCold
  • ERP_P36x2_Ld30.f45_f45_mg37.I2000Clm51FatesSpCruRsGs.cheyenne_intel.clm-FatesColdSatPhen

The ERP SatPhen test is failing the restart in FATES_TVEG and FATES_TVEG24 only starting at t_index 18 an 19 respectively. The reason this is showing up now and not in the ctsm5.1.dev108 baseline is due to the fixes that @adrifoster put in place with #1827. Prior to these fixes, the user_nl_clm for this test would end with clearing the history tapes and only including the variable set in FatesColdDef testmod which doesn't include either of the above variables (whereas the current fix does include them by calling exclude and avoiding clearing the history tapes).

Testing a version of this test as an ERS without the 36x2 shows that only FATES_TVEG is not b4b on the restart. As I'm currently working on another exact restart issue, I'm inclined to think these might be related to NGEET/fates#897. Additionally, this could suggest that there is a problem with FATES_TVEG24 on the threaded restart for some reason. I'm going to test both variation of those (i.e. ERP_Ld30, and ERS_P36x2_Ld30).

@glemieux
Copy link
Collaborator Author

Testing a version of this test as an ERS without the 36x2 shows that only FATES_TVEG is not b4b on the restart. ... Additionally, this could suggest that there is a problem with FATES_TVEG24 on the threaded restart for some reason. I'm going to test both variation of those (i.e. ERP_Ld30, and ERS_P36x2_Ld30).

All the other variations result in only FATES_TVEG as being non-b4b. Not sure what it says about FATES_TVEG24 being not b4b in the ERP_P36x2 test only.

@glemieux
Copy link
Collaborator Author

glemieux commented Sep 24, 2022

The issue causing the non-b4b fates SatPhen testmod in part has been posted at NGEET/fates#908 and is due to a fates-side restart variable not being set properly for no comp modes.

The FATES_TVEG24 variable is still not b4b for this test mod.

Fixes for NEON cases

Two sets of fixes for NEON cases:

(1) Fixes for leap year handling when there are discrepancies between
the model and the DATM forcing data in terms of whether the current year
is a leap year. This involved updates to the share and cdeps externals
from Jim Edwards.

(2) Other fixes for NEON cases from Will Wieder, as documented in
ESCOMP#1860
@glemieux
Copy link
Collaborator Author

glemieux commented Oct 5, 2022

Fix for the non-b4b SatPhen testmod (NGEET/fates#908 and NGEET/fates#911) is forthcoming in NGEET/fates#914. As such, this PR will be updated with a new fates tag once integrated on the fates-side.

@glemieux
Copy link
Collaborator Author

Fates-side PR NGEET/fates#914 fix has been integrated. I will update the externals for this PR to point to the fix tag sci.1.59.7_api.24.1.0. As such, this will result in non-b4b fates testsmods, although the name changes in the PR will obscure that fact. Retesting aux_clm.

@glemieux
Copy link
Collaborator Author

Testing aux_clm on izumi and cheyenne are complete. All tests are b4b with the exception of expected fails and expected DIFFs due to the fates update. I spot check a few tests that overlap with the fates suite regression tests and the DIFFs appear consistent, particularly the model day output updates from sci.1.59.4_api.24.1.0.

The test locations are here:

  • izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr1827-1849-pr914
  • cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr1827-1849-pr914 and /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr1827-1849-pr914-2

Note that the second cheyenne location is the test for the mimicsFatesColdDef test which initially failed (in the first folder location) to create the test case due to having a mismatch in the testlist and testmod directory names.

@ekluzek will this go before or after #1735?

@ekluzek
Copy link
Collaborator

ekluzek commented Oct 13, 2022

@glemieux if you are ready to go we should plan on it coming in now. It can go either way. We have scheduled #1735 as first right now, but this can go first instead. We are at a space where whatever is ready should go next.

@glemieux
Copy link
Collaborator Author

Thanks @ekluzek. I'll update the changelog.

@glemieux
Copy link
Collaborator Author

glemieux commented Oct 14, 2022

The izumi fates suite tests are showing DIFFs consistent and expected with the DIFFs from the aux_clm test mod review (after running cprnc manually against the baseline). That said, issue NGEET/fates#701 has cropped back up in the hydro test. I'm currently planning on simply reopening the issue and adding it back into the expected fails list to be resolved with a future PR.
Folder location: /scratch/cluster/glemieux/ctsm-tests/tests_pr1827-1849-pr914-2

@glemieux
Copy link
Collaborator Author

I'm still waiting on two tests to clear the queue for the cheyenne fates test suite. Otherwise all tests are passing with BFAILS and DIFFs as expected.
Folder location: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr1827-pr1849-pr914-fates-2

@glemieux
Copy link
Collaborator Author

The last two test passed through the queue. No unexpected issues. This should be good to go now.

@ekluzek ekluzek self-assigned this Oct 15, 2022
@ekluzek ekluzek added the FATES API update Changes to the FATES version that also REQUIRE an API change in CTSM label Oct 15, 2022
@glemieux
Copy link
Collaborator Author

glemieux commented Oct 15, 2022

@ekluzek per our google meet conversation here is the test I ran for the mimicsFatesColdDef to enable a baseline comparison with ctsm5.1.dev111: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr1827-1849-pr914-2. The results are non b4b as I accidentally stated in the call; they do however match expected differences due to the fates externals update. Side note: I had been generating my baselines as ctsm5.1.dev113 just in case #1735 was going to come in prior, and have since renamed the directory to ctsm5.1.dev112.

I've renamed the testmod to match the new name convention via 84f768d. I re-ran the testmod to make sure the naming was consistent and would generate the correct name in the ctsm5.1.dev112 baseline; it built and ran successfully. Folder is here: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr1827-1849-pr914-mimics

@ekluzek ekluzek merged commit 616905b into ESCOMP:master Oct 15, 2022
@ekluzek ekluzek deleted the fates-tests-ers-fbg_nocomp branch October 15, 2022 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FATES API update Changes to the FATES version that also REQUIRE an API change in CTSM
Projects
Status: Done (non release/external)
4 participants