Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change soil moisture initialization logic for FATES configurations #1962

Merged
merged 10 commits into from
Apr 5, 2023

Conversation

ckoven
Copy link
Contributor

@ckoven ckoven commented Mar 7, 2023

This PR changes the logic for soil moisture initialization to initialize with wetter soils (75% of saturated water content, as opposed to 15% of absolute water content) for all FATES configurations. The rationale for this is that in FATES-nocomp simulations, @JessicaNeedham was finding very high initial mortality rates in some seasonal tropical forest regions, and she traced it back to the initial soil moisture killing off plants before the ecosystem could get established. More info at NGEET/fates#994.

Description of changes

This soil moisture value was already in use for FATES-Hydro, this PF just updates it to be used for all FATES configurations. I've run it and the code does the intended thing, which is to make soils considerably wetter across the world during the initial period from a cold-start spinup.

The thing I am curious about is whether there are any longer-term implications to the soil moisture, in particular in the permafrost region. I haven't tested this for a long period yet, but I wonder how much the initial soil moisture gets locked in for permafrost soil layers, and thus how sensitive they are to initial conditions even after long spinup? @swensosc and @dlawrenncar probably know the answer to this.

Specific notes

Contributors other than yourself, if any: @JessicaNeedham

CTSM FATES Issues Fixed (include github issue #): NGEET/fates#994

Are answers expected to change (and if so in what way)? Yes, in all non-hydro FATES configurations (the change to initial soil moisture is already the case for FATES-hydro and so wouldn't change anything there)

Any User Interface Changes (namelist or namelist defaults changes)? No

Testing performed, if any:

@JessicaNeedham has done more complete testing in E3SM-FATES to show that it fixes the initial high mortality bias. I have run this code in CLM-FATES and it does increase soil moisture as intended. I have not run it through any test suites.

@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Mar 7, 2023
@wwieder
Copy link
Contributor

wwieder commented Mar 9, 2023

We're wondering we can / should keep the HLM and FATES initialization the same to avoid confusion for users? This seems like it warrants a more scientific-focused discussion with both FATES and HLM teams?

@wwieder wwieder removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Mar 9, 2023
@ekluzek
Copy link
Collaborator

ekluzek commented Mar 9, 2023

We also wondered if this shouldn't be a namelist item of some sort to make it more obvious what the starting state is being set to.

@ckoven
Copy link
Contributor Author

ckoven commented Mar 9, 2023

Right. In principle the initial soil moisture is another hidden parameter that could be brought into the PPE infrastructure. In email discussions with @dlawrenncar and @swensosc, it does seem like changing the initial amount will lead to changes that persist over the long term in the permafrost soil thermal and hydrologic dynamics. So if the goal is to use consistent values between FATES and non-FATES configurations, then exploring that would need to be done. The very dry values we are using now are leading to problems in FATES runs; since non-FATES configurations don't really represent mortality, they aren't really sensitive to the issue.

@swensosc proposed one possibility of initializing at field capacity, so a fixed matric potential rather than volumetric water content. That would seem like a good long-term solution to me, to initialize to some uniform matric potential, with a default value of field capacity, but also make the initial matric potential a PPE parameter that could be explored. But again, that would lead to changes in behavior as compared to the current simulations.

@glemieux
Copy link
Collaborator

We're wondering we can / should keep the HLM and FATES initialization the same to avoid confusion for users? This seems like it warrants a more scientific-focused discussion with both FATES and HLM teams?

Per the ctsm software meeting discussion today, we agreed to bring in this PR as is and continue the broader science-focused discussion in a new repo issue/discussion. I'll use this PR as a vehicle to update fates to the latest API updates as well.

@glemieux glemieux self-assigned this Mar 23, 2023
Update externals and minor fixes

Main change is to update externals to cesm2_3_alpha12c-ish.

Doing this exposed a few issues that are also fixed here.

Also, reduce GU_LULCC tests down to a single test.
@glemieux
Copy link
Collaborator

Testing the fates suite on Cheyenne is underway. I'll follow this up with testing on izumi and the aux_clm suite tomorrow assuming the fates results look as expected.

@glemieux
Copy link
Collaborator

glemieux commented Mar 28, 2023

Almost all fates tests are passing with DIFFs on all tests except the Hydro test mods, which are B4B, which is all as @ckoven expected.

That said, one of our longer, higher resolution exact restart tests (ERS_Lm13.f10_f10_mg37.I2000Clm50Fates.cheyenne_gnu) is barely failing to complete due to hitting the wallclock. Previous tests show that this test was already skirting the limit and with more cohorts surviving the computation for fates has increased, pushing the test beyond the wallclock limit. For the short term, I'm going to slightly increase the wallclock to see if the result passes as expected. For the longterm, I think it might be good to revisit this particular test's overlap with others to see if we should change it since it's so long.

@glemieux
Copy link
Collaborator

glemieux commented Mar 28, 2023

That said, one of our longer, higher resolution exact restart tests (ERS_Lm13.f10_f10_mg37.I2000Clm50Fates.cheyenne_gnu) is barely failing to complete due to hitting the wallclock. Previous tests show that this test was already skirting the limit and with more cohorts surviving the computation for fates has increased, pushing the test beyond the wallclock limit. For the short term, I'm going to slightly increase the wallclock to see if the result passes as expected.

Re-running this test with a greater wall clock allowed it to complete. It ran as expected taking 42.6 minutes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link
Collaborator

@rgknox rgknox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of anything else this needs. Thanks for also updating the FATES tag and associated parameter file.

@glemieux
Copy link
Collaborator

aux_clm regression testing on cheyenne is complete. All fates tests returned DIFFs as expected since we are updating to the latest the fates science tag.

One tests passed although it was expected to fail: FUNITCTSM_P1x1.f10_f10_mg37.I2000Clm50Sp.cheyenne_intel

There was one unexpected failure, FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel, that returned as failed to initialize for the SHAREDLIB_BUILD step. I'm not sure what happened here. Have you ever seen this @ekluzek?

Folder on cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_dev121-aux
Izumi testing coming soon.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 30, 2023

@glemieux odd that the UNIT tests worked for you, but it had been failing for @billsacks. Maybe CISL fixed something on the system? Or it's a user dependent thing?

On the other issue it's failing in this set of commands...

. /glade/scratch/glemieux/ctsm-tests/tests_dev121-aux/FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel.GC.dev121-aux_int/.env_mach_specific.sh; module unload python; module load conda; conda activate ctsm_pylib && python3 /glade/u/home/glemieux/ctsm/tools/modify_input_files/fsurdat_modifier /glade/scratch/glemieux/ctsm-tests/tests_dev121-aux/FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel.GC.dev121-aux_int/modify_fsurdat.cfg

Try running each of those commands separately on the command line and see what fails. You might also just

./py_env_create
conda activate ctsm_pylib
cd python
make test

and see what happens...

@glemieux
Copy link
Collaborator

Thanks @ekluzek. It looks like everything worked as expected:

(ctsm_pylib) ✔  ~/ctsm/python [modinitsoilwater {ckoven/modinitsoilwater } ↓·1|⚑ 12] 
11:46 $ make test
python3 ./run_ctsm_py_tests  --unit
.....................................................................................................................................................................
----------------------------------------------------------------------
Ran 165 tests in 0.988s

OK
python3 ./run_ctsm_py_tests  --sys
...............Setting resource.RLIMIT_STACK to -1 from (307200000, -1)
Done converting /glade/scratch/glemieux/tmp_otnm7it/scrip.nc
.
----------------------------------------------------------------------
Ran 16 tests in 12.257s

OK

Maybe this was a glitch of some sort? Should I try and manually run ./case.build for that test?

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 30, 2023

Possibly what happened was that you hadn't run py_env_create before so you needed to run it by hand? Could that be the base?

We have a long list of things that happen in that line in the script. Possibly the script should do them separately, so you know which one fails.

@glemieux
Copy link
Collaborator

Oh, I didn't realize we needed to run py_env_create prior to run_sys_tests -s aux_clm. Loading the ctsm_pylib env and then re-running ./case.build in the test dir worked. I was able to then submit and that test ran successfully and returned B4B.

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 30, 2023

Yep, this is a new thing that you need to do. We possibly need more documentation and advice given about the need to do it.

@glemieux
Copy link
Collaborator

I'm running into an issue with the nag compiler izumi tests. I'm getting the following error:

Error: /home/glemieux/ctsm/src/fates/main/FatesHistoryInterfaceMod.F90, line 2478: Too many continuation lines

We've definitely increased the number of variables in the association statement since the last fates science tag update. For comparison I was able to create a baseline tag for fates using fates-sci.1.61.0_api.25.0.0-ctsm5.1.dev120, which is the current tag that ctsm is pointing to in the externals. I've never seen this before. Have you @rgknox, @ekluzek?

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 30, 2023

@glemieux yes I saw this with the CN matrix work. There is a limit for nag on how many lines you can have in those continue statements. What we had to do for CN matrix was to have more than one associate statement that you nest. It's annoying but not that much different and just adds a couple end associate statements at the subroutine close.

@glemieux
Copy link
Collaborator

I'll be testing NGEET/fates#1009 and updating the externals pointer to the integrated for fix the above issue prior to continuing with this PR.

@billsacks
Copy link
Member

Regarding the FUNIT tests: Brian Vanderwende in CISL made a system change that should get them working again. He said:

Jim is right that 4.9.0 added a libxml2 dependency that has been problematic on Cheyenne (as there is no "libxml2.so" on the compute image - though there is "libxml2.so.2"). We account for this after 4.9.0 and so using 4.9.1 should work fine as Jim mentions.

That said, I've put a hack in that should hopefully fix the issue for 4.9.0 as well. Let me know how it goes for you.

So I have closed #1972. @glemieux I opened a PR with some changes that should be merged now that this is resolved: #1978 . Can you either (a) cherry-pick the single commit on that branch into the branch you're working on here, or (b) give me the go-ahead to merge #1978 when it won't interfere with your work?

@glemieux
Copy link
Collaborator

glemieux commented Apr 3, 2023

Thanks for the heads up @billsacks. I'll cherry-pick your commit into this PR.

billsacks and others added 2 commits April 3, 2023 16:50
- Revert "Add Fortran unit tests as a separate test to run". This
reverts commit 2ee9024.

- Revert "Add FUNITCTSM test to expected fails list". This reverts
commit 3cd1351.
This tag includes a fix to avoid line continuation issues seen on izumi.
@glemieux
Copy link
Collaborator

glemieux commented Apr 4, 2023

Testing on Cheyenne is complete. All expected non-fates test pass B4B for the aux_clm suite.

Folders on Cheyenne:

/glade/u/home/glemieux/scratch/ctsm-tests/tests_dev121-aux-1

UPDATE: waiting on fates suite test results. All non-hydro tests should be non-bfb.

@glemieux
Copy link
Collaborator

glemieux commented Apr 4, 2023

Testing on izumi is complete. As expected all non-fates tests are b4b for the aux_clm tests. All non-hydro tests are not b4b as expected for the fates suite/

Folders on Izumi:

/scratch/cluster/glemieux/ctsm-tests/tests_dev121-aux_clm
/scratch/cluster/glemieux/ctsm-tests/tests_dev121-fates-1

@glemieux
Copy link
Collaborator

glemieux commented Apr 5, 2023

fates suite regression tests are complete on Cheyenne. All expected tests PASS. The non-hydro testmods are the only B4B tests as expected.

Folder on cheyenne:

/glade/u/home/glemieux/scratch/ctsm-tests/tests_dev120-121-fates-65-3

@ekluzek this should be good to go now.

@ekluzek ekluzek merged commit 0b98af4 into ESCOMP:master Apr 5, 2023
@ekluzek ekluzek deleted the modinitsoilwater branch April 5, 2023 19:35
@dlawrenncar
Copy link
Contributor

dlawrenncar commented Apr 5, 2023 via email

@ekluzek
Copy link
Collaborator

ekluzek commented Apr 5, 2023

@dlawrenncar yes it's only for FATES and is considered a temporary solution. This just allows them to run for now.

We've had some discussion about better long term solutions, but they will require some study. And likely someone identified to work on this. See comments near the top of this issue. We thought this should be discussed at an up and coming CTSM science meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done (non release/external)
Development

Successfully merging this pull request may close these issues.

7 participants