Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failures in waccmx_offline test #550

Closed
billsacks opened this issue Oct 29, 2018 · 4 comments
Closed

Failures in waccmx_offline test #550

billsacks opened this issue Oct 29, 2018 · 4 comments
Assignees
Labels
bfb bit-for-bit testing additions or changes to tests

Comments

@billsacks
Copy link
Member

Brief summary of bug

On release-clm5.0, ERS_D_Ln9_P480x3.f19_g16.I2000Clm50SpGs.cheyenne_intel.clm-waccmx_offline is failing the restart test due to some ROF fields (both rof2lnd and lnd2rof).

General bug information

CTSM version you are using: release-clm5.0.09-43-gde4a134 (but it looks like the same error occurred in @ekluzek 's testing of release-clm5.0.09)

Does this bug cause significantly incorrect results in the model's science? No

Configurations affected: Tests with the waccmx_offline testmod

Details of bug

I think the problem is that this test restarts in the middle of a ROF coupling interval, since ROF is set to couple every 3 hours.

I tried increasing the length of this test to 5 hours, to get a restart at 3 hours. However, this failed with a water balance error at time step 16.

Important output or errors that show the problem

Failure in ERS_D_Ln9_P480x3.f19_g16.I2000Clm50SpGs.cheyenne_intel.clm-waccmx_offline

These are the fields that differ between the base and restart runs, from cpl.hi:

 RMS x2l_Flrr_volr                    5.1452E-03            NORMALIZED  1.7545E+01
 RMS x2l_Flrr_volrmch                 5.0652E-03            NORMALIZED  3.9215E+01
 RMS r2x_Forr_rofl                    7.6880E-05            NORMALIZED  7.3214E+01
 RMS r2x_Forr_rofi                    1.4318E-03            NORMALIZED  2.5776E+02
 RMS r2x_Flrr_volr                    3.7171E-03            NORMALIZED  3.0539E+01
 RMS r2x_Flrr_volrmch                 3.6666E-03            NORMALIZED  6.8436E+01
 RMS x2r_Flrl_rofsur                  1.4240E-05            NORMALIZED  2.2272E+01
 RMS x2r_Flrl_rofgwl                  3.7447E-04            NORMALIZED  6.6007E+01
 RMS x2r_Flrl_rofsub                  5.3266E-05            NORMALIZED  8.4397E+00
 RMS x2r_Flrl_rofi                    3.7446E-04            NORMALIZED  6.6517E+01

Failure in ERS_D_Lh5_P480x3.f19_g16.I2000Clm50SpGs.cheyenne_intel.clm-waccmx_offline

381: WARNING:  water balance error  nstep=           16  local indexc=        26329
381:  errh2o=   1.333529328132217E-003
381: clm model is stopping - error is greater than 1e-5 (mm)
381: nstep                 =           16
381: errh2o                =   1.333529328132217E-003
381: forc_rain             =   0.000000000000000E+000
381: forc_snow             =   2.621327578972475E-003
381: total_plant_stored_h2o_col =   0.000000000000000E+000
381: endwb                 =    2939.21392352162
381: begwb                 =    2939.21403792467
381: qflx_evap_tot         =  -1.890526273460123E-005
381: qflx_irrig            =   0.000000000000000E+000
381: qflx_surf             =   0.000000000000000E+000
381: qflx_h2osfc_surf      =   0.000000000000000E+000
381: qflx_qrgwl            =   0.000000000000000E+000
381: qflx_drain            =   4.088165222360376E-003
381: qflx_drain_perched    =   0.000000000000000E+000
381: qflx_flood            =   0.000000000000000E+000
381: qflx_ice_runoff_snwcp =   0.000000000000000E+000
381: qflx_ice_runoff_xs    =   0.000000000000000E+000
381: qflx_glcice_dyn_water_flux =   0.000000000000000E+000
381: qflx_snwcp_discarded_ice =   0.000000000000000E+000
381: qflx_snwcp_discarded_liq =   0.000000000000000E+000
381: qflx_rootsoi_col(1:nlevsoil)  =   0.000000000000000E+000
381:  1.421355802127227E-009  3.793750179443353E-009  9.152129867992451E-010
381:  4.943154857210362E-009 -1.481497918064600E-006  8.185368947298590E-007
381:  5.760884802171176E-007  7.720118732749495E-008  2.137550711239529E-009
381: -2.382258233445312E-009 -5.645953750444636E-010 -7.672929666410538E-011
381: -4.088472451680095E-011 -5.456406648993166E-011 -7.223628298214421E-011
381: -9.247471395432471E-011 -1.152682409312011E-010 -1.406578126772176E-010
381:  0.000000000000000E+000
381: clm model is stopping
381: calling getglobalwrite with decomp_index=        26329  and clmlevel= column
381: local  column   index =        26329
381: ERROR: get_proc_bounds ERROR: Calling from inside  a threaded region

Note that, before this, there were water balance warnings up through time step 3, but then nothing between time steps 3 and 16.

@billsacks billsacks added the bug something is working incorrectly label Oct 29, 2018
@billsacks
Copy link
Member Author

There are really two different issues here, which perhaps should be split and dealt with separately.

@ekluzek I'm not sure what you intended in terms of the restart time for this test. i.e., did you deliberately want a restart very shortly into the test? In that case, a solution could be to change the ROF coupling for this test to be the same as ATM_NCPL. But we should still determine the cause of the water balance error.

@ekluzek
Copy link
Collaborator

ekluzek commented Nov 14, 2018

Fixed in release-clm5.0.12

@ekluzek ekluzek closed this as completed Nov 14, 2018
@billsacks billsacks added testing additions or changes to tests tag: simple bfb and removed bug something is working incorrectly labels Dec 30, 2019
@billsacks
Copy link
Member Author

Reopening: It looks like this was fixed on the release branch (in release-clm5.0.12) but the fix hasn't made it to master.

@billsacks
Copy link
Member Author

This fix has come to master: ERS_D_Ln9_P480x3.f19_g16.I2000Clm50SpGs.cheyenne_intel.clm-waccmx_offline has been replaced with ERS_D_Ld5_P480x3.f19_g16.I2000Clm50SpGs.cheyenne_intel.clm-waccmx_offline.

@samsrabin samsrabin added simple bfb bit-for-bit labels Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bfb bit-for-bit testing additions or changes to tests
Projects
None yet
Development

No branches or pull requests

3 participants