Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read from netcdf #25

Merged
merged 16 commits into from
Feb 28, 2022
Merged

Read from netcdf #25

merged 16 commits into from
Feb 28, 2022

Conversation

mnlevy1981
Copy link
Collaborator

@mnlevy1981 mnlevy1981 commented Feb 3, 2022

Add support for reading forcing / grid information from a netCDF file. This implementation adds feisty/testcase/from_netcdf.py, though I think @matt-long suggested putting this code in the driver (or maybe in core/?) instead. This is a draft because there are still several updates to make:

  • Finish Timestep benthic prey with fish #24 and merge it in (this branch is based off the branch in that PR because I needed the updated time-stepping)
  • Clean up time axis in forcing: currently, this branch expects time to run from 0 to 364, so if the forcing file begins with time=1 then the interpolation results in nans.
  • Add support for cyclic forcing: I tried a 20 year run, but (similar to the last issue) the forcing fields are all nan starting at time=365

Tested against Colleen's test_locs3.m matlab script
instead of filling a yaml file with printed values, matlab can write a netcdf
file with output so I updated the matlab-comparison notebook to read this
output when doing the comparison.
For columns deeper than cutoff, want t_frac_pelagic = 0, not 1
Formula fixed in 32a71bc is repeated in test_process.py...
combined forcing and biomass into the same netcdf file, so needed to clean up
the matlab-comparison notebook. Also did a first pass at low-hanging fruit for
performance (adding .data to lots of xarray objects) to get 6x speed-up.
Accidentally appended .data to a non-xarray object but didn't execute that line
of code in my driver; pytest caught it.
@mnlevy1981 mnlevy1981 mentioned this pull request Feb 17, 2022
Modified matlab code to generate forcing with units "days since 0001-01-01
00:00:00", so xarray converts it to cftime.DatetimeNoLeap object. This required
modifying the driver to use these same units for time, which then required
updating one of the tests in the driver as well as the time dimension in the
idealized forcing testcase.
New test initializes two instances of the idealized test case, one starting on
0001-01-01 and one starting on 0002-01-01, runs both for a single time step,
and expects biomass to be identical between the two. Currently this test fails,
because the forcing is only defined for year 1 and using xr.DataArray.interp()
in year 2 returns a nan which then propogates throughout the biomass array.

Proper implementation of an option for cyclic interpolation should allow this
test to pass.
Also cleaned up some numpy warnings that were coming about when
fish.size_class_bnds_ratio ** (1.0 - (death / somatic_growth_potential)) was
causing double-precision overflow (and a similar divide-by-0 error in the
matlab-comparison notebook)

Lastly, realized that I needed the nc-time-axis package for making plots now
that we are using cftime objects
@mnlevy1981
Copy link
Collaborator Author

As of 05d14d0, the python code offers a mechanism for cyclic forcing (and even tests it as part of the pytest suite via test_cyclic_interpolation() in tests/test_driver.py). It's pretty kludgy at this point, though -- it assumes that forcing is available for the entirety of year 0001, and then drops the current year from the model run when computing the current forcing (treating YYYY-MM-DD as 0001-MM-DD). More work will need to be done to support things like monthly forcing fields rather than daily, but perhaps that can wait for a later PR?

If putting off the improvements to forcing can wait, the last task remaining before opening up this PR will be moving the infrastructure for reading from netCDF out of testcases/ and into driver.py. @matt-long and I talked about this a couple weeks ago, and it will probably result in a fairly big refactor of the simulation class

I renamed simulate_testcase() -> config_testcase(), and also copied the
function to config_from_netcdf(). Then I removed support for "from_netcdf" as
options in config_testcase(), and moved the functions in
testcase/from_netcdf.py into driver.py.

There's still some cleanup to do in driver.py, though.
Added start_date to docstring for both config_testcase() and
config_from_netcdf(); also added tuple to list of valid types for start_date in
the simulation constructor.
config_testcase() will determine whether to set cyclic_forcing=True based on
which forcing profile is being used; config_from_netcdf() takes an argument
(default is False).

Also added two new tests that run a single day, and needed to shoehorn in a way
to make the forcing actually be cyclic so this commit contains some kludgy code
that will be cleaned up in coming commits.
More descriptive name for the class
Note that there is a namespace clash between offline_driver (the class) and
offline_driver (the module); the only reason we import the module is to test
the _read_settings() routine so I import the module as offline_driver_mod and
use that to access _read_settings.
Moved the code to make forcing Dataset cyclic into utils/cyclic_forcing.
@mnlevy1981 mnlevy1981 marked this pull request as ready for review February 26, 2022 19:01
@matt-long
Copy link
Contributor

Thanks @mnlevy1981, this looks great.

@matt-long matt-long merged commit 58f2f08 into marbl-ecosys:main Feb 28, 2022
@mnlevy1981 mnlevy1981 deleted the read_from_netcdf branch January 17, 2023 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants