-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create csv file and user-mod directories for PLUMBER2 sites #2137
Comments
Start and end dates would be an encoded string. It might as well be in the same format that we use for RUN_STARTDATE so YYYY-MM-DD, and then also the START_TOD (Time Of Day). So I'll change the above to add extra columns for TOD. |
I have an example csv file here: The first few lines are: ,Site,Lat,Lon,pft1,pft1-%,pft2,pft2-%,start_year,end_year,RUN_STARTDATE,START_TOD,ATM_NCPL start_year and end_year are the starting and ending years for the datm (in mct this was DATM_CLMNCEP_YR_ALIGN and DATM_CLMNCEP_YR_END). As suggested, I've also included RUN_STARTDATE and START_TOD. ATM_NCPL is required to set the time step of the model to match the time step of the atmospheric forcing (either 1/2 or 1 hour). |
Thanks for putting this together @olyson, this is great. From the discussion this morning, I suggested adding a column for the start and end year, but we thought it's easy enough to pull out of the start and end date. So based on that, I'd remove those two columns. Also I assume the end date would be some arbitrary date in the end year, so it would be good to have it added in. And maybe the end date should be called STOP_DATE and STOP_TOD? I think I like naming the columns for the variables that have an XML variable name (RUN_STARTDATE, etc.) for them, so I like that change. For generic crop sites, it seems it would be better to put a supported CFT code for it, so that you could run it both as generic crop and also with the prognostic crop model. That just gives you more options. Using -999 as fillvalue makes sense to me and will be easy to parse for both human and computer. |
The other thing I wonder if there should be some comments to explain the file format as comments at the start of the file? |
One question I have about this is how this will get put into CDEPS? CDEPS could respond to this csv file, and/or process it to update it's XML files. Or the dates could just show up in the user-mod directories in the shell_commands file. Since, MCT is being deprecated we should just set this up for NUOPC, so the XML variables could be handled the same way they are for NEON with this in the shell_commands file: ./xmlchange DATM_YR_ALIGN=2018,DATM_YR_END=2021,DATM_YR_START=2018 |
The start_year at least, which is used in my script, as you've done with NEON, to set DATM_YR_START, can be different from the year encoded in RUN_STARTYEAR, because we are starting at GMT corresponding to local midnight. So I think both of those (start_year and RUN_STARTYEAR) are needed. |
@olyson we should perhaps finalize some of these decisions at our hackathon. But, let me just set some of this up for more discussion. With NEON we wanted the stop date/time, so that we could run over the entire period for transient cases. For spinup cases you can only run over whole years, but it's good to have both so that you can run over the entire period for transient. Also you can either calculate the DATM_YR_START and DATM_YR_END in the script to create the csv file -- or you can calculate this in the script that reads the csv file. Either one is valid, and we can talk more about if there are any pros and cons on Wednesday. On crops. For NEON the pft index uses the full range of 78 PFT's. There are only a couple Crop sites, but are labeled as the specific crop (so 19 which is spring wheat). Since, it's labeled as Spring Wheat you can do simulations with it, as both generic crop (with use_crop turned off), or as Spring Wheat (when use_crop is on). That just gives users the flexibility to run it either way. I would hope that we could investigate to figure out what specific crops are at the sites, so we could run either way. Two other things. One is that for NEON we made all the surface datasets be 78PFT to make it easier to handle and not to have to have some one way, and others 16pft. The other thing is that we'll want to setup surface datasets for FATES. For NEON we did that by having mixed PFT's that used the PFT mix from the 1-degree grid cell. So we probably want to do the same thing here. For FATES we'd just do the non-crop sites, and the files would be 16pft. |
We decided this morning in the group hackathon, that starting with 16pft surface datasets makes sense and running over only full years. This will allow FATES to use the same fsurdat files. At a later date we could expand to 78pft files, but this gets us a good start that works for most cases in the simplest way. So FATES-SP mode would use the mix of PFT's set for each site. For full FATES it'll use the PFT mix in the FATES parameter file. For FATES we would eventually want either a FATES parameter file for each site, or some files to use FATES tools to modify the parameter file for each site. But, that can be a future development. |
@olyson also had some infrastructure to handle how to do spinup's taking into account that the data is Gregorian, while the spinup is no_leap. We think this is handled now in NUOPC, so will try it out for a case to see if we still need that infrastructure. Hopefully we don't. But, can add it in if needed. |
We need to add canopy top and bottom heights to the csv file as these are site-specific and used in SP mode - I'll do that. |
@wwieder , I've added canopy top and bottom heights to the csv file: /glade/work/oleson/ctsm_PLUMBERcsv/tools/site_and_regional/PLUMBER2_sites.csv So you could try it in your script. /glade/u/home/wwieder/CTSM/tools/site_and_regional/subset_data_single_point For example, the BE-Lon site. So maybe I have something wrong set in the csv file for those sites. Did you get any error reporting on those? |
suggested usermod_dirs for PLUMBER2 are here Right now I'm making this in a notebook (in kind of a hacky way), but we can convert this to a python script if we want to keep it @olyson when I try manually creating the surface dataset for a crop site
I get the following error from .subset_data.py:
|
@wwieder I think that's a bug in subset_data. 15 and 16 are the generic crop rainfed and irrigated. So it should allow for 16. So we should fix this in subset data. |
it's easy to fix for pft 16, but breaking for pft 15 for some reason? |
I made the following changes in on line 18 and line 185
This works for dompft =16, but fails with dompft = 15 and I don't understand why? |
This if statement doesn't look right to me. Notice it's using MAX_PFT and NOT NAT_PFT. And it's also making sure num_pft is less than max_dom_pft as well as MAX_PFT. Also since the inequalities are < rather than <= it looks to me that NAT_PFT should 17 or the inequalities changed. I think the num_pft < max_dom_pft part is where it fails for 15. |
Thanks for looking at this Erik. I'm not sure I'm understanding the logic in the code or what you're suggesting I change, but when I set NAT_PFT=17 The error from line 408 in the code states: leaving NAT_PFT=16 creates a similar error. |
OK setting the --crop flag (almost has things working correctly except that PCT_CFT isn't getting set correctly for a grid with 100% PFT=16
|
Fixed by commenting out error flags on lines 197 & 211 of python/ctsm/site_and_regional/single_point_case.py and without using --crop flag Likely need to do some testing to make sure this is working OK for NEON or other configurations? |
@wwieder literally what you are doing is removing the abort on error for two different error checks. So that is "safe" to do in that it isn't going mess up anything that is already working. So you don't need to test other cases. It's not "safe" in that you removed two error checks. So it's like going around without seat belts. Doing so doesn't limit your travel -- it just causes problem if something goes wrong. This is one of the things I'd like us to get better at and illustrates the principle of error checks. To get them right you need to do some testing, and best to have a test case that you can validate that it dies under the right conditions. Then that test can be modified if it wasn't correct. It also makes it more obvious if the check is right by looking at the test rather than trying to parse the code in your head. What sometimes happens when we don't test error checks is that the logic is wrong and it causes problems so we remove the error check. But, then you don't have a sensible error check to know what to do when something goes wrong. The second error check you removed has to do with mixing crop and natural-veg types. I think there might be a reason for this that's embedded into subset_data, so we should be more cautious about removing it. |
I like the analogy, @ekluzek . The error checks were made for one application (NEON & generic single point) and now we're using them for something different (Plumber). To carry the analogy a bit further maybe this is similar to a bicycle vs. skiing helmet. My 'fixes' for plumber are basically undoing the buckle that holds either helmet on. On Weds. we can decide how fail safe we need these error check to be. |
OK, I have code that:
Remaining todos include:
These may be easier once I figure out how to contribute to Keith's development branch... but for now code modifications are in |
Additional todo: The PLUMBER2 usermods will need to include the LAI streams files. |
@olyson, so PLUMBER2 has LAI stream files for each site? Or do you mean to use the global LAI streams files? A technical thing with that is that the LAI streams can only be used for SP or FATES-SP mode. You could trigger that in the user-mods by querying the compset. We do that sort of thing in the NEON user-mods. |
@ekluzek Yes it has an LAI stream file for each site. And yes, my script only uses it for SP mode, it turns it off in BGC mode. |
Another PLUMBER2-specific thing is that we set baseflow_scalar = 0 in user_nl_clm for "wetland" sites. The list of these sites is: |
I wrote a script to compare the new surface datasets that Will created with those used in our original PLUMBER2 submission for all of the sites. As Will mentioned, differences in PCT_NAT_PFT show up for cropland sites because of how PCT_NAT_PFT is handled. Any other differences are small are due to rounding some of the original field values in the csv file. |
I believe this issue was covered by #2485 , so I am closing it now. |
As part of our tower-site hackathon, a task is to create a csv file and user-mod directories for the PLUMBER2 sites that can be added to CLM. This is part of the work in #1487. @olyson has a script that can create the user-mod directories, so likely that will be used for the initial version of this work.
The NEON csv file has a header as follows...
We propose the csv file for PLUMBER2 should look like this:
start_date is in YYYY-MM-DD form. We'll assume to start on Jan/1st at 0 UT and run full years ending on Dec/31 at last time step of the day.
The text was updated successfully, but these errors were encountered: