-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESMF returns garbage data after CORDEX regridding #772
Comments
From stepping into the souce code of |
Hi @thomascrocker , @valeriupredoi you've been looking quite a bit into the regridding lately. Any thoughts? |
Thanks for the reply. I added some nco commands to my data preparation code (which needs to run anyway to convert the data from pp format to CF compliant netCDF format) so that any longitude points (and bounds) < 0 have 360 added to them, forcing longitude into the range 0 to 360. After doing that area extraction worked as expected. I then spent some time stepping through the source code in
Stepping through the code, at this point in execution, (line 186) the data in src_field is fine, but the data returned in regr_field is all zeros.I've never used ESMPY before, so may well have missed something obvious in the setup of the src_field and dst_field due to my inexperience with it. (I normally use IRIS when working with regular grids, and CDO if I have to use an irregular one).
One thing that might be significant is the co-ordinate system of this dataset. Since it's from a regional model the main dim coords for the variable (rlon and rlat) are defined on a rotated grid, (i.e. with a different north pole location) and the longitude and latitude coords are aux coords that span both these dimensions. Due to the fact that the rotated pole is in a different location, the rotated longitudes of this data span the meridian, and therefore include values greater than 360. One last thing I noticed when working with the debugger was that in construction of the target grid, if the target grid is being specified from a string specification (e.g. "2.5.x2.5") then the code assumes that the co-ordinate system of the target grid should match that of the source grid. ESMValCore/esmvalcore/preprocessor/_regrid.py Line 317 in 6594471
In my use case this behaviour is undesirable, since the source grid has a rotated pole co-ordinate system, but the grid specified ("2.5x2.5") is a global grid which one would assume to be on a regular unrotated co-ordinate system. This is probably relevant to this issue #493 To get around this in case it was an issue I tried specifying explicitly the grid from a GCM to regrid to in my recipe (HadGEM2-ES in this case) but I still saw the same behaviour of the regridded data all being set to zero, Hope this information is helpful. I can upload a sample file if that would help? The one I've been using is around 100 mb in size, just let me know where to upload to. |
@thomascrocker cheers for raising this and thanks for the very detailed debugging and problem-tracing! One suggestion would be to run with |
@valeriupredoi I tried with I'm not running on JASMIN, I have a local install. My recipe file (and debug output) are attached in the zip in my original post. I've also attached my config files here: Do you have access to the MASS service on JASMIN? My processed RCM data is stored in MASS. If I understand MASS permissions correctly (there's a non zero chance I don't....) it should be read accessible to any MASS user. One other thing. I've been attempting to regrid the data from my high resolution (~12km grid) RCM to a coarser GCM scale grid. This uses the RCM as the source grid, and GCM as the destination. Since ESMValTool examines the lat and lon coords in the source grid and sees they are irregular it opts for ESMpy to handle regridding. Earlier I tried the other way around, i.e. regridding GCM data from the GCM grid onto the RCM grid. This time, ESMValTool sees that the lat and lon of the source grid are regularly spaced, and uses iris for the regridding. However, this also fails, in the case of the area_weighted scheme with the message:
As far as I'm aware this is a known limitation of iris. I.e. area weighted regridding is not currently supported for differing coordinate systems.
Looking at the data, the CMIP5 GCM data does not have a coordinate system specified. My RCM data does have the coordinate system specified on the rotated coords (rlon and rlat), but not on the standard lat and lon coords. |
I've not gotten any further with this yet, since I've been focussing on a couple of other bits of work. However in the meantime I have got my JASMIN access sorted again. @valeriupredoi Could you let me know how to access the installations there? I can then transfer my data to JASMIN and we'll be singing from the same hymn sheet at least. |
friendly ping @valeriupredoi |
blast! this fell off the radar big way, sorry. @thomascrocker on JASMIN just do |
@thomascrocker - right - am afraid without the actual data I can't do much; it would be great if you could put the data (or a sample of it) on a shared group workspace, eg esmeval where we store the OBS data - could you maybe do that and let me know pls (you'll have to apply for membership of that gws but I'll grant it right away), after that I will attempt at running a test 🍺 Quick question - is this a dataset that will eventually be on ESGF? If so, we need to talk - the official cmorization is done via CDDS and that's what you guys should be running 👍 |
Hi @valeriupredoi thanks very much for looking into this. I've applied for access to the esmval workspace so will upload some data once it's approved. On the ESGF question. No it won't be ending up there (at least not in the short term). The budget for the project is very limited and hosting on ESGF isn't part of it. However, I have been doing my cmorization using an adapted version of the CDDS suites that are being used internally to prep data from other projects for upload to ESGF. |
@thomascrocker just approved your membership for esmeval 👍 If you could tell me where you put the data that'd be awesome! About CDDS - good stuff, let me know if you want to use the Jasmin installation, I am managing the project on Jasmin 🍺 |
OK, I've dropped some monthly data in: I've also uploaded the custom CMOR tables that I've been using (these are the ones that were used by the suite I used to CMORize my data) into:
Finally, here is the entry I used in my config-developer.yml file for reading the data in recipes:
Hopefully that should be sufficient for you to read in the data and attempt to regrid it to a regular lat/lon grid. Let me know if you have any issues or questions. Cheers |
great! cheers, mate - I'll have a stab at testing it tomorrow 👍 🍺 |
@thomascrocker here's what I found out: I tried running a recipe (which I post below), and immediately run into this CMOR checker error:
Note that this is a not such a serious issue since if one runs with
but the code will not crash as it's supposed to do when there are serious issues. Note that this should be fixed since the default setting for cmor checks does crash and spits out this error, and we want data to ideally be compliant to the default level. But anyway, we can run with relaxed for now... OK, now the recipe: ---
documentation:
description: |
Check 0 values from regrid
authors:
- predoi_valeriu
preprocessors:
regridp:
regrid:
target_grid: 1x1
scheme: linear
diagnostics:
BerkeleyEarth:
description: Antigua check
variables:
pr:
preprocessor: regridp
additional_datasets:
- {dataset: MOHC-HadREM3-GA7-05, domain: CEFAS_Antigua, project: CORDEX, driver: MOHC-HadGEM2-ES, mip: mon, exp: historical, ensemble: r1i1p1, rcm_version: v1,
start_year: 2000, end_year: 2008}
scripts: null you can use this recipe and use the CORDEX project right out the box, not needing to add projects to the config-developer file nor paths to custom cmor dirs; see the config-user: rootpath:
CORDEX: /group_workspaces/jasmin4/esmeval/gh_issue772/mi-ba795
drs:
CORDEX: default and it'll work right away! Also note that your custom cmor tables are almost identical to the ones in esmvaltool (CORDEX ones), bar a couple more vars in your tables, but not affecting the current tests. Now, on to the issue at hand - why the regridded data is 0: here's what I found:
def regridder(src):
"""Regrid 2d for irregular grids."""
res = get_empty_data(dst_rep.shape, src.dtype)
data = src.data
if np.ma.is_masked(data):
data = data.data
src_field.data[...] = data.T
print("source data", np.mean(src_field.data))
regr_field = field_regridder(src_field, dst_field)
print("regridder field data", np.mean(regr_field.data))
res.data[...] = regr_field.data[...].T
res.mask[...] = dst_mask
return res the source data is O(1e-5) and the regr data is O(1e-300) - so it means that the ESMF regridder: field_regridder = ESMF.Regrid(src_mask_values=np.array([1]),
dst_mask_values=np.array([1]),
**regridding_arguments) is spitting out garbage; note that this is what I got for a regrid on a 1x1 degree grid, increasing the target grid to 10x10 or 20x20 etc does not improve the supertiny result numbers, nor does multiplying the source data by any given of orders of magnitudes. It is as if the ESMF regridder hits an overflow and returns garbage instead of actual values (1.3e-322 for each data point). I am not familiar with ESMF though, but @zklaus is and maybe he can help? 🍺 |
also just noticed that some of my findings have already been listed by @thomascrocker in an earlier comment - meh, good thing we reached the same conclusion 😁 |
It might be a good idea to send @zklaus an email if you want a reaction from him, he hasn't been very actively reading the notifications from the ESMValTool project recently.. |
@valeriupredoi thanks for taking a look and good to see that we both came to the same conclusion at least. |
I will be emailing Klaus in 2min |
Alright, had a look, here is what I found:
|
good stuff @zklaus 👍 - what do you recommend - implementing a check on the CS and subsequent treatment depending on it or this is something that will work out of the box in iris 3+ - also do you think this is worth plopping on to SciTools GitHub? 🍺 |
The coordinate system issue has to be fixed regardless of this regridding problem, I think. It is a bit hairy though for two reasons: One is that a general approach with The regridding should be taken care of with the new regridding project. But perhaps an interims fix is sensible? |
@thomascrocker, @valeriupredoi, I put up PR #865 that should fix the regridding issue (at least I was able to run your recipe and data with it. Could you give it a test run? |
Thanks for developing a fix so quickly.. Looks like there is a problem though, I found that while the regridding appears to work correctly, it has the side effect of doing something to the latitude and longitude co-ordinates such that Iris can't read them.. This is what a print(cube) in python returns on the regrid cube produced by the processor
However, it looks like the coords are still there in the netcdf file
Could it be that the grid_mapping attribute needs removing from the variable? |
Yes, that does seem to do the trick. WIll have a look how it snuck through... |
Turns out this is a resurfacing of #479. |
Hello, I'm attempting to use ESMValTool with data from the Met Office HadGEM3-RA regional model.
ESMValTool_issue_files.zip
I have successfully run my model output through an in house CMORization / mip_convert suite and added project lines to read the files to a custom
config-developer.yml
file (using the same CMOR tables as for CORDEX). My simple recipe and diagnostic runs without any errors, I've also not noticed any relevant warnings.However, something is going wrong with the regridding steps (and if I remove regridding steps from the recipe, I get problems with area extraction, as I'll explain later).
I set
save_intermediary_cubes
to true in myconfig_user.yml
in order to examine the state of the data as it moves through the pre processor. Regardless of regridding scheme chosen, the data created immediately after the regridding step (in05_regrid.nc
) has all been set to zero, across every single grid point. (The data prior to this step is still intact)I wondered if this is perhaps related to this open pull request #185 (If so, is the code under review likely to fix my problem?)
I also tried removing the regridding steps from my recipe entirely. In this case although again my recipe ran without any warnings or errors, the output was not as expected and I noticed something goes wrong at the point that the region extraction is carried out. This time the data immediately after the region extraction has all been set to 1e20 (masking?). Again, the preprocessed data immediately prior to this step is fine.
Any help appreciated, I'm hoping to use ESMValTool to perform analysis of some downscaled climate projections for the Caribbean that we have recently completed.
Thanks
The text was updated successfully, but these errors were encountered: