Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load fx variables as cube cell measures / ancillary variables #999

Merged
merged 63 commits into from
Apr 30, 2021

Conversation

sloosvel
Copy link
Contributor

Description

This pull requests adds a new default preprocessor step called add_cell_measures. This step cmor-checks and loads areas and volumes as cell measures into a cube variable if the preprocessor steps area_statistics and volume_statistics are called in the recipe. Thus allowing the preprocessor chain to be applied in fx variables. If those steps are not included, nothing happens to the original cube.

It should also solve a couple of issues that there are with the current fx handling:

  • If the volume is time dependent, all files are loaded instead of only keeping the first file.
  • I found that the current implementation allows to define the fx_variables option as both a list and a list of dicts. Instead this pull requests proposes to call the fx_variables using YAML-file identations so that when the recipe is loaded, it is always loaded as a dict. Because allowing both lists and dicts ends up complicating the the code.

I open this as a draft because there may be some issues in terms of where to define this preprocessor step (it's added in _other.py, but that may not be the best option). Also because I had to modify code that I did not originally write, so even though tests are passing, maybe I changed something that should not have been changed.


Before you get started

Checklist

If you make backwards incompatible changes to the recipe format:

  • Update ESMValTool and link the pull request(s) in the description

To help with the number pull requests:

@sloosvel sloosvel added the preprocessor Related to the preprocessor label Feb 10, 2021
Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work @sloosvel, it would be great if you could make the pull request ready for review!

Comment on lines 83 to 84
mip = loaded_cube[0].attributes['table_id']
freq = loaded_cube[0].attributes['frequency']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to pass these in as function arguments rather than relying on the input NetCDF files to have these global attributes

@valeriupredoi
Copy link
Contributor

nice work @sloosvel - could you please address the Codacy complaints? I will have a good look after that, still stuck in Tool-land at the moment 👍

@sloosvel sloosvel marked this pull request as ready for review February 18, 2021 17:28
Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think this is a very nice new feature. Could you please have a look at the comments?

Could you also add documentation for the new preprocessor function add_cell_measure to doc/recipe/preprocessor.rst ?

Are you planning to add another very similar preprocessor function for adding other ancillary variables in a follow up pull request? E.g. land-sea masks for the masking preprocessor functions?

esmvalcore/preprocessor/_area.py Show resolved Hide resolved
return grid_areas


# get the area average
def area_statistics(cube, operator, fx_variables=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the fx_variables argument now be removed from this function, since they are now available in the cube?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that, but then how would someone specify in the recipe the fx file that is needed?

@@ -388,7 +391,11 @@ experiment is preferred for fx data retrieval:
landmask:
mask_landsea:
mask_out: sea
fx_variables: [{'short_name': 'sftlf', 'exp': 'piControl'}, {'short_name': 'sftof', 'exp': 'piControl'}]
fx_variables:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new syntax, but could you make it backward compatible so the old syntax keeps working, at least for a while?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe until version 2.4 or 2.5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try, but it won't look too good I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yesh, I was asking a related question in my previous comment. I agree with @bouweandela on this one, when a new functional feature is implemented it's always nice to have a temporary overlap with its previous syntax 👍

@@ -41,7 +41,7 @@
mask_outside_range,
)
from ._multimodel import multi_model_statistics
from ._other import clip
from ._other import clip, add_cell_measure
Copy link
Member

@bouweandela bouweandela Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's added in _other.py, but that may not be the best option).

Maybe you could add a new module esmvalcore/preprocessor/_ancillary_variables.py and keep the code there? I feel that you are right and other is not very clear.

if cube.data.ndim == 4 and grid_volume.ndim == 3:
grid_volume = np.tile(grid_volume,
[cube_shape[0], 1, 1, 1])
if not fx_variables:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this function, I think the fx_variables argument can be removed now, right?

@valeriupredoi
Copy link
Contributor

@sloosvel I am very happy for this PR to be merged asap, I have gone through it in detail and I reckon it's a very nice one and you've answered all my queries (cheers for being patient with me!). The only thing preventing a merge is the broken test on Circle that is not broken in reality, I'd like @bouweandela to shed light on this via #1063 (when he's got time) and then we'll get this in right away 🍺

@valeriupredoi valeriupredoi added the enhancement New feature or request label Mar 30, 2021
@bouweandela
Copy link
Member

bouweandela commented Apr 19, 2021

The only thing preventing a merge is the broken test on Circle that is not broken in reality, I'd like @bouweandela to shed light on this via #1063 (when he's got time) and then we'll get this in right away 🍺

@valeriupredoi The test is fixed now, let's get this merged if you think it's ready! If needed, we can continue the discussion on the tests in the issue. I won't be back at work for another two weeks, let's not wait for that.

@sloosvel
Copy link
Contributor Author

Actually I have been wondering if it would be better to remove the ancillary vars/ measures after the preproc steps that need them finish.

@valeriupredoi
Copy link
Contributor

@bouweandela cheers! @sloosvel all fine and ready for merge by me, just let me know if you think this is ready-ready 👍 BTW I am totally fine with the Codacy issues too 🍺

@bouweandela
Copy link
Member

@sloosvel I think that would be better, maybe just before the save step? Because of the broadcasting, keeping them would result in rather large output files.

@valeriupredoi
Copy link
Contributor

good to merge @sloosvel ? I feel the need for speed(merge) 😁

@sloosvel
Copy link
Contributor Author

Well it seemed to work for what I have tested but if anyone else wants to double check they are welcome to!

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Apr 30, 2021

I am very happy to merge now and fix whatevs later, I tested this and it works - if there are any corner cases we can always open new PRs, this is great amount of work and added functionality that needs not wait no more 👍

@valeriupredoi valeriupredoi merged commit fee5d54 into master Apr 30, 2021
@valeriupredoi valeriupredoi deleted the dev_load_cell_measures_in_cube branch April 30, 2021 12:09
@valeriupredoi
Copy link
Contributor

thank you very very much again @sloosvel 🍺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request preprocessor Related to the preprocessor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preprocessor chain is not being applied to cell measures
3 participants