Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loaded from netcdf with units of "-" or "no_unit" does not save with the same meaning. #5368

Open
pp-mo opened this issue Jul 5, 2023 · 0 comments

Comments

@pp-mo
Copy link
Member

pp-mo commented Jul 5, 2023

🐛 Bug Report

When a file contains a variable with an attribute like <varname>:units = "-",
this loads to an Iris object with <obj>.units == cf_units.Unit('-') ,
( which prints out as (no_unit) )

This also happens with various other equivalent unit strings, as listed in
cf_units.__NO_UNIT= ['no_unit', '-', 'no unit', 'no-unit', 'nounit']

But when we save this, it is handled here, which treats it exactly the same as 'unknown'
( since Unit('-').is_udunit() is False )
So the effect is that no "units" attribute is written on output, which of course does not match the input.

When the re-saved data is loaded back, it then has units 'unknown' instead of 'no_unit'.

This is basically wrong, since "no unit" means the data is actually known to be unitless (typically dimensionless),
which is not the same as not knowing what the units are.

E.G. this can be seen in the Iris test-data file
"test_data/NetCDF4/global/xyz_t/GEMS_CO2_Apr2006.nc"
- where the 'lnsp' variable has an attribute units = "-", because it is "Logarithm of surface pressure", which doesn't have a valid ud-units representation, and so is marked as 'unitless'.

How To Reproduce

Steps to reproduce the behaviour:

>>> cube = iris.cube.Cube([1], units='-')
>>> iris.save(cube, 'tmp3.nc')
>>> loadback = iris.load_cube('tmp3.nc')
>>> 
>>> # Fix some other equality-destroying problems ... 
>>> loadback.var_name = None
>>> del loadback.attributes['Conventions']
>>> 
>>> # (still) does not compare
>>> cube == loadback
False
>>> loadback.units = 'no_unit'
>>> 
>>> # Does compare
>>> cube == loadback
True
>>> 

Expected behaviour

In the above, the re-loaded cube loadback should have the same units as the original cube.

Cubes + coords which have units of 'no_unit' should be saved with a units = "-" attribute,
rather than no units attribute.

BUT NOTE

We did earlier make a definite decision that we should treat all non-udunits units in the same way -- i.e. don't write a units attribute.
See #3711

The conflict with this is that various of our test-files do include units of the type var:units = "-".
Notably, pretty much all of the 'LFRic' files here
But also notably, the file global/xyz_t/GEMS_CO2_Apr2006.nc, which is a GEMS file from ecmwf

This alternative view was also aired by rcomer in the original discussion ...

It seems to me that no_unit is a valid option to say “units aren’t meaningful here”. E.g. it’s used for all the string coords in iris.coord_categorisation. So it makes sense to me to continue saving it.

Environment

  • linux
  • Iris 3.6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

1 participant