Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File open fails when lon and lat are variables (not dimensions) #32

Closed
durack1 opened this issue Nov 22, 2016 · 32 comments
Closed

File open fails when lon and lat are variables (not dimensions) #32

durack1 opened this issue Nov 22, 2016 · 32 comments
Assignees
Milestone

Comments

@durack1
Copy link
Member

durack1 commented Nov 22, 2016

The attached file is unreadable in the current version of cdms2 (v2.8). I believe the issue is that the variables (not dimensions) lon and lat included in the file are assumed to be standard dimensions (which they aren't), and these assumptions lead to a read failure.

cdmsReadFail.zip

The file looks like:

ncdump -h 161122_RobertPincus_multiple_input4MIPs_radiation_RFMIP_UColorado-RFMIP-20161122_none.nc 
netcdf \161122_RobertPincus_multiple_input4MIPs_radiation_RFMIP_UColorado-RFMIP-20161122_none {
dimensions:
	expt = 18 ;
	level = 61 ;
	layer = 60 ;
	site = 100 ;
variables:
	float lon(site) ;
		lon:long_name = "ERA-Interim longitude" ;
		lon:units = "degree_north" ;
		lon:standard_name = "longitude" ;
	float lat(site) ;
		lat:long_name = "ERA-Interim latitude" ;
		lat:units = "degree_east" ;
		lat:standard_name = "latitude" ;
	float time(site) ;
		time:long_name = "ERA-Interim fractional day of the year 2014" ;
		time:units = "days since 2014-1-1 0:0:0" ;
		time:standard_name = "time" ;
		time:calendar = "gregorian" ;
	float sst(site) ;
		sst:title = "sea surface temperature" ;
		sst:units = "K" ;
		sst:long_name = "ERA-Interim sea surface temperature (= \"missing_value\" over land)" ;
		sst:standard_name = "sea_surface_temperature" ;
		sst:missing_value = -9.99f ;
		sst:FillValue = -9.99f ;
		sst:coordinates = "lon lat time" ;
...

And the read failure looks like:

In [1]: infile = '161122_RobertPincus_multiple_input4MIPs_radiation_RFMIP_UColorado-RFMIP-20161122_none.nc'
In [2]: import cdms2 as cdm
In [3]: f = cdm.open(infile)
---------------------------------------------------------------------------
CDMSError                                 Traceback (most recent call last)
<ipython-input-3-da3bd29babf2> in <module>()
----> 1 f = cdm.open(infile)
/uvcdat280/lib/python2.7/site-packages/cdms2/dataset.pyc in openDataset(uri, mode, template, dods, dpath, hostObj)
    357 
    358             # The file exists
--> 359             file1 = CdmsFile(path, "r")
    360             if libcf is not None:
    361                 if hasattr(file1, libcf.CF_FILETYPE):
/uvcdat280/lib/python2.7/site-packages/cdms2/dataset.pyc in __init__(self, path, mode, hostObj, mpiBarrier)
   1137                 # Get grid information for the variable. gridkey has the form
   1138                 # (latname,lonname,order,maskname, abstract_class).
-> 1139                 gridkey, lat, lon = var.generateGridkey(self._convention_, self.variables)
   1140 
   1141                 # If the variable is gridded, lookup the grid. If no such grid exists,
/uvcdat280/lib/python2.7/site-packages/cdms2/avariable.pyc in generateGridkey(self, convention, vardict)
    239         lon, nlon = convention.getVarLonId(self, vardict)
    240         if (lat is not None) and (lat is lon):
--> 241             raise CDMSError, "Axis %s is both a latitude and longitude axis! Check standard_name and/or axis attributes."%lat.id
    242 
    243         # Check for 2D grid
CDMSError: Axis lon is both a latitude and longitude axis! Check standard_name and/or axis attributes.
@doutriaux1
Copy link
Contributor

when did you create your conda env? I uploaded an updated cdms2 last night.

@durack1
Copy link
Member Author

durack1 commented Nov 22, 2016

@doutriaux1 here we go:

(uvcdat280) bash-3.2$ ls -ald /uvcdat280/lib/python2.7/site-packages/cdms2
drwxr-xr-x  100 duro  THE-LAB\Domain Users  3400 Nov 16 15:33 /uvcdat280/lib/python2.7/site-packages/cdms2
(uvcdat280) bash-3.2$ conda update cdms2
Fetching package metadata .......
# All requested packages already installed.
# packages in environment at /uvcdat280:
#
cdms2                     2.8                 np111py27_2    uvcdat

@doutriaux1
Copy link
Contributor

please do a force

@durack1
Copy link
Member Author

durack1 commented Nov 22, 2016

@doutriaux1 same issue, it's a problem with cdms:

(uvcdat280) bash-3.2$ conda update cdms2 --force
Fetching package metadata .......
Solving package specifications: ..........
Package plan for installation in environment /uvcdat280:
The following NEW packages will be INSTALLED:
    cdms2: 2.8-np111py27_2 uvcdat
Proceed ([y]/n)? y
Pruning extracted packages from the cache ...
[      COMPLETE      ]|########################################################################################################################################################| 100%
Extracting packages ...
[      COMPLETE      ]|########################################################################################################################################################| 100%
Linking packages ...
(uvcdat280) bash-3.2$ ipython
Python 2.7.12 | packaged by conda-forge | (default, Sep  8 2016, 14:41:48) 
Type "copyright", "credits" or "license" for more information.
IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
In [1]: import cdms2 as cdm
In [2]: infile = '161122_RobertPincus_multiple_input4MIPs_radiation_RFMIP_UColorado-RFMIP-20161122_none.nc'
In [3]: f = cdm.open(infile)
---------------------------------------------------------------------------
CDMSError                                 Traceback (most recent call last)
<ipython-input-3-da3bd29babf2> in <module>()
----> 1 f = cdm.open(infile)
/uvcdat280/lib/python2.7/site-packages/cdms2/dataset.pyc in openDataset(uri, mode, template, dods, dpath, hostObj)
    357 
    358             # The file exists
--> 359             file1 = CdmsFile(path, "r")
    360             if libcf is not None:
    361                 if hasattr(file1, libcf.CF_FILETYPE):
/uvcdat280/lib/python2.7/site-packages/cdms2/dataset.pyc in __init__(self, path, mode, hostObj, mpiBarrier)
   1137                 # Get grid information for the variable. gridkey has the form
   1138                 # (latname,lonname,order,maskname, abstract_class).
-> 1139                 gridkey, lat, lon = var.generateGridkey(self._convention_, self.variables)
   1140 
   1141                 # If the variable is gridded, lookup the grid. If no such grid exists,
/uvcdat280/lib/python2.7/site-packages/cdms2/avariable.pyc in generateGridkey(self, convention, vardict)
    239         lon, nlon = convention.getVarLonId(self, vardict)
    240         if (lat is not None) and (lat is lon):
--> 241             raise CDMSError, "Axis %s is both a latitude and longitude axis! Check standard_name and/or axis attributes."%lat.id
    242 
    243         # Check for 2D grid
CDMSError: Axis lon is both a latitude and longitude axis! Check standard_name and/or axis attributes.

@durack1
Copy link
Member Author

durack1 commented Nov 22, 2016

@doutriaux1 curiously, in a previous (and not identical) version of the file, if the lon:units were changed from "degree_north" to "degree north" (remove the underscore) it seemed to be readable by the checker/cdms

@dnadeau4 pinging you into this thread

@dnadeau4
Copy link
Contributor

degree_north and degree_east are accepted and CDMS2 should work with this file. cf-conventions

I will enhanced CDMS2 to make sure your lat/lon are detected correctly

@dnadeau4 dnadeau4 self-assigned this Nov 22, 2016
@dnadeau4 dnadeau4 added this to the 3.0 milestone Nov 22, 2016
@durack1
Copy link
Member Author

durack1 commented Nov 23, 2016

@dnadeau4 out of curiosity, I tried to update the file to include lon:axis = "X" and similar for lat and I still get the identical error:

[durack1@oceanonly RFMIP]$ ncdump -h test.nc | more                                                                                  
netcdf test {                                                                                                                        
dimensions:                                                                                                                          
        expt = 18 ;                                                                                                                  
        level = 61 ;                                                                                                                 
        layer = 60 ;                                                                                                                 
        site = 100 ;                                                                                                                 
variables:                                                                                                                           
        float lon(site) ;                                                                                                            
                lon:long_name = "ERA-Interim longitude" ;                                                                            
                lon:units = "degree_north" ;                                                                                         
                lon:standard_name = "longitude" ;                                                                                    
                lon:axis = "X" ;                                                                                                     
        float lat(site) ;                                                                                                            
                lat:long_name = "ERA-Interim latitude" ;                                                                             
                lat:units = "degree_east" ;                                                                                          
                lat:standard_name = "latitude" ;                                                                                     
                lat:axis = "Y" ;                                                                                                     
        float time(site) ;                                                                                                           
                time:long_name = "ERA-Interim fractional day of the year 2014" ;                                                     
                time:units = "days since 2014-1-1 0:0:0" ;                                                                           
                time:standard_name = "time" ;                                                                                        
                time:calendar = "gregorian" ;                                                                                        
        float sst(site) ;                                                                                                            
                sst:title = "sea surface temperature" ;                                                                              
                sst:units = "K" ;                                                                                                    
                sst:long_name = "ERA-Interim sea surface temperature (= \"missing_value\" over land)" ;                              
                sst:standard_name = "sea_surface_temperature" ;                                                                      
                sst:missing_value = -9.99f ;                                                                                         
                sst:FillValue = -9.99f ;                                                                                             
                sst:coordinates = "lon lat time" ;

And the error:

(uvcdatNightly) duri@ocean:[RFMIP]:[2858]> ipython
Python 2.7.12 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:42:40)
Type "copyright", "credits" or "license" for more information.
IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
In [1]: import cdms2 as cdm
In [2]: infile = 'test.nc'
In [3]: f = cdm.open(infile)
---------------------------------------------------------------------------
CDMSError                                 Traceback (most recent call last)
<ipython-input-3-da3bd29babf2> in <module>()
----> 1 f = cdm.open(infile)
/uvcdatNightly/lib/python2.7/site-packages/cdms2/dataset.pyc in openDataset(uri, mode, template, dods, dpath, hostObj)
    307
    308             # The file exists
--> 309             file1 = CdmsFile(path,"r")
    310             if libcf is not None:
    311                 if hasattr(file1, libcf.CF_FILETYPE):
/uvcdatNightly/lib/python2.7/site-packages/cdms2/dataset.pyc in __init__(self, path, mode, hostObj, mpiBarrier)
   1063                 # Get grid information for the variable. gridkey has the form
   1064                 # (latname,lonname,order,maskname, abstract_class).
-> 1065                 gridkey, lat, lon = var.generateGridkey(self._convention_, self.variables)
   1066
   1067                 # If the variable is gridded, lookup the grid. If no such grid exists,
/uvcdatNightly/lib/python2.7/site-packages/cdms2/avariable.pyc in generateGridkey(self, convention, vardict)
    236         lon, nlon = convention.getVarLonId(self, vardict)
    237         if (lat is not None) and (lat is lon):
--> 238             raise CDMSError, "Axis %s is both a latitude and longitude axis! Check standard_name and/or axis attributes."%lat.id
    239
    240         # Check for 2D grid
CDMSError: Axis lon is both a latitude and longitude axis! Check standard_name and/or axis attributes.

@durack1
Copy link
Member Author

durack1 commented Nov 23, 2016

@dnadeau4 if you amend the lon:units = "degree north" (remove the underscore) it loads fine:

(uvcdatNightly) duro@ocean:[RFMIP]:[2858]> ipython
Python 2.7.12 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:42:40)
Type "copyright", "credits" or "license" for more information.
IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
In [1]: import cdms2 as cdm
In [2]: infile = 'test_lonUnitTweak.nc'
In [3]: f = cdm.open(infile)
In [4]:

And for completeness the ncdump of the top of the file:

netcdf test_lonUnitTweak {
dimensions:
        expt = 18 ;
        level = 61 ;
        layer = 60 ;
        site = 100 ;
variables:
        float lon(site) ;
                lon:long_name = "ERA-Interim longitude" ;
                lon:standard_name = "longitude" ;
                lon:axis = "X" ;
                lon:units = "degree north" ;
        float lat(site) ;
                lat:long_name = "ERA-Interim latitude" ;
                lat:units = "degree_east" ;
                lat:standard_name = "latitude" ;
                lat:axis = "Y" ;

@durack1
Copy link
Member Author

durack1 commented Jan 10, 2017

@dnadeau4 just wondering where this issue is sitting? We're hoping to get @RobertPincus's data published in the input4MIPs project, however the publisher/cdms can't deal with his data currently..

@dnadeau4
Copy link
Contributor

Nothing yet, I am working on issue #7. @doutriaux1 gave me a priority list check label #priority and talk with @doutriaux1.

@durack1
Copy link
Member Author

durack1 commented Jan 15, 2017

@sashakames with #76 it seems that @dnadeau4 has solved this issue and so this version of bug-fixed cdms will need to be pulled across so the ESGF publisher can open and publish the file (attached above). Once this PR is merged, can we prioritise this bug fix patch into ESGF?

@dnadeau4
Copy link
Contributor

@durack1 this is now working. The issue was the wrong units for latitude/longitude. I used nco to change them and now cdms2 works.

ncatted -O -h -a units,lon,o,c,degree_east ~/Downloads/161122_RobertPincus_multiple_input4MIPs_radiation_RFMIP_UColorado-RFMIP-20161122_none.nc
ncatted -O -h -a units,lat,o,c,degree_north ~/Downloads/161122_RobertPincus_multiple_input4MIPs_radiation_RFMIP_UColorado-RFMIP-20161122_none.nc

@durack1
Copy link
Member Author

durack1 commented Jan 18, 2017

@dnadeau4 the issue wasn't incorrect units, these were changed to prevent the CF checker (which uses a much earlier version of "chat-lite") from crashing.. I confirmed the same behaviour in the 2.8 cdms release and hence this issue

@durack1
Copy link
Member Author

durack1 commented Jan 18, 2017

@dnadeau4 if you see the original ncdump output, these are the same as your ncatted edits above

@dnadeau4
Copy link
Contributor

@durack1 I was able to work with these files when I changed the units. Why is longitudes degrees_north in the original ncdump?

@dnadeau4
Copy link
Contributor

degree is an acceptable unit in CF-1 conventions. When you set degree north I assumed CDMS used the first word degree. But if you set longitude to degree_north the command isLatitude returns true. That is why you get the error that lat is equal to lon.

@dnadeau4
Copy link
Contributor

longitude needs to be set to degree_east

@sashakames
Copy link

sashakames commented Jan 18, 2017 via email

@durack1
Copy link
Member Author

durack1 commented Jan 18, 2017

dnadeau4 ok now I get you.. So the unit values were flipped lon:units = "degree_north" ; should have been lon:units = "degree_east" ; and likewise for lat.. I still think the software should fix (or at the very least point out) dumb user errors like this.. So checking standard_name or longname for longitude or latitude string matches rather than bombing like it did and not specifically pointing out that the units for the dimensions (longitude and latitude) are incompatible with the software (and conventions)

I wonder if there would be a way, in the case of a failure testing the units against the possible values and returning a "units invalid" error, rather than ungracefully exiting as it does. The fact that cdms can't read this file (due to the units error) is kinda worrying to me..

@durack1
Copy link
Member Author

durack1 commented Jan 18, 2017

@dnadeau4 you might want to create a user error tag, and assign it here.. I know @doutriaux1 likes that label alot

@doutriaux1
Copy link
Contributor

@dnadeau4 I think @durack1 is right here, can we catch the error and return a user understandable error here. Thanks.

@doutriaux1
Copy link
Contributor

doutriaux1 commented Jan 18, 2017

@durack1 see, as I 'm getting older I become weaker (and wiser?) and I'm willing to listen to users even when they are caught in a flagrant user error

@jypeter
Copy link
Member

jypeter commented Jan 18, 2017

That's good @doutriaux1, because our new users not only do not read the docs, but they tend to assume that things will work as they expect (regardless of what the functions are actually supposed to do). They must believe in magic and fairy tales

@durack1
Copy link
Member Author

durack1 commented Jan 18, 2017

@doutriaux1 that robust tag may indeed get awarded one day.. My philosophy here is to fix what you can irrespective of how trivial the error is.. The fact that the units were flipped avoided the eyes of a number of folks before @dnadeau4 correctly pointed the issue out.. I think the software should have pointed this out first and suggested (or automagically implemented) a fix.

@doutriaux1
Copy link
Contributor

@durack1 agreed the software should point this out to the user, I am personally STRONGLY opposed to to fixing things automatically for the user. I refuse to assume that I am smarter than the user, there might be a good reason for this that we don't know of. All we can do is let the user know that we can't go forward with the data the way it is.

@dnadeau4
Copy link
Contributor

dnadeau4 commented Jan 19, 2017

I guess I could try to use "standard_name" if "units" is not right for variables lat/lon and warn the user.

If there is an attribute axis='Y' then we are done! To pass the test, the variable must start wit 'lat' (mixed cases are fine).

    def isLatitude(self):                                                                                                                                    
        id = self.id.strip().lower()                                                                                                                         
        if (hasattr(self,'axis') and self.axis=='Y'): return 1                                                                                               
        units = getattr(self,"units","").strip().lower()                                                                                                     
        if units in ["degrees_north","degree_north","degree_n","degrees_n","degreen","degreesn"]:                                                            
          return 1                                                                                                                                           
        return (id[0:3] == 'lat') or (id in latitude_aliases)                                                                                                

@durack1
Copy link
Member Author

durack1 commented Jan 19, 2017

@dnadeau4 there is always the range check approach too, so for a valid latitude this can only occupy the range between -90.0 and 90.0, and for longitude this can only be -180.0 to 360.0 any other values are an invalid variable

The preferred behaviour to me would be for these CF variables and attributes to be set if they can be found (the current searches are fine), however if anything fails, rather than aborting the read, throw a descriptive error and return the file handle. This way a user is still able to operate on the file, but won't have much (or any?) of the cdms transientVariable magical stuff to work with. In the test case above this would allow me for example to work with the file and its contents, and what I would do is then read out all the information/variables from the file into memory, and then use cdms to rewrite the file correctly

@doutriaux1
Copy link
Contributor

range depends on the units though. But it's probably worth adding for known units.

@doutriaux1
Copy link
Contributor

we can't enforce a valid range for longitudes though

@durack1
Copy link
Member Author

durack1 commented Jan 20, 2017

@doutriaux1 why no valid range for longitudes? These values have to sit within the range -180 to 360 degrees. That deals with the -180 to 180 grids and the 0 to 360 grids too, what other ranges are required?

@dnadeau4
Copy link
Contributor

@doutriaux1 Can CDMS read other projections like Mercator? If so, I cannot check the range the @durack1 asked.

@doutriaux1
Copy link
Contributor

@dnadeau4 as i stated range depends on units, I think mercator and such will have units in m or something like that. So you could check range based on units degrees_north or radians or such

dnadeau4 added a commit that referenced this issue Feb 7, 2017
@dnadeau4 dnadeau4 added bug and removed invalid labels Feb 10, 2017
doutriaux1 pushed a commit that referenced this issue Feb 21, 2017
* Fix #32 for input4MIPs

* Fix #32: change Return 1 and return 0 -- to return True and return False

* add test for inpu4MIPs

* fix #78 Read NC_STRING variables [Can't write]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants