Skip to content
This repository has been archived by the owner on Aug 28, 2023. It is now read-only.

Occasional failures in netcdf c library on data import #58

Open
CloudNiner opened this issue Aug 18, 2016 · 0 comments
Open

Occasional failures in netcdf c library on data import #58

CloudNiner opened this issue Aug 18, 2016 · 0 comments

Comments

@CloudNiner
Copy link
Contributor

Stack trace from EC2 worker. Same combination of scenario/model/year on development machine did not error.

INFO:climate_data:Processing SQS message for model IPSL-CM5A-MR scenario RCP85 year 2027
Failed to process data for model IPSL-CM5A-MR scenario RCP85 year 2027
Traceback (most recent call last):
 File "/opt/django/climate_change_api/climate_data/management/commands/run_jobs.py", line 58, in process_message
 Nex2DB().nex2db(variables, datasource)
 File "/opt/django/climate_change_api/climate_data/nex2db.py", line 105, in nex2db
 for label, path in variable_paths.iteritems()}
 File "/opt/django/climate_change_api/climate_data/nex2db.py", line 105, in <dictcomp>
 for label, path in variable_paths.iteritems()}
 File "/opt/django/climate_change_api/climate_data/nex2db.py", line 67, in get_var_data
 var_data = numpy.asarray(ds.variables[var_name])
 File "/usr/local/lib/python2.7/site-packages/numpy/core/numeric.py", line 482, in asarray
 return array(a, dtype, copy=False, order=order)
 File "netCDF4/_netCDF4.pyx", line 3249, in netCDF4._netCDF4.Variable.__array__ (netCDF4/_netCDF4.c:29477)
 File "netCDF4/_netCDF4.pyx", line 3695, in netCDF4._netCDF4.Variable.__getitem__ (netCDF4/_netCDF4.c:37072)
 File "netCDF4/_netCDF4.pyx", line 4376, in netCDF4._netCDF4.Variable._get (netCDF4/_netCDF4.c:46292)
RuntimeError: NetCDF: HDF error
ERROR:climate_data:Failed to process data for model IPSL-CM5A-MR scenario RCP85 year 2027
Traceback (most recent call last):
 File "/opt/django/climate_change_api/climate_data/management/commands/run_jobs.py", line 58, in process_message
 Nex2DB().nex2db(variables, datasource)
 File "/opt/django/climate_change_api/climate_data/nex2db.py", line 105, in nex2db
 for label, path in variable_paths.iteritems()}
 File "/opt/django/climate_change_api/climate_data/nex2db.py", line 105, in <dictcomp>
 for label, path in variable_paths.iteritems()}
 File "/opt/django/climate_change_api/climate_data/nex2db.py", line 67, in get_var_data
 var_data = numpy.asarray(ds.variables[var_name])
 File "/usr/local/lib/python2.7/site-packages/numpy/core/numeric.py", line 482, in asarray
 return array(a, dtype, copy=False, order=order)

Maybe a related issue?: Unidata/netcdf4-python#279
I didn't find much else after some googling, and some issues were definitely unrelated or were resolved in the version of the python lib we're using.

It is interesting that we read the entire variable into memory, which we can probably avoid by only accessing the individual grid cells we need via direct indexing of the netcdf data structure, rather than converting to a numpy array and then indexing:
https://github.com/azavea/climate-change-api/blob/develop/django/climate_change_api/climate_data/nex2db.py#L67

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants