-
Notifications
You must be signed in to change notification settings - Fork 15
Software Requirements
SR-01 Global data access function
The API shall offer a global function or method of a global object that returns a dataset representation when given a dataset name and a list of optional selectors. The selectors allow to subset the overall dataset in the data store.
Possible solutions:
The following example presents access to a local data store:
data_store = LocalDataStore('/home/norman/esa-cci-data')
ozone_dataset = data_store.load('ozone/data/total_columns/l3/merged/v0100/'+
'ESACCI-OZONE-L3S-TC-MERGED-DLR_1M-(?P<year>\d\d\d\d)0104-fv0100.nc',
lambda kv: int(kv['year']) == 2012)
It has the advantage that the LocalDataStore does not need to know anything about the contents of the source data tree. It has the disadvantage that users must know how the file tree is organised and how dataset files are named. A smarter LocalDataStore could scan the tree, create an index, and maintain it. The index will contain the various different datasets available. Each dataset comprises a set of netCDF files or shapefiles where each file contributes to a unique time series. For each dataset the index would also provide the common file content schema (info about variables and dimensions), the spatial and temporal coverage and other information, for example the coordinate reference system used.
data_store = LocalDataStore('/home/norman/esa-cci-data')
data_store.dataset_info()