Full xarray.Dataset support #3

smithara · 2018-10-15T16:15:22Z

Currently only scalar and 3-vectors are handled by the translation from CDF to xarray.Dataset (here). 3-vectors are given a dimensions label "dim", and I just implemented that to handle the MAG-B_NEC data.

The proper solution is that every dimension in the data is given an appropriate label - I am not sure if this information is in the original CDF files, otherwise it will just have to be hard-coded in for every variable. It would make sense to do this on the server and build and send a netCDF from the server instead.

The same applies to adding metadata (units etc - e.g. cdf.varattsget("F") -> {'DESCRIPTION': 'Magnetic field intensity', 'UNITS': 'nT'}, and global attributes for ORIGINAL_PRODUCT_NAMES, MAGNETIC_MODELS ...). This is particularly useful as it will be used by xarray for plotting: http://xarray.pydata.org/en/stable/plotting.html#one-dimension

The xarray.Dataset/netCDF (xarray.Dataset is a direct mapping to a netCDF file) should probably follow the netCDF-CF conventions - this is in line with Aeolus (I think).

See also: http://xarray.pydata.org/en/stable/faq.html#what-is-your-approach-to-metadata

It would also be good to look at making the xarray.Dataset creation faster. The main slowdown is probably the pandas.to_datetime() call (same applies to the pandas.Dataframe conversion). Also, with very large datasets where xarray.concat() is used, it is very slow - I found that a file of a few GB took longer than 30 minutes to create the xarray.Dataset. This is further justification to build this on the server instead.

In the (probably far) future, I think we can make use of sparse xarray so that the (empty except for one point) Lat/Lon/Rad dimensions can be filled in, instead of just using a "flat" time series, so that we build a "data cube" and 2D plotting and other things can be done directly. (I could be wrong here, or most likely there is some other way to achieve this)

The text was updated successfully, but these errors were encountered:

smithara · 2018-11-29T16:11:51Z

viresclient currently requires xarray < 0.11 as there may be some incompatible changes that I haven't checked. Update to allow v.011 - See http://xarray.pydata.org/en/stable/whats-new.html

Update: higher versions (<1.0) of xarray are allowable since viresclient v0.4

smithara · 2020-01-26T20:09:00Z

Meta data is now included in the produced xarray.Dataset.
Global attributes (accessible as ds.attrs): "Sources", "MagneticModels", "RangeFilters"
Variable attributes (ds[x].attrs): "units", "description"

Multi-dimensional variables are now set up with appropriate xarray dimensions and coordinate labels (

VirES-Python-Client/viresclient/_data_handling.py

Line 43 in 446a49c

# Frame names to use as xarray dimension names

)

I have not yet made improvements to the loading speed.

smithara added the enhancement New feature or request label Oct 15, 2018

smithara closed this as completed Jan 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full xarray.Dataset support #3

Full xarray.Dataset support #3

smithara commented Oct 15, 2018 •

edited

Loading

smithara commented Nov 29, 2018 •

edited

Loading

smithara commented Jan 26, 2020

Full xarray.Dataset support #3

Full xarray.Dataset support #3

Comments

smithara commented Oct 15, 2018 • edited Loading

smithara commented Nov 29, 2018 • edited Loading

smithara commented Jan 26, 2020

smithara commented Oct 15, 2018 •

edited

Loading

smithara commented Nov 29, 2018 •

edited

Loading