Skip to content

Commit

Permalink
some docs updates (pydata#2746)
Browse files Browse the repository at this point in the history
* Friendlier io title.

* Fix lists.

* Fix *args, **kwargs

"inline emphasis..."

* misc

* Reference xarray_extras for csv writing. Closes pydata#2289

* Add metpy accessor. Closes pydata#461

* fix transpose docstring. Closes pydata#2576

* Revert "Fix lists."

This reverts commit 39983a5.

* Revert "Fix *args, **kwargs"

This reverts commit 1b9da35.

* Add MetPy to related projects.

* Add Weather and Climate specific page.

* Add hvplot.

* Note open_dataset, mfdataset open files as read-only (closes pydata#2345).

* Update metpy 1

Co-Authored-By: dcherian <[email protected]>

* Update doc/weather-climate.rst

Co-Authored-By: dcherian <[email protected]>
  • Loading branch information
dcherian authored and pletchm committed Mar 21, 2019
1 parent a63f057 commit 57f196c
Show file tree
Hide file tree
Showing 11 changed files with 202 additions and 153 deletions.
2 changes: 2 additions & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Documentation
* :doc:`reshaping`
* :doc:`combining`
* :doc:`time-series`
* :doc:`weather-climate`
* :doc:`pandas`
* :doc:`io`
* :doc:`dask`
Expand All @@ -70,6 +71,7 @@ Documentation
reshaping
combining
time-series
weather-climate
pandas
io
dask
Expand Down
13 changes: 8 additions & 5 deletions doc/io.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
.. _io:

Serialization and IO
====================
Reading and writing files
=========================

xarray supports direct serialization and IO to several file formats, from
simple :ref:`io.pickle` files to the more flexible :ref:`io.netcdf`
format.
format (recommended).

.. ipython:: python
:suppress:
Expand Down Expand Up @@ -739,11 +739,14 @@ options are listed on the PseudoNetCDF page.
.. _PseudoNetCDF: http://github.com/barronh/PseudoNetCDF


Formats supported by Pandas
---------------------------
CSV and other formats supported by Pandas
-----------------------------------------

For more options (tabular formats and CSV files in particular), consider
exporting your objects to pandas and using its broad range of `IO tools`_.
For CSV files, one might also consider `xarray_extras`_.

.. _xarray_extras: https://xarray-extras.readthedocs.io/en/latest/api/csv.html

.. _IO tools: http://pandas.pydata.org/pandas-docs/stable/io.html

Expand Down
4 changes: 4 additions & 0 deletions doc/plotting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ For more extensive plotting applications consider the following projects:
data structures for building even complex visualizations easily." Includes
native support for xarray objects.

- `hvplot <https://hvplot.pyviz.org/>`_: ``hvplot`` makes it very easy to produce
dynamic plots (backed by ``Holoviews`` or ``Geoviews``) by adding a ``hvplot``
accessor to DataArrays.

- `Cartopy <http://scitools.org.uk/cartopy/>`_: Provides cartographic
tools.

Expand Down
1 change: 1 addition & 0 deletions doc/related-projects.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Geosciences
- `aospy <https://aospy.readthedocs.io>`_: Automated analysis and management of gridded climate data.
- `infinite-diff <https://github.com/spencerahill/infinite-diff>`_: xarray-based finite-differencing, focused on gridded climate/meterology data
- `marc_analysis <https://github.com/darothen/marc_analysis>`_: Analysis package for CESM/MARC experiments and output.
- `MetPy <https://unidata.github.io/MetPy/dev/index.html>`_: A collection of tools in Python for reading, visualizing, and performing calculations with weather data.
- `MPAS-Analysis <http://mpas-analysis.readthedocs.io>`_: Analysis for simulations produced with Model for Prediction Across Scales (MPAS) components and the Accelerated Climate Model for Energy (ACME).
- `OGGM <http://oggm.org/>`_: Open Global Glacier Model
- `Oocgcm <https://oocgcm.readthedocs.io/>`_: Analysis of large gridded geophysical datasets
Expand Down
137 changes: 0 additions & 137 deletions doc/time-series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -212,140 +212,3 @@ Data that has indices outside of the given ``tolerance`` are set to ``NaN``.
For more examples of using grouped operations on a time dimension, see
:ref:`toy weather data`.


.. _CFTimeIndex:

Non-standard calendars and dates outside the Timestamp-valid range
------------------------------------------------------------------

Through the standalone ``cftime`` library and a custom subclass of
:py:class:`pandas.Index`, xarray supports a subset of the indexing
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for
dates from non-standard calendars commonly used in climate science or dates
using a standard calendar, but outside the `Timestamp-valid range`_
(approximately between years 1678 and 2262).

.. note::

As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects
will be used to represent times (either in indexes, as a
:py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if
any of the following are true:

- The dates are from a non-standard calendar
- Any dates are outside the Timestamp-valid range.

Otherwise pandas-compatible dates from a standard calendar will be
represented with the ``np.datetime64[ns]`` data type, enabling the use of a
:py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[ns]``
and their full set of associated features.

For example, you can create a DataArray indexed by a time
coordinate with dates from a no-leap calendar and a
:py:class:`~xarray.CFTimeIndex` will automatically be used:

.. ipython:: python
from itertools import product
from cftime import DatetimeNoLeap
dates = [DatetimeNoLeap(year, month, 1) for year, month in
product(range(1, 3), range(1, 13))]
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
xarray also includes a :py:func:`~xarray.cftime_range` function, which enables
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For
instance, we can create the same dates and DataArray we created above using:

.. ipython:: python
dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:

- `Partial datetime string indexing`_ using strictly `ISO 8601-format`_ partial
datetime strings:

.. ipython:: python
da.sel(time='0001')
da.sel(time=slice('0001-05', '0002-02'))
- Access of basic datetime components via the ``dt`` accessor (in this case
just "year", "month", "day", "hour", "minute", "second", "microsecond",
"season", "dayofyear", and "dayofweek"):

.. ipython:: python
da.time.dt.year
da.time.dt.month
da.time.dt.season
da.time.dt.dayofyear
da.time.dt.dayofweek
- Group-by operations based on datetime accessor attributes (e.g. by month of
the year):

.. ipython:: python
da.groupby('time.month').sum()
- Interpolation using :py:class:`cftime.datetime` objects:

.. ipython:: python
da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])
- Interpolation using datetime strings:

.. ipython:: python
da.interp(time=['0001-01-15', '0001-02-15'])
- Differentiation:

.. ipython:: python
da.differentiate('time')
- Serialization:

.. ipython:: python
da.to_netcdf('example-no-leap.nc')
xr.open_dataset('example-no-leap.nc')
- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:

.. ipython:: python
da.resample(time='81T', closed='right', label='right', base=3).mean()
.. note::


For some use-cases it may still be useful to convert from
a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`,
despite the difference in calendar types. The recommended way of doing this
is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex`
method:

.. ipython:: python
:okwarning:
modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(range(24), [('time', modern_times)])
da
datetimeindex = da.indexes['time'].to_datetimeindex()
da['time'] = datetimeindex
However in this case one should use caution to only perform operations which
do not depend on differences between dates (e.g. differentiation,
interpolation, or upsampling with resample), as these could introduce subtle
and silent errors due to the difference in calendar types between the dates
encoded in your data and the dates stored in memory.

.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations
.. _ISO 8601-format: https://en.wikipedia.org/wiki/ISO_8601
.. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#partial-string-indexing
160 changes: 160 additions & 0 deletions doc/weather-climate.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
.. _weather-climate:

Weather and climate data
========================

.. ipython:: python
:suppress:
import xarray as xr
``xarray`` can leverage metadata that follows the `Climate and Forecast (CF) conventions`_ if present. Examples include automatic labelling of plots with descriptive names and units if proper metadata is present (see :ref:`plotting`) and support for non-standard calendars used in climate science through the ``cftime`` module (see :ref:`CFTimeIndex`). There are also a number of geosciences-focused projects that build on xarray (see :ref:`related-projects`).

.. _Climate and Forecast (CF) conventions: http://cfconventions.org

.. _metpy_accessor:

CF-compliant coordinate variables
---------------------------------

`MetPy`_ adds a ``metpy`` accessor that allows accessing coordinates with appropriate CF metadata using generic names ``x``, ``y``, ``vertical`` and ``time``. There is also a `cartopy_crs` attribute that provides projection information, parsed from the appropriate CF metadata, as a `Cartopy`_ projection object. See `their documentation`_ for more information.

.. _`MetPy`: https://unidata.github.io/MetPy/dev/index.html
.. _`their documentation`: https://unidata.github.io/MetPy/dev/tutorials/xarray_tutorial.html#coordinates
.. _`Cartopy`: https://scitools.org.uk/cartopy/docs/latest/crs/projections.html

.. _CFTimeIndex:

Non-standard calendars and dates outside the Timestamp-valid range
------------------------------------------------------------------

Through the standalone ``cftime`` library and a custom subclass of
:py:class:`pandas.Index`, xarray supports a subset of the indexing
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for
dates from non-standard calendars commonly used in climate science or dates
using a standard calendar, but outside the `Timestamp-valid range`_
(approximately between years 1678 and 2262).

.. note::

As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects
will be used to represent times (either in indexes, as a
:py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if
any of the following are true:

- The dates are from a non-standard calendar
- Any dates are outside the Timestamp-valid range.

Otherwise pandas-compatible dates from a standard calendar will be
represented with the ``np.datetime64[ns]`` data type, enabling the use of a
:py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[ns]``
and their full set of associated features.

For example, you can create a DataArray indexed by a time
coordinate with dates from a no-leap calendar and a
:py:class:`~xarray.CFTimeIndex` will automatically be used:

.. ipython:: python
from itertools import product
from cftime import DatetimeNoLeap
dates = [DatetimeNoLeap(year, month, 1) for year, month in
product(range(1, 3), range(1, 13))]
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
xarray also includes a :py:func:`~xarray.cftime_range` function, which enables
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For
instance, we can create the same dates and DataArray we created above using:

.. ipython:: python
dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:

- `Partial datetime string indexing`_ using strictly `ISO 8601-format`_ partial
datetime strings:

.. ipython:: python
da.sel(time='0001')
da.sel(time=slice('0001-05', '0002-02'))
- Access of basic datetime components via the ``dt`` accessor (in this case
just "year", "month", "day", "hour", "minute", "second", "microsecond",
"season", "dayofyear", and "dayofweek"):

.. ipython:: python
da.time.dt.year
da.time.dt.month
da.time.dt.season
da.time.dt.dayofyear
da.time.dt.dayofweek
- Group-by operations based on datetime accessor attributes (e.g. by month of
the year):

.. ipython:: python
da.groupby('time.month').sum()
- Interpolation using :py:class:`cftime.datetime` objects:

.. ipython:: python
da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])
- Interpolation using datetime strings:

.. ipython:: python
da.interp(time=['0001-01-15', '0001-02-15'])
- Differentiation:

.. ipython:: python
da.differentiate('time')
- Serialization:

.. ipython:: python
da.to_netcdf('example-no-leap.nc')
xr.open_dataset('example-no-leap.nc')
- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:

.. ipython:: python
da.resample(time='81T', closed='right', label='right', base=3).mean()
.. note::


For some use-cases it may still be useful to convert from
a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`,
despite the difference in calendar types. The recommended way of doing this
is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex`
method:

.. ipython:: python
:okwarning:
modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap')
da = xr.DataArray(range(24), [('time', modern_times)])
da
datetimeindex = da.indexes['time'].to_datetimeindex()
da['time'] = datetimeindex
However in this case one should use caution to only perform operations which
do not depend on differences between dates (e.g. differentiation,
interpolation, or upsampling with resample), as these could introduce subtle
and silent errors due to the difference in calendar types between the dates
encoded in your data and the dates stored in memory.

.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations
.. _ISO 8601-format: https://en.wikipedia.org/wiki/ISO_8601
.. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#partial-string-indexing
2 changes: 1 addition & 1 deletion doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ Bug fixes
from higher frequencies to lower frequencies. Datapoints outside the bounds
of the original time coordinate are now filled with NaN (:issue:`2197`). By
`Spencer Clark <https://github.com/spencerkclark>`_.
- Line plots with the `x` argument set to a non-dimensional coord now plot the correct data for 1D DataArrays.
- Line plots with the ``x`` argument set to a non-dimensional coord now plot the correct data for 1D DataArrays.
(:issue:`27251). By `Tom Nicholas <http://github.com/TomNicholas>`_.
- Subtracting a scalar ``cftime.datetime`` object from a
:py:class:`CFTimeIndex` now results in a :py:class:`pandas.TimedeltaIndex`
Expand Down
Loading

0 comments on commit 57f196c

Please sign in to comment.