Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop loading tutorial data by default #2538

Merged
merged 4 commits into from
Nov 5, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -411,7 +411,7 @@ can use indexing with ``.loc`` :

.. ipython:: python

ds = xr.tutorial.load_dataset('air_temperature')
ds = xr.tutorial.open_dataset('air_temperature')

#add an empty 2D dataarray
ds['empty']= xr.full_like(ds.air.mean('time'),fill_value=0)
Expand Down
2 changes: 1 addition & 1 deletion doc/interpolation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ Let's see how :py:meth:`~xarray.DataArray.interp` works on real data.
.. ipython:: python

# Raw data
ds = xr.tutorial.load_dataset('air_temperature').isel(time=0)
ds = xr.tutorial.open_dataset('air_temperature').isel(time=0)
fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
ds.air.plot(ax=axes[0])
axes[0].set_title('Raw data')
Expand Down
4 changes: 2 additions & 2 deletions doc/plotting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ For these examples we'll use the North American air temperature dataset.

.. ipython:: python

airtemps = xr.tutorial.load_dataset('air_temperature')
airtemps = xr.tutorial.open_dataset('air_temperature')
airtemps

# Convert to celsius
Expand Down Expand Up @@ -585,7 +585,7 @@ This script will plot the air temperature on a map.
.. ipython:: python

import cartopy.crs as ccrs
air = xr.tutorial.load_dataset('air_temperature').air
air = xr.tutorial.open_dataset('air_temperature').air
ax = plt.axes(projection=ccrs.Orthographic(-80, 35))
air.isel(time=0).plot.contourf(ax=ax, transform=ccrs.PlateCarree());
@savefig plotting_maps_cartopy.png width=100%
Expand Down
5 changes: 5 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,11 @@ Breaking changes
should significantly improve performance when reading and writing
netCDF files with Dask, especially when working with many files or using
Dask Distributed. By `Stephan Hoyer <https://github.com/shoyer>`_
- Tutorial data is now loaded lazily. Previous behavior of
:py:meth:`xarray.tutorial.load_dataset` would call `Dataset.load()` prior
to returning. This was changed in order to facilitate using this data with
dask.
By `Joe Hamman <https://github.com/jhamman>`_.

Documentation
~~~~~~~~~~~~~
Expand Down
7 changes: 6 additions & 1 deletion xarray/tests/test_tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ def setUp(self):
os.remove('{}.md5'.format(self.testfilepath))

def test_download_from_github(self):
ds = tutorial.load_dataset(self.testfile)
ds = tutorial.open_dataset(self.testfile).load()
tiny = DataArray(range(5), name='tiny').to_dataset()
assert_identical(ds, tiny)

def test_download_from_github_load_without_cache(self):
ds_nocache = tutorial.open_dataset(self.testfile, cache=False).load()
ds_cache = tutorial.open_dataset(self.testfile).load()
assert_identical(ds_cache, ds_nocache)
27 changes: 25 additions & 2 deletions xarray/tutorial.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

import hashlib
import os as _os
import warnings

from .backends.api import open_dataset as _open_dataset
from .core.pycompat import urlretrieve as _urlretrieve
Expand All @@ -24,7 +25,7 @@ def file_md5_checksum(fname):


# idea borrowed from Seaborn
def load_dataset(name, cache=True, cache_dir=_default_cache_dir,
def open_dataset(name, cache=True, cache_dir=_default_cache_dir,
github_url='https://github.com/pydata/xarray-data',
branch='master', **kws):
"""
Expand All @@ -48,6 +49,10 @@ def load_dataset(name, cache=True, cache_dir=_default_cache_dir,
kws : dict, optional
Passed to xarray.open_dataset

See Also
--------
xarray.open_dataset

"""
longdir = _os.path.expanduser(cache_dir)
fullname = name + '.nc'
Expand Down Expand Up @@ -77,9 +82,27 @@ def load_dataset(name, cache=True, cache_dir=_default_cache_dir,
"""
raise IOError(msg)

ds = _open_dataset(localfile, **kws).load()
ds = _open_dataset(localfile, **kws)

if not cache:
ds = ds.load()
_os.remove(localfile)
jhamman marked this conversation as resolved.
Show resolved Hide resolved

return ds


def load_dataset(*args, **kwargs):
"""
`load_dataset` will be removed in version 0.12. The current behavior of
this function can be achived by using `tutorial.open_dataset(...).load()`.

See Also
--------
open_dataset
"""
warnings.warn(
"load_dataset` will be removed in xarray version 0.12. The current "
"behavior of this function can be achived by using "
"`tutorial.open_dataset(...).load()`.",
DeprecationWarning, stacklevel=2)
return open_dataset(*args, **kwargs).load()