Skip to content

Commit

Permalink
Switch to shared Lock (SerializableLock if possible) for reading and …
Browse files Browse the repository at this point in the history
…writing

Fixes pydata#1172

The serializable lock will be useful for dask.distributed or multi-processing
(xref pydata#798, pydata#1173, among others).
  • Loading branch information
shoyer committed Dec 22, 2016
1 parent aec3e8e commit 2258217
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 11 deletions.
6 changes: 5 additions & 1 deletion doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Breaking changes
By `Guido Imperiale <https://github.com/crusaderky>`_ and
`Stephan Hoyer <https://github.com/shoyer>`_.
- Pickling a ``Dataset`` or ``DataArray`` linked to a file on disk no longer
caches its values into memory before pickling :issue:`1128`. Instead, pickle
caches its values into memory before pickling (:issue:`1128`). Instead, pickle
stores file paths and restores objects by reopening file references. This
enables preliminary, experimental use of xarray for opening files with
`dask.distributed <https://distributed.readthedocs.io>`_.
Expand Down Expand Up @@ -206,6 +206,10 @@ Bug fixes
- Fixed a bug with facetgrid (the ``norm`` keyword was ignored, :issue:`1159`).
By `Fabien Maussion <https://github.com/fmaussion>`_.

- Resolved a concurrency bug that could cause Python to crash when
simultaneously reading and writing netCDF4 files with dask (:issue:`1172`).
By `Stephan Hoyer <https://github.com/shoyer>`_.

.. _whats-new.0.8.2:

v0.8.2 (18 August 2016)
Expand Down
10 changes: 3 additions & 7 deletions xarray/backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
from __future__ import print_function
import gzip
import os.path
import threading
from distutils.version import StrictVersion
from glob import glob
from io import BytesIO
Expand All @@ -12,7 +11,7 @@
import numpy as np

from .. import backends, conventions
from .common import ArrayWriter
from .common import ArrayWriter, GLOBAL_LOCK
from ..core import indexing
from ..core.combine import auto_combine
from ..core.utils import close_on_error, is_remote_uri
Expand Down Expand Up @@ -55,9 +54,6 @@ def _normalize_path(path):
return os.path.abspath(os.path.expanduser(path))


_global_lock = threading.Lock()


def _default_lock(filename, engine):
if filename.endswith('.gz'):
lock = False
Expand All @@ -71,9 +67,9 @@ def _default_lock(filename, engine):
else:
# TODO: identify netcdf3 files and don't use the global lock
# for them
lock = _global_lock
lock = GLOBAL_LOCK
elif engine in {'h5netcdf', 'pynio'}:
lock = _global_lock
lock = GLOBAL_LOCK
else:
lock = False
return lock
Expand Down
14 changes: 11 additions & 3 deletions xarray/backends/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
from __future__ import division
from __future__ import print_function
import numpy as np
import itertools
import logging
import time
import traceback
Expand All @@ -12,7 +11,12 @@

from ..conventions import cf_encoder
from ..core.utils import FrozenOrderedDict
from ..core.pycompat import iteritems, dask_array_type, OrderedDict
from ..core.pycompat import iteritems, dask_array_type

try:
from dask.utils import SerializableLock as Lock
except ImportError:
from threading import Lock

# Create a logger object, but don't add any handlers. Leave that to user code.
logger = logging.getLogger(__name__)
Expand All @@ -21,6 +25,10 @@
NONE_VAR_NAME = '__values__'


# dask.utils.SerializableLock if available, otherwise just a threading.Lock
GLOBAL_LOCK = Lock()


def _encode_variable_name(name):
if name is None:
name = NONE_VAR_NAME
Expand Down Expand Up @@ -150,7 +158,7 @@ def sync(self):
import dask.array as da
import dask
if StrictVersion(dask.__version__) > StrictVersion('0.8.1'):
da.store(self.sources, self.targets, lock=threading.Lock())
da.store(self.sources, self.targets, lock=GLOBAL_LOCK)
else:
da.store(self.sources, self.targets)
self.sources = []
Expand Down

0 comments on commit 2258217

Please sign in to comment.