Skip to content

Commit

Permalink
Fix #60 Use local copy of NWB file to avoid use of special charactes …
Browse files Browse the repository at this point in the history
…in folder names
  • Loading branch information
oruebel committed Jan 4, 2023
1 parent e3d063a commit b92f549
Show file tree
Hide file tree
Showing 5 changed files with 52 additions and 21 deletions.
45 changes: 26 additions & 19 deletions docs/gallery/plot_convert_nwb_hdf5.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,37 +5,44 @@
This tutorial illustrates how to convert data between HDF5 and Zarr using
a Neurodata Without Borders (NWB) file from the DANDI data archive as an example.
In this tutorial we will convert our example file from HDF5 to Zarr and then
back again to HDF5.
back again to HDF5. The NWB standard is defined using :hdmf-docs:`HDMF <>` and uses the
:py:class:`~ hdmf.backends.hdf5.h5tools.HDF5IO` HDF5 backend from HDMF for storage.
"""


###############################################################################
# Setup
# -----
#
# We first **download a small NWB file** from the DANDI neurophysiology data archive as an example.
# The NWB standard is defined using HDMF and uses the :py:class:`~ hdmf.backends.hdf5.h5tools.HDF5IO`
# HDF5 backend from HDMF for storage.
# Here we use a small NWB file from the DANDI neurophysiology data archive from
# `DANDIset 000009 <https://dandiarchive.org/dandiset/000009/0.220126.1903>`_ as an example.
# To download the file directly from DANDI we can use:
#
# .. code-block:: python
# :linenos:
#
# from dandi.dandiapi import DandiAPIClient
# dandiset_id = "000009"
# filepath = "sub-anm00239123/sub-anm00239123_ses-20170627T093549_ecephys+ogen.nwb" # ~0.5MB file
# with DandiAPIClient() as client:
# asset = client.get_dandiset(dandiset_id, 'draft').get_asset_by_path(filepath)
# s3_path = asset.get_content_url(follow_redirects=1, strip_query=True)
# filename = os.path.basename(asset.path)
# asset.download(filename)
#
# We here use a local copy of a small file from this DANDIset as an example:
#

# sphinx_gallery_thumbnail_path = 'figures/gallery_thumbnail_plot_convert_nwb.png'
import os
import shutil
from dandi.dandiapi import DandiAPIClient

dandiset_id = "000009"
filepath = "sub-anm00239123/sub-anm00239123_ses-20170627T093549_ecephys+ogen.nwb" # ~0.5MB file
with DandiAPIClient() as client:
asset = client.get_dandiset(dandiset_id, 'draft').get_asset_by_path(filepath)
s3_path = asset.get_content_url(follow_redirects=1, strip_query=True)
filename = os.path.basename(asset.path)
asset.download(filename)

###############################################################################
# Next we define the names of the files to generate as part of this tutorial and clean up any
# data from previous executions of this tutorial.

zarr_filename = "test_zarr_" + filename + ".zarr"
hdf_filename = "test_hdf5_" + filename
# Input file to convert
filename = "../resources/sub_anm00239123_ses_20170627T093549_ecephys_and_ogen.nwb"
# Zarr file to generate for converting from HDF5 to Zarr
zarr_filename = "test_zarr_" + os.path.basename(filename) + ".zarr"
# HDF5 file to generate for converting from Zarr to HDF5
hdf_filename = "test_hdf5_" + os.path.basename(filename)

# Delete our converted HDF5 and Zarr file from previous runs of this notebook
for fname in [zarr_filename, hdf_filename]:
Expand Down
23 changes: 23 additions & 0 deletions docs/resources/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Resources
=========

sub_anm00239123_ses_20170627T093549_ecephys_and_ogen.nwb
--------------------------------------------------------

This NWB file was downloaded from `DANDIset 000009 <https://dandiarchive.org/dandiset/000009/0.220126.1903>`_
The file was modified to replace ``:`` characters used in the name of the ``ElectrodeGroup`` called ``ADunit: 32`` in
``'general/extracellular_ephys/`` to ``'ADunit_32'``. This is to avoid issues on Windows file systems that do not
support ``:`` as part of folder names. The asses can be downloaded from DANDI via:

.. code-block:: python
:linenos:
from dandi.dandiapi import DandiAPIClient
dandiset_id = "000009"
filepath = "sub-anm00239123/sub-anm00239123_ses-20170627T093549_ecephys+ogen.nwb" # ~0.5MB file
with DandiAPIClient() as client:
asset = client.get_dandiset(dandiset_id, 'draft').get_asset_by_path(filepath)
s3_path = asset.get_content_url(follow_redirects=1, strip_query=True)
filename = os.path.basename(asset.path)
asset.download(filename)
Binary file not shown.
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,8 @@
intersphinx_mapping = {
'python': ('https://docs.python.org/3.10', None),
'numpy': ('https://numpy.org/doc/stable/', None),
'scipy': ('https://docs.scipy.org/doc/scipy/reference', None),
'matplotlib': ('https://matplotlib.org', None),
'scipy': ('https://docs.scipy.org/doc/scipy/', None),
'matplotlib': ('https://matplotlib.org/stable/', None),
'h5py': ('https://docs.h5py.org/en/latest/', None),
'pandas': ('https://pandas.pydata.org/pandas-docs/stable/', None),
'hdmf': ('https://hdmf.readthedocs.io/en/stable/', None),
Expand Down
1 change: 1 addition & 0 deletions docs/source/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,4 @@ Known Limitations
- Currently the :py:class:`~hdmf_zarr.backend.ZarrIO` backend uses Zarr's :py:class:`~zarr.storage.DirectoryStore` only. Other `Zarr stores <https://zarr.readthedocs.io/en/stable/api/storage.html>`_ could be added but will require proper treatment of links and references for those backends as links are not supported in Zarr (see `zarr-python issues #389 <https://github.com/zarr-developers/zarr-python/issues/389>`_.
- Exporting of HDF5 files with external links is not yet fully implemented/tested. (see `hdmf-zarr issue #49 <https://github.com/hdmf-dev/hdmf-zarr/issues/49>`_.
- Object references are currently always resolved on read (as are links) rather than being loaded lazily (see `hdmf-zarr issue #50 <https://github.com/hdmf-dev/hdmf-zarr/issues/50>`_.
- Special characters (e.g., ``:``, ``<``, ``>``, ``"``, ``/``, ``\``, ``|``, ``?``, or ``*``) may not be supported by all file systems (e.g., on Windows) and as such should not be used as part of the names of Datasets or Groups as Zarr needs to create folders on the filesystem for these objects.

0 comments on commit b92f549

Please sign in to comment.