Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for load_data from URI/URL #2875

Merged
merged 26 commits into from
Jun 3, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
885b937
add support for load_data from URI/URL
bmorris3 May 15, 2024
b781c19
improvements from Pey Lian's review
bmorris3 May 16, 2024
061b18d
add intersphinx for astroquery
bmorris3 May 16, 2024
7e952e7
adding uri/url load_data user docs, better error message on bad path
bmorris3 May 21, 2024
55f11fa
Update jdaviz/utils.py
bmorris3 May 21, 2024
e187a78
Update docs/imviz/import_data.rst
bmorris3 May 21, 2024
04666af
Update docs/imviz/import_data.rst
bmorris3 May 21, 2024
4829610
Update jdaviz/configs/cubeviz/plugins/tests/test_parsers.py
bmorris3 May 24, 2024
5c425ae
uri queries cache by default, fix up docstrings
bmorris3 May 24, 2024
83a30bd
fix launcher with URIs, update example notebooks
bmorris3 May 24, 2024
231a64c
more informative message for non-public URIs
bmorris3 May 24, 2024
88ad25d
add caveats about local filepath URIs and cloud fits to narrative docs
bmorris3 May 24, 2024
aceaf40
default cache is None
bmorris3 May 24, 2024
86972d3
fix for order of warnings
bmorris3 May 24, 2024
c8e9038
Update docs/conf.py
bmorris3 May 28, 2024
00c1706
Apply suggestions from code review
bmorris3 May 28, 2024
de4e32d
updates from Pey Lian's review comments
bmorris3 May 29, 2024
7470102
Apply suggestions from code review
bmorris3 May 29, 2024
9255009
test nonexistant MAST URI
bmorris3 May 29, 2024
8032a28
increase coverage, fix docstring formatting
bmorris3 May 29, 2024
fc9902d
review comments addressed from Ricky and Pey Lian
bmorris3 May 31, 2024
da1d5fd
fix multiple warning checks in imviz local_path test
bmorris3 May 31, 2024
5135e3f
addressing Pey Lian's review comments
bmorris3 Jun 3, 2024
bc4ac84
Update jdaviz/utils.py
bmorris3 Jun 3, 2024
77df5ff
fix astroquery configuration call
bmorris3 Jun 3, 2024
fc28c86
corrections to check for multiple warnings
bmorris3 Jun 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@ Other Changes and Additions
New Features
------------

- Load remote data from a URI or URL. [#2875]

Cubeviz
^^^^^^^

Expand Down
1 change: 1 addition & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,7 @@
# Extra intersphinx in addition to what is already in sphinx-astropy
intersphinx_mapping.update({ # noqa: F405
'glueviz': ('https://docs.glueviz.org/en/stable/', None),
'astroquery': ('https://astroquery.readthedocs.io/en/latest/', None),
'glue': ('https://glue-core.readthedocs.io/en/latest/', None),
'glue_jupyter': ('https://glue-jupyter.readthedocs.io/en/stable/', None),
'photutils': ('https://photutils.readthedocs.io/en/stable/', None),
Expand Down
8 changes: 8 additions & 0 deletions docs/cubeviz/import_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -213,3 +213,11 @@ For more details on the API, please see
:py:meth:`~jdaviz.core.helpers.ImageConfigHelper.load_regions_from_file`
and :py:meth:`~jdaviz.core.helpers.ImageConfigHelper.load_regions` methods
in Cubeviz.

Loading from a URL or URI
-------------------------

.. seealso::

:ref:`Load from URL or URI <load-data-uri>`
Imviz documentation describing load from URI/URL.
35 changes: 35 additions & 0 deletions docs/imviz/import_data.rst
pllim marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,41 @@ data entries into the viewer until after the parsing is complete::
imviz.show()


.. _load-data-uri:

Load data from a URI or URL
---------------------------

The examples above import data from a local file path, and also support loading remote
data from a URL or URI with :meth:`~jdaviz.core.helpers.ConfigHelper.load_data`.
If the input is a string with a MAST URI, the file will be retrieved via
astroquery's `~astroquery.mast.ObservationsClass.download_file`. If the
input string is a URL, it will be retrieved via astropy with
`~astropy.utils.data.download_file`. Both methods support a
``cache`` argument, which will store the file locally. Cached downloads via astropy
pllim marked this conversation as resolved.
Show resolved Hide resolved
are placed in the :ref:`astropy cache <astropy:utils-data>`,
and URIs retrieved via astroquery can be saved to a path of your choice with
``local_path``. If the ``cache`` argument hasn't been set, the file will be cached
and a warning will be raised.

Local file URIs beginning with ``file://``
are not supported by this method – nor are they necessary, since string
paths without the scheme work fine! :ref:`Cloud FITS <astropy:fits_io_cloud>` are not yet supported.

.. code-block:: python

from jdaviz import Imviz

uri = "mast:JWST/product/jw01345-o001_t021_nircam_clear-f200w_i2d.fits"
cache = True

# store the retrieved file in the current working directory:
local_path = "jw01345-o001_t021_nircam_clear-f200w_i2d.fits"

imviz = Imviz()
imviz.load_data(uri, cache=cache, local_path=local_path)
imviz.show()

Importing catalogs via the API
==============================

Expand Down
8 changes: 8 additions & 0 deletions docs/specviz/import_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -155,3 +155,11 @@ The :py:meth:`~jdaviz.configs.specviz.helper.Specviz.load_data` method also take
an optional keyword argument ``concat_by_file``. When set to ``True``, the spectra
loaded in the :class:`~specutils.SpectrumList` will be concatenated together into one
combined spectrum per loaded file, which may be useful for MIRI observations, for example.

Loading from a URL or URI
-------------------------

.. seealso::

:ref:`Load from URL or URI <load-data-uri>`
Imviz documentation describing load from URI/URL.
8 changes: 8 additions & 0 deletions docs/specviz2d/import_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,11 @@ the spectrum to be horizontal:
.. code-block:: python

specviz2d.load_data(filename, ext=7, transpose=True)

Loading from a URL or URI
-------------------------

.. seealso::

:ref:`Load from URL or URI <load-data-uri>`
Imviz documentation describing load from URI/URL.
18 changes: 2 additions & 16 deletions jdaviz/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -886,22 +886,8 @@ def load_data(self, file_obj, parser_reference=None, **kwargs):
"""
self.loading = True
try:
try:
# Properly form path and check if a valid file
file_obj = pathlib.Path(file_obj)
if not file_obj.exists():
msg_text = "Error: File {} does not exist".format(file_obj)
snackbar_message = SnackbarMessage(msg_text, sender=self,
color='error')
self.hub.broadcast(snackbar_message)
raise FileNotFoundError("Could not locate file: {}".format(file_obj))
else:
# Convert path to properly formatted string (Parsers do not accept path objs)
file_obj = str(file_obj)
except TypeError:
# If it's not a str/path type, it might be a compatible class.
# Pass to parsers to see if they can accept it
pass
if isinstance(file_obj, pathlib.Path):
file_obj = str(file_obj)

# attempt to get a data parser from the config settings
parser = None
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
from astropy.io import fits
from astropy.nddata import CCDData
from astropy.wcs import WCS
from astroquery.mast import Observations
import astropy.units as u
import numpy as np
from numpy.testing import assert_allclose
Expand Down Expand Up @@ -262,12 +261,11 @@ def test_write_momentmap(cubeviz_helper, spectrum1d_cube, tmp_path):
@pytest.mark.remote_data
def test_momentmap_nirspec_prism(cubeviz_helper, tmp_path):
uri = "mast:jwst/product/jw02732-o003_t002_nirspec_prism-clear_s3d.fits"
download_path = str(tmp_path / Path(uri).name)
Observations.download_file(uri, local_path=download_path)
local_path = str(tmp_path / Path(uri).name)

with warnings.catch_warnings():
warnings.simplefilter("ignore")
cubeviz_helper.load_data(download_path)
cubeviz_helper.load_data(uri, cache=True, local_path=local_path)
plugin = cubeviz_helper.plugins['Moment Maps']
plugin.calculate_moment()
assert isinstance(plugin._obj.moment.wcs, WCS)
Expand Down
23 changes: 21 additions & 2 deletions jdaviz/configs/cubeviz/plugins/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

from jdaviz.configs.imviz.plugins.parsers import prep_data_layer_as_dq
from jdaviz.core.registries import data_parser_registry
from jdaviz.utils import standardize_metadata, PRIHDR_KEY
from jdaviz.utils import standardize_metadata, PRIHDR_KEY, download_uri_to_path


__all__ = ['parse_data']
Expand All @@ -23,7 +23,8 @@


@data_parser_registry("cubeviz-data-parser")
def parse_data(app, file_obj, data_type=None, data_label=None, parent=None):
def parse_data(app, file_obj, data_type=None, data_label=None,
parent=None, cache=None, local_path=None, timeout=None):
"""
Attempts to parse a data file and auto-populate available viewers in
cubeviz.
Expand All @@ -38,6 +39,19 @@ def parse_data(app, file_obj, data_type=None, data_label=None, parent=None):
The data type used to explicitly differentiate parsed data.
data_label : str, optional
The label to be applied to the Glue data component.
parent : str, optional
Data label for "parent" data to associate with the loaded data as "child".
cache : None, bool, or str
Cache the downloaded file if the data are retrieved by a query
to a URL or URI.
local_path : str, optional
pllim marked this conversation as resolved.
Show resolved Hide resolved
Cache remote files to this path. This is only used if data is
requested from `astroquery.mast`.
timeout : float, optional
If downloading from a remote URI, set the timeout limit for
remote requests in seconds (passed to
`~astropy.utils.data.download_file` or
`~astroquery.mast.Conf.timeout`).
"""

flux_viewer_reference_name = app._jdaviz_helper._default_flux_viewer_reference_name
Expand Down Expand Up @@ -66,6 +80,11 @@ def parse_data(app, file_obj, data_type=None, data_label=None, parent=None):
flux_viewer_reference_name=flux_viewer_reference_name)
return

# try parsing file_obj as a URI/URL:
file_obj = download_uri_to_path(
file_obj, cache=cache, local_path=local_path, timeout=timeout
)

file_name = os.path.basename(file_obj)

with fits.open(file_obj) as hdulist:
Expand Down
2 changes: 1 addition & 1 deletion jdaviz/configs/cubeviz/plugins/tests/test_parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ def test_numpy_cube(cubeviz_helper):


def test_invalid_data_types(cubeviz_helper):
with pytest.raises(FileNotFoundError, match='Could not locate file'):
with pytest.raises(ValueError, match=r"The input file 'does_not_exist\.fits'"):
cubeviz_helper.load_data('does_not_exist.fits')

with pytest.raises(NotImplementedError, match='Unsupported data format'):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,13 @@ def test_roman_against_rdm():
@pytest.mark.remote_data
def test_data_quality_plugin(imviz_helper, tmp_path):
uri = "mast:JWST/product/jw01895001004_07101_00001_nrca3_cal.fits"
download_path = str(tmp_path / Path(uri).name)
Observations.download_file(uri, local_path=download_path)
local_path = str(tmp_path / Path(uri).name)

with warnings.catch_warnings():
warnings.simplefilter("ignore")
imviz_helper.load_data(download_path, ext=('SCI', 'DQ'))
imviz_helper.load_data(
uri, cache=True, local_path=local_path, ext=('SCI', 'DQ')
)

assert len(imviz_helper.app.data_collection) == 2

Expand Down
26 changes: 24 additions & 2 deletions jdaviz/configs/imviz/plugins/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

from jdaviz.core.registries import data_parser_registry
from jdaviz.core.events import SnackbarMessage
from jdaviz.utils import standardize_metadata, PRIHDR_KEY, _wcs_only_label
from jdaviz.utils import standardize_metadata, PRIHDR_KEY, _wcs_only_label, download_uri_to_path

try:
from roman_datamodels import datamodels as rdd
Expand Down Expand Up @@ -43,7 +43,8 @@ def prep_data_layer_as_dq(data):


@data_parser_registry("imviz-data-parser")
def parse_data(app, file_obj, ext=None, data_label=None, parent=None):
def parse_data(app, file_obj, ext=None, data_label=None,
parent=None, cache=None, local_path=None, timeout=None):
"""Parse a data file into Imviz.

Parameters
Expand All @@ -60,11 +61,32 @@ def parse_data(app, file_obj, ext=None, data_label=None, parent=None):
data_label : str, optional
The label to be applied to the Glue data component.

parent : str, optional
Data label for "parent" data to associate with the loaded data as "child".

cache : None, bool, or str
Cache the downloaded file if the data are retrieved by a query
to a URL or URI.

local_path : str, optional
Cache remote files to this path. This is only used if data is
requested from `astroquery.mast`.

timeout : float, optional
If downloading from a remote URI, set the timeout limit for
remote requests in seconds (passed to
`~astropy.utils.data.download_file` or
`~astroquery.mast.Conf.timeout`).
"""
if isinstance(file_obj, str):
if data_label is None:
data_label = os.path.splitext(os.path.basename(file_obj))[0]

# try parsing file_obj as a URI/URL:
file_obj = download_uri_to_path(
file_obj, cache=cache, local_path=local_path, timeout=timeout
)

# If file_obj is a path to a cached file from
# astropy.utils.data.download_file, the path has no file extension.
# Here we check if the file is in the download cache, and if it is,
Expand Down
28 changes: 15 additions & 13 deletions jdaviz/configs/imviz/tests/test_parser.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import os
import numpy as np
import pytest
from astropy import units as u
Expand Down Expand Up @@ -234,15 +235,15 @@ def test_parse_asdf_in_fits_4d(self, imviz_helper, tmp_path):

@pytest.mark.remote_data
def test_parse_jwst_nircam_level2(self, imviz_helper):
# TODO: Change back to smaller number (30?) when ITSD is convinced it is them and not us.
# Help desk ticket INC0183598, J. Quick.
filename = download_file(self.jwst_asdf_url_1, cache=True, timeout=100)
pllim marked this conversation as resolved.
Show resolved Hide resolved

# Default behavior: Science image
imviz_helper.load_data(filename)
with pytest.warns(UserWarning, match='You may be querying for a remote file'):
# if you don't pass a `cache` value, a warning should be raised:
imviz_helper.load_data(self.jwst_asdf_url_1, timeout=100)

data = imviz_helper.app.data_collection[0]
comp = data.get_component('DATA')
assert data.label == 'contents[DATA]' # download_file returns cache loc
expected_label = os.path.splitext(os.path.basename(self.jwst_asdf_url_1))[0] + '[DATA]'
assert data.label == expected_label
assert data.shape == (2048, 2048)
assert isinstance(data.coords, GWCS)
assert comp.units == 'MJy/sr'
Expand Down Expand Up @@ -312,7 +313,7 @@ def test_parse_jwst_nircam_level2(self, imviz_helper):
# --- Back to parser testing below. ---

# Request specific extension (name + ver, but ver is not used), use given label
imviz_helper.load_data(filename, ext=('DQ', 42),
imviz_helper.load_data(self.jwst_asdf_url_1, cache=True, ext=('DQ', 42),
data_label='jw01072001001_01101_00001_nrcb1_cal',
show_in_viewer=False)
data = imviz_helper.app.data_collection[1]
Expand All @@ -322,6 +323,7 @@ def test_parse_jwst_nircam_level2(self, imviz_helper):
assert comp.units == ''

# Pass in HDUList directly + ext (name only), use given label
filename = download_file(self.jwst_asdf_url_1, cache=True)
with fits.open(filename) as pf:
imviz_helper.load_data(pf, ext='SCI',
data_label='jw01072001001_01101_00001_nrcb1_cal',
Expand Down Expand Up @@ -362,12 +364,11 @@ def test_parse_jwst_nircam_level2(self, imviz_helper):
@pytest.mark.remote_data
def test_parse_jwst_niriss_grism(self, imviz_helper):
"""No valid image GWCS for Imviz, will fall back to loading without WCS."""
filename = download_file(self.jwst_asdf_url_2, cache=True)

imviz_helper.load_data(filename, show_in_viewer=False)
imviz_helper.load_data(self.jwst_asdf_url_2, cache=True, show_in_viewer=False)
data = imviz_helper.app.data_collection[0]
comp = data.get_component('DATA')
assert data.label == 'contents[DATA]' # download_file returns cache loc
expected_label = os.path.splitext(os.path.basename(self.jwst_asdf_url_2))[0] + '[DATA]'
assert data.label == expected_label
assert data.shape == (2048, 2048)
assert data.coords is None
assert comp.units == 'DN/s'
Expand All @@ -379,10 +380,11 @@ def test_parse_hst_drz(self, imviz_helper):
filename = download_file(url, cache=True)

# Default behavior: Load first image
imviz_helper.load_data(filename)
imviz_helper.load_data(url, cache=True)
data = imviz_helper.app.data_collection[0]
comp = data.get_component('SCI,1')
assert data.label == 'contents[SCI,1]' # download_file returns cache loc
expected_label = os.path.splitext(os.path.basename(url))[0] + '[SCI,1]'
assert data.label == expected_label
assert data.shape == (4299, 4219)
assert_allclose(data.meta['PHOTFLAM'], 7.8711728E-20)
assert isinstance(data.coords, WCS)
Expand Down
20 changes: 18 additions & 2 deletions jdaviz/configs/mosviz/plugins/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from jdaviz.configs.imviz.plugins.parsers import get_image_data_iterator
from jdaviz.core.registries import data_parser_registry
from jdaviz.core.events import SnackbarMessage
from jdaviz.utils import standardize_metadata, PRIHDR_KEY
from jdaviz.utils import standardize_metadata, PRIHDR_KEY, download_uri_to_path

__all__ = ['mos_spec1d_parser', 'mos_spec2d_parser', 'mos_image_parser']

Expand Down Expand Up @@ -259,7 +259,8 @@ def mos_spec1d_parser(app, data_obj, data_labels=None,

@data_parser_registry("mosviz-spec2d-parser")
def mos_spec2d_parser(app, data_obj, data_labels=None, add_to_table=True,
show_in_viewer=False, ext=1, transpose=False):
show_in_viewer=False, ext=1, transpose=False,
cache=None, local_path=None, timeout=None):
"""
Attempts to parse a 2D spectrum object.

Expand All @@ -281,6 +282,17 @@ def mos_spec2d_parser(app, data_obj, data_labels=None, add_to_table=True,
The extension in the FITS file that contains the data to be loaded.
transpose : bool, optional
Flag to transpose the data array before loading.
cache : None, bool, or str
Cache the downloaded file if the data are retrieved by a query
to a URL or URI.
local_path : str, optional
Cache remote files to this path. This is only used if data is
requested from `astroquery.mast`.
timeout : float, optional
If downloading from a remote URI, set the timeout limit for
remote requests in seconds (passed to
`~astropy.utils.data.download_file` or
`~astroquery.mast.Conf.timeout`).

Returns
-------
Expand Down Expand Up @@ -347,6 +359,10 @@ def _parse_as_spectrum1d(hdulist, ext, transpose):
# If we got a filepath, first try and parse using the Spectrum1D and
# SpectrumList parsers, and then fall back to parsing it as a generic
# FITS file.

# try parsing file_obj as a URI/URL:
data = download_uri_to_path(data, cache=cache, local_path=local_path, timeout=timeout)

if _check_is_file(data):
try:
if ext != 1 or transpose:
Expand Down
Loading
Loading