Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for no authentication and NASA Earthdata (URS) authentication for OPeNDAP #57

Merged
merged 3 commits into from
Mar 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ data drivers included in this package.
In `intake-xarray`, there are plugins provided for reading data into [xarray](http://xarray.pydata.org/en/stable/)
containers:
- NetCDF
- OPeNDAP
- Rasterio
- Zarr
- images
Expand All @@ -28,5 +29,5 @@ conda install -c conda-forge intake-xarray
To install optional dependencies:

```
conda install -c conda-forge rasterio
conda install -c conda-forge pydap rasterio
```
1 change: 1 addition & 0 deletions ci/environment-py36.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@ dependencies:
- zarr
- pytest
- netcdf4
- pydap
- rasterio
- scikit-image
1 change: 1 addition & 0 deletions ci/environment-py37-nodefaults.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@ dependencies:
- zarr
- pytest
- netcdf4
- pydap
- rasterio
- scikit-image
35 changes: 27 additions & 8 deletions intake_xarray/opendap.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,44 @@ class OpenDapSource(DataSourceMixin):
Chunks is used to load the new dataset into dask
arrays. ``chunks={}`` loads the dataset with dask using a single
chunk for all arrays.
auth: None, "esgf" or "urs"
Method of authenticating to the OPeNDAP server.
Choose from one of the following:
'esgf' - [Default] Earth System Grid Federation.
'urs' - NASA Earthdata Login, also known as URS.
None - No authentication.
Note that you will need to set your username and password respectively using the
environment variables DAP_USER and DAP_PASSWORD.
"""
name = 'opendap'

def __init__(self, urlpath, chunks, xarray_kwargs=None, metadata=None,
def __init__(self, urlpath, chunks, auth="esgf", xarray_kwargs=None, metadata=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want the default auth to be esgf?

Copy link
Collaborator Author

@weiji14 weiji14 Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ESGF was the 'default' authentication method in the original code. Personally I would prefer using URS as the default, but might be good to preserve backward compatibility here. Edit: Or do you mean we should set the default to None i.e. no authentication?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaults are hard. Perhaps, @danielballan or @jsignell care to weigh in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with @martindurant's comment in the main thread. 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Same for me

**kwargs):
self.urlpath = urlpath
self.chunks = chunks
self.auth = auth
self._kwargs = xarray_kwargs or kwargs
self._ds = None
super(OpenDapSource, self).__init__(metadata=metadata)

def _get_session(self):
from pydap.cas.esgf import setup_session
username = os.getenv('DAP_USER', None)
password = os.getenv('DAP_PASSWORD', None)
return setup_session(
username,
password,
check_url=self.urlpath)
if self.auth is None:
session = None
else:
if self.auth == "esgf":
from pydap.cas.esgf import setup_session
elif self.auth == "urs":
from pydap.cas.urs import setup_session
else:
raise ValueError(
"Authentication method should either be None, 'esgf' or 'urs', "
f"got '{self.auth}' instead."
)
username = os.getenv('DAP_USER', None)
password = os.getenv('DAP_PASSWORD', None)
session = setup_session(username, password, check_url=self.urlpath)

return session
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

session can be None? I would have thought you still need to make a non-authenticated session of some sort (not that I'm certain what this session thing is, perhaps pydap makes its own for the None case).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes session can be None, probably just an anonymous connection then? See https://github.com/pydap/pydap/blob/3f6aa190c59e3bbc6c834377a61579b20275ff69/src/pydap/client.py#L58


def _open_dataset(self):
import xarray as xr
Expand Down
7 changes: 7 additions & 0 deletions intake_xarray/tests/data/catalog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,10 @@ sources:
driver: zarr
args:
urlpath: "{{CATALOG_DIR}}/blank.zarr"
opendap_source:
description: example OPeNDAP source
driver: opendap
args:
urlpath: http://test.opendap.org/opendap/hyrax/data/nc/data.nc
chunks: {}
auth: null
40 changes: 40 additions & 0 deletions intake_xarray/tests/test_intake_xarray.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# -*- coding: utf-8 -*-
import os
from unittest.mock import patch

import numpy as np
import pytest

Expand Down Expand Up @@ -303,3 +305,41 @@ def test_read_jpg_image():
im = ImageSource(os.path.join(here, 'data', 'dog.jpg'))
da = im.read()
assert da.shape == (192, 192)


def test_read_opendap_no_auth():
pytest.importorskip("pydap")
cat = intake.open_catalog(os.path.join(here, "data", "catalog.yaml"))
source = cat.opendap_source
info = source.discover()
assert info["metadata"]["dims"] == {"TIME": 12}
x = source.read()
assert x.TIME.shape == (12,)


@pytest.mark.parametrize("auth", ["esgf", "urs"])
def test_read_opendap_with_auth(auth):
pytest.importorskip("pydap")
from intake_xarray.opendap import OpenDapSource

os.environ["DAP_USER"] = "username"
os.environ["DAP_PASSWORD"] = "password"
urlpath = "http://test.opendap.org/opendap/hyrax/data/nc/123.nc"

with patch(
f"pydap.cas.{auth}.setup_session", return_value=None
) as mock_setup_session:
source = OpenDapSource(urlpath=urlpath, chunks={}, auth=auth)
source.discover()
mock_setup_session.assert_called_once_with(
os.environ["DAP_USER"], os.environ["DAP_PASSWORD"], check_url=urlpath
)


def test_read_opendap_invalid_auth():
pytest.importorskip("pydap")
from intake_xarray.opendap import OpenDapSource

source = OpenDapSource(urlpath="https://test.url", chunks={}, auth="abcd")
with pytest.raises(ValueError):
source.discover()