-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use xarray.open_dataset() for password-protected Opendap files #1068
Comments
If you write |
Thanks very much for your reply! I still get an error from xarray when I use the
I've attached the error message here -- |
If the dataset has a "time" dimension, try accessing the first few values. Can you view them in pydap? Xarray's open_dataset does a little more work than pydap's open_url, insofar as it actually downloads some array data. |
Ah, I see. Thanks for the suggestion. Using Pydap I'm able to see all the variables and their metadata, so I thought it was working, but when I try to actually access the data values, I get the same error message as from Xarray. The issue must be something unrelated to Xarray -- I'll keep investigating. Thanks for your help! |
@jenfly did you find a solution how to make opendap authentication work with |
@j08lue no, not yet. I've been in touch with the folks at NASA who run the server, but their suggestions didn't work for me and I haven't had time to keep troubleshooting. I will need to sort out this issue in the next couple of months to get some data that I need, so if/when I ever resolve it, I'll post the solution here. |
I've finally found something useful online and am able to use Pydap to open these files -- hoping someone can help me find a way to integrate this into an xarray.open_dataset() function call and then I will be a very happy camper! Turns out much of the info posted by NASA online is out of date and based on a different implementation of Pydap than what is actually being used currently (argh). Here is something that actually works, from http://www.pydap.org/en/latest/client.html#urs-nasa-earthdata:
where I've assigned the username and password variables with the appropriate values in another function. I've tested this and it is working, but I would prefer to do things within Xarray since all my code is already using it. Just for fun, I tried |
Hi @jenfly, it's great to see that you have tracked down this root issue! I agree we should be able to support direct access to these sort of opendap resources within xarray. It should not be too tricky to implement, and in fact, if you are interested, it could be a great opportunity for you to open a pull request and become directly involved in the project. We would be very happy to gain another contributor. You can see the line where We just need a mechanism to pass the username and password from
It would be good to get some other opinions on which approach would be preferable. |
Thanks, @rabernat! I'd be happy to try implementing this in the project. I'm a newbie when it comes to contributing to big projects like this (so far I've just used Github for my own little projects) so I might have some naive questions as I figure out how things work. The two options you mentioned for passing username and password info to Also, I realized that there is another hiccup along the way. When I try to specify |
Parsing username/password from the URL would be very easy to add. We need to figure out a solution for the proliferating arguments on Another option is to add
|
Pydap has a new v3.2 release, but it still needs some fixes to work with xarray -- or xarray needs to be updated to work with the new version of pydap. I think pydap/pydap#48 once merged would probably be enough to restore xarray compatibility. |
I like the idea of passing |
I also like the idea of passing I'm still having problems trying to get |
Indeed, it would be great if someone using pydap could take a look into this. You can find our logic for interoperating with pydap here: https://github.com/pydata/xarray/blob/master/xarray/backends/pydap_.py |
Awesome, thanks so much @laliberte! |
I spent a few minutes on this but am still getting |
Nevermind, I figured it out (I was using an old version of pydap by mistake). See #1439 for the pydap fix. |
@shoyer @jenfly Has this been implemented? I can't see any open PRs relating to this, so I guess no one is working on it? I would be happy to try and implement it, if that's fine with you? It seems like you settled on the solution of passing a session object to a PydapDataStore and then passing that to open_dataset(), correct? Thanks in advance! |
@mrpgraae no, I don't think this has been implemented yet. Please take a look at #1508 for an example of the model to use:
You are also welcome to add any keyword parameters (e.g., So the user API becomes: pydap_ds = pydap.client.open_url(url, session=session)
store = xarray.backends.PydapDataStore(pydap_ds)
ds = xarray.open_dataset(store) or store = xarray.backends.PydapDataStore.open(url, session=session)
ds = xarray.open_dataset(store) |
Thank you @shoyer, I'll start work on the implementation. |
#1570) * Use the PydapDataStore.open() classmethod * Added test for pydap password support * Added pydap password change to whats-new.rst * Changed test_password to test_session * Documented type of ds in PydapDataStore * Fixed documentation * Removed unused import * Added docs for using sessions with pydap * Fixed typo * Fixed formatting after merge
Dear all, |
@juliancanellas |
I am trying to load MERRA2 data via the NASA password-protected opendap server. Although it sounds like both pydap and xarray have been fixed to support this, I still am having basically the same problem @jenfly described over three years ago. At this point it feels like a pydap issue, but I ask on this thread anyway. Here's a fully reproducible example, password and all 😄 from pydap.client import open_url
from pydap.cas.urs import setup_session
username = 'rabernat'
password = '%8rTMU6VT37r&%3e'
url = 'https://goldsmr5.gesdisc.eosdis.nasa.gov:443/opendap/MERRA2_MONTHLY/M2IMNPANA.5.12.4/2019/MERRA2_400.instM_3d_ana_Np.201901.nc4'
session = setup_session(username, password, check_url=url)
dataset = open_url(url, session=session)
assert 'USVS' in dataset
_ = dataset['USVS'][:] raises
Is this a problem with pydap? Or the NASA server? |
https://en.wikipedia.org/wiki/HTTP_302 Looks like you need a better URL? and that pydap can't deal with redirects? |
Yes, seems like a redirect issue. The URL is fine. |
No, actually the problem was with my authorization. I had to accept a EULA before my password was valid. Once I did that, everything worked. |
One can also add username and password to the However, there was one more issue. With Python 3.7.6, I was getting the following error:
That was solved by |
So, I tried Ryan's example, and got to the same error, where do you accept
the EULA? It doesn't pop up on screen.
El dom., 22 mar. 2020 a las 6:29, ahahmann (<[email protected]>)
escribió:
… One can also add username and password to the .netrc file and all works
very smoothly, without a need for explicit username and password in the
script.
However, there was one more issue. With Python 3.7.6, I was getting the
following error:
File "MERRA2.py", line 16, in <module>
session = setup_session(username, password, check_url=url)
File "/groups/FutureWind/xesmf_env/lib/python3.7/site-packages/pydap/cas/urs.py", line 19, in setup_session
verify=verify)
File "/groups/FutureWind/xesmf_env/lib/python3.7/site-packages/pydap/cas/get_cookies.py", line 75, in setup_session
password_field=password_field)
File "/groups/FutureWind/xesmf_env/lib/python3.7/site-packages/pydap/cas/get_cookies.py", line 123, in soup_login
soup = BeautifulSoup(resp.content, 'lxml')
File "/groups/FutureWind/xesmf_env/lib/python3.7/site-packages/bs4/__init__.py", line 228, in __init__
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
That was solved by pip install lxml
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1068 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKIV6EWOBU7AJOJNQWZQEGLRIXK7XANCNFSM4CUSDJ5A>
.
|
https://urs.earthdata.nasa.gov/app_eula/nasa_gesdisc_data_archive |
I'm trying this example: url = 'https://gpm1.gesdisc.eosdis.nasa.gov:443/opendap/hyrax/GPM_L3/GPM_3IMERGHH.06/2019/087/3B-HHR.MS.MRG.3IMERG.20190328-S000000-E002959.0000.V06B.HDF5'
try:
session = setup_session(username, password, check_url=url)
pydap_ds = open_url(url, session=session)
store = xr.backends.PydapDataStore(pydap_ds)
ds = xr.open_dataset(store)
except Exception as err:
print(err) which returns:
The error message just comes when I try to use xr.open_dataset |
Dear all, anyone knows if it is possible in xarray.open_dataset (pydap or netcdf engines) to pass |
@wallissoncarvalho Were you ever able to make that example work? I have been getting this error using the same example as well and haven't been able to find a solution |
I'm also getting the same error when running I'm using pydap==3.2.2 and xarray==0.18.0, any help would be much appreciated! import xarray as xr
from pydap.client import open_url
from pydap.cas.urs import setup_session
username = "my_username"
password= "my_password"
url = 'https://goldsmr4.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2T1NXSLV.5.12.4/2016/06/MERRA2_400.tavg1_2d_slv_Nx.20160601.nc4'
session = setup_session(username, password, check_url=url)
pydap_ds = open_url(url, session=session)
store = xr.backends.PydapDataStore(pydap_ds)
ds = xr.open_dataset(store) HTTPError: 302 Found
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://urs.earthdata.nasa.gov/oauth/authorize/?scope=uid&app_type=401&client_id=e2WVk8Pw6weeLUKZYOxvTQ&response_type=code&redirect_uri=http%3A%2F%2Fgoldsmr4.gesdisc.eosdis.nasa.gov%2Fdata-redirect&state=aHR0cHM6Ly9nb2xkc21yNC5nZXNkaXNjLmVvc2Rpcy5uYXNhLmdvdi9vcGVuZGFwL01FUlJBMi9NMlQxTlhTTFYuNS4xMi40LzIwMTYvMDYvTUVSUkEyXzQwMC50YXZnMV8yZF9zbHZfTnguMjAxNjA2MDEubmM0LmRvZHM%2FdGltZSU1QjA6MTowJTVE">here</a>.</p>
</body></html> |
@AyrtonB I'm getting the same error now, did you manage to solve it? |
Unfortunately not @zjans |
I'd like to tag @betolink in this issue. He knows quite a bit about both Xarray and Earthdata login. Maybe he can help us get to the bottom of these problems. Luis, any ideas? |
This looks familiar. I'm going to take a look at this when I get home and will report back. @rabernat |
Looks like the dataset got updated and when that happens NASA requires users to accept the end user license agreement (again). That's why the request ends up in a redirect. This EULA is also required the first time a user requests the data. Here are the instructions for accepting GESDISC EULA. https://disc.gsfc.nasa.gov/earthdata-login After the GESDIC data archive app shows up in our authorized apps list the code above works like a charm. |
@betolink Thanks for looking into this. GESDISC was already in my lists of accepted EULAS & authorized Apps. I also deleted them and re-authorized, but no change. I still get the "302 The document has moved" message when trying to access the HDF-datasets under https://gpm1.gesdisc.eosdis.nasa.gov/opendap/hyrax/GPM_L3/... with xr.backends.PydapDataStore and ds.open_dataset() In the meantime, I changed my scripts to download the entire HDF files from https://gpm1.gesdisc.eosdis.nasa.gov/data/GPM_L3/... and open them locally with xarray (and do spatial subsetting etc) - which works fine but is not quite ideal. |
Yeah, definitely not ideal. I'm going to test it again this evening with a new Earthdata user. I'll send you a binder link to a notebook to test it with both accounts. |
At what point do we escalate this issue to NASA? Is there a channel via which they can receive and respond to user feedback? |
I just asked on Slack about how to check for these changes (if at the end this issue is indeed related to an updated EULA) and unfortunately there is no way around it other than doing what Jan did(and still got the 302s). About feedback, yes there are channels but they are on a per-DAAC basis (cries). In this case that would be going to https://daac.gsfc.nasa.gov/ and clicking on the feedback button. I'll keep looking at this after the cloud hackathon today. |
Quick update, MERRA2 worked as expected after accepting the EULA again. GPM_L3 redirects to an empty |
Just wanted to say how much I appreciate @betolink acting as a communication channel between Xarray and NASA. Users often end up on our issue tracker because Xarray raises errors whenever it can't read data. But the source of these problems is not with Xarray, it's with the upstream data provider. This also happens all the time with xmitgcm, e.g. MITgcm/xmitgcm#266 It would be great if NASA had a better way to respond to these issues which didn't require that you "know a guy". |
One solution to this problem might be the creation of a custom Xarray backend for NASA EarthData. This backend could manage authentication with EDL and have its own documentation. If this package were maintained by NASA, it would close the feedback loop more effectively. |
I've been using xarray.open_dataset() to read Opendap netcdf files from NASA's MERRA-2 data archive. Recently they changed their site so that now you must enter a username and password to read any files. They describe here how to access data with Pydap: http://disc.sci.gsfc.nasa.gov/registration/registration-for-data-access#python.
I experimented with a similar approach (adding username and password to the url) with xarray.open_dataset() and specifying engine='pydap', but no luck. Is there a way to use xarray.open_dataset() to read password-protected Opendap files? Thanks!
The text was updated successfully, but these errors were encountered: