-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClientResponseError:400 message='Bad Request' when opening OpenDAP urls with xarray #525
Comments
Can you please go into debug and find out what the headers of the request were? Perhaps this is trying to grab a specific range, and dap doesn't handle that. |
The RequestInfo object associated with the exception is: RequestInfo(url=URL('https://dapds00.nci.org.au/thredds/dodsC/rr6/oceanmaps_datasets/version_3.3/forecast/latest/ocean_fc_2021011512_000_eta.nc'), method='GET', headers=<CIMultiDictProxy('Host': 'dapds00.nci.org.au', 'Accept': '/', 'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'Python/3.7 aiohttp/3.7.3')>, real_url=URL('https://dapds00.nci.org.au/thredds/dodsC/rr6/oceanmaps_datasets/version_3.3/forecast/latest/ocean_fc_2021011512_000_eta.nc')) I read through the xarray PR pydata/xarray#4823 and seems that this is all related possibly to your reasons of limiting that PR to Zarr only. I have followed the call stack through, the read is prompted by the 8 byte read to inform Xarray._get_engine_from_magic_number, however at HTTPFile.Read (line 436) HTTPFile.size is (65) < blocksize so self._fetch_all() is requested. The size of 65 is is set based on the number of characters in the URL (http.py:231) - the actual file is ~10MB. Setting the storage option 'block_size' to 40 to force a range request results in the same error and this RequestInfo: RequestInfo(url=URL('https://dapds00.nci.org.au/thredds/dodsC/rr6/oceanmaps_datasets/version_3.3/forecast/latest/ocean_fc_2021011512_000_eta.nc'), method='GET', headers=<CIMultiDictProxy('Host': 'dapds00.nci.org.au', 'Range': 'bytes=0-47', 'Accept': '/', 'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'Python/3.7 aiohttp/3.7.3')>, real_url=URL('https://dapds00.nci.org.au/thredds/dodsC/rr6/oceanmaps_datasets/version_3.3/forecast/latest/ocean_fc_2021011512_000_eta.nc')) |
If the server didn't accept range requests, that would not be too much of a surprise - but I don't see that in the original exception-causing request. _fetch_all should get the whole file, which is actually what we want, what could be going wrong?
Se we could specify the engine and avoid this?
That's not what self.size(path) does - it tries HEAD or streaming GET. Actually, that's the intent, but I don't now see where this might happen. Actually, the URL is longer than that, and I think it's the length of
which is what I get in I conclude that DAP shouldn't just be fetched like this. I don't know what it does internally, but presumably it passes additional parameters/queries which expose the real URL for reading. EDIT A little digging in pydap reveals: url + ".dds" : dataset description (size, coords) By the way, this is also a thredds server, maybe intake_thredds takes care of it? |
I was hoping to have an intake-xarray Catalog entry that provided a parameter and aggregated across several files on a thredds server. The intake-thredds syntax is a bit clunky still to descend the catalog hierarchy. I only stumbled across this due to the issues introduced by including fsspec into the NetCDFSource intake/intake-xarray#98 Agree that this isn't the correct approach for accessing DAP, but the correct backend clients exposed through the XArray 'engine' argument don't accept file-like objects. This issue can probably be closed. |
Yes, I think this issue should be closed, perhaps we can iron out the details in intake-xarray and/or intake-thredds can talk with xarray a bit clearer to concat/merge datasets. |
Not sure if this is an fsspec/async issue or xarray/intake-xarray. Since the change to async operation with aiohttp it is no longer possible to use an OpenDAP server endpoint with xarray when a file-like object is passed. Minimum reproducible example:
Results in:
Similar to intake/intake-xarray#98 this previously worked.
Testing that the URL is correct - this works:
However using the h5netcdf engine with a file-like object causes the same error:
This seems to be something unique to the opendap endpoint, as the standard http server works fine, e.g:
I have confirmed this with a few different opendap servers.
The text was updated successfully, but these errors were encountered: