-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read h5 file using AWS S3 s3fs/boto3 #144
Comments
The h5py library doesn't accept a Python file like object. It expects a string pathname to a local file. The HDF library does not work well with cloud-based data. See http://matthewrocklin.com/blog/work/2018/02/06/hdf-in-the-cloud for further discussion. This is a good question, thank you for raising it, but solving it is out of scope for s3fs, so I'm going to close this issue. |
Support for file-like objects has been added to h5py v 2.9. See "Python file-like objects" in http://docs.h5py.org/en/stable/high/file.html. You can open HDF5 files with s3fs like so:
Performance will vary depending on how the file is structured and latency between where your code is running and the S3 bucket where the file is stored (running in the same AWS region is best), but if you have some existing Python h5py code, this is easy enough to try out. |
That's great, thanks for sharing @jreadey! |
Nice!
…On Sun, Nov 3, 2019 at 5:44 PM James Bourbeau ***@***.***> wrote:
That's great, thanks for sharing @jreadey <https://github.com/jreadey>!
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#144>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTF4COCO2OMF7QDIJPDQR5V6XANCNFSM4FOUG7OQ>
.
|
(should work with any file system backend) |
thanks for the code snippet. I tried it and it works fine unteil I wanto to close the file to leave the function. Is there any way to get around this?
This is the error that comes up as soon as it leaves the with statement:
Any help is appreciated |
but S3File does have an attribute
to be explicit. |
thanks. I solved i with the following solution:
I don't really know why I had to give different read/write options to hdf5 and s3 , but this way it does work. |
HD5 will assume binary in every case, but fsspec follows the python convention that 'r' means text-mode. |
thanks for the explanation. |
for the interested reader directed here by search engines (like me) -- another effective workaround:
|
The above is essentially equivalent to:
which may or may not seem simpler. |
I am tring your solution but getting empty dictionary as if the file is only opening as view mode, any suggestions? |
I am trying to read h5 file from AWS S3. I am getting the following errors using s3fs/boto3. Can you help? Thanks!
TypeError: expected str, bytes or os.PathLike object, not S3File
TypeError: expected str, bytes or os.PathLike object, not S3File
TypeError: expected str, bytes or os.PathLike object, not StreamingBody
The text was updated successfully, but these errors were encountered: