You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
>>>importfsspec>>>of=fsspec.open('github://datalad:datalad@/README.md')
>>>fp=of.open()
>>>fp.size23922>>>fp.read(125)
b' _____ ....
This is pretty much all that is needed to implement basic support for all unauthenticated transports.
For AnyUrlOperations we would need to come up with a way to identify URLs that this handler should or could handle.
This IO support could be the basis for RemoteArchive#184 (comment)
But even without it, this tech enables partial archive content access.
Here is a demo. big.zip is 5G in size, and contains a 4-byte text file and a large binary blob. Here I am opening the text file from inside the archive over an SSH connection:
Such an endpoint_url could also be used as means to discover a related credential, likely in (optional) combination with a bucket name. An absent endpoint_url could be interpreted as AWS-S3. A distinction by endpoint and bucket name would also replace the need for a type-inflation as presently done in datalad-core (e.g. aws-s3, nda-s3), and we can simply use an s3 credential for all of them.
Such an s3 credential is essentially a user/password combo. It is probably not very useful to handle session tokens due to their lifetime limitations.
This is used by https://github.com/datalad/datalad-fuse
The aim here would be simpler: Wrap around
fsspec.open()
to provide the standard operationsdownload
,upload
,sniff
, and possiblydelete
.This should be relatively simple in principle. More challenging would be the provisioning of the correct credentials necessary.
The list of supported protocols is impressively long:
Individual protocols do require additional software dependencies to be installed. But even the list of built-ins is long https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations
Demo:
This is pretty much all that is needed to implement basic support for all unauthenticated transports.
For
AnyUrlOperations
we would need to come up with a way to identify URLs that this handler should or could handle.This IO support could be the basis for
RemoteArchive
#184 (comment)But even without it, this tech enables partial archive content access.
Here is a demo.
big.zip
is 5G in size, and contains a 4-byte text file and a large binary blob. Here I am opening the text file from inside the archive over an SSH connection:Size reporting is still possible, hence progress reporting would be possible here too!
The text was updated successfully, but these errors were encountered: