-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concepts and terminology #98
Comments
I think it is defined in the datalad-context as "git-annex optional remote access". IIUC, it refers to git-annex special remotes. |
With regard to using $ git annex initremote uncurl-ria type=external externaltype=uncurl encryption=none \
match='ria\+(?P<scheme>[^:]+)://(?P<site>[^/]*)/(?P<path>.*)$' \
'url=file:///data/ria-stores/store-1/{datalad_dsid[0]}{datalad_dsid[1]}{datalad_dsid[2]}/{datalad_dsid[3]}{datalad_dsid[4]}{datalad_dsid[5]}{datalad_dsid[6]}{datalad_dsid[7]}{datalad_dsid[8]}{datalad_dsid[9]}{datalad_dsid[10]}{datalad_dsid[11]}{datalad_dsid[12]}{datalad_dsid[13]}{datalad_dsid[14]}{datalad_dsid[15]}{datalad_dsid[16]}{datalad_dsid[17]}{datalad_dsid[18]}{datalad_dsid[19]}{datalad_dsid[20]}{datalad_dsid[21]}{datalad_dsid[22]}{datalad_dsid[23]}{datalad_dsid[24]}{datalad_dsid[25]}{datalad_dsid[26]}{datalad_dsid[27]}{datalad_dsid[28]}{datalad_dsid[29]}{datalad_dsid[30]}{datalad_dsid[31]}{datalad_dsid[32]}{datalad_dsid[33]}{datalad_dsid[34]}{datalad_dsid[35]}/annex/objects/{annex_dirhash}/{annex_key}/{annex_key}' (That is a rather long command line. This is mainly due to the construction of the 2-level dataset-id-based directory structure. A A Similarly, a Obviously, this does not include access to the archive files (which are stored in While the above indicates that ria-remotes might be implemented by inheriting from |
This has never been defined properly. I think we should do it and put it in the docs. At minimum this needs to cover:
RIA
stands for "Remote Indexed Archive" -- that we know. I cannot remember whatORA
stands for. ARIA store
is typically referring to the particular (filesystem) structure described in the FAIRLY-big paper and the handbook chapter.Broader perspective
RIA is a data structure where the location of all components is based on identifiers. Identifiers that are always available in any datalad dataset: DataLad dataset ID and annex key.
Things that can be put into a RIA store are: base Git repositories, annex keys/objects.
A RIA store takes the form of a directory tree on the file system, where some parts of the name are the respective identifiers.
The directory tree is organized (at the top-level) as a collections of per-single-dataset subtrees. This is done in order to enable more simple server-side maintenance tasks (e.g., delete a dataset is just deleting the directory, no need to sift through a joint key store to find all the pieces).
Conceptually, there is no reason to have this file-system representation by the only way one can materialize a RIA store. Any object store could do a per-dataset-per-key type object addressing. It would be not as straightforward to put bare repositories in such an alternative "backend". But with the invention of
datalad-annex::
, this is no longer a fundamental issue. Bare repositories can be also "just" annex keys.All we know about "ORA" is that this is the name of the special remote implementation that can talk to RIA stores.
The rest has been bolted on (storage of annex keys in archives to trim inode consumption; aliases).
Relationship of
ora
special remote to theuncurl
special remoteThe
uncurl
special remote could be considered a strict superset ofora
. It also allows for an identifier based organization, and supports the identifiers used byora
.uncurl
uses a different IO abstraction layer. That already comes with some implementations (ssh, http, file).It would be useful to compare the implementations in detail. We may find advantages and disadvantages on either side and could use the outcome for improving uncurl.
What
uncurl
does not do "out of the box" is retrieval of archive members. ATM only thearchivist
special remote does that, but only for archives that are available locally. This missing functionality could be implemented via a dedicatedUrlOperations
implementation. We could implementRiaSshUrlOperations
, which is justSshUrlOperations
, except for a fallback implementation of thedownload
operation, that would look for the archive on access failure and see whether the archive can provide the desired key.If we go down this path, we could also support HTTP-based partial archive access with a dedicated
RiaHttpUrloperations
(using the implementation draft that @christian-monch started).Taken together, the
ora
special remote could be reimplemented as a plainuncurl
remote (derived class), that employs a dedicated URL handler configuration for its internalAnyUrlOperations
.The text was updated successfully, but these errors were encountered: