Example of a data provisioning specification #174

mih · 2024-05-06T08:29:14Z

The twist here is the focus on (partial) data access to information in dataset, rather than a (full) description of a dataset. It should enable precise instructions what to obtain, allow for smart decisions on how to obtain it, and all that with a lean data specification.

Decision making: I can declare a download_url for a dataset and use that, but if it is 1gb and I only need a 1mb file from it, a full download is not smart. So that dataset may be a datalad dataset, and we may be able to clone it, and may be able to get that individual file separately.

This means that we need to be able to declare a clone_url in a way that is recognizable. And we should not start declaring additional attributes like clone_url without thinking real hard. Because in no time we will have 1k additional attributes for each special case.

I am thinking to go via QualifiedAccess and have a DataService that is some kind of Git service...

I think this best here is to try to write done a small, clean example of a record, and then get it to be compliant with the schema

The text was updated successfully, but these errors were encountered:

mih self-assigned this May 6, 2024

This was referenced May 6, 2024

Design datalad remake-provision datalad/datalad-remake#12

Open

First attempt at a data provisioning specification #175

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example of a data provisioning specification #174

Example of a data provisioning specification #174

mih commented May 6, 2024

Example of a data provisioning specification #174

Example of a data provisioning specification #174

Comments

mih commented May 6, 2024