Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First rough mapping of a DataLad dataset export onto data-distribution #128

Merged
merged 1 commit into from
Mar 27, 2024

Conversation

mih
Copy link
Contributor

@mih mih commented Mar 27, 2024

I think this worked pretty good for a first attempt!

A number of examples have been added that roughly reflect the compartmentalization of metadata record for storage in a datalad dataset

  • annex key metadata -> Distribution
  • git blob metadata -> Distribution
  • git tree metadata -> Distribution
  • git commit -> Resource
    • this is also the anchor for provenance records
    • here demo'd for a derivation from a parent commit via an unspecific "change"
  • datalad dataset (version-less) -> Resource

Structurally, EntityInfluence has been enhanced to accept multiple entities. This makes specifying a homgeneous set of roles for a number of entities more space-efficient. It also enables the annotation of merge commits (that have multiple parent commits).

This should make the datalad-dataset-component schema obsolete.

I think this worked pretty good for a first attempt!

A number of examples have been added that roughly reflect the
compartmentalization of metadata record for storage in a datalad dataset

- annex key metadata -> Distribution
- git blob metadata -> Distribution
- git tree metadata -> Distribution
- git commit -> Resource
  - this is also the anchor for provenance records
  - here demo'd for a derivation from a parent commit via an unspecific
    "change"
- datalad dataset (version-less) -> Resource

Structurally, `EntityInfluence` has been enhanced to accept multiple
entities. This makes specifying a homgeneous set of roles for a number
of entities more space-efficient. It also enables the annotation of
merge commits (that have multiple parent commits).

This should make the `datalad-dataset-component` schema obsolete.
@mih mih merged commit 64ab50d into main Mar 27, 2024
3 checks passed
@mih mih deleted the ddist branch March 27, 2024 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant