Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add "container metadata extractor" #198

Closed
asmacdo opened this issue Feb 28, 2023 · 6 comments · Fixed by #200
Closed

ENH: Add "container metadata extractor" #198

asmacdo opened this issue Feb 28, 2023 · 6 comments · Fixed by #200

Comments

@asmacdo
Copy link
Member

asmacdo commented Feb 28, 2023

datalad-metalad extension should include an extractor to retrieve metadata from a container image.

In scope: singularity
out of scope (for now) OCI containers

Desired metadata ideas:

  • base image
  • base image build date
  • operating system
  • installed software & version information (reproman retrace?)
  • Vulnerability report (CVEs)
  • Image build date
  • Image location (ie hub.docker.com, quay.io)
@asmacdo
Copy link
Member Author

asmacdo commented Feb 28, 2023

file level, not dataset level. we will extract for.all singularity images in the repo.

@yarikoptic
Copy link
Member

do develop that extractor within datalad-container extension, not in metalad. I will transfer this issue

@yarikoptic yarikoptic transferred this issue from datalad/datalad-metalad Feb 28, 2023
@yarikoptic yarikoptic changed the title ENH: Add "container extractor" ENH: Add "container metadata extractor" Feb 28, 2023
@yarikoptic
Copy link
Member

target use case: https://github.com/ReproNim/containers

extractor for starters should just look at registered within dataset containers i.e. the ones listed by containers-list command, which is also available via python API:

❯ pwd
/home/yoh/proj/repronim/containers
❯ python -c 'import json,datalad.api as dl; print(json.dumps(dl.containers_list(result_renderer="disabled")))' | jq '.[].path' | head
"/home/yoh/proj/repronim/containers/scripts/tests/arg-test.simg"
"/home/yoh/proj/repronim/containers/images/bids/bids-aa--0.2.0.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-afni-proc--0.0.2.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-antscorticalthickness--2.2.0-1.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-baracus--1.1.4.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-brainiak-srm--latest.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-broccoli--1.0.1.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-cpac--1.1.0_14.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-dparsf--4.3.12.sing"
"/home/yoh/proj/repronim/containers/images/bids/bids-example--0.0.7.sing"
...

@yarikoptic
Copy link
Member

singularity inspect is a nice one:

$> singularity inspect images/neurodesk/neurodesk-vina--1.2.3.simg
{
    "org.opencontainers.image.revision": "e50c6323cd703e0c2df7aa56eb18e05757df2402",
    "maintainer": "Anaconda, Inc",
    "org.opencontainers.image.document": "https://github.com/Metaphorme/AutoDock-Vina-Docker",
    "GITHUB_SHA": "da8afcba3a56737deebcc847da204c0247a139f5",
    "org.opencontainers.image.created": "2022-12-23T15:47:51.857Z",
    "org.opencontainers.image.licenses": "MIT",
    "GITHUB_REPOSITORY": "NeuroDesk/neurocontainers",
    "org.opencontainers.image.description": "Build from release",
    "org.opencontainers.image.name": "vina-all",
    "org.opencontainers.image.version": "22.11.1",
    "org.opencontainers.image.source": "https://github.com/ContinuumIO/docker-images",
    "org.opencontainers.image.authors": "Metaphorme",
    "org.opencontainers.image.url": "https://github.com/ContinuumIO/docker-images",
    "org.opencontainers.image.title": "docker-images"
}

may be we should also provide a singularity-inspect metadata extractor which would just do that -- channel output from that command for the file. @mslw mentioned some already existing helper to do such extractors which rely on external command output, might be the easiest

@mslw
Copy link

mslw commented Feb 28, 2023

You can capture the output of external programs by using metalad_external_file extractor (see this docs section). But given that we are within a DataLad extension already, I think it would be more logical to make the singularity-inspect extractor a proper MetaLad extractor derived from datalad_metalad.extractors.base.FileMetadataExtractor.

This docs page explains how the derived class needs to be constructed.

@yarikoptic
Copy link
Member

I thought you have mastered one @asmacdo as a PR somewhere -- please reference this issue so we get them linked.

asmacdo added a commit to asmacdo/datalad-container that referenced this issue Mar 7, 2023
Fixes: datalad#198

plus docs
  (add build/ to gitignore)
plus changelog
  (add scriv to dev requirements)
asmacdo added a commit to asmacdo/datalad-container that referenced this issue Mar 9, 2023
Fixes: datalad#198

- Adds "singularity inspect path/to/file.sing" to metadata
- Adds "apptainer --version || singularity version" to metadata
asmacdo added a commit to asmacdo/datalad-container that referenced this issue Mar 15, 2023
Fixes: datalad#198

- Adds "singularity inspect path/to/file.sing" to metadata
- Adds "apptainer --version || singularity version" to metadata

find_executable will be removed in 3.12

Code that imports distutils will no longer work from Python 3.12.
Necessary for me to run locally.

Update file docstring to not lie

Handle both singularity and apptainer
asmacdo added a commit to asmacdo/datalad-container that referenced this issue Mar 24, 2023
Fixes: datalad#198

- Adds "singularity inspect path/to/file.sing" to metadata
- Adds "apptainer --version || singularity version" to metadata

find_executable will be removed in 3.12

Code that imports distutils will no longer work from Python 3.12.
Necessary for me to run locally.

Update file docstring to not lie

Handle both singularity and apptainer
asmacdo added a commit to asmacdo/datalad-container that referenced this issue Mar 24, 2023
Fixes: datalad#198

- Adds "singularity inspect path/to/file.sing" to metadata
- Adds "apptainer --version || singularity version" to metadata

find_executable will be removed in 3.12

Code that imports distutils will no longer work from Python 3.12.
Necessary for me to run locally.

Update file docstring to not lie

Handle both singularity and apptainer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants