Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using an image index to reference nested related artifacts #1217

Open
arewm opened this issue Nov 12, 2024 · 4 comments
Open

Using an image index to reference nested related artifacts #1217

arewm opened this issue Nov 12, 2024 · 4 comments

Comments

@arewm
Copy link

arewm commented Nov 12, 2024

We have encountered some use cases where we would like to associate multiple related artifacts but which would still benefit from having their own namespace location.

  • Use case: A single Tekton task runs ko to produce multiple container images. While each of these image indexes can be pushed independently, we want to also associate them as being produced together by the single task. If we configure the ko build to push to quay.io/<username>/myrepo/ko-build/<image-name>, the task could return a single result quay.io/<username>/myrepo/ko-build which would map to all image indexes. (side note: chains should also support recursive index images).
  • Use case: We want to produce an immutable RPM repo which itself has immutable references to contained RPMs. To achieve this, we might push the repo to quay.io/<username>/myrpmrepo and the contained RPMs to quay.io/<username>/myrpmrepo/<rpm> image manifests.

Since an image index only has a reference to the digest for the referenced image index/manifest, it is not possible to map to reference any nested artifacts. Any client that is interested in supporting this functionality would need to add some form of client-specific encoding (i.e. via annotations) on the image index.

While some might be interested in having an image index refer to arbitrary pullspecs, I think that it is better to scope the references only to nested relationships as it is likely easier for registries to implement authorization models for these relationships. Being able to have a common authorization for all referenced artifacts would result in a better user experience.

@arewm arewm changed the title Using an image index to reference multiple related artifacts Using an image index to reference nested related artifacts Nov 12, 2024
@sudo-bmitch
Copy link
Contributor

Hi @arewm. I believe it's possible to have a nested index today with the current specs. What is the specific change you are looking for from the spec? Is it for a standard around querying annotations?

@arewm
Copy link
Author

arewm commented Nov 12, 2024

I might have made a poor choice of terms. I assume that it is possible to have an image index point to another image index within the same namespace. What isn't possible possible is to point to an image within a further scoped namespace.

What does work:

Produce an image in a repository

$ echo "unnested image" > unnested.txt
$ oras push quay.io/arewm/oci-spec-1217:unnested ./unnested.txt
✓ Exists    application/vnd.oci.empty.v1+json                                                             2/2  B 100.00%     0s
  └─ sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a
✓ Uploaded  unnested.txt                                                                                15/15  B 100.00%     2s
  └─ sha256:052ec3b6a7a72037458b61f231635552122aa059b58f096cc7e1b3fb0c69ea60
✓ Uploaded  application/vnd.oci.image.manifest.v1+json                                                591/591  B 100.00%     4s
  └─ sha256:fb84505a92c39564f1976b8753bec1c10443a5b89bb9e900973cbd377d59ec4c
Pushed [registry] quay.io/arewm/oci-spec-1217:unnested
ArtifactType: application/vnd.unknown.artifact.v1
Digest: sha256:fb84505a92c39564f1976b8753bec1c10443a5b89bb9e900973cbd377d59ec4c
$ oras manifest fetch --descriptor quay.io/arewm/oci-spec-1217:unnested
{"mediaType":"application/vnd.oci.image.manifest.v1+json","digest":"sha256:fb84505a92c39564f1976b8753bec1c10443a5b89bb9e900973cbd377d59ec4c","size":591}%

Create an image index referring to that

$ echo '{"schemaVersion":2,"mediaType": "application/vnd.oci.image.index.v1+json","manifests": [{"mediaType":"application/vnd.oci.image.manifest.v1+json","digest":"sha256:fb84505a92c39564f1976b8753bec1c10443a5b89bb9e900973cbd377d59ec4c","size":591}]}' > unnested-manifest.json
$ oras manifest push quay.io/arewm/oci-spec-1217:index unnested-manifest.json
Pushed [registry] quay.io/arewm/oci-spec-1217:index
Digest: sha256:2763178e5ddad8f5ff33ce5ae7d6dcc0ad6ccc46fedeff20f96c07a2c732d065$ oras manifest fetch --pretty quay.io/arewm/oci-spec-1217:index
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:fb84505a92c39564f1976b8753bec1c10443a5b89bb9e900973cbd377d59ec4c",
      "size": 591
    }
  ]
}

Is wrapping this in an image index what is supposed to be supported by the spec?

$ echo '{"schemaVersion":2,"mediaType": "application/vnd.oci.image.index.v1+json","manifests": [{"mediaType":"application/vnd.oci.image.index.v1+json","digest":"sha256:2763178e5ddad8f5ff33ce5ae7d6dcc0ad6ccc46fedeff20f96c07a2c732d065","size":243}]}' > nested-index.json
$ oras manifest push quay.io/arewm/oci-spec-1217:nested-index nested-index.json

What doesn't work but is proposed:

Produce an image in a repository

$ oras push quay.io/arewm/oci-spec-1217/nested:manifest ./nested.txt
✓ Exists    nested.txt                                                                                  13/13  B 100.00%     0s
  └─ sha256:6106ccab4c28daa60df08ab663b42c27a3f884ea58da65ac8d6d848055f25588
✓ Exists    application/vnd.oci.empty.v1+json                                                             2/2  B 100.00%     0s
  └─ sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a
✓ Uploaded  application/vnd.oci.image.manifest.v1+json                                                589/589  B 100.00%     3s
  └─ sha256:99d5be857b2bd8906bb3fcd742119b28bbb637859d6ac2fb382c71058dceb01d
Pushed [registry] quay.io/arewm/oci-spec-1217/nested:manifest
ArtifactType: application/vnd.unknown.artifact.v1
Digest: sha256:99d5be857b2bd8906bb3fcd742119b28bbb637859d6ac2fb382c71058dceb01d
$ oras manifest fetch --descriptor quay.io/arewm/oci-spec-1217/nested:manifest
{"mediaType":"application/vnd.oci.image.manifest.v1+json","digest":"sha256:99d5be857b2bd8906bb3fcd742119b28bbb637859d6ac2fb382c71058dceb01d","size":589}%

Then trying to point to that from an image index in a "parent" namespace

$ echo '{"schemaVersion":2,"mediaType": "application/vnd.oci.image.index.v1+json","manifests": [{"mediaType":"application/vnd.oci.image.manifest.v1+json","digest":"sha256:99d5be857b2bd8906bb3fcd742119b28bbb637859d6ac2fb382c71058dceb01d","size":589}]}' > nested-manifest.json
$ oras manifest push quay.io/arewm/oci-spec-1217:index nested-manifest.json
Error response from registry: failed to tag index: manifest invalid: manifest invalid: map[message:Could not find child manifest with digest `sha256:99d5be857b2bd8906bb3fcd742119b28bbb637859d6ac2fb382c71058dceb01d`]

Proposed change to the spec to add support for nested artifact namespaces

In the details above that don't work, quay returns an error because the registry couldn't find a manifest with the provided digest. If we add an additional optional property for the manifests pointing to a child namespace then a registry could look for the digest there instead.

@arewm
Copy link
Author

arewm commented Nov 12, 2024

This issue also come from me, as a registry user, expecting that namespaces within a registry should roughly correspond to logically similar artifacts. When it comes to binary containers, for example (i.e. ko above), my expectation is that each application binary is in its own namespace instead of being in the same namespace with only a unique tag.

Artifacts stored in a registry have relationships to each other. For me, the relationships can be described with:

  • same namespaces: Artifacts that are similar in their construction and intent. They may vary from other artifacts in the same namespace based on version, platform, etc.
  • separate namespaces: By extension, if artifacts are in separate namespaces, then they are conceptually different. Examples include binary containers for different application code, different types of unrelated files.
  • image indexes: Description of links between artifacts. Examples include multiple artifacts built for different platforms, coupling configuration blobs with binary container artifacts, providing directory-like structure to blobs pushed to the registry.
  • referrer's api: Optional data that can be created to further describe the original artifact. Attestations can also be used here if there is an identity available for signing. Examples include BOMs, CVE/test reports, alternative formats (i.e. tars of the image).

@sudo-bmitch
Copy link
Contributor

There's no concept of a cross-repository descriptor in OCI, every descriptor references content in the same repository. Trying to change that is likely to face a lot of resistance due to all the issues it creates. A few issues I can think of:

  • Authentication is scoped per repository. Allowing cross-repository descriptors creates a potential data leakage, and content may only be partially accessible to consumers.
  • Copying content to a new location would either lose the different repositories and flatten the content to a single repository, potentially break if the identity performing the copy doesn't have access to create additional repositories (which may not match the receiving organization's repository naming scheme), or leave references to external registries (breaking the ability to move content into isolated networks).
  • All existing tooling to consume content would need to be redesigned and would break when encountering descriptors that span different repositories.
  • Including the repository name in the descriptor would be immutable without changing the top level index digest, locking in the naming of the other repositories.
  • Authentication to pull the content is unclear. Would one request multiple scopes for all of the other repositories, or a new auth request per repository being accessed?

Right now, any cross repository logic is registry specific and requires client configuration. E.g. it is possible to push an artifact with a subject digest that does not exist in the current repository. However, registries are permitted to implement their own garbage collection process that would include that content, and clients would need to know how to find the subject digest from another repository and be configured with the repository name to perform the referrers API.

Similarly, it's possible to push an Index that has descriptors that do not exist in the local repository (this is specifically desired for sparse copies that only include specific platforms used in an environment). However, most registries reject that as part of their validation, and many clients would break if they attempted to consume content and follow the path to a missing digest. To support your own use case, you would need a registry that doesn't validate the Index, and would need client tooling that knows how to find the content in different repositories.

Overall, the easy answer is to put all of the content in the same repository, but realize that some registries may reject a nested Index even when the referenced content exists (I recall someone validating the media types listed in an OCI Index were limited to only an OCI Image). Also, clients need to be configured to know how to walk a nested Index to find their content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants