Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oras cp fails if org.opencontainers.image.ref.name contains a full reference #1505

Closed
1 task done
mauriciovasquezbernal opened this issue Sep 19, 2024 · 26 comments · Fixed by #1507
Closed
1 task done
Labels
bug Something isn't working enhancement New feature or request
Milestone

Comments

@mauriciovasquezbernal
Copy link
Contributor

mauriciovasquezbernal commented Sep 19, 2024

What happened in your environment?

We use OCI images to package eBPF programs in Inspektor Gadget. We have an ig image export command that creates an oci layout file with the image. I wanted to use oras cp to copy the image to a remote registry, however it fails with:

$ oras cp --from-oci-layout ./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget:latest localhost:5000/foo:latest --debug
Error: invalid argument "./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget:latest": failed to find path "./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget": stat ./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget: no such file or directory

The issue seems to be related to the fact that the org.opencontainers.image.ref.name annotation is set to ghcr.io/inspektor-gadget/gadget/mygadget:latest in index.json:

mygadget.tar.folder/index.json:

{
    "schemaVersion": 2,
    "manifests": [
        {
            "mediaType": "application/vnd.oci.image.manifest.v1+json",
            "digest": "sha256:69fb10d45f448525a39bb26ae2af71e57fe2b7aa774547eb57da38ddf08b12e0",
            "size": 1278,
            "annotations": {
                "org.opencontainers.image.created": "2024-09-19T13:06:53-05:00",
                "org.opencontainers.image.description": "TODO: Fill the gadget description",
                "org.opencontainers.image.documentation": "TODO: Fill the gadget documentation URL",
                "org.opencontainers.image.source": "TODO: Fill the gadget source code URL",
                "org.opencontainers.image.title": "TODO: Fill the gadget name",
                "org.opencontainers.image.url": "TODO: Fill the gadget homepage URL"
            },
            "platform": {
                "architecture": "arm64",
                "os": "linux"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.index.v1+json",
            "digest": "sha256:f31a444623261efcc68601704dd3d980692a73a55f5031c8ddca58cc8a23876e",
            "size": 1816,
            "annotations": {
                "org.opencontainers.image.created": "2024-09-19T13:06:53-05:00",
                "org.opencontainers.image.description": "TODO: Fill the gadget description",
                "org.opencontainers.image.documentation": "TODO: Fill the gadget documentation URL",
                "org.opencontainers.image.ref.name": "ghcr.io/inspektor-gadget/gadget/mygadget:latest",
                "org.opencontainers.image.source": "TODO: Fill the gadget source code URL",
                "org.opencontainers.image.title": "TODO: Fill the gadget name",
                "org.opencontainers.image.url": "TODO: Fill the gadget homepage URL"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.manifest.v1+json",
            "digest": "sha256:f99e163810194bd7fc50e23c9f1d7d827e464aac1766892f3747b1764301c3ae",
            "size": 1278,
            "annotations": {
                "org.opencontainers.image.created": "2024-09-19T13:06:53-05:00",
                "org.opencontainers.image.description": "TODO: Fill the gadget description",
                "org.opencontainers.image.documentation": "TODO: Fill the gadget documentation URL",
                "org.opencontainers.image.source": "TODO: Fill the gadget source code URL",
                "org.opencontainers.image.title": "TODO: Fill the gadget name",
                "org.opencontainers.image.url": "TODO: Fill the gadget homepage URL"
            },
            "platform": {
                "architecture": "amd64",
                "os": "linux"
            }
        }
    ]
}

According to https://github.com/opencontainers/image-spec/blob/v1.1.0-rc5/annotations.md, ghcr.io/inspektor-gadget/gadget/mygadget:latest is a valid value for the annotation.

Possible Fix

The following diff fixes the issue:

diff --git a/cmd/oras/internal/option/target.go b/cmd/oras/internal/option/target.go
index 489eeb8..33b83af 100644
--- a/cmd/oras/internal/option/target.go
+++ b/cmd/oras/internal/option/target.go
@@ -38,7 +38,6 @@ import (
        "oras.land/oras-go/v2/registry/remote/auth"
        "oras.land/oras-go/v2/registry/remote/errcode"
        oerrors "oras.land/oras/cmd/oras/internal/errors"
-       "oras.land/oras/cmd/oras/internal/fileref"
 )

 const (
@@ -120,6 +119,20 @@ func (opts *Target) Parse(cmd *cobra.Command) error {
        }
 }

+// Parse parses file reference on unix.
+func parseModified(reference string, defaultMetadata string) (filePath, metadata string, err error) {
+       i := strings.Index(reference, ":")
+       if i < 0 {
+               filePath, metadata = reference, defaultMetadata
+       } else {
+               filePath, metadata = reference[:i], reference[i+1:]
+       }
+       if filePath == "" {
+               return "", "", fmt.Errorf("found empty file path in %q", reference)
+       }
+       return filePath, metadata, nil
+}
+
 // parseOCILayoutReference parses the raw in format of <path>[:<tag>|@<digest>]
 func (opts *Target) parseOCILayoutReference() error {
        raw := opts.RawReference
@@ -132,7 +145,7 @@ func (opts *Target) parseOCILayoutReference() error {
        } else {
                // find `tag`
                var err error
-               path, ref, err = fileref.Parse(raw, "")
+               path, ref, err = parseModified(raw, "")
                if err != nil {
                        return errors.Join(err, errdef.ErrInvalidReference)
                }

I'm willing to open a fix once we agree on a solution.

What did you expect to happen?

oras cp should have copied the image to the remote repository

How can we reproduce it?

Set the org.opencontainers.image.ref.name annotation to something like foo:bar in index.json and try to run oras cp --from-oci-layout ./myfolder:foo:bar localhost:5000/foo:latest

What is the version of your ORAS CLI?

Manually compiled from 961e9f8

What is your OS environment?

Ubuntu 22.04 (it really doesn't matter for this bug)

Are you willing to submit PRs to fix it?

  • Yes, I am willing to fix it.
@mauriciovasquezbernal mauriciovasquezbernal added bug Something isn't working triage New issues or PRs to be acknowledged by maintainers labels Sep 19, 2024
@sajayantony
Copy link
Contributor

That’s a fair requirement as per image spec. Coincidentally this point came up during the OCI calls. Thanks for being willing to contribute the fix @mauriciovasquezbernal

@mauriciovasquezbernal
Copy link
Contributor Author

Opened #1507 with (hopefully) a right fix for it.

@shizhMSFT
Copy link
Contributor

@sajayantony I have a concern about this. According to the annotations.md

ref       ::= component ("/" component)*
component ::= alphanum (separator alphanum)*
alphanum  ::= [A-Za-z0-9]+
separator ::= [-._:@+] | "--"

Reference names like a:b:c:d:e:f, a@b@c@d@e@f, and a@b:c@d:e@f are also valid although their meaning is confusing.

@mauriciovasquezbernal If you look at the image-layout.md, there is a note:

Implementor's Note: A common use case of descriptors with a "org.opencontainers.image.ref.name" annotation is representing a "tag" for a container image. For example, an image may have a tag for different versions or builds of the software. In the wild you often see "tags" like "v1.0.0-vendor.0", "2.0.0-debug", etc. Those tags will often be represented in an image-layout repository with matching "org.opencontainers.image.ref.name" annotations like "v1.0.0-vendor.0", "2.0.0-debug", etc.

Therefore, I'm not sure if storing the full reference in org.opencontainers.image.ref.name is a good practice or not although it is allowed.

@shizhMSFT
Copy link
Contributor

@mauriciovasquezbernal FYI, containerd uses io.containerd.image.name for storing the full reference. Maybe Inspektor Gadget can do something similar.

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
      "digest": "sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4f9454",
      "size": 1638,
      "annotations": {
        "io.containerd.image.name": "docker.io/library/alpine:latest",
        "org.opencontainers.image.ref.name": "latest"
      }
    }
  ]
}

References:

@shizhMSFT
Copy link
Contributor

Nevertheless, oras should support this scenario for maximum compatibility as the spec allows it.

@shizhMSFT
Copy link
Contributor

Thank @mauriciovasquezbernal for the fix (#1507). However, fixing this issue is not trivial as we need to consider paths for various OSs and edge cases.

For instances,

  • C:\OCI\foo:bar should be parsed as C:\OCI\foo and bar not C and \OCI\foo:bar
  • /oci:foo:bar is ambiguous as it can be /oci:foo and bar or /oci and foo:bar

Again, taking ./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget:latest as an example, it can be interpreted as ./mygadget.tar.folder as the file path and ghcr.io/inspektor-gadget/gadget/mygadget:latest as the reference. Or ./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget as the file path and latest as the reference. Both are valid since : is a valid character in Unix/Linux file systems.

In other words, we need to find a special separator, which is not in the ref grammar such as # (i.e. proposal 1). For example,

oras cp --from-oci-layout ./mygadget.tar.folder#ghcr.io/inspektor-gadget/gadget/mygadget:latest localhost:5000/foo:latest

Or, we need a new flag such as --oci-layout-path <path> (i.e. proposal 2). For example,

oras cp --from-oci-layout-path ./mygadget.tar.folder ghcr.io/inspektor-gadget/gadget/mygadget:latest localhost:5000/foo:latest

Note: This flag should be mutual exclusive with --oci-layout

I think proposal 2 is clearer, less ambiguous, and backward compatible.
@mauriciovasquezbernal @sajayantony @qweeah @FeynmanZhou What do you think?

@FeynmanZhou
Copy link
Member

@shizhMSFT If the source image in OCI layout is ./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget:latest, why does the proposal 2 specifies two target image references? Do you want to use the following command as an example?

oras cp --from-oci-layout-path ./mygadget.tar.folder:ghcr.io/inspektor-gadget/gadget/mygadget:latest localhost:5000/foo:latest

@shizhMSFT
Copy link
Contributor

@FeynmanZhou Here's the difference.

Original:

oras cp --from-oci-layout <path>:<reference> <destination>

Proposal 2: There is no separator :.

oras cp --from-oci-layout-path <path> <reference> <destination>

@mauriciovasquezbernal
Copy link
Contributor Author

Is it really worth it to introduce a new flag? What about continue supporting the : for the file:tag case and introduce support for # for the file#reference case? I think it'll provide backwards compatibility (previous support will continue to work as usual) and it will provide a hatch for the case described in this issue.

In any case, I'll let you as a maintainers make the decision and I will implement it.

@shizhMSFT
Copy link
Contributor

The # separator is not a silver bullet since it can be a part of the file name and introduce ambiguity.

For example, ./mygadget.tar.folder#ghcr.io/inspektor-gadget/gadget/mygadget:latest can be parsed as

  • ./mygadget.tar.folder and ghcr.io/inspektor-gadget/gadget/mygadget:latest
  • ./mygadget.tar.folder#ghcr.io/inspektor-gadget/gadget/mygadget and latest

Both parsing strategies are valid.

@qweeah
Copy link
Contributor

qweeah commented Oct 8, 2024

For proposal 2, can we move the reference out like:

oras cp --from-oci-layout-ref <reference> <path> <destination>

@shizhMSFT
Copy link
Contributor

For proposal 2, can we move the reference out like:

oras cp --from-oci-layout-ref <reference> <path> <destination>

The original design of oras cp is oras cp <source_ref> <destination_ref>. Therefore, I think it will be more natural to keep the meaning of those positional arguments and leave additional stuffs to flags.

@FeynmanZhou
Copy link
Member

IMO, the proposal #2 is much flexible than the proposal #1. However, the only ambiguity of proposal #2 is that users might be confused about two parameters <reference> and <destination>.

oras cp --from-oci-layout-path <path> <reference> <destination>

For example, I thought ghcr.io/inspektor-gadget/gadget/mygadget:latest is the source image, but actually not.

oras cp --from-oci-layout-path ./mygadget.tar.folder ghcr.io/inspektor-gadget/gadget/mygadget:latest localhost:5000/foo:latest

How about we make the new flag --from-oci-layout-path as an experimental flag in v1.3.0? If there is a better solution in the future or OCI spec changes, we could easily change the UX.

@FeynmanZhou
Copy link
Member

cc more ORAS maintainers @TerryHowe @oras-project/oras-cli for inputs

@TerryHowe
Copy link
Member

TerryHowe commented Oct 9, 2024

It could be hard to parse the command line in proposal 2 oras cp --from-oci-layout-ref <reference> <path> <destination> it makes more sense to me if we were going to do that to have the user specify that format in quotes like oras cp --from-oci-layout-ref "<reference> <path>" <destination>

@sajayantony
Copy link
Contributor

@TerryHowe were you going for?

oras cp --from-oci-layout-ref  '<reference> <path>'

@qweeah
Copy link
Contributor

qweeah commented Oct 16, 2024

Sorry a little bit lost, will there still be 2 positional arguments for proposal --from-oci-layout-ref '<reference> <path>'? If yes, will the first argument be <path> and the command looks like oras cp --from-oci-layout-ref '<reference> <path>' <path> <destination> and <path> is duplicated.

@FeynmanZhou
Copy link
Member

FeynmanZhou commented Oct 16, 2024

Hi @TerryHowe , just to double confirm, does '<reference> <path>' mean one positional parameter with a space as a separator? For your proposal, is it much more straightforward to use oras cp --from-oci-layout-path "<path> <reference>" <destination>?

@qweeah
Copy link
Contributor

qweeah commented Oct 16, 2024

Hi @TerryHowe , just to double confirm, does '<reference> <path>' mean one positional parameter with a space as a separator? For your proposal, is it much more straightforward to use oras cp --from-oci-layout-path "<path> <reference>" <destination>?

@FeynmanZhou no this is not what I mean. I am confirming whether --from-oci-layout-ref is a switch flag that cannot take any value as input?

@qweeah
Copy link
Contributor

qweeah commented Oct 16, 2024

Confirming since @shizhMSFT and I are suggesting a flag that take string value as parameter, but what @TerryHowe suggested is a switch flag?

@FeynmanZhou FeynmanZhou removed the triage New issues or PRs to be acknowledged by maintainers label Oct 16, 2024
@FeynmanZhou FeynmanZhou added this to the v1.3.0 milestone Oct 16, 2024
@TerryHowe
Copy link
Member

What I should have said is the usage would not change, but the way it is used would. Current usage:

oras cp [flags] <from>{:<tag>|@<digest>} <to>[:<tag>[,<tag>][...]]

would parse this just fine:

oras cp --from-oci-layout-ref "<reference> <path>" <destination>

@shizhMSFT
Copy link
Contributor

shizhMSFT commented Oct 16, 2024

I'd like to elaborate more on the original design of oras cp and how it involves with the flag --oci-layout-path.

The original command is

oras cp <source> <destination>

which means "use oras to copy from source to destination by references".

With the bool flag --oci-layout, it involves as

oras cp <source> <destination> --from-oci-layout --to-oci-layout

which means "use oras to copy from source to destination by references where the source reference is in OCI image layout form and the destination reference is in OCI image layout form".
After rearranging the flags, it gets clearer as

oras cp --from-oci-layout <source> --to-oci-layout <destination>

Similarly, with the string value flag --oci-layout-path, it involves as

oras cp <source> <destination> --from-oci-layout-path <source_path> --to-oci-layout-path <destination_path>

which means "use oras to copy from source to destination by references where the source reference can be found in the given OCI image layout path and the destination is the reference in the given OCI image layout path".
After rearranging the flags, it gets clearer as

oras cp --from-oci-layout-path <source_path> <source> --to-oci-layout-path <destination_path> <destination>

If only source or destination is OCI image layout, it can be

oras cp --from-oci-layout-path <source_path> <source> <destination>

or

oras cp --to-oci-layout-path <destination_path> <destination>

@TerryHowe @qweeah @FeynmanZhou @mauriciovasquezbernal @sajayantony I recall that the bool flag --oci-layout is not the complete form in the initial design but --target type=<type>[[,<key>=<value>][...]] (see the code reference below).

// applyFlagsWithPrefix applies flags to fs with prefix and description.
// The complete form of the `target` flag is designed to be
//
// --target type=<type>[[,<key>=<value>][...]]
//
// For better UX, the boolean flag `--oci-layout` is introduced as an alias of
// `--target type=oci-layout`.
// Since there is only one target type besides the default `registry` type,
// the full form is not implemented until a new type comes in.
func (opts *Target) applyFlagsWithPrefix(fs *pflag.FlagSet, prefix, description string) {
flagPrefix, notePrefix := applyPrefix(prefix, description)
fs.BoolVarP(&opts.IsOCILayout, flagPrefix+"oci-layout", "", false, "set "+notePrefix+"target as an OCI image layout")
}

Therefore, the following commands should be equivalent by design.

oras cp --from-oci-layout <source> --to-oci-layout <destination>
oras cp --from-target type=oci-layout <source> --to-target type=oci-layout <destination>

It is also natural to have --target type=oci-layout[{,src=<path>|,dest=<path>}] as the complete form of --oci-layout-path <path>. That is, the following commands should be equivalent by design.

oras cp --from-oci-layout-path <source_path> <source> --to-oci-layout-path <destination_path> <destination>
oras cp --from-target type=oci-layout,src=<source_path> <source> --to-target type=oci-layout,dest=<destination_path> <destination>

@shizhMSFT
Copy link
Contributor

The proposal oras cp --from-oci-layout-ref "<reference> <path>" <destination> solves the problem technically. However, it can be improved. Given that space, as a separator, is not in the grammar of <reference>, the order of "<reference> <path>" can be reversed and that matches the original design of <path><separator><reference>. That is, it becomes similar to proposal 1 of #1505 (comment). The difference is that it uses a different flag to drop the support of other separators (i.e. @ and :) to avoid ambiguity.

Actually, I'm not a fan of using space as a separator inside an argument since many scripts forget to add quotes. For example, the following script seems correct

#!/bin/bash

function test {
    echo "$# parameters for test"
    ref=$1
    oras cp --from-oci-layout-ref $ref localhost:5000/test:latest
}

function oras {
    echo "$# parameters for oras"
    echo "ref is $3"
    echo "destination is $4"
}

test "ghcr.io/oras-project/oras:v1.2.0 oras.tar"

but it is not as it prints

1 parameters for test
5 parameters for oras
ref is ghcr.io/oras-project/oras:v1.2.0
destination is oras.tar

The trick is that quotes are required for $ref. After adding quotes,

-    oras cp --from-oci-layout-ref $ref localhost:5000/test:latest
+    oras cp --from-oci-layout-ref "$ref" localhost:5000/test:latest

it prints

1 parameters for test
4 parameters for oras
ref is ghcr.io/oras-project/oras:v1.2.0 oras.tar
destination is localhost:5000/test:latest

@mauriciovasquezbernal
Copy link
Contributor Author

mauriciovasquezbernal commented Oct 16, 2024

oras cp --from-oci-layout-ref " "

IMHO we shouldn't rely on users to correctly set the quotes. Also, paths can have empty spaces, making this command a bit more difficult to understand: oras cp --from-oci-layout-ref "myimagereference /a path /with spaces/yes/" localhost:5000/foo:latest

With the --{from,to}-oci-layout-path flag described above, it'll become: oras cp --from-oci-layout-path "/a path /with spaces/yes/" myimagereference localhost:5000/foo:latest

IMO --{from,to}-oci-layout-path is better, but it's up to you folks as maintainers to agree on a solution.

@TerryHowe
Copy link
Member

I think it would be more consisten to have a --to-oci-layout-path option like:

oras cp --from-oci-layout-path <source_path> <source> --to-oci-layout-pajh <destination_path> <destination>

from and to oci-layout-path should take an argument rather than be booleans.

@shizhMSFT
Copy link
Contributor

I think it would be more consisten to have a --to-oci-layout-path option like:

oras cp --from-oci-layout-path <source_path> <source> --to-oci-layout-pajh <destination_path> <destination>

from and to oci-layout-path should take an argument rather than be booleans.

@TerryHowe Thanks for pointing out. It was a typo in #1505 (comment) and I've fixed that.

@qweeah qweeah added enhancement New feature or request bug Something isn't working and removed bug Something isn't working labels Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants