Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: copy: add --multi-arch=sparse:... #1987

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nalind
Copy link
Member

@nalind nalind commented May 2, 2023

Add a "sparse:..." value for the copy --multi-arch option, where the "..." is a comma-separated list of instance digests, platform specifications, or the literal word "system" to signify the local platform. Along with containers/image#1938, I think this one of the features described at #1935 (comment), give or take the presence of "[" and "]" around the list. If it looks about right, I'll see what sorts of tests I can add to the PR.

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is fine (with the comparatively minor question of system semantics)… but then I got stuck on the platforms.Parse part below — and I’ll need to leave for the day anyway.

Quite possibly you have already thought about these things and there’s a good reason to interpret the inputs this way.

cmd/skopeo/copy.go Outdated Show resolved Hide resolved
cmd/skopeo/copy.go Outdated Show resolved Hide resolved
if platformSpec == "system" {
platformList = append(platformList, imgspecv1.Platform{})
} else {
p, err := platforms.Parse(platformSpec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platforms.Parse, if I’m reading it correctly, turns "linux" into linux/$GOARCH. Or, alternatively, "amd64" into $GOOS/amd64.

Hum, do we want that? Especially for Mac users?


Actually, should the UI be built around the OCI “platform” triplet at all, or should we have OS/arch filters separately? (And what does that mean for Zstd compression? Probably nothing…)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platforms.Parse, if I’m reading it correctly, turns "linux" into linux/$GOARCH. Or, alternatively, "amd64" into $GOOS/amd64.

Hum, do we want that? Especially for Mac users?

Hmm, for people on GOOS="darwin" I could actually see defaulting to "linux" as being reasonable, but for people running on GOOS="windows" I think it would be surprising.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more option is to just provide the full-platform filter now (with a more strict parser requiring a full specification?), and leave a nicer arch-filter-only UI for a future improvement — as long as we can see a path to making that possible, and, uh, --multi-arch=sparse2:… is a last-resort option we always have.

“A nice arch-filter-only UI” is not what the primary motivating issue is asking for right now…

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2175 also suggests using platforms.Parse, with the same potential surprises.

Both PRs should probably end up with the same decision.

@mtrmac
Copy link
Contributor

mtrmac commented May 3, 2023

Actually, should the UI be built around the OCI “platform” triplet at all, or should we have OS/arch filters separately? (And what does that mean for Zstd compression? Probably nothing…)

Just to write down an unfinished version of what I’ve been thinking, to make sure this doesn’t block for an extra day.

  • If the only thing we must currently support is system, one option might be to do only that, and punt on the rest. Probably not an ideal effort/benefit outcome, but an option.
  • Apart from system, it seems to me that the primary feature is an arch (or arch+variant?) filter for images targeting a single OS. Having the user explicitly write down the OS works but is not the most convenient.
  • Looking at the added syntax to also support a digest list within --multi-arch=sparse, maybe we can imagine another level of indirection: --multi-arch=sparse:digest=…, --multi-arch=sparse:platform=$triples, --multi-arch=sparse:arch=$arch_with_optional_variant…. Or maybe --multi-arch=sparse:$arch_with_optional_variant, and --multi-arch=sparse-platform:$triplets, and --multi-arch=sparse:digest=.... With some view on whether we would support a combination of these, and how (to possibly carve out some syntax).
  • We absolutely don’t need all of that implemented now. Just system and some, one, way to choose architectures, is probably the 80% solution, without supporting architectures and digests at the same time. So a restrictive option with a restrictive parser is perfectly fine, as long as we don’t box ourselves completely in.

So maybe something in the direction of recognizing --multi-arch=sparse:system, and --multi-arch=sparse:arch=[amd64,s390,arm/v7] would work here? The [] would allow us to unambiguously say arch=[…],digest=[…], OTOH writing that manually would be a pain; quite possibly that might not be worth worrying about.

@nalind nalind force-pushed the copy-platforms branch 2 times, most recently from 46f84eb to d494e22 Compare May 8, 2023 21:31
@nalind
Copy link
Member Author

nalind commented May 8, 2023

Took a swing at that sparse syntax. it's not super-pretty, but I tried to reduce the opportunity for copy-paste errors.

Add a "sparse:..." value for the copy --multi-arch option, where the
"..." is a comma-separated list of "system", "arch=[...]", "digest=[]",
and "platform=[]" lists.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of picking a design for --multi-arch=sparse:, sure, this works as a direction/plan.


WRT the semantics of the UI, the thing I conceptually struggle with here is that “filter for these architectures” is not really “filter for these architectures and an implied operating system. But then “filter for architectures, ignore OS” would need to be modeled as an entire new API in copy.Options, and we almost certainly don’t need all that implementation complexity of doing both platform and arch in the first version if we don’t have known high-priority users.

Maybe we can, for now, just implement digest and platform (certainly allowing any use case), with a custom platform parser that requires os/arch (and optionally allows variant), to avoid the defaulting surprises of #1987 (comment) .

Or, alternatively, implement digest+arch and not worry about platform, for a more convenient UI probably also handling most practical uses (multi-OS images are hard for me to imagine currently…). That would require a new filter implementation in c/image/copy, the existing ChooseInstance can’t do it.


Implementing “system” cleanly might be simplest by adding an extra boolean to c/image, copy.Options.InstanceIncludeSystem. [Or, alternatively, we could over-engineer copy.Options.InstanceSelectors that allows multiple various criteria, but that seems way more of an effort than necessary.]

// we currently don't provide an option to choose the images to copy. That
// could be added in the future.
// It returns the copy.ImageListSelection, instance list, and platform list to use in image.Copy option
func parseMultiArch(globalOptions *globalOptions, multiArch string) (copy.ImageListSelection, []digest.Digest, []imgspecv1.Platform, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Eventually, it might be cleaner for this to update fields of copy.Options in-place instead of returning a list of values that need to be applied in the caller. For now, the three values have different types, which protects us against mistakes just fine.)

Comment on lines +167 to +172
}); err != nil {
return nil, nil, err
} else if isSystem {
continue
}
if isDigest, err := parseArg("digest=[", "]", func(digestSpecList string) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is probably very premature… Let’s settle on the UI and API first before beautifying the implementation.)

It seems to me that something vaguely like

keyword, remainder, err := findKeyword(remainder)
switch keyword {
case "system": …
case "digest": ourArgs, remainder, err = getArgs(remainder, "=[", "]"); …
case "arch": ourArgs, remainder, err = getArgs(remainder, "=[", "]"); …
}

should be possible to do; centralizing the remainder = newRemainder code in a single place does help us not to forget to update remainder, but it ends up with these copy&pasted sequences instead, which seems not too be worth it to me. Alternatively, we could end up with a whole type argDescriptor struct { start, end, handler … } which also seem like an overkill.

for remainder != "" {
parseArg := func(argStart string, argEnd string, fn func(argVal string) (bool, error)) (bool, error) {
if newRemainder, isArg := strings.CutPrefix(remainder, ","+argStart); isArg {
if !isArg {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unreachable right now. (Aesthetically, I’d weakly prefer to flip the control flow so that the !isArg code exits early at the top, instead of at the very bottom of the function.)

return false, nil
}
if isSystem, err := parseArg("system", "", func(string) (bool, error) {
systemPlatform := imgspecv1.Platform{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I continue to be worried about the design that copy.Options.InstancePlatforms can contain “partial” platform specifications with some fields filled in, with the rest defaulted to runtime.GO*.

E.g. we can’t represent “find the entry with arch="arm", variant="" if Variant: "" meant “the current runtime variant”. (Now, this special case already exists in the underlying ChooseInstance, but do we need to document that in copy.Options.InstancePlatforms?)

I’d prefer a copy.InstancePlatformIncludeSystem bool and let c/image handle that, including how “system” interacts with SystemContext = globalOptions.override*.

Comment on lines +187 to +190
wantedOS := runtime.GOOS
if globalOptions.overrideOS != "" {
wantedOS = globalOptions.overrideOS
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Mac users must use --override-os=linux --multi-arch=sparse:arch=[amd64,arm64], that’s not really any simpler than --multi-arch=sparse:platform=[linux/amd64,linux/arm64].

I think if for an arch, separate and distinct from platform, to be worth adding (extra complexity in implementation and for users to read about), it should be an arch-only filter, not restricting the OS at all.

I also don’t think we need to build that arch-only filter at all for the first version… see the top-level comment.

}
if isPlatform, err := parseArg("platform=[", "]", func(platformSpecList string) (bool, error) {
for _, platformSpec := range strings.Split(platformSpecList, ",") {
p, err := platforms.Parse(platformSpec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are to ever have a separate arch= and platform=, at most one of them (possibly neither) should default to runtime.GOOS.

Requiring the user to always provide os/arch (with an optional variant) here, using a custom parser instead of platforms.Parse, would for now provide a clearly deterministic feature, while still giving us the flexibility to addd that defaulting to GOOS in the future.

remainder := "," + sparseArg
for remainder != "" {
parseArg := func(argStart string, argEnd string, fn func(argVal string) (bool, error)) (bool, error) {
if newRemainder, isArg := strings.CutPrefix(remainder, ","+argStart); isArg {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CutPrefix was added in Go 1.20; currently we declare Go 1.18.

(It is also very neat, I can’t wait for this to be available…)

We are not necessarily stuck on 1.18, and updating is an option. But looking at https://bodhi.fedoraproject.org/updates/?packages=golang , we might be limited to 1.19.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, good catch.

@mtrmac mtrmac added the kind/feature A request for, or a PR adding, new functionality label May 11, 2023
@github-actions
Copy link

A friendly reminder that this PR had no activity for 30 days.

@mgs255
Copy link

mgs255 commented Nov 25, 2023

@nalind @mtrmac Is there any chance of this being picked up again? This feature would certainly be very useful for our use case, where we are only interested in two platforms linux/amd64 and linux/arm64. Any interface which would allow copying only those two platforms, re-writing the manifest if required would be a fantastic addition. To me personally, the proposal of --multi-arch=sparse:platform=linux/amd64,linux/arm64 seems pretty clear and understandable.

Not sure that it has been mentioned already, but there would be an open question as to the behaviour if the source only supported a sub-set of the specified sparse filters. Here, it would probably make sense to warn on each of the unmatched filters, but error if there were none that matched.

@snail2sky
Copy link

This RFC is really great, but unfortunately the author has forgotten it, so I decided to fork it to solve my problem.
Thanks @nalind @mtrmac for the great analysis

@nalind
Copy link
Member Author

nalind commented Aug 28, 2024

I should at least clean up the containers/image part.

@snail2sky
Copy link

I should at least clean up the containers/image part.

I think your RFC is very good, but the maintainers of skopeo think that they haven't thought about how to make it more elegant, so this function has never been online.
In fact, there is no need to think so much, and this is not a function from zero to one. Docker manifest supports this function, but it is difficult to use.
Referring to the docker manifest, the usage of docker pull --platfrom is actually sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature A request for, or a PR adding, new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants