Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be mindful of image architectures when scheduling tasks #2770

Open
wk8 opened this issue Oct 24, 2018 · 28 comments
Open

Be mindful of image architectures when scheduling tasks #2770

wk8 opened this issue Oct 24, 2018 · 28 comments
Assignees

Comments

@wk8
Copy link
Contributor

wk8 commented Oct 24, 2018

No description provided.

@wk8 wk8 self-assigned this Oct 24, 2018
@anshulpundir
Copy link
Contributor

Based on conversation with @wsong: our real desire here is to run a service only on windows nodes with named pipe support rather than image architecture; what we really need here is scheduling based on named pipe support.

@wk8
Copy link
Contributor Author

wk8 commented Oct 24, 2018

Thanks for the clarification @anshulpundir

@dhiltgen
Copy link

what we really need here is scheduling based on named pipe support.

If we implement it this way, the solution becomes far less general.

Can we implement this based on platform support in the multi-arch image manifest? If we do that then this scheduling smarts will be generally useful for anyone using multi-arch, and not unique to just this named pipe usecase.

@wk8
Copy link
Contributor Author

wk8 commented Nov 6, 2018

@dhiltgen @anshulpundir @wsong : do I understand correctly that the goal here is to ensure that we only schedules tasks on nodes with the right architecture & OS according to the image's manifest?

If so, it looks like this is already the case: when creating a service, by default (and except if the --no-resolve-image flag was set, or if platforms were explicitly set) the client already does retrieve the manifest from the registry and passes that to swarm (https://github.com/moby/moby/blob/12bba16306dd618883fe2e85ef1730efc572294f/client/service_create.go#L45 and https://github.com/moby/moby/blob/12bba16306dd618883fe2e85ef1730efc572294f/client/service_create.go#L66-L71).
(and we do the same on updates, https://github.com/moby/moby/blob/12bba16306dd618883fe2e85ef1730efc572294f/client/service_update.go#L44-L78)

This is then used by swarm when filtering which nodes are eligible to run the task (https://github.com/docker/swarmkit/blob/bc032e24784ea618044ee438fedec3458abb2ef9/manager/scheduler/filter.go#L253-L320).

So I'm not sure what the problem/goal here is. Could you please clarify?

Thanks!

@wsong
Copy link

wsong commented Nov 6, 2018

@wk8 The problem today is if Swarm can't get the manifest from the registry (e.g. if it's a private image or if we don't have Internet connectivity). In that case, Swarmkit won't be able to tell what platforms the image supports, and therefore it won't be able to filter the nodes.

One way of solving this would be to add support for loading image manifests from an offline bundle (i.e. a .tar file); that isn't possible today. See https://github.com/docker/docker-core-backlog/issues/24

@wk8
Copy link
Contributor Author

wk8 commented Nov 6, 2018

@wsong : thanks for the quick reply! :)

Just to make sure I understand correctly: currently, the client does query the registry for an image's manifest if there's a registry, but in the case of local multi-arch images loaded with docker load doesn't actually inspect those to extract the supported platforms (the same way docker run does to decide which bundled image to actually run); and that's what we'd want to change - correct?

@wsong
Copy link

wsong commented Nov 6, 2018

My understanding is that docker save doesn't actually save the manifest data for an image at all; it just saves the image contents. Thus, you can use docker load to load the image data for your platform, but you won't know what other platforms the image is available for.

@olljanat
Copy link
Contributor

olljanat commented Nov 6, 2018

@wsong this can be stupid question but why you guys don't just focus to get docker manifest command marked as non-experimental so users can store manifest to local registries?

@wsong
Copy link

wsong commented Nov 6, 2018

@olljanat That's basically what we're talking about here. What we need is a user-friendly way of shipping manifests to users without Internet connectivity.

@wk8
Copy link
Contributor Author

wk8 commented Nov 6, 2018

@wsong : tried this morning, it seems that you can docker save/load a manifest, seems to work just fine. If that's indeed the case, and we add support within swarm to be able to use a local multi-arch image, then we're good, right?

@wk8
Copy link
Contributor Author

wk8 commented Nov 6, 2018

(@wsong : the small caveat is that it seems that you do need to push the images in the manifest, and the manifest itself, to a registry before you can save it - once you've saved it you don't need access to the registry on whichever node you then load it on. Not sure why, but I see no reason why this couldn't be changed too)

@wk8
Copy link
Contributor Author

wk8 commented Nov 6, 2018

Ugh please ignore my previous comment #2770 (comment). Manifests are indeed not saved at all. So what we need is a way to save manifests, I guess, i.e. bundle a manifest and its images in a single archive. Would we want a docker manifest save command? Where load then only imports the images that can run on the host?

@wsong
Copy link

wsong commented Nov 6, 2018

Actually, what we want is to just save the manifest (i.e. the metadata about which platforms a multi-arch image supports). For our purposes, we want to run a service on Windows nodes that are Windows Server 1709 or newer, but we want to create the service on a Linux node. That means that the Swarmkit manager (which will run on Linux) needs to have the manifest image metadata for the Windows images, but it doesn't need to load the actual image layers.

@olljanat
Copy link
Contributor

olljanat commented Nov 6, 2018

@wsong have you seen this one? https://medium.com/@mauridb/docker-multi-architecture-images-365a44c26be6

As far I understand this you should be able to store manifest to private registry using commands:

export DOCKER_CLI_EXPERIMENTAL=enabled
docker manifest push registry-host:5000/org/image:tag

@wsong
Copy link

wsong commented Nov 6, 2018

That's fine, but we still need an offline solution for air-gapped clusters.

@olljanat
Copy link
Contributor

olljanat commented Nov 6, 2018

@wsong sorry but now I didn't understood. You need anyway some method to distribute images and manifest for swarm nodes so why you are not just running private registry inside of that air-gapped cluster?

@wsong
Copy link

wsong commented Nov 6, 2018

Not all of our airgapped users have a private registry.

@olljanat
Copy link
Contributor

olljanat commented Nov 7, 2018

Not all of our airgapped users have a private registry.

On that case how correct nodes gets image? From some external process? Why they are not then checking during import process that only nodes with correct OS version gets that image?

For our purposes, we want to run a service on Windows nodes that are Windows Server 1709 or newer, but we want to create the service on a Linux node.

Btw. One option which we are using for similar use cases are add engine/node labels to nodes and use constraints based on them.

PS. Let me know if I'm bothering your working too much. This just sounds interesting use case so I'm trying to understand it.

@wsong
Copy link

wsong commented Nov 7, 2018

We just use docker load to get images onto airgapped nodes. But, as discussed above, that only loads image data, not manifests.

@olljanat
Copy link
Contributor

olljanat commented Nov 8, 2018

(@wsong : the small caveat is that it seems that you do need to push the images in the manifest, and the manifest itself, to a registry before you can save it - once you've saved it you don't need access to the registry on whichever node you then load it on. Not sure why, but I see no reason why this couldn't be changed too)

It looks to be that docker manifest create command creates manifest locally and then docker manifest push is used to push it to Docker Hub/private registry.

So adding two new commands docker manifest save and docker manifest load would be probably best option.

@wsong would it work on your use case that you would use docker manifest load on swarm managers and docker load on worker nodes?

And just as reference. Current manifest commands are implemented on docker/cli#138 and when all known issues on them have been fixed we can mark it non-experimental by merging docker/cli#1355

@wsong
Copy link

wsong commented Nov 8, 2018

@olljanat Yeah, that's what I was thinking; some sort of docker manifest save/load command.

@wk8 wk8 closed this as completed Nov 8, 2018
@wk8 wk8 reopened this Nov 8, 2018
@olljanat
Copy link
Contributor

olljanat commented Nov 8, 2018

Some more info which can be useful:

But that is fully client side implementation so to be usable on here it would need to be modified as engine side solution.

Other thing which I noticed that swarmkit does not currently care about OS version even if that is specified on manifest, only CPU architecture and OS:
https://github.com/docker/swarmkit/blob/master/api/types.proto#L82-L88
On process isolation mode Windows only support to run images which have same OS version than on node so if that information exists on manifest is should be used.

I also tested to save Windows image to tar and import back with docker load and looks that manifest with needed information is included to it:

"Architecture": "amd64",
"Os": "windows",
"OsVersion": "10.0.17134.345",

One options is also add support for node.platform.os_version constraint so user can use it together with node.platform.os to make sure that services are scheduled only valid hosts. Nice thing on that option is that it would work on airgapped environment without need to import images/manifests to swarm managers.

@wk8
Copy link
Contributor Author

wk8 commented Nov 9, 2018

After more discussion, going to add a OS version constraint.

The manifest approach, while fitting the UCP use case, was also deemed clunky and half-baked as a generic feature, mainly because in the offline (ie, no registry) case it would have required all the nodes in the clusters to locally have the same version of all the manifests, as well as all the workers to locally have all the images relevant to their arch/OS listed in all the manifests, while providing no mechanism to actually handle this synchronisation nor the deployment of new manifest/image versions.

On the other hand, adding OS version constraints is easy for users to understand, and also perfectly addresses the UCP use case.

@david-yu
Copy link

@wk8 Was this ever merged in Moby? it looks like this is a dependency for Kube on Windows.

@thaJeztah
Copy link
Member

Looks to be still pending moby/moby#38349

@wk8
Copy link
Contributor Author

wk8 commented Apr 29, 2019

Will do next week, thanks :)

wk8 added a commit to wk8/moby that referenced this issue Jun 6, 2019
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770).

The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.

Added unit tests when possible, as well as Prometheus metrics.

Signed-off-by: Jean Rouge <[email protected]>
thaJeztah pushed a commit to thaJeztah/docker that referenced this issue Jun 7, 2019
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770).

The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.

Added unit tests when possible, as well as Prometheus metrics.

Signed-off-by: Jean Rouge <[email protected]>
(cherry picked from commit d363a18)
Signed-off-by: Sebastiaan van Stijn <[email protected]>
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this issue Jun 7, 2019
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770).

The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.

Added unit tests when possible, as well as Prometheus metrics.

Signed-off-by: Jean Rouge <[email protected]>
Upstream-commit: d363a1881ec256e1be5a48a90046ff16e84ddc02
Component: engine
@david-yu
Copy link

david-yu commented Aug 1, 2019

@wk8 Should we close this issue since it looks like the work got merged?

@wk8
Copy link
Contributor Author

wk8 commented Aug 5, 2019

@david-yu : the needed part in moby/moby got merged, but I still need to do the swarmkit partn sorry. Will do shortly.

burnMyDread pushed a commit to burnMyDread/moby that referenced this issue Oct 21, 2019
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770).

The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.

Added unit tests when possible, as well as Prometheus metrics.

Signed-off-by: Jean Rouge <[email protected]>
Signed-off-by: zach <[email protected]>
dims pushed a commit to dims/mobyutils that referenced this issue Jan 17, 2020
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770).

The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.

Added unit tests when possible, as well as Prometheus metrics.

Signed-off-by: Jean Rouge <[email protected]>
dims pushed a commit to dims/moby-sysinfo that referenced this issue Mar 10, 2020
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
moby/swarmkit#2770).

The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.

Added unit tests when possible, as well as Prometheus metrics.

Signed-off-by: Jean Rouge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants