Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to make kind a little less Docker Centric #151

Closed
wants to merge 1 commit into from

Conversation

rhatdan
Copy link

@rhatdan rhatdan commented Dec 5, 2018

I believe that kind should be able to run with Podman as
well as Docker.

Signed-off-by: Daniel J Walsh [email protected]

@k8s-ci-robot
Copy link
Contributor

Welcome @rhatdan! It looks like this is your first PR to kubernetes-sigs/kind 🎉

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 5, 2018
@k8s-ci-robot k8s-ci-robot requested review from amwat and munnerz December 5, 2018 17:28
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 5, 2018
@rhatdan
Copy link
Author

rhatdan commented Dec 5, 2018

I did not discover a way of building this, so I relied on validata-all to look for bugs.

@BenTheElder
Copy link
Member

Huge fan of this idea! I've been mulling changing it to something like:

kind - friendly local clusters

(and dropping the acronym) with exactly this purpose in mind (also why the docker related code has been moving into a separate package slowly)

I did not discover a way of building this, so I relied on validata-all to look for bugs.

go build / go install should cover the main binary, for the images kind build base-image and kind build node-image, the former is mostly a docker build of images/base, but the latter is a bit trickier right now so we can install debians without keeping a copy of them in the image.

I'm not very familiar with podman yet, but we should be able to come up with something, and there are pre-built images at least.


Will take a look through the code later today 😄

docs/devel/README.md Outdated Show resolved Hide resolved
Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass generally LGTM, thanks!

couple of nits, will try to take a look at what the bug is in CI

pkg/container/run.go Outdated Show resolved Hide resolved
pkg/container/init.go Show resolved Hide resolved
if err := n.Command(
"find",
"/kind/images",
"-name", "*.tar",
"-exec", "docker", "load", "-i", "{}", ";",
"-exec", container.Engine, "load", "-i", "{}", ";",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is calling docker inside the "node" container, so this should still be "docker" for now since that's still what runs inside.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will change it back, but I actually think podman inside of that container would make more sense since you don't need ot setup a separate daemon, or is this container volume mounting in the docker.sock?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

podman does look very cool for that, however:

  • We're using docker for now because of existing expertise and accesibility to both our maintainers and the bulk of our known current and potential users.
  • I also need to maintain another more general docker-in-docker setup for kubernetes/test-infra where it is used pervasively by many projects we merely support and that is less likely to change, as we'd need to convince them all to switch. @munnerz also maintains a parallel test-infra setup to this one.

It's certainly worth looking at, but I don't think the second point would happen and there is very large overlap in maintaining and debugging these setups. There's also a lot more information out there from third parties related to dind. I think @runcom might be the first person to run cri-o in a container 🙃

We're not mounting sockets.

Other than loading the images, only Kubernetes is supposed to be talking to this, however some advanced users are already talking to it, mostly to load their images until #28.

@@ -141,17 +144,17 @@ func (n *Node) LoadImages() {
// TODO(bentheelder): this is a bit gross, move this logic out of bash
if err := n.Command(
"/bin/bash", "-c",
`docker images --format='{{.Repository}}:{{.Tag}}' | grep -v amd64 | xargs -L 1 -I '{}' /bin/bash -c 'docker tag "{}" "$(echo "{}" | sed s/:/-amd64:/)"'`,
fmt.Sprintf(`%s images --format='{{.Repository}}:{{.Tag}}' | grep -v amd64 | xargs -L 1 -I '{}' /bin/bash -c '%s tag "{}" "$(echo "{}" | sed s/:/-amd64:/)"'`, container.Engine, container.Engine),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here, this command runs inside the "node" container so it shouldn't use the host container runtime, just "docker" for now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok switched back to Docker, although left them as being passed in, so we could move quickly to using alternatives inside of the container.

// WaitForDocker waits for Docker to be ready on the node
// it returns true on success, and false on a timeout
// WaitForDocker waits for Docker Daemon to be ready on the node
// it returns true on success or if you are not using Docker, and false on a timeout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function also should not change, this is waiting for the docker service inside the node container to be ready.

This is the change that is causing the CI to fail.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess if you were using podman on the host and docker within the container this would be an issue. But this clearly shows for this use case why podman inside of the container would be better, no daemon to start and wait for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but podman does not implement a CRI for Kubernetes as far as I know. Docker in the container is purely for that use case.

We will be exploring cri-o as an option soon. @runcom started looking into this.

@BenTheElder
Copy link
Member

BenTheElder commented Dec 6, 2018

Found three confusing bits -- there's some node.Command("docker" ... calls that are running commands against the nested container runtime on the "nodes".

Those should stay the same until we're actually running different runtimes there, and even then should use that container engine, not the hosts's.

Once those are reverted, I think this should pass tests 👍

There's also a small merge conflict now in pkg/cluster/context.go, as another PR merged to that file.


Edit: we will probably want to change these in the future though, supporting other CRI within the "nodes" is on the roadmap. It needs a little bit more thought and research.

@runcom
Copy link

runcom commented Dec 6, 2018

There would be also be a way to specify which runtime kube uses? I would love to run the cluster with CRI-O container runtime for instance

@BenTheElder
Copy link
Member

@runcom, yes that is on the TODO but would make a larger follow-up, we will need to figure out:

  • the best way to ship the base image once we have variants, or if we should just install multiple runtimes
  • what tweaks we need to run cri-o in a container,
    I haven't found anyone looking into this yet and haven't had a chance myself
  • CI?

I've punted this for now in favor of getting the rest stable, docker-in-a-container is a path I have some existing expertise with from https://kubernetes/test-infra, but if nobody else gets to it I definitely will eventually :-)

@runcom
Copy link

runcom commented Dec 6, 2018

  • what tweaks we need to run cri-o in a container,
    I haven't found anyone looking into this yet and haven't had a chance myself

I'm gonna look into this, I'll hack around with kind's base image itself and see if tweaking kubeadm plus having CRI-O in it is a thing.

I think a reasonable first step is the above: have a mean to support another container runtime, be it in docker or podman doesn't really matter for now. Afterwads we can chat and agree about the way we're going to ship the base/node image (multiple runtimes or different images or whatever). I'll take a stab at first running CRI-O to begin with.

For the CI, I'm not sure how things are setup right now in this repo, but once we agree on a delivery path for the images, it hopefully is going to be straightforward to figure out and integrate with the current CI, right?

@BenTheElder
Copy link
Member

SGTM, filed #153 to track

@rhatdan
Copy link
Author

rhatdan commented Dec 7, 2018

@runcom lets work together to get CRI-O running with Docker. But I think we will beed a decent amount of work in node directory to support two or more container engines.

@rhatdan
Copy link
Author

rhatdan commented Dec 7, 2018

Right, although I think you are combining two different things.

We can do podman load to load the images into CRI-O Storage, then if we have to actually talk to CRI-O we would need to do crictl.

@BenTheElder
Copy link
Member

Right, although I think you are combining two different things.

Which two things?
Loading the images necessarily requires knowing about the runtime on the node.

We can do podman load to load the images into CRI-O Storage, then if we have to actually talk to CRI-O we would need to do crictl.

We don't have cri-o storage to talk to yet. And we'll need to detect and handle the node CRI in general distinctly from the host runtime, for which we need another set of unrelated changes.

For now it doesn't make sense to touch the code running inside the node because that can only possibly be talking to docker. We can follow up with that.

Shipping multiple runtimes will have some tradeoffs no matter how we do it, and should come with some further discussion.

Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we made a few things harder to understand when referring to usage of the container engine CLI. Since podman's are mirroring docker's it seems more helpful to mention something searchable and documented rather than an opaque "ContainerEngine run" that doesn't really map to anything.

There are also a few inaccuracies in the docs changes related to nodes.

README.md Outdated
To use `kind`, you will need to [install docker].
Once you have docker running you can create a cluster with `kind create cluster`
To use `kind`, you will need to [install docker] or [install podman].
Once you have installed podman or have docker service running you can create a cluster with `kind create cluster`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "the docker service"

nested containers, systemd, and kubernetes components.

To do this we need to set up an environment that will meet the CRI
(currently just docker) and systemd's particular needs. Documentation for each
step we take is inline to the image's [Dockerfile][dockerfile]),
but essentially:

- we preinstall tools / packages expected by systemd / Docker / Kubernetes other
- we preinstall tools / packages expected by systemd / Docker | Podman / Kubernetes other
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not true at the moment. this image ships docker for now. it does not ship podman

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should omit listing all possible runtimes in a | separated list.
possibly best to just outline using "container runtime" in the long run.

@@ -1,14 +1,14 @@
# The Base Image

The ["base" image][base image] is a small-ish Docker image for running
The ["base" image][base image] is a small-ish container image for running
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular image is built from a Dockerfile, I'm not sure how helpful this change is.
I think podman / cri-o / containerd users know that a "docker image" is not limited to Docker. :^)

pseudo "paused" state, with [the entrypoint][entrypoint] waiting for `SIGUSR1`.
This allows us to manipulate and inspect the container with `docker exec ...`
This allows us to manipulate and inspect the container with `containerEngine exec ...`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sort of change is going to be more confusing than anything. podman is the only tool I know of besides docker that provides something like this, and does so by mimicing the docker CLI AFAICT... "docker exec" is searchable when looking for upstream doc / what it is "containerEngine exec" is not.

how about "with eg docker exec" - it seems like podman users will still understand what this maps to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about "with eg docker exec" -
or something like:
with tools like docker exec


Once the nodes are sufficiently prepared, we signal the entrypoint to actually
"boot" the node.

We then wait for the Docker service to be ready on the node before running
We then wait for the Docker service to be ready if we are using Docker on the node before running
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it stands, we are only using docker for this case.

@@ -84,7 +84,7 @@ At this point users can test Kubernetes by using the exported kubeconfig.

### Cluster Deletion

All "node" containers in the cluster are tagged with docker labels identifying
All "node" containers in the cluster are tagged with container labels identifying
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, this is searchable, I'm not sure I want to sacrifice this. docker labels have a lot of restrictions and particular behavior, what those are is probably interesting to many of our users / developers who may not have used these yet.

@@ -35,6 +35,6 @@ each "node" container with [kubeadm][kubeadm].
[base image]: ./base-image.md
[build package]: ./../../pkg/build
[cluster package]: ./../../pkg/cluster
[docker image archives]: https://docs.docker.com/engine/reference/commandline/save/
[container image archives]: https://docs.docker.com/engine/reference/commandline/save/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless we have a generic reference for this instead, it seems like renaming this and then linking to docker is not particularly helpful.

@@ -6,7 +6,7 @@ Before you being you must have:
* [GitHub account][github]
* `git`
* `go`
* `Docker`
* `Docker` | `Podman`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/|/or/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, better to have "a container runtime" here and have a separate section that lists the supported options.

var containerIDRegex = regexp.MustCompile("^[a-f0-9]+$")

// Run creates a container with "docker run", with some error handling
// Run creates a container with "ContainerEngine run", with some error handling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just say Engine since at least that maps to the var we're using. the reason for mentioning "docker run" is to correlate with the documented behavior there without documenting it ourselves.

Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @rhatdan

var Engine string

func init() {
for _, engine := range []string{"podman", "docker"} {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this needs to be exposed from a CLI flag or via an environment variable (preferring the former).
having this precedence can be problematic if someone has podman installed but wants to use docker for kind instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now exposed as an Environment Variable

@rhatdan
Copy link
Author

rhatdan commented Dec 19, 2018

Is there other things you guys want me to change?

@BenTheElder
Copy link
Member

I've been thinking about how to do better with container runtimes both on the nodes and on the host going forward ... I think we should get away from the container.Engine global and init handling this. Still mulling the best option.

@randomvariable
Copy link
Member

@rhatdan , is there something else in libpod that's needed to get this working locally?

ERRO[10:45:27] unable to pull kindest/node:v1.13.2@sha256:e14edfa4950e009fe560499c9db6e89daae8bd18bcb372caca6d321a86c52cda: error getting default registries to try: unable to parse 'kindest/node:v1.13.2@sha256:e14edfa4950e009fe560499c9db6e89daae8bd18bcb372caca6d321a86c52cda': Docker references with both a tag and digest are currently not supported

@rhatdan
Copy link
Author

rhatdan commented Jan 24, 2019

@randomvariable Is this related to this PR or something different? Could you open an issue on github.com/containers/libpod for this, so I can have one of my guys look into it?

I believe that kind should be able to run with Podman as
well as Docker.

Signed-off-by: Daniel J Walsh <[email protected]>
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rhatdan
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: munnerz

If they are not already assigned, you can assign the PR to them by writing /assign @munnerz in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

@rhatdan: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kind-verify e956870 link /test pull-kind-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot
Copy link
Contributor

@rhatdan: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 9, 2019
@BenTheElder
Copy link
Member

We're making a concerted effort to refactor and support this, I'd like to implement differently (no globals vas in the packages, no lookup in init(), an actual per-runtime implementation so we can handle differences ...) and I'd like to not drop all references to docker in the docs quite as agressively so as to not confuse our existing users.

See also: https://kind.sigs.k8s.io/docs/design/principles/#target-cri-functionality, making this possible is a documented requirement for a little while now.

This is tracked in #154.

@rhatdan
Copy link
Author

rhatdan commented Feb 25, 2019

Awesome, thanks. Please contact me if you need any changes to our container runtimes. (Engines)

stg-0 pushed a commit to stg-0/kind that referenced this pull request Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants