Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kind in dind under sysbox not working after upgrade to kind v0.11.1 #2490

Closed
felipecrs opened this issue Oct 9, 2021 · 12 comments
Closed

kind in dind under sysbox not working after upgrade to kind v0.11.1 #2490

felipecrs opened this issue Oct 9, 2021 · 12 comments
Labels
area/rootless Issues or PRs related to rootless containers kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@felipecrs
Copy link
Contributor

felipecrs commented Oct 9, 2021

What happened:
After upgrading to kind 0.11.1 from 0.10.0, kind create cluster does not work anymore.

What you expected to happen:
To work, as it worked before the upgrade.

How to reproduce it (as minimally and precisely as possible):
I'm using Sysbox as my container runtime.

  1. Install Sysbox
  2. Run a dind container with sysbox as runtime
  3. Install kind 0.11.1
  4. kind create cluster

Anything else we need to know?:

jenkins@dind:~$ docker run kindest/node:v1.20.7@sha256:cbeaf907fc78ac97ce7b625e4bf0de16e3ea725daf6b04f930bd14c67c671ff9
time="2021-10-09T05:37:41.783364868Z" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/192bbf079c6ad7f6e3be91eb4de17e0ac903b9a7fa5bd1587ac8051c4e303650 pid=9647
INFO: running in a user namespace (experimental)
ERROR: UserNS: cgroup v2 needs to be enabled

Environment:

  • kind version: 0.11.1
  • Kubernetes version: none
  • Docker version:
jenkins@dind:~$ docker version
Client: Docker Engine - Community
 Version:           20.10.9
 API version:       1.41
 Go version:        go1.16.8
 Git commit:        c2ea9bc
 Built:             Mon Oct  4 16:08:29 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.9
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.8
  Git commit:       79ea9d3
  Built:            Mon Oct  4 16:06:37 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.11
  GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
jenkins@dind:~$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.6.3)
  compose: Docker Compose (Docker Inc., v2.0.1)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 20.10.9
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 5b46e404f6b9f661a205e28d59c982d3634148f8
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.0-70-generic
 Operating System: Ubuntu 20.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.34GiB
 Name: dind
 ID: 7FEQ:W3EN:IIFL:RIRP:4UXU:OUMB:KVLK:MOD7:MXEE:4QYK:3OA6:HPM3
 Docker Root Dir: /home/jenkins/agent/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
  • OS: Ubuntu 20.04.3 LTS

References nestybox/sysbox#410

@felipecrs felipecrs added the kind/bug Categorizes issue or PR as related to a bug. label Oct 9, 2021
@BenTheElder BenTheElder changed the title Not working after upgrade to kind 0.11.1 kind in dind under sysbox not working after upgrade to kind v0.11.1 Oct 9, 2021
@BenTheElder
Copy link
Member

We don't currently support sysbox ... #1772 has not seen any contributions and I don't think any of us maintainers have the bandwidth for this at the moment.

In order to support rootless properly (see guide: https://kind.sigs.k8s.io/docs/user/rootless/) we currently require cgroups v2.

@BenTheElder BenTheElder added the area/rootless Issues or PRs related to rootless containers label Oct 9, 2021
@felipecrs
Copy link
Contributor Author

felipecrs commented Oct 9, 2021

I don't think sysbox is the point here, and I never meant to use kind as rootless. As said, in 0.10.0 it worked, so perhaps the detection of rootless is broken. For example, if cgroup v2 is not available, kind could fallback to not use rootless at all.

Sysbox is just the simplest way to create an environment where the issue can be replicated.

@felipecrs
Copy link
Contributor Author

felipecrs commented Oct 9, 2021

Furthermore, the issue referenced talks about using Sysbox as container runtime for creating kind clusters instead of the normal --privileged flag. It has nothing about what I'm reporting, which is the normal kind not working under an environment where it used to work before.

@BenTheElder
Copy link
Member

BenTheElder commented Oct 11, 2021

Sysbox is just the simplest way to create an environment where the issue can be replicated

But we are not developing with sysbox, and don't have time to do so, which is my point. (and nobody else has contributed anything related so far).

Sysbox does alter the environment quite a bit. Currently we've not seen this issue outside sysbox. Enabling userns and NOT running rootless docker is esoteric to the point that we've not had any reports up until this point.

@rodnymolina
Copy link

rodnymolina commented Oct 11, 2021

@BenTheElder, the problem is not Sysbox specific ...

The same issue is probably seen in all those scenarios where a rootfull runtime creates an unprivileged container in a cgroup-v1 system. Don't think those are esoteric scenarios -- see below how the same exact issue is being reproduced in LXD.

I think the current entrypoint logic should not couple the unprivileged-user-ns execution with the rootless character of the runtime environment. IOW, you could have a rootfull runtime within a cgroup-v1 kernel launching an unprivileged container, so I believe that the simple fix submitted by @felipecrs should work fine in all the scenarios (haven't tested it though).

Hope that makes sense.

rmolina@dev-vm1:~$ lxc list
+-----------+---------+------------------------------+----------------------------------------------+-----------+-----------+
|   NAME    |  STATE  |             IPV4             |                     IPV6                     |   TYPE    | SNAPSHOTS |
+-----------+---------+------------------------------+----------------------------------------------+-----------+-----------+
| first-one | RUNNING | 172.18.0.1 (br-53551ea5f51f) | fd42:ca5:e139:a249:216:3eff:fef2:3e77 (eth0) | CONTAINER | 0         |
|           |         | 172.17.0.1 (docker0)         | fc00:f853:ccd:e793::1 (br-53551ea5f51f)      |           |           |
|           |         | 10.120.7.78 (eth0)           |                                              |           |           |
+-----------+---------+------------------------------+----------------------------------------------+-----------+-----------+
rmolina@dev-vm1:~$

rmolina@dev-vm1:~$ sudo lxc exec first-one -- /bin/bash

root@first-one:~# cat /proc/self/uid_map
         0    1000000 1000000000
root@first-one:~#

root@first-one:~# ./kind create cluster --retain -v 1
Creating cluster "kind" ...
DEBUG: docker/images.go:67] Pulling image: kindest/node:v1.21.1@sha256:69860bda5563ac81e3c0057d654b5253219618a22ec3a346306239bba8cfa1a6 ...
 ✓ Ensuring node image (kindest/node:v1.21.1) 🖼
 ✓ Preparing nodes 📦
 ✗ Writing configuration 📜
ERROR: failed to create cluster: failed to generate kubeadm config content: failed to get kubernetes version from node: failed to get file: command "docker exec --privileged kind-control-plane cat /kind/version" failed with error: exit status 1
Command Output: Error response from daemon: Container 4ff56de42eb687cbd181d45355e7993e80eb1057563c86296d2727391a4734e4 is not running
Stack Trace:
sigs.k8s.io/kind/pkg/errors.WithStack
	sigs.k8s.io/kind/pkg/errors/errors.go:59
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
	sigs.k8s.io/kind/pkg/exec/local.go:124
sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run
	sigs.k8s.io/kind/pkg/cluster/internal/providers/docker/node.go:146
sigs.k8s.io/kind/pkg/exec.OutputLines
	sigs.k8s.io/kind/pkg/exec/helpers.go:81
sigs.k8s.io/kind/pkg/cluster/nodeutils.KubeVersion
	sigs.k8s.io/kind/pkg/cluster/nodeutils/util.go:35
sigs.k8s.io/kind/pkg/cluster/internal/create/actions/config.getKubeadmConfig
	sigs.k8s.io/kind/pkg/cluster/internal/create/actions/config/config.go:208
sigs.k8s.io/kind/pkg/cluster/internal/create/actions/config.(*Action).Execute.func1.1
	sigs.k8s.io/kind/pkg/cluster/internal/create/actions/config/config.go:90
sigs.k8s.io/kind/pkg/errors.UntilErrorConcurrent.func1
	sigs.k8s.io/kind/pkg/errors/concurrent.go:30
runtime.goexit
	runtime/asm_amd64.s:1371

root@first-one:~# docker ps -a
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS                      PORTS     NAMES
4ff56de42eb6   kindest/node:v1.21.1   "/usr/local/bin/entr…"   37 seconds ago   Exited (1) 34 seconds ago             kind-control-plane

root@first-one:~# docker logs kind-control-plane
INFO: running in a user namespace (experimental)
ERROR: UserNS: cgroup v2 needs to be enabled
INFO: running in a user namespace (experimental)
ERROR: UserNS: cgroup v2 needs to be enabled
root@first-one:~#

@aojea
Copy link
Contributor

aojea commented Oct 11, 2021

/cc @AkihiroSuda

@AkihiroSuda
Copy link
Member

@rodnymolina
Copy link

Left a comment in https://github.com/kubernetes-sigs/kind/pull/2492/files#r725882283

@AkihiroSuda, makes perfect sense to me.

@BenTheElder
Copy link
Member

BenTheElder commented Oct 11, 2021

The same issue is probably seen in all those scenarios where a rootfull runtime creates an unprivileged container in a cgroup-v1 system. Don't think those are esoteric scenarios -- see below how the same exact issue is being reproduced in LXD.

docker / podman in-LXD is also an unusual / unsupported environment that nobody [in our ecosystem] testing. I would call LXD + kind quite esoteric.

Let me be a bit clearer: I have no intention to block any fixes and welcome any patches: but there should be no expectation that this environment will work in the future, we're not testing it and don't have the bandwidth to do so, we're barely keeping up with the existing support.

I think the current entrypoint logic should not couple the unprivileged-user-ns execution with the rootless character of the runtime environment.

Sure, but, this is just one specific problem. Any future problems for these environments will stem from the same root issue: We are not and do not plan to test them, and until now have received no contributions related to them.

@rodnymolina
Copy link

@BenTheElder, I understand.

We will keep an eye on KinD-over-Sysbox scenarios and report/fix any issues that may arise.

Thanks.

@BenTheElder
Copy link
Member

#2465 landed #2498 along with backlogged base image changes, this should be fixed in future node images (bump to that coming next).

thanks!

@BenTheElder BenTheElder added this to the v0.12.0 milestone Oct 14, 2021
@felipecrs
Copy link
Contributor Author

Thank you all a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rootless Issues or PRs related to rootless containers kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
5 participants