-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable rootless mode except RootlessCgMgr when executed as the root in userns (fix Docker-in-LXD regression) #1862
Disable rootless mode except RootlessCgMgr when executed as the root in userns (fix Docker-in-LXD regression) #1862
Conversation
89d6837
to
fceaba2
Compare
26e5251
to
3c63ff4
Compare
@giuseppe PTAL |
libcontainer/configs/config.go
Outdated
// IntelRdt specifies settings for Intel RDT/CAT group that the container is placed into | ||
// to limit the resources (e.g., L3 cache) the container has available | ||
IntelRdt *IntelRdt `json:"intel_rdt,omitempty"` | ||
|
||
// RootlessEUID specifies is set when the runc was launched with non-zero EUID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please drop "specifies" or "is set"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
rootless_linux.go
Outdated
} | ||
// euid = 0, in a userns. | ||
u := os.Getenv("USER") | ||
return u != "" && u != "root" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be something like?
u, ok := os.LookupEnv("USER")
return !ok || u != "root"
Otherwise it breaks current users like Podman (and probably Buildah as well) that don't pass $USER
down to the runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
95dbfbc
to
0851385
Compare
this needs a rebase |
0851385
to
3334b2e
Compare
rebased |
@giuseppe LGTY? |
@AkihiroSuda yes, I've tested it with podman and it works well for me |
/cc @crosbymichael ptal |
an error I've seen (not introduced by this patch but that could probably be fixed as part of it), is: create a rootless user namespace with multiple IDs mapped in, then create a runc container that has:
if I try to exec (from the first user namespace) into the container, I get:
|
…in userns This PR decomposes `libcontainer/configs.Config.Rootless bool` into `RootlessEUID bool` and `RootlessCgroups bool`, so as to make "runc-in-userns" to be more compatible with "rootful" runc. `RootlessEUID` denotes that runc is being executed as a non-root user (euid != 0) in the current user namespace. `RootlessEUID` is almost identical to the former `Rootless` except cgroups stuff. `RootlessCgroups` denotes that runc is unlikely to have the full access to cgroups. `RootlessCgroups` is set to false if runc is executed as the root (euid == 0) in the initial namespace. Otherwise `RootlessCgroups` is set to true. (Hint: if `RootlessEUID` is true, `RootlessCgroups` becomes true as well) When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes), `RootlessEUID` is set to false but `RootlessCgroups` is set to true. So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored. This PR does not have any impact on CLI flags and `state.json`. Note about CLI: * Now `runc --rootless=(auto|true|false)` CLI flag is only used for setting `RootlessCgroups`. * Now `runc spec --rootless` is only required when `RootlessEUID` is set to true. For runc-in-userns, `runc spec` without `--rootless` should work, when sufficient numbers of UID/GID are mapped. Note about `$XDG_RUNTIME_DIR` (e.g. `/run/user/1000`): * `$XDG_RUNTIME_DIR` is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility. (`/run/runc` is used) * If runc is executed as the root (euid == 0) in an user namespace, `$XDG_RUNTIME_DIR` is honored if `$USER != "" && $USER != "root"`. This allows unprivileged users to allow execute runc as the root in userns, without mounting writable `/run/runc`. Note about `state.json`: * `rootless` is set to true when `RootlessEUID == true && RootlessCgroups == true`. Signed-off-by: Akihiro Suda <[email protected]>
3334b2e
to
06f789c
Compare
@giuseppe should be fixed in the latest revision |
yes thanks, I've verified that it works now |
Assuming a host UID of 1000, the UID mapping inside the user namespace created by rootless podman for the toolbox container was: 0 1000 1 1 100000 65536 ... which was the same as seen from the host: 0 1000 1 1 100000 65536 Therefore, when running with an UID of 1000 inside the container, it got mapped to UID 100999 on the host. That means, for example, files created by the user inside the container end up looking funny from the host. This is addressed by creating another user namespace that's a child of the initial user namespace created by rootless podman. Assuming a host UID of 1000, the UID mapping inside this child namespace is: 1000 0 1 0 1 1000 1001 1001 64536 ... which when seen from the host is: 1000 1000 1 0 100000 1000 1001 101000 64536 This means that UID 1000 inside the child namespace is mapped to the same UID 1000 on the host via the intermediate namespace created by rootless podman. UIDs 0 to 999 inside the child namespace are mapped to UIDs 100000 to 100999 in the host. This change requires this runc pull request to work: opencontainers/runc#1862 As suggested by Giuseppe Scrivano.
ping @cyphar |
This PR seems also needed for Docker-on-Chromebook https://www.reddit.com/r/Crostini/comments/9jabhq/docker_now_working/ |
Without this PR one can't use rootless |
ping @opencontainers/runc-maintainers |
This is being discussed at the dev ML: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/jOaNYre5M40 |
I'm reviewing and testing this today |
I'll review this today. |
This PR decomposes
libcontainer/configs.Config.Rootless bool
intoRootlessEUID bool
andRootlessCgroups bool
, so as to make "runc-in-userns" to be more compatible with "rootful" runc.RootlessEUID
denotes that runc is being executed as a non-root user (euid != 0) inthe current user namespace.
RootlessEUID
is almost identical to the formerRootless
except cgroups stuff.
RootlessCgroups
denotes that runc is unlikely to have the full access to cgroups.RootlessCgroups
is set to false if runc is executed as the root (euid == 0) in the initial namespace.Otherwise
RootlessCgroups
is set to true.(Hint: if
RootlessEUID
is true,RootlessCgroups
becomes true as well)When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes),
RootlessEUID
is set to false butRootlessCgroups
is set to true.So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored.
This PR does not have any impact on CLI flags and
state.json
.Note about CLI:
runc --rootless=(auto|true|false)
CLI flag is only used for settingRootlessCgroups
.runc spec --rootless
is only required whenRootlessEUID
is set to true.For runc-in-userns,
runc spec
without--rootless
should work, when sufficient numbers ofUID/GID are mapped.
Note about
$XDG_RUNTIME_DIR
(e.g./run/user/1000
):$XDG_RUNTIME_DIR
is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility.(
/run/runc
is used)$XDG_RUNTIME_DIR
is honored if$USER != "" && $USER != "root"
.This allows unprivileged users to allow execute runc as the root in userns, without mounting writable
/run/runc
.Note about
state.json
:rootless
is set to true whenRootlessEUID == true && RootlessCgroups == true
.Signed-off-by: Akihiro Suda [email protected]
Fix Don't always enable rootless mode in userns #1837 Rootless build: cannot specify gid= mount options for unmapped gid in rootless containers containers/podman#1147