Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rootfs: do not permit /proc mounts to non-directories #2207

Merged
merged 1 commit into from
Jan 22, 2020
Merged

rootfs: do not permit /proc mounts to non-directories #2207

merged 1 commit into from
Jan 22, 2020

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented Jan 14, 2020

mount(2) will blindly follow symlinks, which is a problem because it
allows a malicious container to trick runc into mounting /proc to an
entirely different location (and thus within the attacker's control for
a rename-exchange attack).

This is just a hotfix (to "stop the bleeding"), and the more complete
fix would be finish libpathrs and port runc to it (to avoid these types
of attacks entirely, and defend against a variety of other /proc-related
attacks). It can be bypased by someone having "/" be a volume controlled
by another container.

Fixes: CVE-2019-19921
Fixes #2197
Signed-off-by: Aleksa Sarai [email protected]

@cyphar
Copy link
Member Author

cyphar commented Jan 14, 2020

/cc @leoluk @opencontainers/runc-maintainers

Copy link

@leoluk leoluk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

mount(2) will blindly follow symlinks, which is a problem because it
allows a malicious container to trick runc into mounting /proc to an
entirely different location (and thus within the attacker's control for
a rename-exchange attack).

This is just a hotfix (to "stop the bleeding"), and the more complete
fix would be finish libpathrs and port runc to it (to avoid these types
of attacks entirely, and defend against a variety of other /proc-related
attacks). It can be bypased by someone having "/" be a volume controlled
by another container.

Fixes: CVE-2019-19921
Signed-off-by: Aleksa Sarai <[email protected]>
@cyphar
Copy link
Member Author

cyphar commented Jan 17, 2020

/cc @opencontainers/runc-maintainers

@giuseppe
Copy link
Member

giuseppe commented Jan 17, 2020

@cyphar I've added to crun an extra check to allow procfs and sysfs to be mounted only under the rootfs, so no /foo/proc to prevent the mount to happen on a volume where another container has CAP_SYS_ADMIN access (hopefully unlikely) and could do a mount(MS_MOVE) between the mount setup and the MS_RDONLY. Do you think it makes sense to add here as well?

@cyphar
Copy link
Member Author

cyphar commented Jan 17, 2020

Even given how crazy VOLUME is, I don't think you can trick Docker (or other such tools) into mounting procfs on /foo/proc. Also these are temporary restrictions, there's really no need to over-engineer them -- I want to drop this the second runc is ported to libpathrs.

Not to mention that if someone decides to create a container with CAP_SYS_ADMIN then they are running an insecure setup and it isn't our job to help protect them at that point -- there is literally nothing we can do to protect someone in that case (because the container can do many more malicious things than just mount(MS_MOVE)). If you have full-on CAP_SYS_ADMIN you can just re-mount anything as read-write. I'm not sure that the protection you mentioned actually addresses the "you have a container with CAP_SYS_ADMIN" threat model.

@thaJeztah
Copy link
Member

/cc @justincormack

@giuseppe
Copy link
Member

If you have full-on CAP_SYS_ADMIN you can just re-mount anything as read-write. I'm not sure that the protection you mentioned actually addresses the "you have a container with CAP_SYS_ADMIN" threat model.

true, I was thinking of a way to make the writeable /proc persistent RW when mounted on a shared volume, but I don't see a way without requiring a weird configuration (if possible at all with seccomp) to allow mount(MS_MOVE) but not allowing a remount/umount.

But even without CAP_SYS_ADMIN: if you can mount /proc on a shared bind mount, there will still be the possibility for a process in another container to open a file before the /proc mount is remounted read only. Allowing /proc just in the rootfs limits the attack as the rootfs itself must be accessible from the other malicious container.

@cyphar
Copy link
Member Author

cyphar commented Jan 17, 2020

I see your point, but IMHO the threat model for OCI runtimes is not an attacker being able to control the configuration (because in that case, it's game over no matter what you do). The fact that higher-level runtimes allow for this (such as Docker's VOLUME) is something we have to deal with anyway, but I'd prefer to avoid giving anyone the impression that we can solve the general (or even semi-general) case here.

And while it isn't possible (AFAICS) to trick Docker into creating a /foo/proc mount for procfs (in such a way that it isn't also possible to mount anything), it is very possible to get Docker to make / a volume -- which as I mentioned above (and in the other issue) allows you to bypass all of these checks entirely (including the one you described).

@cyphar
Copy link
Member Author

cyphar commented Jan 18, 2020

Then again, all of our masked-path logic is very much tied to the actual path (not the filesystem) so there is an argument for making procfs only mountable on /proc. My main disagreement is that I don't think that change helps protect against the threat model it's trying to protect against -- because if you can provide a way to mount procfs to a custom path there are plenty of other ways of causing trouble.

@cyphar
Copy link
Member Author

cyphar commented Jan 21, 2020

/ping @opencontainers/runc-maintainers

@hqhq
Copy link
Contributor

hqhq commented Jan 22, 2020

LGTM

Approved with PullApprove

1 similar comment
@mrunalp
Copy link
Contributor

mrunalp commented Jan 22, 2020

LGTM

Approved with PullApprove

@mrunalp mrunalp merged commit 2fc03cc into opencontainers:master Jan 22, 2020
@cyphar cyphar deleted the fix-double-volume-attack branch January 22, 2020 16:26
@thaJeztah
Copy link
Member

Will this be tagged as v1.0.0-rc10 ?

@thaJeztah
Copy link
Member

oh, nevermind; I see the e-mail thread 👍

AkihiroSuda added a commit to AkihiroSuda/docker that referenced this pull request Jan 24, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Signed-off-by: Akihiro Suda <[email protected]>
AkihiroSuda added a commit to AkihiroSuda/docker that referenced this pull request Jan 24, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Also updates go-selinux: opencontainers/selinux@3a1f366...5215b18
(See containerd/cri#1383 (comment))

Signed-off-by: Akihiro Suda <[email protected]>
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this pull request Jan 25, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Signed-off-by: Akihiro Suda <[email protected]>
Upstream-commit: cd43c1d1ac81a37dc8f9aad16d33949df80ac5b9
Component: engine
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this pull request Jan 25, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Also updates go-selinux: opencontainers/selinux@3a1f366...5215b18
(See containerd/cri#1383 (comment))

Signed-off-by: Akihiro Suda <[email protected]>
Upstream-commit: 6d6808090736ac76e908e78aa6894f5586c7d243
Component: engine
thaJeztah pushed a commit to thaJeztah/docker that referenced this pull request Feb 4, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Signed-off-by: Akihiro Suda <[email protected]>
(cherry picked from commit cd43c1d)
Signed-off-by: Sebastiaan van Stijn <[email protected]>
thaJeztah pushed a commit to thaJeztah/docker that referenced this pull request Feb 4, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Also updates go-selinux: opencontainers/selinux@3a1f366...5215b18
(See containerd/cri#1383 (comment))

Signed-off-by: Akihiro Suda <[email protected]>
(cherry picked from commit 6d68080)
Signed-off-by: Sebastiaan van Stijn <[email protected]>
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this pull request Feb 5, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Signed-off-by: Akihiro Suda <[email protected]>
(cherry picked from commit cd43c1d1ac81a37dc8f9aad16d33949df80ac5b9)
Signed-off-by: Sebastiaan van Stijn <[email protected]>
Upstream-commit: 3bd1759f804a53d15685e22eab7d609bb1fa556b
Component: engine
docker-jenkins pushed a commit to docker-archive/docker-ce that referenced this pull request Feb 5, 2020
Notable changes:
* Fix CVE-2019-19921 (Volume mount race condition with shared mounts): opencontainers/runc#2207
* Fix exec FIFO race: opencontainers/runc#2185
* Basic support for cgroup v2.  Almost feature-complete, but still missing support for systemd mode in rootless.
  See also opencontainers/runc#2209 for the known issues.

Full changes: opencontainers/runc@v1.0.0-rc9...v1.0.0-rc10

Also updates go-selinux: opencontainers/selinux@3a1f366...5215b18
(See containerd/cri#1383 (comment))

Signed-off-by: Akihiro Suda <[email protected]>
(cherry picked from commit 6d6808090736ac76e908e78aa6894f5586c7d243)
Signed-off-by: Sebastiaan van Stijn <[email protected]>
Upstream-commit: d3dab1f618d6e8c81d0704ac4e93bb2843c2dadf
Component: engine
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CVE-2019-19921]: Volume mount race condition with shared mounts
6 participants