Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenVPN not working inside container #18708

Open
wlhlm opened this issue Sep 17, 2016 · 20 comments
Open

OpenVPN not working inside container #18708

wlhlm opened this issue Sep 17, 2016 · 20 comments
Labels
0.kind: bug Something is broken 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 9.needs: port to stable A PR needs a backport to the stable release.

Comments

@wlhlm
Copy link
Contributor

wlhlm commented Sep 17, 2016

Issue description

OpenVPN inside a container fails to start with:

ERROR: Cannot open TUN/TAP dev /dev/net/tun: Operation not permitted (errno=1)

Steps to reproduce

  1. Start container with OpenVPN enabled
  2. Wonder that OpenVPN isn't actually running
  3. Check journal

Technical details

It looks like NET_ADMIN capability is missing in order for OpenVPN to manipulate the TUN device.

  • System: 17.03pre91207.2b0eace (Gorilla)
  • Nix version: nix-env (Nix) 1.11.4
  • Nixpkgs version: 17.03pre91207.2b0eace
@Mic92
Copy link
Member

Mic92 commented Sep 18, 2016

nix-containers could expose --capabilities of systemd-nspawn, so NET_ADMIN could be added to the list
You could send a pull requests which add this option <nixpkgs/nixos/modules/virtualisation/containers.nix>

@rasendubi rasendubi added 0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS labels Sep 18, 2016
@wlhlm
Copy link
Contributor Author

wlhlm commented Sep 18, 2016

@Mic92 Thanks for the pointer. I'm investigating...

@wlhlm
Copy link
Contributor Author

wlhlm commented Sep 18, 2016

Ok, I stumbled upon this in the manual:

[Private] networking is implemented using a pair of virtual Ethernet devices. The network interface in the container is called eth0, while the matching interface in the host is called ve-container-name (e.g., ve-foo). The container has its own network namespace and the CAP_NET_ADMIN capability, so it can perform arbitrary network configuration such as setting up firewall rules, without affecting or having access to the host’s network.

Which sounded somewhat confusing to me at first, but was made clearer after @Mic92's comment about systemd-nspawn. After strolling through the systemd-nspawn manual I found:

--private-network
Disconnect networking of the container from the host. [...] If this option is specified, the CAP_NET_ADMIN capability will be added to the set of capabilities the container retains. [...]

Though nixos-container doesn't set --private-network directly, but --network-veth:

-n, --network-veth
Create a virtual Ethernet link ("veth") between host and container. [...] The --network-veth option implies --private-network.

So, containers with private network have CAP_NET_ADMIN, containers without don't, meaning OpenVPN should work at least in the former.

After that I've tried to run OpenVPN inside a container with private networking by setting containers.<name>.privateNetwork = true and was surprised again. OpenVPN still fails to start with the same error message as above ("Operation not permitted"). Running OpenVPN manually inside the container works:

# nixos-container run vpn -- openvpn --config /nix/store/jx60gppis2xs7zgvs29afz5x7nbpj7s0-openvpn-config-main
...
Sun Sep 18 15:25:50 2016 us=747892 Initialization Sequence Completed

Not really sure what's going on here...

@wlhlm
Copy link
Contributor Author

wlhlm commented Sep 18, 2016

Actually, it seems that, when started manually, OpenVPN also works inside a container without private network:

# nixos-container run vpn -- ip link show
1: lo: ...
2: enp1s0: ...
# nixos-container run vpn -- openvpn --config /nix/store/jx60gppis2xs7zgvs29afz5x7nbpj7s0-openvpn-config-main
...
Sun Sep 18 15:54:16 2016 us=312424 Initialization Sequence Completed
  • OpenVPN failing to access a TUN device doesn't seem to have anything to do with CAP_NET_ADMIN
    • Maybe OpenVPN simply isn't able to access /dev/net/tun (as the original error already states d'oh).

@Mic92
Copy link
Member

Mic92 commented Sep 18, 2016

Actually you need 2 things: access to the device node /dev/net/tun and create a tun device and CAP_NET_ADMIN to add ip addresses/routes to this interface. Note that the container has its own devtmpfs mounted on /dev, which is different from the one on the host.
By default /dev/net/tun probably does not exists in containers, but your container should have the permission to create one:

$ mkdir /dev/net
$ mknod /dev/net/tun c 10 200
$ chmod 0666 net/tun

@wlhlm
Copy link
Contributor Author

wlhlm commented Sep 18, 2016

/dev/net/tun already exists in all of my containers (even in the ones not running OpenVPN). Using mknod was the first solution I tried, but it failed with "/dev/net/tun exists".

@Mic92
Copy link
Member

Mic92 commented Sep 18, 2016

Ok. Opening a device node should not require any special capabilities, just the correct permissions.

$ cat /dev/net/tun 
cat: /dev/net/tun: Permission denied
$ chmod 777 /dev/net/tun
$ cat /dev/net/tun
cat: /dev/net/tun: File descriptor in bad state

Is running openvpn as user in the openvpn service?

@wlhlm
Copy link
Contributor Author

wlhlm commented Sep 18, 2016

@Mic92 /dev/net/tun is created by systemd-nspawn itself. Found in a mailinglist posting (systemd/systemd@85614d6).


Is running openvpn as user in the openvpn service?

@Mic92 Openvpn is run as root AFAICT. Both the generated service and the config file don't include any user directives. Though I'm not sure how to find that out for real. I was thinking about just adding ExecStartPre=/run/current-system/sw/bin/systemd-run whoami, but I don't know how to modify the service, since /nix/store is read-only.

Here is the generated service:

$ cat /nix/store/dy2vwbw6ba335c92cxk0z9gvks0vbzwj-unit-openvpn-main.service/openvpn-main.service
[Unit]
After=network-interfaces.target
Description=OpenVPN instance ‘main’

[Service]
Environment="LOCALE_ARCHIVE=/nix/store/5r4ld7ljv7j7psg19m2300pf63zki7a2-glibc-locales-2.24/lib/locale/locale-archive"
Environment="PATH=/nix/store/9z2j15dpph32n892vhv60vl4nj4vxcmc-iptables-1.6.0/bin:/nix/store/kwniwil6lhjwksgkxsbab7x8sff8fl8h-iproute2-4.5.0/bin:/nix/store/97f928hz549v1kz36i2680rx9xxsz8nq-net-tools-1.60_p20120127084908/bin:/nix/store/zhhb02yrp03xzfrpk06ixb0gmwrjmjff-coreutils-8.25/bin:/nix/store/8rd2k2l4q2zyqin7whdafsxq15q9x94j-findutils-4.6.0/bin:/nix/store/8d1n7sv4fbjf0rbn0xaczja2py4iqm25-gnugrep-2.25/bin:/nix/store/n62975gpkckivvxvlhg9hjfknspxnkkm-gnused-4.2.2/bin:/nix/store/cs53yxld1yqxd5k8fwm0nwsiz159ccwq-systemd-231/bin:/nix/store/9z2j15dpph32n892vhv60vl4nj4vxcmc-iptables-1.6.0/sbin:/nix/store/kwniwil6lhjwksgkxsbab7x8sff8fl8h-iproute2-4.5.0/sbin:/nix/store/97f928hz549v1kz36i2680rx9xxsz8nq-net-tools-1.60_p20120127084908/sbin:/nix/store/zhhb02yrp03xzfrpk06ixb0gmwrjmjff-coreutils-8.25/sbin:/nix/store/8rd2k2l4q2zyqin7whdafsxq15q9x94j-findutils-4.6.0/sbin:/nix/store/8d1n7sv4fbjf0rbn0xaczja2py4iqm25-gnugrep-2.25/sbin:/nix/store/n62975gpkckivvxvlhg9hjfknspxnkkm-gnused-4.2.2/sbin:/nix/store/cs53yxld1yqxd5k8fwm0nwsiz159ccwq-systemd-231/sbin"
Environment="TZDIR=/nix/store/fa2fa02lgq591r69pxwhdbfw3kyr9sl0-tzdata-2016f/share/zoneinfo"



ExecStart=@/nix/store/s5wnwc5vbngivxc10kqzrhfdhw7drjym-openvpn-2.3.11/sbin/openvpn openvpn --config /nix/store/zw6bjndsw8yc1i4l058cww6a09fwkdli-openvpn-config-main
Restart=always
Type=notify


If I just copy the ExecStart line and run it using nixos-container run or nixos-container root-login, it works just fine (see above). Only the service seems to fail.

@Mic92
Copy link
Member

Mic92 commented Sep 18, 2016

I just checked the permission of the tun device from within the service in such a container. It is the same as the as expected (with ls). The service has also all capabilities the container has (with getpcaps $$).
Other security frameworks such as Apparmor/SElinux are not active. The syscall trace of openvpn provided by sysdig looks sane to me, but I don't know, why openvpn is not allowed to open /dev/net/tun

open fd=-1(EPERM) name=/dev/net/tun flags=3(O_RDWR) mode=0

UPDATE Maybe it has something to do with systemd's DeviceAllow feature?

https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html

This list is already restricted at container startup, but maybe it is further restricted for a service?

@wlhlm
Copy link
Contributor Author

wlhlm commented Sep 20, 2016

I found out why OpenVPN works with nixos-container run: it uses nsenter to get an environment similar to that of the container, but doesn't switch cgroups, so it's not really comparable with how the systemd service is run. When I open a shell with machinectl shell vpn I get a "real" session inside the container and when I manually run OpenVPN it also fails to open the tun device, just like the service.

@edolstra Why does nixos-container use nsenter instead of just machinectl shell? It seems like nsenter doesn't give you the complete environment of the container leading to surprising results.

@Mic92
Copy link
Member

Mic92 commented Sep 20, 2016

ok, didn't now that it uses nsenter. This is probably a leftover, when machinectl had no shell subcommand. This probably worth a pull request because nsenter also lack other things like proper pseudo tty support.

@wlhlm
Copy link
Contributor Author

wlhlm commented Sep 20, 2016

I finally found the problem. @Mic92 was right with the Device* options.

By default, nixos-container creates a service file for every container (container@<name>.service) and sets DevicePolicy=closed, meaning the container can't access any devices that aren't explicitly allowed with DeviceAllow (except for /dev/null, /dev/zero, /dev/full, /dev/random, and /dev/urandom).

Editing the containers module to set DeviceAllow=/dev/net/tun rw (or DevicePolicy=auto) fixes the issue and OpenVPN can create a tun interface without problems.

I'll submit PRs for nixos-container and containers module shortly. I'll also look into adding capabilities options to the containers module, as this currently only works with privateNetwork = true.

@rasendubi
Copy link
Member

Reopening. Needs backport to 16.09

@rasendubi rasendubi reopened this Oct 4, 2016
@rasendubi rasendubi added the 9.needs: port to stable A PR needs a backport to the stable release. label Oct 4, 2016
@cillianderoiste
Copy link
Member

It would also be great to allow /dev/net/tun to be used inside imperative containers/globally.

@nh2
Copy link
Contributor

nh2 commented Dec 8, 2017

It would also be great to allow /dev/net/tun to be used inside imperative containers

Yes, I need that, so that it works with nixops.

@deliciouslytyped
Copy link
Contributor

Is this still relevant? Backporting to 16.09 is probably no longer interesting, but is there anything else to do here?

@wlhlm
Copy link
Contributor Author

wlhlm commented Dec 23, 2019

/dev/tun access was backported in #19523, but I'm unclear whether it works in imperative containers as @cillianderoiste and @nh2 mention above.

@stale
Copy link

stale bot commented Jun 20, 2020

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 20, 2020
@Lassulus
Copy link
Member

Lassulus commented Sep 27, 2020

this is still relevant, as /dev/net/tun is inaccessible inside imperative containers. Would be nice if there would be --enable-tun flag for nixos-container create or something like that

EDIT:
so after diving into the code, the problem is:
permissions get added by the unit file
the unit file is added inside an nixos module
the imperative containers are managed by an perl script which currently is not interacting with the unit in any way

so a solution would be to allow /dev/net/tun globally in all declarative containers in https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/virtualisation/nixos-containers.nix

Or maybe to add some flags to globally enable/disable the flags for all imperative containers.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Sep 27, 2020
@stale
Copy link

stale bot commented Mar 31, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 9.needs: port to stable A PR needs a backport to the stable release.
Projects
None yet
Development

No branches or pull requests

7 participants