Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compose flake: rootlesskit port forwarder not setup correctly #10052

Closed
Luap99 opened this issue Apr 16, 2021 · 16 comments · Fixed by #11091
Closed

compose flake: rootlesskit port forwarder not setup correctly #10052

Luap99 opened this issue Apr 16, 2021 · 16 comments · Fixed by #11091
Assignees
Labels
flakes Flakes from Continuous Integration In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature

Comments

@Luap99
Copy link
Member

Luap99 commented Apr 16, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

The rootless compose test flaking regularly in CI. The problem is that the port forwarding with rootlesskit is not working. Rootlesskit does not seem to bind the port on the host.
A example test failure with somw debug information can be seen here: https://storage.googleapis.com/cirrus-ci-6707778565701632-fcae48/artifacts/containers/podman/5302424773591040/html/compose-podman-fedora-34beta-rootless-host.log.html
Relevant log lines:

time="2021-04-16T04:34:23-05:00" level=debug msg="rootlessport: time=\"2021-04-16T04:34:23-05:00\" level=info msg=\"starting parent driver\"\n"
time="2021-04-16T04:34:23-05:00" level=debug msg="rootlessport: time=\"2021-04-16T04:34:23-05:00\" level=info msg=\"opaque=map[builtin.readypipepath:/run/user/30212/libpod/tmp/rootlessport761613945/.bp-ready.pipe builtin.socketpath:/run/user/30212/libpod/tmp/rootlessport761613945/.bp.sock]\"\n"
time="2021-04-16T04:34:23-05:00" level=debug msg="rootlessport: time=\"2021-04-16T04:34:23-05:00\" level=info msg=\"starting child driver in child netns (\\\"/proc/self/exe\\\" [containers-rootlessport-child])\"\n"
time="2021-04-16T04:34:23-05:00" level=debug msg="rootlessport: time=\"2021-04-16T04:34:23-05:00\" level=info msg=\"waiting for initComplete\"\n"
time="2021-04-16T04:34:23-05:00" level=debug msg="rootlessport: time=\"2021-04-16T04:34:23-05:00\" level=info msg=\"initComplete is closed; parent and child established the communication channel\"\n"
time="2021-04-16T04:34:23-05:00" level=debug msg="rootlessport: time=\"2021-04-16T04:34:23-05:00\" level=info msg=\"exposing ports [{5000 5000 tcp }]\"\n"
time="2021-04-16T04:34:23-05:00" level=debug msg="rootlessport is ready"

A successful run looks like this:

time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=\"starting parent driver\"\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=\"opaque=map[builtin.readypipepath:/run/user/1000/libpod/tmp/rootlessport734916386/.bp-ready.pipe builtin.socketpath:/run/user/1000/libpod/tmp/rootlessport734916386/.bp.sock]\"\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=\"starting child driver in child netns (\\\"/proc/self/exe\\\" [containers-rootlessport-child])\"\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=\"waiting for initComplete\"\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=\"initComplete is closed; parent and child established the communication channel\"\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=\"exposing ports [{5000 5000 tcp }]\"\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=ready\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport: time=\"2021-04-16T13:33:14+02:00\" level=info msg=\"waiting for exitfd to be closed\"\n"
time="2021-04-16T13:33:14+02:00" level=debug msg="rootlessport is ready"

Relevant code part:

logrus.Infof("exposing ports %v", cfg.Mappings)
if err := exposePorts(driver, cfg.Mappings, cfg.ChildIP); err != nil {
return err
}
// write and close ReadyFD (convention is same as slirp4netns --ready-fd)
logrus.Info("ready")
if _, err := readyW.Write([]byte("1")); err != nil {
return err
}

We never reach logrus.Info("ready") so the error must be happening inside exposePorts(). However the main podman process never seems to get the error propagated back and thinks rootlesskit started successfully.

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 16, 2021
@Luap99
Copy link
Member Author

Luap99 commented Apr 16, 2021

@AkihiroSuda PTAL
Somehow the rootless port setup fails but the main podman process thinks it started successfully. Any Ides why it could fail and no error is displayed?

@AkihiroSuda
Copy link
Collaborator

AkihiroSuda commented Apr 16, 2021

Is this reproducible manually out of CI? Is this specific to compose?

@Luap99
Copy link
Member Author

Luap99 commented Apr 16, 2021

I am unable to reproduce locally and I also did no see such issue in other CI tests so it could be specific to compose.

@edsantiago
Copy link
Member

Hmmmm, it's not only rootless:

compose: simple_port_map - curl (port 5000) failed with status 7

compose: port_map_diff_port - curl (port 5001) failed with status 7

compose: mount_and_label : port 5000

@AkihiroSuda AkihiroSuda added flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. and removed kind/bug Categorizes issue or PR as related to a bug. rootless labels Apr 21, 2021
@edsantiago
Copy link
Member

This is now our #1 flake. Unfortunately it triggers in many different subtests, so it's hard to just add a skip to ignore the problem.

compose: mount_and_label - curl (port 5000) failed with status 7

compose: env_and_volume - down

compose: port_map_diff_port - curl (port 5001) failed with status 7

compose: env_and_volume - curl (port 5000) failed with status 7

compose: env_and_volume : port 5001

@Luap99
Copy link
Member Author

Luap99 commented May 12, 2021

@edsantiago Can you check for network flakes in the system tests, especially the remote tests. I think this is a more general problem, not only compose.

@edsantiago
Copy link
Member

edsantiago commented May 12, 2021

@Luap99, interesting that you mention it. Here are two flakes from the past week:

sys: podman networking: port on localhost

The symptom looks similar to my eye. Aside from those two, I don't see any flakes that match. (I do see a new series of iptables flakes, but I'm going to ignore those right now unless they start happening more often).

[EDIT: I just noticed, these are podman, not podman-remote like you asked for]

@edsantiago
Copy link
Member

Another flake in "port on localhost" test just now: PR #10222, sys podman ubuntu-2104 rootless host (again podman, not remote)

@Luap99
Copy link
Member Author

Luap99 commented May 12, 2021

@edsantiago These are new ones from my PR yesterday, they should be fixed by #10318.

I mean failures like this one https://storage.googleapis.com/cirrus-ci-6707778565701632-fcae48/artifacts/containers/podman/4569209960136704/html/sys-remote-fedora-34-root-host.log.html. The No route to host error message also happens in the rootful compose test. Can you grep for this string in all system test logs? How far can you go back? I want to know if this ever happened before the merge of #9423.

@edsantiago
Copy link
Member

Unfortunately, no, I have no (sane) way to grep the logs. I just have a very simple script that queries github and cirrus but does not download&preserve the logs for me. I guess I should consider adding that...

@edsantiago
Copy link
Member

Here's another one outside of compose tests. podman + root, not podman-remote rootless:

sys: podman network reload

@edsantiago
Copy link
Member

compose: simple_port_map - curl (port 5000) failed with status 7


These too are network-related, one a missing result from a web fetch, another a `ncat: no route to host' error:

sys: podman networking: port on localhost

sys: podman pod create - hashtag AllTheOptions

@edsantiago
Copy link
Member

Another Ncat: No route to host:

sys: podman networking: port with --userns=keep-id

And another compose flake:

compose: mount_and_label - curl (port 5000) failed with status 7

@Luap99 Luap99 added the network Networking related issue or feature label Jun 21, 2021
@vrothberg
Copy link
Member

I also see the compose tests flaking frequently.

@edsantiago
Copy link
Member

Is this where we're tracking Ncat: No route to host flakes? If so, here's some more:

sys: podman networking: port with --userns=keep-id

@Luap99
Copy link
Member Author

Luap99 commented Jul 21, 2021

OK, I found the root cause for the rootless issue. Reproducer:

podman network create testnet
podman run -p 8080:80 --network testnet --name test --replace -d --rm nginx
# this works
curl 127.0.0.1:8080
podman network disconnect testnet test
# this fails as expected with Connection reset by peer (no network is attached)
curl 127.0.0.1:8080
# now connect again to the network
podman network connect testnet test
# curl should work now but it doesn't
# it hangs for some seconds before returning Connection reset by peer
curl 127.0.0.1:8080

docker-compose always runs network disconnect && network connect on the container. I think the race happens between the rootlesskit port setup and docker-compose calling network connect/disconnect at the same time. When rootlesskit was initialized before network connect is finished, port forwarding will be broken.

Rootlesskit port forwarding is broken is because podman sets the child ip for the rootlesskit port forwarder to the eth0 ip address. After disconnect && connect CNI will allocate a new ip. Since the new eth0 ip no longer matches the rootless kit child ip the port forwarding is broken. Ideally we could use 127.0.0.1 as source address but this was changed to fix CVE-2021-20199.

@AkihiroSuda Do you know a good way to fix this? I know rootlesskit offers a way to add/remove ports dynamically, is there way for podman network connect to remove the broken port and add a new port with the correct child ip?

@Luap99 Luap99 self-assigned this Aug 2, 2021
@Luap99 Luap99 added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Aug 2, 2021
Luap99 added a commit to Luap99/libpod that referenced this issue Aug 3, 2021
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]

To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.

Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().

Fixes containers#10052

[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/libpod that referenced this issue Aug 6, 2021
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]

To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.

Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().

Fixes containers#10052

[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199

Signed-off-by: Paul Holzinger <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants