Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network interface/firewall rules leaked after aardvark-dns start error #1121

Closed
stuartm opened this issue Nov 7, 2024 · 4 comments · Fixed by #1123
Closed

network interface/firewall rules leaked after aardvark-dns start error #1121

stuartm opened this issue Nov 7, 2024 · 4 comments · Fixed by #1123
Assignees
Labels

Comments

@stuartm
Copy link

stuartm commented Nov 7, 2024

Issue Description

Podman version 5.2.3

The issue I'm seeing is identical to containers/podman#14365 which was closed and locked due to inactivity, but it seems was never resolved and was affecting at least a few people.

I recently updated my server from Fedora 39 to Fedora 40 following which a pihole container which was working perfectly before the upgrade stopped functioning, or rather as it turns out port forward for that container stopped working and in a rather interesting way.

I was forwarding port 53 (tcp/udp) on the host to port 53 on the container, I was also forwarding port 8888 on the host to port 80 on the container for pihole's admin interface. After the upgrade port forwarding broke for both ports.

I've played around with different caps, I disabled selinux enforcement on the host and disabled the firewall (although it was correctly configured). I've checked and checked the container configuration, and even managed to prove that it working as expected except for the port forwarding issue (see below).

To cut a very long story short, here's what I discovered after hours of trying to get things working again. I was able to reach both ports through the container IP, thus demonstrating that the container was functioning correctly. When I changed the ports used for forwarding 8888 > 8765 and 53 > 54 port forward worked! Therefore the issue is specific to certain ports, in my experience 53 as in the original ticket but also others including 8888.

A half dozen other containers, all with port forwards are unaffected by this issue.

I can't see an obvious connection between port 53 and 8888 however maybe those two ports share something in common that triggers a thought for someone.

Steps to reproduce the issue

  1. Run a container forwarding port 53 (or 8888) from host

Describe the results you received

Port forwarding for some ports is resulting in traffic just disappearing into the void.

Describe the results you expected

Traffic forwarded from the host on mapped ports to reach the container

podman info output

Podman version 5.2.3
Fedora 40
x86_64

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

No

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

@sbrivio-rh
Copy link

I was forwarding port 53 (tcp/udp) on the host to port 53 on the container

Let's focus on this for a moment. Can you check what process port 53 is bound to? fuser -n tcp 53 and fuser -n udp 53. I wonder if another process you don't expect is stealing your packets.

@stuartm
Copy link
Author

stuartm commented Nov 8, 2024

Last night after opening this ticket I had a thought. It was just too great a coincidence that of all the ports that were not working it was the exact two forwarded for this container which were having issues.

Immediately after the upgrade when I first started the container it had failed to start due to a conflict on port 53 with aardvark - I changed the port forwarding to listen only on the external IP and restarted it which fixed the binding issue but then I found that port forwarding was not working. I formed a theory that forwarding rules had been created on this first start but then not removed when it subsequently failed to bind to port 53 on all addresses. Rather than messing about trying to find where these rules were (in hindsight probably just iptables?) I just restarted the host and that fixed the issue.

So there is a bug here, but it's not what I thought, it appears under a certain failure scenario podman is creating port forwarding rules but then not cleaning them up correctly. Resulting in all traffic sent to those ports presumably being sent to the incorrect container IP.

This might also explain why so many people were having issues specifically with pihole, standard instructions for pihole have port forwarding on port 53 enabled for all addresses by default. With the introduction of aardvark on the container interface this would result in the first start of the pihole container always failing, like me many would have restricted pihole to listen on an external interface and then recreated the container only to find that things were still not working. Most of those people would have also restarted the host at some point, finding that the issue disappeared which explains why the original ticket was abandoned.

I don't know if you want to leave this ticket open for the orphaned port forwarding rule bug, or not, I leave that decision to you.

@sbrivio-rh
Copy link

I don't know if you want to leave this ticket open for the orphaned port forwarding rule bug, or not, I leave that decision to you.

I have no idea how that part works, but if there are stale nftables port forwarding rules (I'm not sure what component would add them?) then there's an actual issue somewhere...

@Luap99
Copy link
Member

Luap99 commented Nov 8, 2024

Immediately after the upgrade when I first started the container it had failed to start due to a conflict on port 53 with aardvark - I changed the port forwarding to listen only on the external IP and restarted it which fixed the binding issue but then I found that port forwarding was not working. I formed a theory that forwarding rules had been created on this first start but then not removed when it subsequently failed to bind to port 53 on all addresses. Rather than messing about trying to find where these rules were (in hindsight probably just iptables?) I just restarted the host and that fixed the issue.

Yeah looking at the code it seems if we fail to start aardvark-dns we forget to teardown the driver again here

if let Err(er) = aardvark_interface.commit_netavark_entries(aardvark_entries) {
return Err(std::io::Error::new(
std::io::ErrorKind::Other,
format!("Error while applying dns entries: {er}"),
)

At least that is what I assume from your description but we definitely do not cleanup on failure there so leaking iptables rules and interface are to be expected in such case.

I move the issue to netavark

@Luap99 Luap99 transferred this issue from containers/podman Nov 8, 2024
@Luap99 Luap99 removed the network label Nov 8, 2024
@Luap99 Luap99 changed the title Port forwarding for selected ports failing network interface/firewall rules leaked after aardvark-dns start error Nov 8, 2024
Luap99 added a commit to Luap99/netavark that referenced this issue Nov 11, 2024
When aardvark-dns error out at the end we already configured interfaces
+ firewall rules for the driver but because we return a error podman
considers this a failure and teardown is never called.

Netavark should always cleanup on its own. So on errors make sure to
tear down the drivers again.

Fixes containers#1121

Signed-off-by: Paul Holzinger <[email protected]>
@Luap99 Luap99 self-assigned this Nov 11, 2024
Luap99 added a commit to Luap99/netavark that referenced this issue Nov 26, 2024
When aardvark-dns error out at the end we already configured interfaces
+ firewall rules for the driver but because we return a error podman
considers this a failure and teardown is never called.

Netavark should always cleanup on its own. So on errors make sure to
tear down the drivers again.

Fixes containers#1121

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/netavark that referenced this issue Dec 4, 2024
When aardvark-dns error out at the end we already configured interfaces
+ firewall rules for the driver but because we return a error podman
considers this a failure and teardown is never called.

Netavark should always cleanup on its own. So on errors make sure to
tear down the drivers again.

Fixes containers#1121

Signed-off-by: Paul Holzinger <[email protected]>
(cherry picked from commit 73e9911)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants