Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] IPv6 is not working over multi-access clab links #1669

Open
ipspace opened this issue Dec 19, 2024 · 10 comments
Open

[BUG] IPv6 is not working over multi-access clab links #1669

ipspace opened this issue Dec 19, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@ipspace
Copy link
Owner

ipspace commented Dec 19, 2024

IPv6 is not working between three nodes connected to the same link. The same setup works with libvirt, meaning we probably have a problem with the setup of the netlab-created Linux bridge.

To Reproduce

Run the following topology with libvirt provider. X1 can ping X2. Restart the same topology with clab provider. X1 can reach X2 over IPv4 but not over IPv6.

Lab topology

---
addressing:
  loopback:
    ipv6: 2001:db8:1::/48
  lan:
    ipv6: 2001:db8:2::/48

defaults.device: linux

nodes:
  x1:
  x2:
  x3:

links:
- x1-x2-x3

Version

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy
@ipspace ipspace added the bug Something isn't working label Dec 19, 2024
@ipspace
Copy link
Owner Author

ipspace commented Dec 19, 2024

So far I've identified these sysctl net.ipv6.conf.brname differences. Using libvirt values (below) on netlab-created bridge did not help.

accept_ra = 0
autoconf = 0
disable_ipv6 = 1
mtu = 1500

There must also be something in the /sys/devices/virtual/net/brname settings.

@ipspace
Copy link
Owner Author

ipspace commented Dec 19, 2024

Snooping on x1_eth1 and x2_eth2 (the two clab interfaces connected to the netlab-created Linux bridge) confirms that the Linux bridge drops IPv6 multicast packets (neighbor solicitation is seen on x1 but not received on x2).

Before someone starts quoting decade-old hits from Google: yes, I did echo 0 >/sys/class/net/X_1/bridge/multicast_snooping and no, it did not help.

Any working solution would be highly appreciated, I'm out of ideas.

@ipspace
Copy link
Owner Author

ipspace commented Dec 19, 2024

The mystery deepens: IPv6 MLD snooping works correctly. The bridge mdb table is populated:

dev X_1 port x1_eth1 grp ff02::1:ff00:1 temp
dev X_1 port x3_eth1 grp ff02::1:ff00:3 temp
dev X_1 port x2_eth1 grp ff02::1:ff00:2 temp
dev X_1 port X_1 grp ff02::fb temp
dev X_1 port X_1 grp ff02::1:3 temp
dev X_1 port x3_eth1 grp ff02::1:ff1c:7d48 temp
dev X_1 port x2_eth1 grp ff02::1:fffa:7833 temp
dev X_1 port x1_eth1 grp ff02::1:ff1c:997 temp
dev X_1 port X_1 grp ff02::6a temp
dev X_1 port X_1 grp ff02::1:ff9a:1c59 temp

However, the traffic does not get across.

@jbemmel
Copy link
Collaborator

jbemmel commented Dec 19, 2024

Which device are you testing with, ‘’’frr’’’?

Any difference with a vrnetlab based device?

@ipspace
Copy link
Owner Author

ipspace commented Dec 19, 2024

Which device are you testing with, ‘’’frr’’’?

Doesn't matter -- Linux, FRR, EOS, same behavior.

Any difference with a vrnetlab based device?

No.

@ipspace
Copy link
Owner Author

ipspace commented Dec 19, 2024

Works on a fresh Ubuntu VM. Must be some super-weird leftover in my server setup :(

@jbemmel
Copy link
Collaborator

jbemmel commented Jan 12, 2025

I am seeing the same issue, using dropwatch I've narrowed it down to netfilter drops on the veth pair from the router to the Linux bridge (reported on the ifindex in the host namespace):

drop at: nft_do_chain+0x4db/0x650 [nf_tables] (0xffffffffc196987b)
origin: software
input port ifindex: 45
timestamp: Sun Jan 12 06:15:56 2025 810126096 nsec
protocol: 0x86dd
length: 118
original length: 118
drop reason: NETFILTER_DROP

The router advertisements are 118 bytes long using IPv6 proto 0x86dd, ifindex 45 is the router's veth port in the host namespace (with MTU 9500)

uname -r: 6.8.0-51-generic

jeroen@j:~/Projects/netlab/tests$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 24.04.1 LTS
Release:	24.04
Codename:	noble

A reboot does not fix it

@jbemmel jbemmel reopened this Jan 12, 2025
@jbemmel
Copy link
Collaborator

jbemmel commented Jan 12, 2025

FOUND IT:

sudo nft flush chain ip6 filter FORWARD
sudo nft 'add chain ip6 filter FORWARD { policy accept; }'

fixes it

===
Root cause is that on my Ubuntu setup at least, the default netfilter forwarding policy for ipv6 is "drop".
Libvirt adds a forwarding rule that bypasses the default, but Containerlab does not

Specifically: https://github.com/srl-labs/containerlab/blob/38ea59a576ec8a0d68722245b9c51cbcd8975d4c/runtime/docker/firewall/nftables/client.go#L110
only handles ipv4

Talk about networking SNAFUs...

@jbemmel
Copy link
Collaborator

jbemmel commented Jan 12, 2025

I've submitted a ticket to Containerlab, as I think this should be solved there: srl-labs/containerlab#2389

In the meantime, potential fix in providers/clab.py:

def pre_start_lab(self, topology: Box) -> None:
    log.print_verbose('pre-start hook for Containerlab - create any bridges and load kernel modules')

    # Make sure ipv6 forwarding is allowed by netfilter
    status = external_commands.run_command(
        ['sudo','nft','add chain ip6 filter FORWARD { policy accept; }'],
        check_result=True)
    if status is False:
      return

    for brname in list_bridges(topology):
      if use_ovs_bridge(topology):
        create_ovs_bridge(brname)
      else:
        create_linux_bridge(brname)
    load_kmods(topology)

We may need to check if nft is available, or issue a warning (but continue) in case the command fails

@jbemmel
Copy link
Collaborator

jbemmel commented Jan 12, 2025

Submitted PR to Containerlab to fix it at the root: srl-labs/containerlab#2390

Will see how long it will take to get released

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants