-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPv6 link-local address enhancements #625
Conversation
doc/ipv6/ipv6_link_local.md
Outdated
7. Support to trap control packets destined to IPv6 link-local address to CPU. | ||
8. Support filtering of packets with IPv6 link-local source or destination addresses. These packets must not be routed to other interfaces. This implies utilities like trace route are not applicable for IPv6 link-local addresses. Also, ping to link-local address is only applicable for directly connected networks. | ||
9. Support BGP peering using unnumbered interface configuration. In this configuration, the IPv6 link-local address of the interface is used and the remote peer IPv6 link-local address is dynamically discovered to establish adjacency. | ||
10. IPv6 mode is disabled by default on an interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, this can be changed to enabled as default and use some profile to disable
doc/ipv6/ipv6_link_local.md
Outdated
|
||
`sysctl -w net.ipv6.conf.default.disable_ipv6=1` | ||
|
||
Since the Linux kernel auto-generates the IPv6 link-local address per interface, netlink events for IPv6 address addition and deletion are handled by the IntfMgr. All netlink messages other than RTM_NEWADDR, RTM_DELADDR are ignored. It also ignores all addresses other than IPv6 link-local addresses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently intfmgr
or other manager does not listen on events from kernel but handled by corresponding *syncd. Originally there was an intfsyncd
but was later removed. We may want to revisit this section on how to handle netlink events for link-local address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We initially considered using intfsyncd. But that can result in multiple producers for the INTF:PREFIX table (intfmgrd for configured addresses and intfsyncd for learned addresses).
To avoid any race conditions and inconsistent behavior (like for ex., if VRF interface unbind when INTF and INTF:PREFIX tables are deleted by intfmgrd and if there are IPv6 LL address updates from intfsyncd at the same time), we enhanced the intfmgrd to push the netlink address updates into the APP_DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should re-introduce intfsyncd as before. However this approach of intfmgrd listening on netlink doesn't lgtm. Do you have any other proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prsunny As long as we have an interface table e.g INTF_TABLE: in App-DB, creating a router interface in intfsorch for that interface in HW should be fine..right? why do we need to worry about handling multiple IPv6 LLAs with reference count.
|
||
``` | ||
|
||
# 4 Flow Diagrams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this section with the flows, especially on LL IP2ME and /10 route installation and deletion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
"INTERFACE": { | ||
"Ethernet24": { | ||
"ipv6_use_link_local_only": "disable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we throw an error if user tries to configure global Ipv6 address when ipv6_use_link_local_only = "enable" for L3 interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ipv6_use_link_local_only config mode is in effect only in the absence of configured IPv6 global addresses on the interface. In the presence of global addresses, the link-local is enabled implicitly anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we introduce 'ipv6_enable' field, it is implicitly enabled upon first global IPv6 configuration. it is aligned with what we exactly do to kernel in the cfgmgr.
By default Ipv6 is disabled on the interface, after few Ipv6 address configuration, if we need to go to the default state i.e ipv6 disable on interface, do we have any field? ipv6_use_link_local_only - disable should not disable the Ipv6 functionality on the interface or do we need to assume in the cfgmgr that when no IPv6 address present on the interface, disable the IPv6 functionality? IMO, we should have the field name based on exactly what we do in the backend to avoid any confusion.
F - PBR, f - OpenFabric, | ||
> - selected route, * - FIB route # - not installed in hardware | ||
|
||
B 192.168.0.0/24 [20/0] via fe80::5054:ff:fe03:6175, Ethernet0, 00:00:06 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since FRR/zebra is already pushing 169.254.0.1 IPv4 link local NH for IPv4 route to kernel, let's use the same for HW programming flow (fpmsyncd) as well, we're already listening neighbor events from kernel, the existing code should work for this case as well, please share the issues you have encountered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unnumbered routes needs to be sent to fpm with special attribute (e.g. onlink) to restrict next-hop resolution to be confined to the link on which route is getting programmed. Zebra doesn't pass any such attribute, and route orch doesn't have such ability to handle that today. Also, same neighbor entry 169.254.0.1 will appear on multiple interfaces, and this is also something not expected and requires special handling in sonic. Note that not supporting rfc 5549 next-hops natively is a limitation in Linux kernel and what zebra does is a hack.
Keeping all of the above into consideration, zebra is patched to send ipv6 next-hop along with the route. This approach is much cleaner and avoids any special neigh/next-hop handling in neigh/route orch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hasan - Agree that this solution will work, trying to understand the advantage of deviating the current kernel & FPM flow behaviors, I'm not aware of any FRR user other than SONiC already deviating like this by introducing a change in the FRR module, I believe the change in FRR will stay as PATCH and we wont be able to merge with FRR opensource code i.e somone in SONiC has to maintain the SONiC specific FRR changes (Any discussion with FRR community already happened on this change?)
The unnumbered routes needs to be sent to fpm with special attribute (e.g. onlink) to restrict next-hop resolution to be confined to the link on which route is getting programmed.
How's this handled in kernel, I believe we dont expect the NH resolution in kernel either, if FRR restricts it by adding the static Ipv4 link-local neighbor(169.254.0.1), cant we follow the same in FPM flow as well.
Also, same neighbor entry 169.254.0.1 will appear on multiple interfaces, and this is also something not expected and requires special handling in sonic.
This is not different from any IPv6 LLA neighbor i.e same LLA can be present on multiple interfaces, if we use "SAI_NEIGHBOR_ENTRY_ATTR_NO_HOST_ROUTE" already for IPv6 LLA nbr, why cant we use the same for IPv4 LLA neighbor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this details to the HLD section ? @hasan-brcm, can we address this concern from Venkat as part of this feature?
Hi, I am from Keysight. I was going through the test case coverage section of the design document,
Let me know whats your opinion about including these functional tests in the design document. |
Updated the test cases section with 1 and 2. 3 is more specific to BGP functionality. |
WG discussion points:
|
@prsunny Updated the HLD as per the new design. |
As per HLD - sonic-net/SONiC#625 FRR Patches: 0009-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch Files modified : bgpd_network.c and bgpd/bgp_zebra.c Fix for : Link local scope was not set while binding socket with local address causing socket errors for bgp ipv6 link local neighbors. 0010-VRF-interface-lookup-was-still-done-in-the-default-vrf.patch Files modified : staticd/static_zebra.c Fix for : VRF interface lookup was still done in the default-vrf which was causing the interface lookup to fail. Due to this static-route pointing to link-local was not getting installed. 0011-Changes-to-send-ipv6-link-local-address-as-nexthop-to-fpmsyncd.patch Files modified : zebra/zebra_fpm_netlink.c Fix for : Made changes to send ipv6 address as nexthop to fpmsyncd. Depends on: sonic-net/sonic-utilities#1159 sonic-net/sonic-swss#1463 Signed-off-by: Akhilesh Samineni [email protected]
As per HLD - sonic-net/SONiC#625 FRR Patches: 0009-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch Files modified : bgpd_network.c and bgpd/bgp_zebra.c Fix for : Link local scope was not set while binding socket with local address causing socket errors for bgp ipv6 link local neighbors. 0010-VRF-interface-lookup-was-still-done-in-the-default-vrf.patch Files modified : staticd/static_zebra.c Fix for : VRF interface lookup was still done in the default-vrf which was causing the interface lookup to fail. Due to this static-route pointing to link-local was not getting installed. 0011-Changes-to-send-ipv6-link-local-address-as-nexthop-to-fpmsyncd.patch Files modified : zebra/zebra_fpm_netlink.c Fix for : Made changes to send ipv6 address as nexthop to fpmsyncd. Depends on: sonic-net/sonic-utilities#1159 sonic-net/sonic-swss#1463 Signed-off-by: Akhilesh Samineni [email protected]
As per HLD - sonic-net/SONiC#625 FRR Patches: 0009-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch Files modified : bgpd_network.c and bgpd/bgp_zebra.c Fix for : Link local scope was not set while binding socket with local address causing socket errors for bgp ipv6 link local neighbors. 0010-VRF-interface-lookup-was-still-done-in-the-default-vrf.patch Files modified : staticd/static_zebra.c Fix for : VRF interface lookup was still done in the default-vrf which was causing the interface lookup to fail. Due to this static-route pointing to link-local was not getting installed. 0011-Changes-to-send-ipv6-link-local-address-as-nexthop-to-fpmsyncd.patch Files modified : zebra/zebra_fpm_netlink.c Fix for : Made changes to send ipv6 address as nexthop to fpmsyncd. Depends on: sonic-net/sonic-utilities#1159 sonic-net/sonic-swss#1463 Signed-off-by: Akhilesh Samineni [email protected]
This document covers the design details for the following enhancements: