Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[202205][dualtor][active-standby] The routes to tun0 are removed after shutdown all ports in a vlan #11924

Closed
lolyu opened this issue Sep 1, 2022 · 11 comments · Fixed by sonic-net/sonic-linux-kernel#293 or sonic-net/sonic-swss#2469
Assignees
Labels
Bug 🐛 Issue for 202205 Triaged this issue has been triaged

Comments

@lolyu
Copy link
Contributor

lolyu commented Sep 1, 2022

Description

On a dualtor setup, if the mux ports on one ToR are all active and they are members under the same vlan, if we are trying to shutdown those mux ports one by one, after shutdown the last up port, all routes to tun0 for previous down ports will be removed, and the routes to the mux servers will be directed to the default route.

Steps to reproduce the issue:

  1. shutdown active ports one by one except port Ethernet124
# show vlan config
Name        VID  Member       Mode
--------  -----  -----------  --------
Vlan1000   1000  Ethernet0    untagged
Vlan1000   1000  Ethernet4    untagged
Vlan1000   1000  Ethernet8    untagged
Vlan1000   1000  Ethernet12   untagged
Vlan1000   1000  Ethernet16   untagged
Vlan1000   1000  Ethernet20   untagged
Vlan1000   1000  Ethernet40   untagged
Vlan1000   1000  Ethernet44   untagged
Vlan1000   1000  Ethernet48   untagged
Vlan1000   1000  Ethernet52   untagged
Vlan1000   1000  Ethernet56   untagged
Vlan1000   1000  Ethernet60   untagged
Vlan1000   1000  Ethernet64   untagged
Vlan1000   1000  Ethernet68   untagged
Vlan1000   1000  Ethernet72   untagged
Vlan1000   1000  Ethernet76   untagged
Vlan1000   1000  Ethernet80   untagged
Vlan1000   1000  Ethernet84   untagged
Vlan1000   1000  Ethernet104  untagged
Vlan1000   1000  Ethernet108  untagged
Vlan1000   1000  Ethernet112  untagged
Vlan1000   1000  Ethernet116  untagged
Vlan1000   1000  Ethernet120  untagged
Vlan1000   1000  Ethernet124  untagged
# show mux s
PORT         STATUS    SERVER_STATUS    HEALTH    HWSTATUS    LAST_SWITCHOVER_TIME
-----------  --------  ---------------  --------  ----------  ---------------------------
Ethernet0    active    active           healthy   consistent  2022-Sep-01 03:02:07.024764
Ethernet4    active    active           healthy   consistent  2022-Sep-01 03:02:06.474897
Ethernet8    active    active           healthy   consistent  2022-Sep-01 03:02:06.523995
Ethernet12   active    active           healthy   consistent  2022-Sep-01 03:02:06.880076
Ethernet16   active    active           healthy   consistent  2022-Sep-01 03:02:07.397831
Ethernet20   active    active           healthy   consistent  2022-Sep-01 03:02:06.676979
Ethernet40   active    active           healthy   consistent  2022-Sep-01 03:01:28.099248
Ethernet44   active    active           healthy   consistent  2022-Sep-01 03:02:07.191180
Ethernet48   active    active           healthy   consistent  2022-Sep-01 03:02:06.642033
Ethernet52   active    active           healthy   consistent  2022-Sep-01 03:02:06.376834
Ethernet56   active    active           healthy   consistent  2022-Sep-01 03:02:07.505453
Ethernet60   active    active           healthy   consistent  2022-Sep-01 03:02:06.425438
Ethernet64   active    active           healthy   consistent  2022-Sep-01 03:02:07.095702
Ethernet68   active    active           healthy   consistent  2022-Sep-01 03:02:06.724604
Ethernet72   active    active           healthy   consistent  2022-Sep-01 03:02:06.249699
Ethernet76   active    active           healthy   consistent  2022-Sep-01 03:02:07.240711
Ethernet80   active    active           healthy   consistent  2022-Sep-01 03:02:06.784232
Ethernet84   active    active           healthy   consistent  2022-Sep-01 03:02:06.328413
Ethernet104  active    active           healthy   consistent  2022-Sep-01 03:02:06.927865
Ethernet108  active    active           healthy   consistent  2022-Sep-01 03:01:26.981678
Ethernet112  active    active           healthy   consistent  2022-Sep-01 03:02:06.831614
Ethernet116  active    active           healthy   consistent  2022-Sep-01 03:02:06.975433
Ethernet120  active    active           healthy   consistent  2022-Sep-01 03:02:07.143938
Ethernet124  active    active           healthy   consistent  2022-Sep-01 03:02:07.312000
# config interface shutdown Ethernet0
# config interface shutdown Ethernet4
# config interface shutdown Ethernet8
# config interface shutdown Ethernet12
# config interface shutdown Ethernet16
# config interface shutdown Ethernet20
# config interface shutdown Ethernet40
# config interface shutdown Ethernet44
# config interface shutdown Ethernet48
# config interface shutdown Ethernet52
# config interface shutdown Ethernet56
# config interface shutdown Ethernet60
# config interface shutdown Ethernet64
# config interface shutdown Ethernet68
# config interface shutdown Ethernet72
# config interface shutdown Ethernet76
# config interface shutdown Ethernet80
# config interface shutdown Ethernet84
# config interface shutdown Ethernet104
# config interface shutdown Ethernet108
# config interface shutdown Ethernet112
# config interface shutdown Ethernet116
# config interface shutdown Ethernet120
  1. shutdown port Ethernet124

Describe the results you received:

  1. all mux server neighbors are flushed
# ip neighbor | grep 192.168
  1. route to mux servers are directed to the default route
# show ip route 192.168.0.25
Routing entry for 0.0.0.0/0
  Known via "bgp", distance 20, metric 0, best
  Last update 00:22:36 ago
  * 10.0.0.57, via PortChannel101, weight 1
  * 10.0.0.59, via PortChannel102, weight 1
  * 10.0.0.61, via PortChannel103, weight 1
  * 10.0.0.63, via PortChannel104, weight 1

Describe the results you expected:

the route to mux servers should have nexthop as tun0

Output of show version:

# show version

SONiC Software Version: SONiC.20220531.05
Distribution: Debian 11.4
Kernel: 5.10.0-12-2-amd64
Build commit: edaa08b626
Build date: Tue Aug 23 18:50:46 UTC 2022
Built by: cloudtest@e5e9faeac000005

syslog after shutdown the last up port Ethernet124

Sep  1 03:08:15.447036 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 alias to Ethernet32/1
Sep  1 03:08:15.447036 str2-7050cx3-acs-10 NOTICE swss#buffermgrd: :- doSpeedUpdateTask: Reusing existing profile 'pg_lossless_50000_300m_profile'
Sep  1 03:08:15.447036 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 description to Servers23:eth0
Sep  1 03:08:15.447557 str2-7050cx3-acs-10 NOTICE swss#buffermgrd: :- doSpeedUpdateTask: PG to Buffer Profile Mapping Ethernet124|3-4 already present
Sep  1 03:08:15.448594 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 index to 32
Sep  1 03:08:15.450670 str2-7050cx3-acs-10 WARNING mux#linkmgrd: MuxManager.cpp:262 addOrUpdateMuxPortLinkState: Ethernet124: link state: up
Sep  1 03:08:15.450670 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 lanes to 125,126
Sep  1 03:08:15.450727 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 mux_cable to true
Sep  1 03:08:15.450727 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 pfc_asym to off
Sep  1 03:08:15.451150 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 speed to 50000
Sep  1 03:08:15.451510 str2-7050cx3-acs-10 NOTICE pmon#xcvrd[47]: message repeated 6 times: [ CMIS: Ethernet120: skipping CMIS state machine for flat memory xcvr]
Sep  1 03:08:15.451790 str2-7050cx3-acs-10 NOTICE pmon#xcvrd[47]: CMIS: Ethernet124: skipping CMIS state machine for flat memory xcvr
Sep  1 03:08:15.458645 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 tpid to 0x8100
Sep  1 03:08:15.458645 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- setPortPfcAsym: Already set asymmetric PFC mode: off
Sep  1 03:08:15.458813 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- doPortTask: Set port Ethernet124 asymmetric PFC to off
Sep  1 03:08:15.462984 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 MTU to 9100
Sep  1 03:08:15.486488 str2-7050cx3-acs-10 INFO kernel: [11760.118032] Bridge: port 25(Ethernet124) entered disabled state
Sep  1 03:08:15.490281 str2-7050cx3-acs-10 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet124 admin:0 oper:0 addr:94:8e:d3:04:eb:28 ifindex:210 master:148
Sep  1 03:08:15.490281 str2-7050cx3-acs-10 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet124(ok:down) to state db
Sep  1 03:08:15.490281 str2-7050cx3-acs-10 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet124 admin:0 oper:0 addr:94:8e:d3:04:eb:28 ifindex:210 master:148
Sep  1 03:08:15.495737 str2-7050cx3-acs-10 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet124(ok:down) to state db
Sep  1 03:08:15.501338 str2-7050cx3-acs-10 NOTICE swss#portmgrd: :- doTask: Configure Ethernet124 admin status to down
Sep  1 03:08:15.528103 str2-7050cx3-acs-10 INFO syncd#syncd: [none] SAI_API_PORT:_brcm_sai_link_event_cb:1128 Port 127 link down event cause: ADMIN_DOWN
Sep  1 03:08:15.535324 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- doPortTask: Set port Ethernet124 admin status to down
Sep  1 03:08:15.539673 str2-7050cx3-acs-10 INFO lldp#lldpmgrd[28]: message repeated 7 times: [ port Ethernet120 is not up, continue]
Sep  1 03:08:15.540204 str2-7050cx3-acs-10 INFO lldp#lldpmgrd[28]: port Ethernet124 is not up, continue
Sep  1 03:08:15.543372 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor b2:4a:cc:01:11:16 on Vlan1000
Sep  1 03:08:15.546178 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.11/32
Sep  1 03:08:15.547698 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 1a:e3:c2:94:ab:18 on Vlan1000
Sep  1 03:08:15.551238 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.12/32
Sep  1 03:08:15.560354 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor ae:a7:96:49:dc:1a on Vlan1000
Sep  1 03:08:15.560882 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.13/32
Sep  1 03:08:15.561248 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 5e:3a:62:d8:94:1c on Vlan1000
Sep  1 03:08:15.561570 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.14/32
Sep  1 03:08:15.561879 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 62:de:12:e5:84:22 on Vlan1000
Sep  1 03:08:15.562498 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.17/32
Sep  1 03:08:15.562498 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor de:53:ce:77:ee:24 on Vlan1000
Sep  1 03:08:15.562498 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.18/32
Sep  1 03:08:15.562498 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor b6:04:15:03:b7:26 on Vlan1000
Sep  1 03:08:15.574591 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.19/32
Sep  1 03:08:15.574591 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor c2:35:04:22:6e:00 on Vlan1000
Sep  1 03:08:15.574591 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.2/32
Sep  1 03:08:15.574591 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor f2:48:0c:52:18:2c on Vlan1000
Sep  1 03:08:15.574652 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.20/32
Sep  1 03:08:15.575826 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor f2:c1:b4:8e:72:2e on Vlan1000
Sep  1 03:08:15.580361 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.21/32
Sep  1 03:08:15.581818 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 0e:71:0b:70:83:30 on Vlan1000
Sep  1 03:08:15.585775 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.22/32
Sep  1 03:08:15.587166 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 5a:c2:92:e4:3a:34 on Vlan1000
Sep  1 03:08:15.598858 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.24/32
Sep  1 03:08:15.632598 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed next hop 192.168.0.25 on Vlan1000
Sep  1 03:08:15.634965 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 7e:82:b6:c1:81:36 on Vlan1000
Sep  1 03:08:15.634965 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 6e:72:8e:fb:50:02 on Vlan1000
Sep  1 03:08:15.637251 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.3/32
Sep  1 03:08:15.638062 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor f2:44:50:b2:99:06 on Vlan1000
Sep  1 03:08:15.639849 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.5/32
Sep  1 03:08:15.640567 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 32:34:ee:10:d2:0a on Vlan1000
Sep  1 03:08:15.642100 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.7/32
Sep  1 03:08:15.642544 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 7a:6c:a4:e7:22:12 on Vlan1000
Sep  1 03:08:15.644843 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- remove_route: Removed tunnel route to 192.168.0.9/32
Sep  1 03:08:15.645835 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- doTask: Get port state change notification id:1000000000038 status:2
Sep  1 03:08:15.646069 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet124 oper state set from up to down
Sep  1 03:08:15.647257 str2-7050cx3-acs-10 WARNING mux#linkmgrd: message repeated 9 times: [ MuxManager.cpp:262 addOrUpdateMuxPortLinkState: Ethernet124: link state: up]
Sep  1 03:08:15.647355 str2-7050cx3-acs-10 WARNING mux#linkmgrd: MuxManager.cpp:262 addOrUpdateMuxPortLinkState: Ethernet124: link state: down
Sep  1 03:08:15.647419 str2-7050cx3-acs-10 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveStandby.cpp:531 handleStateChange: Ethernet124: Received link state event, new state: Down
Sep  1 03:08:15.647465 str2-7050cx3-acs-10 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveStandby.cpp:302 switchMuxState: Ethernet124: Switching MUX state to 'Standby'
Sep  1 03:08:15.647512 str2-7050cx3-acs-10 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveStandby.cpp:553 handleStateChange: Ethernet124: (P: Active, M: Active, L: Up) -> (P: Active, M: Wait, L: Down)
Sep  1 03:08:15.647558 str2-7050cx3-acs-10 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveStandby.cpp:231 setLabel: Ethernet124: Linkmgrd state is: Wait Unhealthy
Sep  1 03:08:15.650604 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet124
Sep  1 03:08:15.650694 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- flushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 2
Sep  1 03:08:15.650742 str2-7050cx3-acs-10 NOTICE swss#orchagent: :- recordFlushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 2
@lolyu
Copy link
Contributor Author

lolyu commented Sep 1, 2022

Hi @prsunny, could you please take a look at this issue?

@lolyu
Copy link
Contributor Author

lolyu commented Sep 1, 2022

If we shutdown all mux ports that are all standby, the routes to tun0 for mux servers will be removed and will be changed to be directed to the default route.

@lolyu
Copy link
Contributor Author

lolyu commented Sep 1, 2022

Recent link down test failures are due to this:

test_standby_tor_downlink_down_downstream_standby[active-standby]
test_active_tor_downlink_down_downstream_active[active-standby]
test_active_link_down_downstream_active[active-standby]
test_standby_link_down_downstream_standby[active-standby]

@prsunny
Copy link
Contributor

prsunny commented Sep 1, 2022

@lolyu , is this a valid usecase that we admin down the Vlan ports?

@lolyu
Copy link
Contributor Author

lolyu commented Sep 1, 2022

@lolyu , is this a valid usecase that we admin down the Vlan ports?

Yes, these failures should be regression as they are passed previously, could you please help identify the change that causes this?

@lolyu
Copy link
Contributor Author

lolyu commented Sep 8, 2022

link down testcases are passing on 202012 image, but they are failing on 202205 image.

@prsunny
Copy link
Contributor

prsunny commented Sep 8, 2022

@Ndancejic , can you please take a look at this?

@zhangyanzhao zhangyanzhao added the Triaged this issue has been triaged label Sep 14, 2022
@Ndancejic
Copy link
Contributor

Before shutdown, ip link show Vlan1000 and Bridge are up, and arp entry for neighbor is present:

admin@vlab-06:~$ ip link show Vlan1000
111: Vlan1000@Bridge: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 9100 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ ip link show Bridge
107: Bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9100 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ show arp 192.168.0.12
Address       MacAddress         Iface      Vlan
------------  -----------------  -------  ------
192.168.0.12  00:01:22:04:05:07  -          1000

After shutdown, on 202205 ip show Vlan1000 and Bridge are down, and arp entry for neighbor is removed:

admin@vlab-06:~$ ip link show Vlan1000
111: Vlan1000@Bridge: <NO-CARRIER,BROADCAST,MULTICAST,ALLMULTI,UP> mtu 9100 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ ip link show Bridge
107: Bridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9100 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ show arp 192.168.0.12
Address    MacAddress    Iface    Vlan
---------  ------------  -------  ------
Total number of entries 0 

On 202012 however, the arp entry remains:

admin@svcstr-7050-acs-3:~$ ip link show Vlan1000  
105: Vlan1000@Bridge: <NO-CARRIER,BROADCAST,MULTICAST,ALLMULTI,UP> mtu 9100 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000  
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff  
    
admin@svcstr-7050-acs-3:~$ ip link show Bridge  
102: Bridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9100 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000  
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff  
    
admin@svcstr-7050-acs-3:~$ show arp 192.168.0.12  
Address       MacAddress         Iface      Vlan  
------------  -----------------  -------  ------  
192.168.0.12  00:01:22:04:05:07  -          1000  
Total number of entries 1

This is a behavior change between 202012 and 202205, will research why this is happening further

@lolyu
Copy link
Contributor Author

lolyu commented Sep 19, 2022

Before shutdown, ip link show Vlan1000 and Bridge are up, and arp entry for neighbor is present:

admin@vlab-06:~$ ip link show Vlan1000
111: Vlan1000@Bridge: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 9100 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ ip link show Bridge
107: Bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9100 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ show arp 192.168.0.12
Address       MacAddress         Iface      Vlan
------------  -----------------  -------  ------
192.168.0.12  00:01:22:04:05:07  -          1000

After shutdown, on 202205 ip show Vlan1000 and Bridge are down, and arp entry for neighbor is removed:

admin@vlab-06:~$ ip link show Vlan1000
111: Vlan1000@Bridge: <NO-CARRIER,BROADCAST,MULTICAST,ALLMULTI,UP> mtu 9100 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ ip link show Bridge
107: Bridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9100 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff

admin@vlab-06:~$ show arp 192.168.0.12
Address    MacAddress    Iface    Vlan
---------  ------------  -------  ------
Total number of entries 0 

On 202012 however, the arp entry remains:

admin@svcstr-7050-acs-3:~$ ip link show Vlan1000  
105: Vlan1000@Bridge: <NO-CARRIER,BROADCAST,MULTICAST,ALLMULTI,UP> mtu 9100 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000  
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff  
    
admin@svcstr-7050-acs-3:~$ ip link show Bridge  
102: Bridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9100 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000  
    link/ether 00:aa:bb:cc:dd:ee brd ff:ff:ff:ff:ff:ff  
    
admin@svcstr-7050-acs-3:~$ show arp 192.168.0.12  
Address       MacAddress         Iface      Vlan  
------------  -----------------  -------  ------  
192.168.0.12  00:01:22:04:05:07  -          1000  
Total number of entries 1

This is a behavior change between 202012 and 202205, will research why this is happening further

@Ndancejic, thanks for the investigation.
In addition, if you check ip neighbor on 202012 image, there will be FAILED neighbor entries for mux servers. But on 202212 image, those entries will be flushed.

@lolyu
Copy link
Contributor Author

lolyu commented Sep 22, 2022

The root cause is that, on bullseye, linux now will flush all arp entries learnt from a device if the device comes to NOCARRIER. So if we shutdown all ports in Vlan1000, the bridge device Bridge will become NOCARRIER, so does device Vlan1000, all arp entries from Vlan1000 now will be flushed.
The change caused this is https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=859bd2ef1fc1110a8031b967ee656c53a6260a76.
And it is included in kernel since v4.20:

$ git describe --contains 859bd2ef1fc1110a8031b967ee656c53a6260a76
v4.20-rc1~14^2~92

And there is a newer PR to make this configurable via a new sysctl parameter arp_evict_nocarrier (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fcdb44d08a95003c3d040aecdee286156ec6f34e).
And this will be included in kernel >= v5.16

$ git describe --contains fcdb44d08a95003c3d040aecdee286156ec6f34e
v5.16-rc1~159^2~3^2~2

But with image 202205, the kernel now is v5.10.

@prsunny, can we include this patch in our kernel patch list: https://github.com/sonic-net/sonic-linux-kernel/tree/443253f637ec3dccac246199977a6d65346d7878/patch? And explicitly disable this option in vlanmgrd.

@lolyu
Copy link
Contributor Author

lolyu commented Sep 29, 2022

Reopen as it needs a swss change to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🐛 Issue for 202205 Triaged this issue has been triaged
Projects
None yet
4 participants