Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pfcwd] test_pfcwd_actions failed due to IO still got passed on stormed ports with action as drop #8181

Closed
lolyu opened this issue Jul 14, 2021 · 2 comments
Labels

Comments

@lolyu
Copy link
Contributor

lolyu commented Jul 14, 2021

Description

Steps to reproduce the issue:

  • run test_pfcwd_actions on a storage-backend testbed.
  • here are how to reproduce this manually:
  1. start pfcwd with default config
root@str2-7050qx-32s-acs-03:~# pfcwd show config
Changed polling interval to 400ms
      PORT    ACTION    DETECTION TIME    RESTORATION TIME
----------  --------  ----------------  ------------------
 Ethernet4      drop               400                 400
 Ethernet8      drop               400                 400
Ethernet12      drop               400                 400
Ethernet16      drop               400                 400
Ethernet20      drop               400                 400
Ethernet24      drop               400                 400
Ethernet28      drop               400                 400
Ethernet32      drop               400                 400
Ethernet36      drop               400                 400
Ethernet40      drop               400                 400
Ethernet44      drop               400                 400
Ethernet48      drop               400                 400
  1. start pfcwd on Ethernet44:4
root@str2-7050qx-32s-acs-03:~# pfcwd show stats 

       QUEUE       STATUS    STORM DETECTED/RESTORED    TX OK/DROP    RX OK/DROP    TX LAST OK/DROP    RX LAST OK/DROP 

------------  -----------  -------------------------  ------------  ------------  -----------------  ----------------- 

Ethernet44:4  operational                        1/1           0/0           0/0                0/0                0/0 

root@str2-7050qx-32s-acs-03:~# redis-cli -n 2 hget COUNTERS_QUEUE_NAME_MAP Ethernet44:4 

"oid:0x15000000000372" 

root@str2-7050qx-32s-acs-03:~# redis-cli -n 2 hset COUNTERS:oid:0x15000000000372 DEBUG_STORM enabled 

(integer) 0 

root@str2-7050qx-32s-acs-03:~# pfcwd show stats 

       QUEUE    STATUS    STORM DETECTED/RESTORED    TX OK/DROP    RX OK/DROP    TX LAST OK/DROP    RX LAST OK/DROP 

------------  --------  -------------------------  ------------  ------------  -----------------  ----------------- 

Ethernet44:4   stormed                        2/1           0/0           0/0                0/0                0/0 
  1. run pfc_wd ptftest on ptf to check the egress drop
root@7739e842d266:~# ptf --test-dir ptftests pfc_wd.PfcWdTest --platform-dir ptftests --platform remote -t 'port_type="interface";port_src=6;ip_dst=u"10.0.0.53";router_mac=u"c0:d6:82:ef:0f:0f";wd_action="drop";pkt_count=100;port_dst="[14]";queue_index=4' --relax --debug info --log-file /tmp/pfc_wd.PfcWdTest.2021-07-14-08:00:49.log
WARNING: No route found for IPv6 destination :: (no default route?)
/usr/local/lib/python2.7/dist-packages/paramiko/transport.py:33: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
  from cryptography.hazmat.backends import default_backend
pfc_wd.PfcWdTest ... verifying packet on port device 0 port 14
FAIL

======================================================================
FAIL: pfc_wd.PfcWdTest
----------------------------------------------------------------------
Traceback (most recent call last):
  File "ptftests/pfc_wd.py", line 121, in runTest
    return verify_no_packet_any(self, masked_exp_pkt, dst_port_list)
  File "/usr/lib/python2.7/dist-packages/ptf/testutils.py", line 2476, in verify_no_packet_any
    verify_no_packet(test, pkt, (device, port))
  File "/usr/lib/python2.7/dist-packages/ptf/testutils.py", line 2422, in verify_no_packet
    "port %r.\n%s" % (device, port, result.format()))
AssertionError: Received packet that we expected not to receive on device 0, port 14.
========== RECEIVED ==========
0000   52 54 00 A7 E6 A7 C0 D6  82 EF 0F 0F 08 00 45 11   RT............E.
0010   00 56 00 01 00 00 3F 06  6F 5A 01 01 01 01 0A 00   .V....?.oZ......
0020   00 35 44 4E C9 58 00 00  00 00 00 00 00 00 50 02   .5DN.X........P.
0030   20 00 79 C4 00 00 00 01  02 03 04 05 06 07 08 09    .y.............
0040   0A 0B 0C 0D 0E 0F 10 11  12 13 14 15 16 17 18 19   ................
0050   1A 1B 1C 1D 1E 1F 20 21  22 23 24 25 26 27 28 29   ...... !"#$%&'()
0060   2A 2B 2C 2D                                        *+,-
==============================


----------------------------------------------------------------------
Ran 1 test in 0.014s

FAILED (failures=1)

******************************************
ATTENTION: SOME TESTS DID NOT PASS!!!

The following tests failed:
PfcWdTest

******************************************

Describe the results you received:

  • Packets to the stormed queue Ethernet44:4 got passed.

Describe the results you expected:

  • Packets to the stormed queue Ethernet44:4 should get dropped.

Output of show version:

root@str2-7050qx-32s-acs-03:~# show version

SONiC Software Version: SONiC.20201231.06
Distribution: Debian 10.10
Kernel: 4.19.0-12-2-amd64
Build commit: 959975cede
Build date: Sat Jun 26 13:10:24 UTC 2021
Built by: sonicbld@new-worker-20

Platform: x86_64-arista_7050_qx32s
HwSKU: Arista-7050-QX-32S
ASIC: broadcom
ASIC Count: 1
Serial Number: JPE20224213
Uptime: 10:00:07 up 5 days,  3:21,  1 user,  load average: 0.34, 0.48, 0.53

Docker images:
REPOSITORY                 TAG                 IMAGE ID            SIZE
docker-syncd-brcm          20201231.06         4713c42c96f6        693MB
docker-syncd-brcm          latest              4713c42c96f6        693MB
docker-teamd               20201231.06         bc8a8dc07ddd        411MB
docker-teamd               latest              bc8a8dc07ddd        411MB
docker-router-advertiser   20201231.06         274c1d136dbe        401MB
docker-router-advertiser   latest              274c1d136dbe        401MB
docker-platform-monitor    20201231.06         928381fe4902        609MB
docker-platform-monitor    latest              928381fe4902        609MB
docker-dhcp-relay          20201231.06         dddf8254c342        408MB
docker-dhcp-relay          latest              dddf8254c342        408MB
docker-database            20201231.06         725cedfa090c        401MB
docker-database            latest              725cedfa090c        401MB
docker-lldp                20201231.06         6f4199583c98        441MB
docker-lldp                latest              6f4199583c98        441MB
docker-orchagent           20201231.06         a6e90bc1453b        430MB
docker-orchagent           latest              a6e90bc1453b        430MB
docker-sonic-telemetry     20201231.06         438178945cd0        490MB
docker-sonic-telemetry     latest              438178945cd0        490MB
docker-fpm-frr             20201231.06         00d17d83cc79        430MB
docker-fpm-frr             latest              00d17d83cc79        430MB
docker-snmp                20201231.06         694a477acc2d        442MB
docker-snmp                latest              694a477acc2d        442MB
@lolyu lolyu added the Bug 🐛 label Jul 14, 2021
@neethajohn
Copy link
Contributor

I tried to repro this but could not. I sent packets using scapy. Can you check if the test is constructing the packet properly?

-- before starting the test --

admin@str2-7050qx-32s-acs-03:~$ pfcwd show stats
       QUEUE    STATUS    STORM DETECTED/RESTORED    TX OK/DROP    RX OK/DROP    TX LAST OK/DROP    RX LAST OK/DROP
------------  --------  -------------------------  ------------  ------------  -----------------  -----------------
Ethernet44:4   stormed                        1/0         0/100         0/400              0/100              0/400

-- sending 100 pkts with src port as Ethernet44 --

>>> pkt1=Ether(dst="c0:d6:82:ef:0f:0f",src="c6:fb:fb:76:c0:0e")/Dot1Q(vlan=10, prio=4)/IP(src="10.0.0.53",dst="10.0.0.35")/UDP(sport=6000, dport=7000)/Raw(RandString(size=100))   
>>> sendp(pkt1, iface="eth14", count=100)
admin@str2-7050qx-32s-acs-03:~$ show int counters
Last cached time was 2021-07-14 20:42:38.782054
      IFACE    STATE    RX_OK      RX_BPS    RX_UTIL    RX_ERR    RX_DRP    RX_OVR    TX_OK     TX_BPS    TX_UTIL    TX_ERR    TX_DRP    TX_OVR
-----------  -------  -------  ----------  ---------  --------  --------  --------  -------  ---------  ---------  --------  --------  --------
  Ethernet0        X        0    0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0         0         0
  Ethernet1        X        0    0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0         0         0
  Ethernet2        X        0    0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0         0         0
  Ethernet3        X        0    0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0         0         0
  Ethernet4        U        6   10.39 B/s      0.00%         0         0         0        7  17.49 B/s      0.00%         0         0         0
  Ethernet8        U        7   12.92 B/s      0.00%         0         0         0        7  17.49 B/s      0.00%         0         0         0
 Ethernet12        U        7   12.92 B/s      0.00%         0         0         0        5  15.14 B/s      0.00%         0         0         0
 Ethernet16        U        7   12.92 B/s      0.00%         0         0         0        7  17.53 B/s      0.00%         0         0         0
 Ethernet20        U        7   12.92 B/s      0.00%         0         0         0        7  17.57 B/s      0.00%         0         0         0
 Ethernet24        U        6   10.39 B/s      0.00%         0         0         0        7  17.57 B/s      0.00%         0         0         0
 Ethernet28        U        7   11.50 B/s      0.00%         0         0         0        7  17.57 B/s      0.00%         0         0         0
 Ethernet32        U        8   13.97 B/s      0.00%         0         1         0        7  17.57 B/s      0.00%         0         0         0
 Ethernet36        U        7   11.67 B/s      0.00%         0         1         0        7  17.57 B/s      0.00%         0         0         0
 Ethernet40        U        6   10.39 B/s      0.00%         0         0         0        7  17.57 B/s      0.00%         0         0         0
 Ethernet44        U      107  226.33 B/s      0.00%         0       100         0        7  17.57 B/s      0.00%         0         0         0
admin@str2-7050qx-32s-acs-03:~$ pfcwd show stats
       QUEUE    STATUS    STORM DETECTED/RESTORED    TX OK/DROP    RX OK/DROP    TX LAST OK/DROP    RX LAST OK/DROP
------------  --------  -------------------------  ------------  ------------  -----------------  -----------------
Ethernet44:4   stormed                        1/0         0/100         0/500              0/100              0/500

-- 100 pkts sent with dest port as Ethernet44 --

>>> pkt_transmit=Ether(dst="c0:d6:82:ef:0f:0f",src="6e:12:94:38:0a:04")/Dot1Q(vlan=10, prio=4)/IP(src="10.0.0.33",dst="10.0.0.53")/UDP(sport=6000, dport=7000)/Raw(RandString(size=100)) 
>>> sendp(pkt_transmit, iface="eth4", count=100) 
admin@str2-7050qx-32s-acs-03:~$ show int counters    
Last cached time was 2021-07-14 20:46:25.662306
      IFACE    STATE    RX_OK       RX_BPS    RX_UTIL    RX_ERR    RX_DRP    RX_OVR    TX_OK    TX_BPS    TX_UTIL    TX_ERR    TX_DRP    TX_OVR
-----------  -------  -------  -----------  ---------  --------  --------  --------  -------  --------  ---------  --------  --------  --------
  Ethernet0        X        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
  Ethernet1        X        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
  Ethernet2        X        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
  Ethernet3        X        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
  Ethernet4        U      101  1189.87 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
  Ethernet8        U        1    13.95 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet12        U        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet16        U        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet20        U        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet24        U        1    13.95 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet28        U        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet32        U        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet36        U        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet40        U        1    13.95 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0         0         0
 Ethernet44        U        0     0.00 B/s      0.00%         0         0         0        0  0.00 B/s      0.00%         0       100         0
admin@str2-7050qx-32s-acs-03:~$ pfcwd show stats
       QUEUE    STATUS    STORM DETECTED/RESTORED    TX OK/DROP    RX OK/DROP    TX LAST OK/DROP    RX LAST OK/DROP
------------  --------  -------------------------  ------------  ------------  -----------------  -----------------
Ethernet44:4   stormed                        1/0         0/200         0/500              0/200              0/500
admin@str2-7050qx-32s-acs-03:~$ show ip int
Interface      Master    IPv4 address/mask    Admin/Oper    BGP Neighbor    Neighbor IP
-------------  --------  -------------------  ------------  --------------  -------------
Ethernet4.10             10.0.0.32/31         up/up         ARISTA01BT0     10.0.0.33
Ethernet8.10             10.0.0.34/31         up/up         ARISTA02BT0     10.0.0.35
Ethernet12.10            10.0.0.36/31         up/up         ARISTA03BT0     10.0.0.37
Ethernet16.10            10.0.0.38/31         up/up         ARISTA04BT0     10.0.0.39
Ethernet20.10            10.0.0.40/31         up/up         ARISTA05BT0     10.0.0.41
Ethernet24.10            10.0.0.42/31         up/up         ARISTA06BT0     10.0.0.43
Ethernet28.10            10.0.0.44/31         up/up         ARISTA07BT0     10.0.0.45
Ethernet32.10            10.0.0.46/31         up/up         ARISTA08BT0     10.0.0.47
Ethernet36.10            10.0.0.48/31         up/up         ARISTA09BT0     10.0.0.49
Ethernet40.10            10.0.0.50/31         up/up         ARISTA10BT0     10.0.0.51
Ethernet44.10            10.0.0.52/31         up/up         ARISTA11BT0     10.0.0.53
Ethernet48.10            10.0.0.54/31         up/up         ARISTA12BT0     10.0.0.55
Loopback0                10.1.0.32/32         up/up         N/A             N/A
docker0                  240.127.1.1/24       up/down       N/A             N/A
eth0                     10.3.146.167/23      up/up         N/A             N/A
lo                       127.0.0.1/16         up/up         N/A             N/A
admin@str2-7050qx-32s-acs-03:~$ show ver

SONiC Software Version: SONiC.20201231.06
Distribution: Debian 10.10
Kernel: 4.19.0-12-2-amd64
Build commit: 959975cede
Build date: Sat Jun 26 13:10:24 UTC 2021
Built by: sonicbld@new-worker-20

Platform: x86_64-arista_7050_qx32s
HwSKU: Arista-7050-QX-32S
ASIC: broadcom
ASIC Count: 1
Serial Number: JPE20224213
Uptime: 20:55:28 up 5 days, 14:16,  1 user,  load average: 0.77, 0.41, 0.39

Docker images:
REPOSITORY                 TAG                 IMAGE ID            SIZE
docker-syncd-brcm          20201231.06         4713c42c96f6        693MB
docker-syncd-brcm          latest              4713c42c96f6        693MB
docker-teamd               20201231.06         bc8a8dc07ddd        411MB
docker-teamd               latest              bc8a8dc07ddd        411MB
docker-router-advertiser   20201231.06         274c1d136dbe        401MB
docker-router-advertiser   latest              274c1d136dbe        401MB
docker-platform-monitor    20201231.06         928381fe4902        609MB
docker-platform-monitor    latest              928381fe4902        609MB
docker-dhcp-relay          20201231.06         dddf8254c342        408MB
docker-dhcp-relay          latest              dddf8254c342        408MB
docker-database            20201231.06         725cedfa090c        401MB
docker-database            latest              725cedfa090c        401MB
docker-lldp                20201231.06         6f4199583c98        441MB
docker-lldp                latest              6f4199583c98        441MB
docker-orchagent           20201231.06         a6e90bc1453b        430MB
docker-orchagent           latest              a6e90bc1453b        430MB
docker-sonic-telemetry     20201231.06         438178945cd0        490MB
docker-sonic-telemetry     latest              438178945cd0        490MB
docker-fpm-frr             20201231.06         00d17d83cc79        430MB
docker-fpm-frr             latest              00d17d83cc79        430MB
docker-snmp                20201231.06         694a477acc2d        442MB
docker-snmp                latest              694a477acc2d        442MB

@lolyu
Copy link
Contributor Author

lolyu commented Jul 15, 2021

the root cause for this is the packet sent by pfcwd.py ptftest is to the sub interface(eth14.10), and when the subinterface tags the packet with VLAN tag, the priority is default to zero, and the queue is not stormed.

@lolyu lolyu closed this as completed Jul 15, 2021
lolyu pushed a commit to lolyu/sonic-buildimage that referenced this issue Jul 26, 2021
Update FRR to 7.5.1. The following is a list of new commits.
```
df7ab485b FRRouting Release 7.5.1
f4ed841b8 Merge pull request sonic-net#8187 from opensourcerouting/rpmfixes-75
86d5a20e3 Merge pull request sonic-net#8193 from mjstapp/fix_signals_7_5
b339cc149 lib: avoid signal-handling race with event loop poll call
0f7b432c3 lib: add debug output for signal mask
c0290c86d lib: add sigevent_check api
7a5348665 doc: Fix CentOS 7 Documentation
2a8e69f48 Merge pull request sonic-net#8064 from donaldsharp/foo
cf4d1a744 redhat: Fix changelog incorrect date format
b78dcb209 Merge pull request sonic-net#8181 from idryzhov/7.5-zebra-blackhole
2032e7e72 zebra: don't use kernel nexthops for blackhole routes
e52003567 bgpd: When deleting a neighbor from a peer-group the PGNAME is optional
aa86a6a6f Merge pull request sonic-net#8161 from mjstapp/fix_sa_7_5_backports
13a8efb4b Merge pull request sonic-net#8156 from idryzhov/7.5-backports-2021-02-26
58911c6ed lib: Free memory leak in error path in clippy
556dfd211 lib: use right type for wconv() return val
bd9caa8f1 lib: fix some misc SA warnings
683b3fe3f lib: register dependency between control plane protocol and vrf nb nodes
b45248fb6 lib: add definitions for vrf xpaths
7b9f10d04 lib: add ability to register dependencies between northbound nodes
9c240815c bgpd: Bgp peer group issue
d1b43634b bgpd: upon bgp deletion, do not systematically ask to remove main bgp
f5d1dc55e bgpd: Fix crash when we don't have a nexthop
c2e463478 frr-reload: rpki context exiting uses exit and not end
f11db1698 bgpd: Blackhole nexthops are not reachable
c628e94ff staticd: fix vrf enabling
49b079ef1 staticd: fix nexthop creation and installation
0077038e9 staticd: fix nexthop validation
be3dfbbc7 zebra: use AF_INET for protocol family
```
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this issue Aug 7, 2021
Update FRR to 7.5.1. The following is a list of new commits.
```
df7ab485b FRRouting Release 7.5.1
f4ed841b8 Merge pull request sonic-net#8187 from opensourcerouting/rpmfixes-75
86d5a20e3 Merge pull request sonic-net#8193 from mjstapp/fix_signals_7_5
b339cc149 lib: avoid signal-handling race with event loop poll call
0f7b432c3 lib: add debug output for signal mask
c0290c86d lib: add sigevent_check api
7a5348665 doc: Fix CentOS 7 Documentation
2a8e69f48 Merge pull request sonic-net#8064 from donaldsharp/foo
cf4d1a744 redhat: Fix changelog incorrect date format
b78dcb209 Merge pull request sonic-net#8181 from idryzhov/7.5-zebra-blackhole
2032e7e72 zebra: don't use kernel nexthops for blackhole routes
e52003567 bgpd: When deleting a neighbor from a peer-group the PGNAME is optional
aa86a6a6f Merge pull request sonic-net#8161 from mjstapp/fix_sa_7_5_backports
13a8efb4b Merge pull request sonic-net#8156 from idryzhov/7.5-backports-2021-02-26
58911c6ed lib: Free memory leak in error path in clippy
556dfd211 lib: use right type for wconv() return val
bd9caa8f1 lib: fix some misc SA warnings
683b3fe3f lib: register dependency between control plane protocol and vrf nb nodes
b45248fb6 lib: add definitions for vrf xpaths
7b9f10d04 lib: add ability to register dependencies between northbound nodes
9c240815c bgpd: Bgp peer group issue
d1b43634b bgpd: upon bgp deletion, do not systematically ask to remove main bgp
f5d1dc55e bgpd: Fix crash when we don't have a nexthop
c2e463478 frr-reload: rpki context exiting uses exit and not end
f11db1698 bgpd: Blackhole nexthops are not reachable
c628e94ff staticd: fix vrf enabling
49b079ef1 staticd: fix nexthop creation and installation
0077038e9 staticd: fix nexthop validation
be3dfbbc7 zebra: use AF_INET for protocol family
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants