Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug loopback test #3

Closed
wants to merge 79 commits into from
Closed

Bug loopback test #3

wants to merge 79 commits into from

Conversation

EdenGri
Copy link
Owner

@EdenGri EdenGri commented Dec 7, 2022

What I did

Why I did it

How I verified it

Details if related

bingwang-ms and others added 30 commits June 2, 2022 17:20
…-net#2318)

* Two fixes: sleep after stop and check values in routes

Co-authored-by: Vaibhav Hemant Dixit <[email protected]>
What I did
This is a cherry-pick PR for sonic-net#2320.

Why I did it
We don't necessarily think it is the right solution, but before we can make Zebra behavior consistent. This change will unblock SONiC tests.

It will be a temporary solution until we converge with FRR/Zebra changes.

Signed-off-by: Ying Xie <[email protected]>
What I did
Cherry-pick sonic-net#2281 to 202205 branch
Why I did it
Combine PG3 and PG4 to PG3-4
How I verified it

Details if related
1/ Enable gearbox port counter collection in GB_COUNTERS_DB
2/ Enable gearbox macsec counter collection in GB_COUNTERS_DB
What I did
Fix issue sonic-net/sonic-buildimage#10850 partially by adding sanity check in port_rates.lua. If the must-have counters of one port are not able to get, skip its rate computation.

Why I did it
It avoids port_rates.lua execution exits abnormally.
…nic-net#2314)

* Apply DSCP_TO_TC_MAP|AZURE to switch level

Signed-off-by: bingwang <[email protected]>
*After the latest changes in this PR sonic-net#2190 an issue was introduced. When the tunnel was deleted the TunnelTermEntries were deleted from ASIC but not from the OA cache. Due to that then the same tunnel is created the TunnelTermEntries are not created as OA has it in local cache.
Signed-off-by: Myron Sosyak <[email protected]>
… values are equal (sonic-net#2327)

* [crmorch] Prevent exceededLogCounter from resetting when low and high values are equal
*[intfmgr]: Set proxy_arp kernel param
Signed-off-by: Lawrence Lee <[email protected]>
…et#2155)

* Fix DTel acl rule creation

The significant rewrite of aclorch when adding ACL_TABLE_TYPE
configuration caused a bug that prevents configuration of any
DTel rules. This is due to use of an incorrect set of enum
mappings while determining which type of AclRule to create.
…nd during mock test (sonic-net#2234)

* Support mock_test infra for dynamic buffer manager and fix issues found during mock test 
Signed-off-by: Stephen Sun <[email protected]>
…s reload flows (sonic-net#2262)

What I did
Enhance the mock test of the dynamic buffer manager in port remove and config qos clear flow and fix bugs during mock test implementation
Implement mock method ProduceStateTable::del
Signed-off-by: Stephen Sun [email protected]

How I verified it
Run regression test, mock test, vs test, and manual test.

Details if related
1. Support mock test for dynamic buffer manager
config qos clear and reclaiming buffer
Remove port
2. Handle port remove/create flow
Cache cable length for a port
Try reclaiming unused buffer when maximum buffer parameters are received for a port whose state is ADMIN_DOWN and m_bufferCompletelyInitialized is true
3. Handle config qos clear
If all buffer pools are removed when m_bufferPoolReady is true, remove all zero pools and profiles.
Reload zero profiles and pools if they have not been loaded when reclaiming buffer
…r queue/pg counters (sonic-net#2143)

- What I did
Currently in SONiC all ports queue and pg counters are created by default with the max possible amount of counters.
This feature change this behavior to poll only configured counters provided by the config DB BUFFER_PG and BUFFER_QUEUE tables.
If no tables are present in the DB, no counters will be created for ports.
Filter the unwanted queues/pgs returned by SAI API calls and skip the creation of these queue/pg counters.
Also allow creating/removing counters on runtime if buffer PG/Queue is configured or removed.

- Why I did it
Improve performance by filtering unconfigured queue/pg counters on init.

- How I verified it
Check after enabling the counters, if configured counters created in Counters DB according to the configurations.
Add/Remove buffer PG/Queue configurations and observe the corresponding counters created/removed accordingly.
New UT added to verify this flow.

Signed-off-by: Shlomi Bitton <[email protected]>
…h swssconfig #11046" (sonic-net#2332)

* Fix updated to not flush static mac
What I did
This PR is to fix ACL table creation failure for certain types.
We saw PFCWD table failed to be created at EGRESS stage. The error logs are

Jun 21 07:00:03.409283 str2-7050cx3-acs-08 ERR syncd#syncd: [none] SAI_API_ACL:_brcm_sai_create_acl_table:11205 field group config create failed with error Feature unavailable (0xfffffff0).
Jun 21 07:00:03.409738 str2-7050cx3-acs-08 ERR syncd#syncd: [none] SAI_API_ACL:brcm_sai_create_acl_table:298 create table entry failed with error -2.
Jun 21 07:00:03.409738 str2-7050cx3-acs-08 ERR syncd#syncd: :- sendApiResponse: api SAI_COMMON_API_CREATE failed in syncd mode: SAI_STATUS_NOT_SUPPORTED
Jun 21 07:00:03.409780 str2-7050cx3-acs-08 ERR syncd#syncd: :- processQuadEvent: attr: SAI_ACL_TABLE_ATTR_ACL_BIND_POINT_TYPE_LIST: 1:SAI_ACL_BIND_POINT_TYPE_PORT
Jun 21 07:00:03.409820 str2-7050cx3-acs-08 ERR syncd#syncd: :- processQuadEvent: attr: SAI_ACL_TABLE_ATTR_FIELD_IN_PORTS: true
Jun 21 07:00:03.409820 str2-7050cx3-acs-08 ERR syncd#syncd: :- processQuadEvent: attr: SAI_ACL_TABLE_ATTR_FIELD_TC: true
Jun 21 07:00:03.410144 str2-7050cx3-acs-08 ERR syncd#syncd: :- processQuadEvent: attr: SAI_ACL_TABLE_ATTR_ACL_ACTION_TYPE_LIST: 2:SAI_ACL_ACTION_TYPE_PACKET_ACTION,SAI_ACL_ACTION_TYPE_COUNTER
Jun 21 07:00:03.410144 str2-7050cx3-acs-08 ERR syncd#syncd: :- processQuadEvent: attr: SAI_ACL_TABLE_ATTR_ACL_STAGE: SAI_ACL_STAGE_EGRESS
Jun 21 07:00:03.410144 str2-7050cx3-acs-08 ERR swss#orchagent: :- create: create status: SAI_STATUS_NOT_SUPPORTED
Jun 21 07:00:03.410144 str2-7050cx3-acs-08 ERR swss#orchagent: :- addAclTable: Failed to create ACL table pfcwd_egress
The root cause for the issue is SAI_ACL_TABLE_ATTR_FIELD_IN_PORTS is not supported at EGRESS stage.

This PR addressed the issue by adding match field according to the stage.
For ACL type TABLE_TYPE_PFCWD and TABLE_TYPE_DROP at INGRESS stage, the match field SAI_ACL_TABLE_ATTR_FIELD_IN_PORTS is added, while for EGRESS the field is not added.

Why I did it
To fix ACL table creation issue.

How I verified it

Verified by vstest
test_acl.py::TestAcl::test_AclTableMandatoryMatchFields[ingress] PASSED                                                                                                                         [ 87%]
test_acl.py::TestAcl::test_AclTableMandatoryMatchFields[egress] PASSED                                                                                                                          [ 90%]
Verified by building a new image and run on a TD3 device.

Signed-off-by: bingwang <[email protected]>
* Default macsec poll interval 10s, except of xpn1s
* Correct COUNTERS_MACSEC_NAME_MAP entry in GB_COUNTERS_DB for gearbox macsec
* Support macsec flex couner config
* Correct port flex counter config for gearbox
* Add IN_UCAST_PKTS/IN_NON_UCAST_PKTS/OUT_UCAST_PKTS/OUT_NON_UCAST_PKTS in gearbox port counter list
* Add/remove macsec name map w/o gearbox correctly
* Add macsec counter unit test
- What I did
using a copy of FDBEntry fields (stored in FDBUpdate) instead of a reference since the reference gets invalidated in the storeFdbEntryState()
simplified clearFdbEntry() interface

- Why I did it
To fix the memory usage issue
The issue is that the SWSS_LOG_INFO() uses the mac&, port_alias&, and bv_id& which are invalidated in the storeFdbEntryState().

- How I verified it
Run the tests that were used to find the issues and checked the ASAN report

Signed-off-by: Yakiv Huryk <[email protected]>
- What I did
Optimize swssconfig:
1. Use unix socket
2. Cache producer table to avoid create it for same table name

- Why I did it
We found that generating large scale static routes via swssconfig is very slow.

- How I verified it
After the optimization, generating 100K routes via swssconfig take 2 seconds, however, before the optimization it takes > 60 seconds.
…storm is detected (sonic-net#2304)

What I did
Avoid dropping traffic that is ingressing the port/pg that is in storm. The code changes in this PR avoid creating the ingress zero pool and profile and does not attach any zero profile to the ingress pg when pfcwd is triggered

Revert changes related to sonic-net#1480 where the retry mechanism was added to BufferOrch which caches the task retries and while the PG is locked by PfcWdZeroBufferHandler.

Revert changes related to sonic-net#2164 in PfcWdZeroBufferHandler & ZeroBufferProfile & BufferOrch.

Updated UT's accordingly

How I verified it
UT's.
Ran the sonic-mgmt test with these changes sonic-net/sonic-mgmt#5665 and verified if they've passed.

Signed-off-by: Vivek Reddy Karri <[email protected]>
What I did
This PR is to cherry-pick sonic-net#2356 to 202205 branch. The cherry-pick is clean, no conflict is found.
This PR is to fix the issue of adding mux_acl_rule into IngressTableDrop.
The error log is

 Jun 25 08:02:37.159020 svcstr-7050-acs-4 ERR swss#orchagent: :- validateAclRuleMatch: Match SAI_ACL_ENTRY_ATTR_FIELD_IN_PORTS in rule mux_acl_rule is not supported by table IngressTableDrop
PR sonic-net#2341 added support for different matching field in different stage (INGRESS/EGRESS). For example, SAI_ACL_ENTRY_ATTR_FIELD_IN_PORTS is only supported at INGRESS stage.

However, PR sonic-net#2341 only handled one path for creating ACL table, that is by CONFIG_DB entry.
There is a case that addAclTable is directly called from other orch, such as MuxOrch. In that case, the stage dependent matcing field is not added. As a resule, we will see the above error logs.
To address the issue, I moved the call of addStageMandatoryMatchFields from doAclTableTask to addAclTable to ensure addStageMandatoryMatchFields is always called.
Please be noted that addMandatoryActions is called from both doAclTableTask and addAclTable to ensure the validation of ACL table is passing.

Why I did it
To fix ACL rule issue for mux.

How I verified it

Verified by running test_pfcwd
Verified by checking syslog

Signed-off-by: bingwang <[email protected]>
- What I did
Added a new flag to DVS tests

- Why I did it
Currently, when running the tests with ASAN-enabled image, leak reports are not generated. The reason is that dvs.destroy() (via 'ctn.remove(force=True)') uses SIGKILL to stop the container. To address this, a new flag is added.
When the new flag is set, the swss processes are gracefully stopped (via SIGTERM).
So ASAN reports can be generated as a result of DVS tests run

- How I verified it
Run the tests with --graceful-stop, observe that swss processes are stopped via SIGTERM

Signed-off-by: Yakiv Huryk <[email protected]>
Currently, ASAN sometimes reports the BufferOrch::m_buffer_type_maps and QosOrch::m_qos_maps as leaked. However, their lifetime is the lifetime of a process so they are not really 'leaked'.
This also adds a simple way to add more suppressions later if required.

Example of ASAN report:

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f96aa952d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x55ca1da9f789 in __static_initialization_and_destruction_0 /__w/2/s/orchagent/bufferorch.cpp:39
    #2 0x55ca1daa02af in _GLOBAL__sub_I_bufferorch.cpp /__w/2/s/orchagent/bufferorch.cpp:1321
    #3 0x55ca1e2a9cd4  (/usr/bin/orchagent+0xe89cd4)

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f96aa952d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x55ca1da6d2da in __static_initialization_and_destruction_0 /__w/2/s/orchagent/qosorch.cpp:80
    #2 0x55ca1da6ecf2 in _GLOBAL__sub_I_qosorch.cpp /__w/2/s/orchagent/qosorch.cpp:2000
    #3 0x55ca1e2a9cd4  (/usr/bin/orchagent+0xe89cd4)

- What I did
Added an lsan suppression config with static variable leak suppression

- Why I did it
To suppress ASAN false positives

- How I verified it
Run a test that produces the static variable leaks report and checked that report has these leaks suppressed.

Signed-off-by: Yakiv Huryk <[email protected]>
* Add IP interface loopback action support
Co-authored-by: liora <[email protected]>
* [vnetorch] fix use-after-free in removeBfdSession()
* using a copy of monitor ip instead of a reference since the reference gets invalidated after the endpoint is erased

Signed-off-by: Yakiv Huryk <[email protected]>
Signed-off-by: Ze Gan [email protected], Judy Joseph [email protected]

What I did
If a member of portchannel has macsec profile attached in config, enable MACsec on the port before it's been added as a member of portchannel.

Why I did it
Due to some hardware limitation, cannot enable MACsec on a member of portchannel.
So we enable the macsec on interface first and then add it as part of portchannel.

Note: This is a work around which will be removed when h/w supports it future releases.

The approach taken in this PR is

In the teamdMgr when an interface is added as part of the LAG, we wait for the macsecPort creation done in SAI and Ingress SA creation complete (if macsec is enabled on the interface)
The above takes care of config reload, reboot scenario's where we cannot guarantee the sequence of macsec attach to interface, add interface as part of portchannel.

If we do a manual removal of port from portchannel, or remove macsec config from the interface, Please follow this steps

First remove the portchannel member out of portchannel
Remove the macsec profile attached to interface.

How I verified it

Verified with config reload, reboot with the macsec profile attached to portchannel member interfaces.
Verified case when SAK rekey is enabled on macsec on portchannel members
Verified case when member interface link flaps
andywongarista and others added 29 commits August 19, 2022 15:27
What I did
Revert change from sonic-net#2367 which increases count associated with SAI_SWITCH_ATTR_SWITCH_HARDWARE_INFO by 1, as well as the memset.

Why I did it
Original intention of this change was to accommodate sairedis behaviour when copying null-terminated string; original behaviour is that the null-terminator would not be copied and so receiver of the hwinfo (PAI) would see non-null terminated string.

Reverting this change so that old behaviour is maintained and PAI driver is responsible for not relying on string to be null terminated.
* Handle dual ToR neighbor miss scenario (sonic-net#2137)

- When orchagent receives a neighbor update with a zero MAC:
    - If the neighbor IP is configured for a specific mux cable port in the MUX_CABLE table in CONFIG_DB, handle the neighbor normally (if active for the port, no action is needed. if standby, a tunnel route is created for the neighbor IP)
    - If the neighbor IP is not configured for a specific port, create a tunnel route for the IP to the peer switch.
        - When these neighbor IPs are eventually resolved, remove the tunnel route and handle the neighbor normally.
- When creating/initializing a mux cable object, set the internal state to standby to match the constructor behavior.

- Various formatting fixes inside test_mux.py
- Remove references to deprecated `@pytest.yield_fixture`
- Add dual ToR neighbor miss test cases:
    - Test cases and expected results are described in `mux_neigh_miss_tests.py`. These descriptions are used by the generic test runner `test_neighbor_miss` function to execute the test actions and verify expected results
    - Various setup fixtures and test info fixtures were added
    - Existing test cases were changed to use these setup fixtures for consistency

Signed-off-by: Lawrence Lee <[email protected]>
Co-authored-by: Sumukha Tumkur Vani <[email protected]>
…ter (sonic-net#2194)

- What I did
This PR replace PR sonic-net#2022

Added increasing/decreasing to the port ref counter each time a port buffer configuration is added or removed
Implemented according to the - sonic-net/SONiC#900

- Why I did it
In order to avoid cases where a port is removed before the buffer configuration on this port were removed also

- How I verified it
VS Test was added in order to test it.
we remove a port with buffer configuration and the port is not removed. only after all buffer configurations on this port were
removed - this port will be removed.
…ABLE (sonic-net#2408)

* Fix for issue #11218
Avoid processing portchannel subinterfaces in teamd
…net#2427)

* [neighsyncd] Enabling ipv4 link local entries for non-dualtor
Allow ipv4 link local entries to be programmed to the hardware unless on
a dual-tor setup.
Why I did:

PR: sonic-net#2400 made change to pass <string> as argument to API setPortFecMode but did not updated the corresponding gbsyncd API call
*[BFD]Clean up state_db BFD entries on swss restart (sonic-net#2434)
…ts buffer queue/pg counters (sonic-net#2432)

* Filter unconfigured ports buffers queue/pg counters configurations on init

    commit 6f1199a
    Author: Shlomi Bitton <[email protected]>
    Date:   Sun Jan 2 16:55:58 2022 +0000

    Filter unconfigured ports buffers queue/pg counters configurations on init.
    If no buffer configurations available, no counters will be created.
    Allow creating/removing counters on runtime if buffer PG/Queue is created or removed.
    New UT added to verify new flow.

    Signed-off-by: Shlomi Bitton <[email protected]>
…onic-net#2437)

* Change the log messages from ERROR to INFO.
* Update the test_chassis_system_neigh test to check the mac address change of a neighbor.
…ic-net#2431) (sonic-net#2451)

Signed-off-by: Vivek Reddy Karri <[email protected]>
Bulk write to APP_DB i.e. alias, lanes, speed must be read through one notification by orchagent during create_port
Handled a race condition in portmgrd which tries to immediately apply a mtu/admin_status SET notif after a DEL causing it to crash
…nic-net#2450)

Signed-off-by: Vivek Reddy Karri <[email protected]>
Check STATE_DB before sending ARP/ND pkts for neighbors associated with PortChannel. As a part of intf check, wait for the LAG_MEMBER_TABLE to be populated for Portchannels ifaces
…he nexthop is updated for ERSPAN mirror destination (sonic-net#2392) (sonic-net#2455)

* [Everflow/ERSPAN] Set correct destination port and mac address when the nexthop is updated for ERSPAN mirror destination
* [Everflow] Fixed show mirror-session, Acl rule remove failure, orchagent crash

Signed-off-by: Sakthivadivu Saravanaraj <[email protected]>

Signed-off-by: Sakthivadivu Saravanaraj <[email protected]>
…c-net#2414) (sonic-net#2449)

Cherry-pick sonic-net#2414

- Why I did it
Enhance the error handling logic.
In most cases, a user will not encounter such scenarios in a production environment because it's the front-ends' (eg. CLI) responsibility to identify the wrong configuration and prevent them from being inserted to CONFIG_DB.
However, in some cases, like a wrong config_db.json composed and copied to the switch, front-ends can not prevent that.

- How I verified it
Manual and mock tests.

- Details if related
For the improvement in buffer manager:

previously, the logic was:
declare a reference portQueue to m_portQueueLookup[port][queues] and then assign fvValue(i) to portQueue.running_profile_name
But [] operation on C++ map has a side-effect -- it will insert a new element into the map if there wasn't one. In case the validation check in checkBufferProfileDirection failed and there was not one in the map, the portQueue.running_profile_name will keep empty. This is not what we want.
In case there was an item configured in the map, we should not remove it on failure because we want to prevent the user from being affected by misconfiguration and alert user to correct the error. There is log in checkBufferProfileDirection
Now it is improved in this way:
Avoid using reference and initialize m_portQueueLookup[port][queues] only if there is a valid egress profile configured

Signed-off-by: Stephen Sun [email protected]
…ounters improvement (sonic-net#2462)

* Revert "[202205][counters] Improve performance by polling only configured ports buffer queue/pg counters (sonic-net#2432)"

This reverts commit 2c5116e.

Because the community test test_iface_namingmode.py is not passing with this feature.
…sonic-net#2422)

What I did
Do not enforce drop probability for a color whose WRED is disabled.

Signed-off-by: Stephen Sun [email protected]

Why I did it
Currently, there is a logic to enforce the drop probability if it is not explicitly designated for a color. However, the drop probability is not a mandatory attribute. It can incur vendor SAI complaints to set it when the color is disabled.
The logic was introduced from the very beginning (by PR sonic-net#571) because no drop probability was defined in the QoS template at the time, which is no longer true.
So we will enforce drop probability only if it is not configured and the color is enabled.

How I verified it
Unit test and manual test
…ov (sonic-net#2460)

* [ci] Only when test stage succeeded or succeededwithissues, PR run Gcov
**What I did**
Set extra MTU for MACsec enabled port.

**Why I did it**
MACsec frame will expend the packet with MACsec SecTAG, Otherwise if a packet length equals the MTU which will be dropped by SAI port.

**How I verified it**
Assume 9100 is the MTU, Check by the command `ping -s 9100 10.0.0.57`

**Details if related**
Cherry-pick from sonic-net#2398
Signed-off-by: Ze Gan <[email protected]>
Co-authored-by: Junhua Zhai <[email protected]>
…not ready (sonic-net#2461)

*Flexcounter - Fix issue: ip prefix of a route flow pattern shall be initialized even if VRF/VNET is not ready
**Why I did it**
lgtm build process broken. This PR will fix it.
```
[2022-09-30 18:21:33] [build-stderr] In file included from defaultvalueprovider.cpp:10:
[2022-09-30 18:21:33] [build-stderr] defaultvalueprovider.h:8:10: fatal error: libyang/libyang.h: No such file or directory
[2022-09-30 18:21:33] [build-stderr]     8 | #include <libyang/libyang.h>
[2022-09-30 18:21:33] [build-stderr]       |          ^~~~~~~~~~~~~~~~~~~
[2022-09-30 18:21:33] [build-stderr] compilation terminated.
```

**How I verified it**
The lgtm.yml change does not take effect in this PR's checker. I manually test it https://lgtm.com/logs/19f4015aec3863d7d4e7d5667cbbc251efd1d0f4/lang:cpp
* What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]

* [orchdaemon]: Fixed sairedis record file rotation

What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]

* [orchdaemon]: Fixed sairedis record file rotation

What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]

Signed-off-by: Bryan Crossland [email protected]
* [intfmgr]: Enable `accept_untracked_na` kernel param
* When enabling gratuitous ARP, also enable acceptance of untracked neighbor advertisements
…CL group/table counters (sonic-net#2482)

* [crm] Fix issue with continues EXCEEDED and CLEAR logs for ACL group/table counters
Signed-off-by: Volodymyr Samotiy <[email protected]>
…ts buffer queue/pg counters (sonic-net#2474)

* Revert "[202205][counters] Revert PR sonic-net#2432 for the buffer queue/pg counters improvement (sonic-net#2462)"
This reverts commit 8eea92e.
* [build] add missing package libyang-dev in lgtm.yml (sonic-net#2475)
Counters are port stats and queue stats. Currently only fabric asics
could be collected. J2 fabric counter collection doesn't work yet.
J2 fabric port counters fail to be collected due to logical port id
for fabric links is set up to 512 while SAI supports at most 256 ports.
J2 fabric queue counters are not supported by SAI at this moment (BCM
confirmed).

Signed-off-by: Maxime Lorrillere <[email protected]>

Signed-off-by: Maxime Lorrillere <[email protected]>
…#2469)

* [vlanmgr] Disable `arp_evict_nocarrier` for vlan host intf
…onic-net#2483)

- What I did

I fixed an issue that on port deletion (port breakout scenario), the port OID is not removed from saiOidToAlias map, resulting in getPort returns true when querying non-existing port OID but the Port structure is not filled with correct values. Also lowered the log level on receiving non-existing port operational status update

- Why I did it

To fix errors in the log during breakout:

Oct  4 13:15:45.654396 r-bulldog-04 NOTICE swss#orchagent: :- updatePortOperStatus: Port  oper state set from unknown to down
Oct  4 13:15:45.654773 r-bulldog-04 ERR swss#orchagent: :- set: switch id oid:0x0 doesn't exist
Oct  4 13:15:45.654773 r-bulldog-04 WARNING swss#orchagent: :- setHostIntfsOperStatus: Failed to set operation status DOWN to host interface
Oct  4 13:15:45.654773 r-bulldog-04 ERR swss#orchagent: :- updatePortOperStatus: Failed to set host interface  operational status down
Oct  4 13:15:45.654773 r-bulldog-04 WARNING swss#orchagent: :- flushFDBEntries: Couldn't flush FDB. Bridge port OID: 0x0 bvid:0,

- How I verified it
Run UT, run manual port breakout tests.

Signed-off-by: Stepan Blyschak <[email protected]>
…buffer pools and profiles (sonic-net#2498)

Signed-off-by: Stephen Sun <[email protected]>

What I did
Originally, it was assumed the names of all the buffer pools follow the community buffer pool name convention ({ingress|egress}_{lossless|lossy}_pool). The heuristic algorithm to identify buffer pools and profiles was designed based on the assumption.
However, some users can define the buffer pool names in other ways, which breaks the logic in the Lua plugin and introduces degradation,

the pool sizes of those pools can not be generated
the additional reserved memory for lossy PG can not be calculated
In this PR, the logic is improved to tolerate the case.
Signed-off-by: Stephen Sun [email protected]

How I verified it
Manually and regression test.
It has been covered by regression test and vs test. No new test case is required.

Details if related
Separate the buffer pools into two tables according to the direction.
Iterate all the profiles, generating and recording the type (lossless/lossy) for each ingress profile which is identified by checking the pool it references
Identify buffer profiles for lossy PG by checking the type (lossless/lossy) generated in 2 instead of the hardcoded name ingress_lossy_profile
@EdenGri EdenGri closed this Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.