Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sonic-swss: Fix orchagent crash in generateQueueMapPerPort. #2552

Merged
merged 1 commit into from
Dec 6, 2022
Merged

sonic-swss: Fix orchagent crash in generateQueueMapPerPort. #2552

merged 1 commit into from
Dec 6, 2022

Conversation

skbarista
Copy link
Contributor

  • generateQueueMap uses m_portList[port].m_queue_ids.size to allocate m_queueStates in FlexCounterQueueStates. But m_portList[port].m_queue_ids.size is zero for system ports which results in isQueueCounterEnabled crash for system ports. Since we do not have support disable voq counters yet, do not check isQueueCounterEnabled for voqs.

What I did

Fix orchagent crash in voq systems with the following backtrace.

gdb) bt
#0 0x000055cbe38f7d2d in std::_Bit_reference::operator bool (this=) at /usr/include/c++/10/bits/stl_bvector.h:87
#1 std::_Bit_const_iterator::operator* (this=) at /usr/include/c++/10/bits/stl_bvector.h:348
#2 std::vector<bool, std::allocator >::operator[] (__n=0, this=0x55cbe53688f0) at /usr/include/c++/10/bits/stl_bvector.h:918
#3 FlexCounterQueueStates::isQueueCounterEnabled (this=this@entry=0x55cbe53688f0, index=0) at flexcounterorch.cpp:422
#4 0x000055cbe37692ad in PortsOrch::generateQueueMapPerPort (this=0x55cbe5081360, port=..., queuesState=..., voq=true) at portsorch.cpp:6093
#5 0x000055cbe3769c9a in PortsOrch::generateQueueMap (this=this@entry=0x55cbe5081360, queuesStateVector=std::map with 39 elements = {...}) at portsorch.cpp:6048
#6 0x000055cbe38fb2c8 in FlexCounterOrch::doTask (this=0x55cbe50e9a30, consumer=...) at flexcounterorch.cpp:163
#7 0x000055cbe36dd66e in Consumer::drain (this=0x55cbe51bbe40) at orch.cpp:241
#8 Consumer::drain (this=0x55cbe51bbe40) at orch.cpp:238
#9 Consumer::execute (this=0x55cbe51bbe40) at orch.cpp:235
#10 0x000055cbe36ccc99 in OrchDaemon::start (this=this@entry=0x55cbe505f6b0) at orchdaemon.cpp:757
#11 0x000055cbe3654990 in main (argc=, argv=) at main.cpp:735

Why I did it

Fix orchagent crash in voq system

How I verified it

Verified that voq system boots up fine without any crash.

* generateQueueMap uses m_portList[port].m_queue_ids.size to allocate
  m_queueStates in FlexCounterQueueStates. But
  m_portList[port].m_queue_ids.size is zero for system ports which
  results in isQueueCounterEnabled crash for system ports. Since we do
  not have support disable voq counters yet, do not check
  isQueueCounterEnabled for voqs.
@skbarista skbarista requested a review from prsunny as a code owner November 29, 2022 19:13
@prsunny prsunny requested a review from arlakshm November 30, 2022 17:49
@rlhui rlhui requested a review from vmittal-msft November 30, 2022 17:59
@rlhui rlhui added the chassis label Nov 30, 2022
Copy link
Contributor

@vmittal-msft vmittal-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please verify queue counters on voq system as requested. Changes looks fine though.

@skbarista
Copy link
Contributor Author

skbarista commented Dec 1, 2022

counter.txt

@vmittal-msft I have attached a counter.txt file with output from queuestat -V. This system just has the inband control traffic and front panel control traffic running.

@prsunny prsunny merged commit d0419dc into sonic-net:master Dec 6, 2022
yxieca pushed a commit that referenced this pull request Dec 7, 2022
* generateQueueMap uses m_portList[port].m_queue_ids.size to allocate  m_queueStates in FlexCounterQueueStates. But
  m_portList[port].m_queue_ids.size is zero for system ports which  results in isQueueCounterEnabled crash for system ports. Since we do   not have support disable voq counters yet, do not check   isQueueCounterEnabled for voqs.
yxieca added a commit that referenced this pull request Dec 7, 2022
@yxieca
Copy link
Contributor

yxieca commented Dec 7, 2022

@skbarista this PR causes build failures after included in sonic-swss submodule. Please address the build failure. It is possible that voq dependency is not cherry-picked.

@yxieca
Copy link
Contributor

yxieca commented Dec 7, 2022

Build failure log: https://dev.azure.com/mssonic/build/_build/results?buildId=185460&view=logs&j=d37bc48d-29a0-534f-a1dc-3d699deb17a6&t=933dfdc6-d074-504d-ee66-71743fd9ae4e

portsorch.cpp: In member function 'void PortsOrch::generateQueueMapPerPort(const swss::Port&, FlexCounterQueueStates&)':
portsorch.cpp:5996:18: error: 'voq' was not declared in this scope
             if (!voq && !queuesState.isQueueCounterEnabled(queueRealIndex))
                  ^~~

yxieca pushed a commit that referenced this pull request Jan 4, 2023
* generateQueueMap uses m_portList[port].m_queue_ids.size to allocate  m_queueStates in FlexCounterQueueStates. But
  m_portList[port].m_queue_ids.size is zero for system ports which  results in isQueueCounterEnabled crash for system ports. Since we do   not have support disable voq counters yet, do not check   isQueueCounterEnabled for voqs.
yxieca added a commit that referenced this pull request Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants