Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_snmp_queue fails due to unavailable queue counters #8934

Closed
kartik-arista opened this issue Feb 28, 2023 · 7 comments
Closed

test_snmp_queue fails due to unavailable queue counters #8934

kartik-arista opened this issue Feb 28, 2023 · 7 comments

Comments

@kartik-arista
Copy link
Contributor

Description

snmp/test_snmp_queue.py fails deterministically due to missing 'queues' information in SNMP facts.

This was observed on a VOQ chassis platform but likely affects any non Cisco device.

Steps to reproduce the issue:

Describe the results you received:

run snmp/test_snmp_queue.py on any VOQ chassis device.

Describe the results you expected:

The tests should pass

Additional information you deem important:

Failed: port Ethernet39/1 does not have queue counters
duthosts = [<MultiAsicSonicHost str2-7804-lc5-1>, <MultiAsicSonicHost str2-7804-lc3-1>, <MultiAsicSonicHost str2-7804-lc7-1>, <MultiAsicSonicHost str2-7804-sup-1>]
enum_rand_one_per_hwsku_hostname = 'str2-7804-lc7-1'
localhost = <tests.common.devices.local.Localhost object at 0x7f87614397d0>
creds_all_duts = {'str2-7804-lc3-1': {'ad_domain': 'GME', 'ad_integration_enabled': True, 'ansible_become_pass': "{{ secret_group_vars[...vars['str']['ansible_become_pass'] }}", 'ansible_ssh_pas\
s': "{{ secret_group_vars['str']['ansible_ssh_pass'] }}", ...}}
collect_techsupport_all_duts = None

    def test_snmp_queues(duthosts, enum_rand_one_per_hwsku_hostname, localhost, creds_all_duts, collect_techsupport_all_duts):
        duthost = duthosts[enum_rand_one_per_hwsku_hostname]
        hostip = duthost.host.options['inventory_manager'].get_host(duthost.hostname).vars['ansible_host']

        snmp_facts = get_snmp_facts(localhost, host=hostip, version="v2c", community=creds_all_duts[duthost.hostname]["snmp_rocommunity"], wait=True)['ansible_facts']

        for k, v in snmp_facts['snmp_interfaces'].items():
            if "Ethernet" in v['description']:
                if not v.has_key('queues'):
>                   pytest.fail("port %s does not have queue counters" % v['name'])
E                   Failed: port Ethernet39/1 does not have queue counters

Investigating the code, it seems like this field comes from a private Cisco MIB (ansible/library/snmp_facts.py)

     # From Cisco private MIB (PFC and queue counters)
        self.cpfcIfRequests         = dp + "1.3.6.1.4.1.9.9.813.1.1.1.1" # + .ifindex
        self.cpfcIfIndications      = dp + "1.3.6.1.4.1.9.9.813.1.1.1.2" # + .ifindex
        self.requestsPerPriority    = dp + "1.3.6.1.4.1.9.9.813.1.2.1.2" # + .ifindex.prio
        self.indicationsPerPriority = dp + "1.3.6.1.4.1.9.9.813.1.2.1.3" # + .ifindex.prio
        self.csqIfQosGroupStats     = dp + "1.3.6.1.4.1.9.9.580.1.5.5.1.4" # + .ifindex.IfDirection.QueueID

which is then used to populate SNMP facts

   for varBinds in varTable:
        for oid, val in varBinds:
            current_oid = oid.prettyPrint()
            current_val = val.prettyPrint()
            if v.csqIfQosGroupStats in current_oid:
                ifIndex = int(current_oid.split('.')[-4])
                ifDirection = int(current_oid.split('.')[-3])
                queueId = int(current_oid.split('.')[-2])
                counterId = int(current_oid.split('.')[-1])
                results['snmp_interfaces'][ifIndex]['queues'][ifDirection][queueId][counterId] = current_val

which is not going to be present

@kartik-arista
Copy link
Contributor Author

This is currently noticed in 202205. I have not confirmed the behavior in master.

Please note #6744

which in turn is tied to 
sonic-net/sonic-swss#2432

These changes don't seem to be cherry picked into 202205, so the most pertinent short term question is whether all these changes need to be present in 202205 or not. If the answer is no, then there we need to decide whether it is appropriate to skip these tests unless the platform specifically supports the relevant MIB.

@kartik-arista
Copy link
Contributor Author

@vmittal-msft @arlakshm for viz.

@yxieca
Copy link
Collaborator

yxieca commented Mar 8, 2023

@kartik-arista, sonic-net/sonic-swss#2432 is contributed directly into 202205 branch. Are you saying this PR is the culprit or the fix?

Transferring this issue to sonic-buildimage repo as this issue seems to be an image issue.

@yxieca yxieca transferred this issue from sonic-net/sonic-mgmt Mar 8, 2023
@kartik-arista
Copy link
Contributor Author

@yxieca Yes - originallly it was not clear if this issue was test related or product function related, and if the change required was to skip the relevant tests (or cherrypick additional functionality into 202205).

Also adding @arlakshm for visibility since he was following up with @prsunny and others who may be aware of the background.

@rlhui
Copy link

rlhui commented Jul 12, 2023

Discussed in chassis subgroup that test needs to be enhanced for multi-asic

@rlhui rlhui transferred this issue from sonic-net/sonic-buildimage Jul 12, 2023
@rlhui rlhui added the Test gap label Jul 12, 2023
@kenneth-arista
Copy link
Contributor

kenneth-arista commented Jul 25, 2023

Discussed in chassis subgroup that test needs to be enhanced for multi-asic

Here's a code reference to show that the test does not account for the namespace on multi-ASIC platforms:

q_keys = redis_get_keys(duthost, "CONFIG_DB", "QUEUE|*")

@SuvarnaMeenakshi
Copy link
Contributor

#9115 should fix this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

7 participants