Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vlan] multiple test cases fail #8111

Closed
theasianpianist opened this issue Jul 7, 2021 · 9 comments
Closed

[vlan] multiple test cases fail #8111

theasianpianist opened this issue Jul 7, 2021 · 9 comments
Assignees

Comments

@theasianpianist
Copy link
Contributor

theasianpianist commented Jul 7, 2021

ptfadapter = <tests.common.plugins.ptfadapter.ptfadapter.PtfTestAdapter testMethod=runTest>
duthosts = <tests.common.devices.duthosts.DutHosts object at 0x7f6988380610>
rand_one_dut_hostname = 'str2-7050cx3-acs-01'
ptfhost = <tests.common.devices.ptf.PTFHost object at 0x7f698653df50>
vlan_ports_list = [{'dev': 'PortChannel0001', 'permit_vlanid': {100: {'peer_ip': '192.168.100.26', 'remote_ip': '100.1.1.26'}, 200: {'pe...mote_ip': '100.1.1.8'}, 200: {'peer_ip': '192.168.200.8', 'remote_ip': '200.1.1.8'}}, 'port_index': [24], 'pvid': 200}]
vlan_intfs_list = [{'ip': '192.168.100.1/24', 'vlan_id': 100}, {'ip': '192.168.200.1/24', 'vlan_id': 200}]
cfg_facts = {'ACL_TABLE': {'DATAACL': {'policy_desc': u'DATAACL', 'ports': [u'PortChannel0001', u'PortChannel0002', u'PortChannel0..._300m_profile]'}}, 'Ethernet108': {'3-4': {'profile': u'[BUFFER_PROFILE|pg_lossless_100000_300m_profile]'}}, ...}, ...}

    @pytest.fixture(scope="module", autouse=True)
    def setup_vlan(ptfadapter, duthosts, rand_one_dut_hostname, ptfhost, vlan_ports_list, vlan_intfs_list, cfg_facts):
        duthost = duthosts[rand_one_dut_hostname]
        # --------------------- Setup -----------------------
        try:
            # Generate vlan info
            portchannel_interfaces = cfg_facts.get('PORTCHANNEL_INTERFACE', {})
    
            logger.info("Shutdown lags, flush IP addresses")
            for portchannel, ips in portchannel_interfaces.items():
                duthost.command('config interface shutdown {}'.format(portchannel))
                for ip in ips:
                    duthost.command('config interface ip remove {} {}'.format(portchannel, ip))
    
            # Wait some time for route, neighbor, next hop groups to be removed,
            # otherwise PortChannel RIFs are still referenced and won't be removed
            time.sleep(90)
    
            logger.info("Add vlans, assign IPs")
            for vlan in vlan_intfs_list:
                duthost.command('config vlan add {}'.format(vlan['vlan_id']))
                duthost.command("config interface ip add Vlan{} {}".format(vlan['vlan_id'], vlan['ip'].upper()))
    
            # Delete untagged vlans from interfaces to avoid error message
            # when adding untagged vlan to interface that already have one
            if '201911' not in duthost.os_version:
                logger.info("Delete untagged vlans from interfaces")
                for vlan_port in vlan_ports_list:
                    vlan_members = cfg_facts.get('VLAN_MEMBER', {})
                    vlan_name, vid = vlan_members.keys()[0], vlan_members.keys()[0].replace("Vlan", '')
                    try:
                        if vlan_members[vlan_name][vlan_port['dev']]['tagging_mode'] == 'untagged':
                            duthost.command("config vlan member del {} {}".format(vid, vlan_port['dev']))
                    except KeyError:
                        continue
    
            logger.info("Add members to Vlans")
            for vlan_port in vlan_ports_list:
                for permit_vlanid in vlan_port['permit_vlanid'].keys():
                    duthost.command('config vlan member add {tagged} {id} {port}'.format(
                        tagged=('--untagged' if vlan_port['pvid'] == permit_vlanid else ''),
                        id=permit_vlanid,
>                       port=vlan_port['dev']
                    ))

cfg_facts  = {'ACL_TABLE': {'DATAACL': {'policy_desc': u'DATAACL', 'ports': [u'PortChannel0001', u'PortChannel0002', u'PortChannel0..._300m_profile]'}}, 'Ethernet108': {'3-4': {'profile': u'[BUFFER_PROFILE|pg_lossless_100000_300m_profile]'}}, ...}, ...}
duthost    = <MultiAsicSonicHost> str2-7050cx3-acs-01
duthosts   = <tests.common.devices.duthosts.DutHosts object at 0x7f6988380610>
ip         = '10.0.0.62/31'
ips        = {'10.0.0.62/31': {}, 'FC00::7D/126': {}}
permit_vlanid = 200
portchannel = 'PortChannel0004'
portchannel_interfaces = {'PortChannel0001': {'10.0.0.56/31': {}, 'FC00::71/126': {}}, 'PortChannel0002': {'10.0.0.58/31': {}, 'FC00::75/126': ...ortChannel0003': {'10.0.0.60/31': {}, 'FC00::79/126': {}}, 'PortChannel0004': {'10.0.0.62/31': {}, 'FC00::7D/126': {}}}
ptfadapter = <tests.common.plugins.ptfadapter.ptfadapter.PtfTestAdapter testMethod=runTest>
ptfhost    = <tests.common.devices.ptf.PTFHost object at 0x7f698653df50>
rand_one_dut_hostname = 'str2-7050cx3-acs-01'
vid        = '1000'
vlan       = {'ip': '192.168.200.1/24', 'vlan_id': 200}
vlan_intfs_list = [{'ip': '192.168.100.1/24', 'vlan_id': 100}, {'ip': '192.168.200.1/24', 'vlan_id': 200}]
vlan_members = {'Vlan1000': {'Ethernet12': {'tagging_mode': u'untagged'}, 'Ethernet16': {'tagging_mode': u'untagged'}, 'Ethernet20': {'tagging_mode': u'untagged'}, 'Ethernet24': {'tagging_mode': u'untagged'}, ...}}
vlan_name  = 'Vlan1000'
vlan_port  = {'dev': 'PortChannel0001', 'permit_vlanid': {100: {'peer_ip': '192.168.100.26', 'remote_ip': '100.1.1.26'}, 200: {'peer_ip': '192.168.200.26', 'remote_ip': '200.1.1.26'}}, 'port_index': [28], 'pvid': 100}
vlan_ports_list = [{'dev': 'PortChannel0001', 'permit_vlanid': {100: {'peer_ip': '192.168.100.26', 'remote_ip': '100.1.1.26'}, 200: {'pe...mote_ip': '100.1.1.8'}, 200: {'peer_ip': '192.168.200.8', 'remote_ip': '200.1.1.8'}}, 'port_index': [24], 'pvid': 200}]

vlan/test_vlan.py:143: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
common/devices/multi_asic.py:89: in _run_on_asics
    return getattr(self.sonichost, self.multi_asic_attr)(*module_args, **complex_args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <SonicHost> str2-7050cx3-acs-01
module_args = ('config vlan member add  200 PortChannel0001',)
complex_args = {}, previous_frame = <frame object at 0x7f69840e0a50>
filename = '/var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices/multi_asic.py'
line_number = 89, function_name = '_run_on_asics'
lines = ['            return getattr(self.sonichost, self.multi_asic_attr)(*module_args, **complex_args)\n']
index = 0, verbose = True, module_ignore_errors = False, module_async = False

    def _run(self, *module_args, **complex_args):
    
        previous_frame = inspect.currentframe().f_back
        filename, line_number, function_name, lines, index = inspect.getframeinfo(previous_frame)
    
        verbose = complex_args.pop('verbose', True)
    
        if verbose:
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{}, args={}, kwargs={}"\
                .format(filename, function_name, line_number, self.hostname,
                        self.module_name, json.dumps(module_args), json.dumps(complex_args)))
        else:
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{} executing..."\
                .format(filename, function_name, line_number, self.hostname, self.module_name))
    
        module_ignore_errors = complex_args.pop('module_ignore_errors', False)
        module_async = complex_args.pop('module_async', False)
    
        if module_async:
            def run_module(module_args, complex_args):
                return self.module(*module_args, **complex_args)[self.hostname]
            pool = ThreadPool()
            result = pool.apply_async(run_module, (module_args, complex_args))
            return pool, result
    
        res = self.module(*module_args, **complex_args)[self.hostname]
    
        if verbose:
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{} Result => {}"\
                .format(filename, function_name, line_number, self.hostname, self.module_name, json.dumps(res)))
        else:
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{} done, is_failed={}, rc={}"\
                .format(filename, function_name, line_number, self.hostname, self.module_name, \
                        res.is_failed, res.get('rc', None)))
    
        if (res.is_failed or 'exception' in res) and not module_ignore_errors:
>           raise RunAnsibleModuleFail("run module {} failed".format(self.module_name), res)
E           RunAnsibleModuleFail: run module command failed, Ansible Results =>
E           {
E               "changed": true, 
E               "cmd": [
E                   "config", 
E                   "vlan", 
E                   "member", 
E                   "add", 
E                   "200", 
E                   "PortChannel0001"
E               ], 
E               "delta": "0:00:00.542122", 
E               "end": "2021-07-13 01:00:09.279728", 
E               "failed": true, 
E               "invocation": {
E                   "module_args": {
E                       "_raw_params": "config vlan member add  200 PortChannel0001", 
E                       "_uses_shell": false, 
E                       "argv": null, 
E                       "chdir": null, 
E                       "creates": null, 
E                       "executable": null, 
E                       "removes": null, 
E                       "stdin": null, 
E                       "stdin_add_newline": true, 
E                       "strip_empty_ends": true, 
E                       "warn": true
E                   }
E               }, 
E               "msg": "non-zero return code", 
E               "rc": 2, 
E               "start": "2021-07-13 01:00:08.737606", 
E               "stderr": "Usage: config vlan member add [OPTIONS] <vid> port\nTry \"config vlan member add -h\" for help.\n\nError: PortChannel0001 is a router interface!", 
E               "stderr_lines": [
E                   "Usage: config vlan member add [OPTIONS] <vid> port", 
E                   "Try \"config vlan member add -h\" for help.", 
E                   "", 
E                   "Error: PortChannel0001 is a router interface!"
E               ], 
E               "stdout": "", 
E               "stdout_lines": []
E           }

complex_args = {}
filename   = '/var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices/multi_asic.py'
function_name = '_run_on_asics'
index      = 0
line_number = 89
lines      = ['            return getattr(self.sonichost, self.multi_asic_attr)(*module_args, **complex_args)\n']
module_args = ('config vlan member add  200 PortChannel0001',)
module_async = False
module_ignore_errors = False
previous_frame = <frame object at 0x7f69840e0a50>
res        = {'stderr_lines': [u'Usage: config vlan member add [OPTIONS] <vid> port', u'Try...: [], u'start': u'2021-07-13 01:00:08.737606', u'msg': u'non-zero return code'}
self       = <SonicHost> str2-7050cx3-acs-01
verbose    = True

common/devices/base.py:89: RunAnsibleModuleFail
@theasianpianist theasianpianist added Dual ToR Platform ♊ Issues found on dual ToR platforms Issue for 202012 labels Jul 7, 2021
@theasianpianist theasianpianist self-assigned this Jul 7, 2021
@akokhan
Copy link
Contributor

akokhan commented Jul 8, 2021

We see the same issue on SONiC master - VLAN TCs are failing.

This is because of IP add/remove validation added by sonic-net/sonic-utilities#1414 . The validate_ip_mask() (https://github.com/Azure/sonic-utilities/blob/888701b67fd4f1cc5b9da534a360048f93f263f4/config/main.py#L807) not just validate user-provided IP but also can modify it.

E.g., "FC00::71/126" will be converted to "fc00::71/126", "10.10.10.002/24" to "10.10.10.2/24".

As a result, sudo config interface ip remove PortChannel0001 FC00::71/126 fails because Redis key was created from config_db.json as PORTCHANNEL_INTERFACE|PortChannel0001|FC00::71/126 but we are trying to access PORTCHANNEL_INTERFACE|PortChannel0001|fc00::71/126 (config command converts FC to fc).

Not sure what's the right way to fix this.
I've created the following PR which still may not work for all cases: sonic-net/sonic-utilities#1709

Probably we should update validate_ip_mask() just to make it validate IP but not modify it on create/remove...

Another option, just to revert original PR for now...

@d-dashkov, please comment.

@akokhan
Copy link
Contributor

akokhan commented Jul 8, 2021

@lguohan , @prsunny , @jleveque , how would you suggest to proceed?

@d-dashkov
Copy link
Contributor

@akokhan A case with IP address transformation was added to resolve this issue #6776 and validate_ip_mask() should check the mask and IPs, but additionally it turned out that ipaddress.ip_address() could correct similar cases "10.10.10.002/24" to "10.10.10.2/24 ". If this causes problems, I propose to slightly change validate_ip_mask() and return 0 instead of changing the IP.

@akokhan
Copy link
Contributor

akokhan commented Jul 9, 2021

@d-dashkov , thanks for providing details. I believe that the function that intent to validate input parameters should not modify them. In this case, the IP addresses are used as a part of Redis keys. So, the impact can be unexpected like in case of VLAN TCs failure. So, as for me, it totally make sense to modify validate_ip_mask() to return 0 instead of changing the IP.
Could you fix it proper way? Thanks.

@akokhan
Copy link
Contributor

akokhan commented Jul 13, 2021

PR with the fix under review: sonic-net/sonic-utilities#1709

@theasianpianist
Copy link
Contributor Author

@akokhan could you share the failure messages you were seeing for the VLAN tests prior to the fix? Want to check if it has the same symptoms as the failures I am seeing.

@akokhan
Copy link
Contributor

akokhan commented Jul 14, 2021

@akokhan could you share the failure messages you were seeing for the VLAN tests prior to the fix? Want to check if it has the same symptoms as the failures I am seeing.

@theasianpianist , will do
meanwhile, could you please update the issue with the description? Thanks.

@theasianpianist
Copy link
Contributor Author

@akokhan I believe I am seeing the same issue that is described here and in the linked PR, as I see IPv6 addresses not being deleted after running the deletion CLI command. Thanks for your assistance on this issue.

@akokhan
Copy link
Contributor

akokhan commented Jul 15, 2021

@theasianpianist , thank you for providing details. I see the issue with exactly the same logs.

The problem is that IPv6 address was added through config_db.json as FC00::71/126 and when we remove through config command, the address is converted to fc00::71/126. So, the RIF is not get removed. You may want to try sonic-net/sonic-utilities#1709 fix.

@theasianpianist theasianpianist removed the Dual ToR Platform ♊ Issues found on dual ToR platforms label Jul 21, 2021
@yxieca yxieca closed this as completed Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants