Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed test issues for dualtor topologies. #11921

Merged
merged 21 commits into from
Apr 17, 2024

Conversation

vivekverma-arista
Copy link
Contributor

@vivekverma-arista vivekverma-arista commented Mar 7, 2024

Description of PR

Summary: Fixed test issue in dualtor topology.
Fixes # (issue): https://github.com/aristanetworks/sonic-qual.msft/issues/69

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205
  • 202305
  • 202311

Approach

What is the motivation for this PR?

  1. In some test cases the packets go to unselected ToR in case of active-active links or forwarded by the unselected ToR in case of active-standby links.
  2. Some tests verify queue counters, but this check fails for UC1 in presence of active active links as UC1 always has gRPC traffic flowing through it.

How did you do it?

  1. First issue can be addressed by using the following fixtures in case of active-standby (wherever missing, commit 1) :-
    a) toggle_all_simulator_ports_to_rand_selected_tor
    b) toggle_all_simulator_ports_to_enum_rand_one_per_hwsku_frontend_host_m

    To handle this in case of active-active links new fixtures are introduced (commit 3 & 4)
    a) setup_standby_ports_on_rand_unselected_tor
    b) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m
    c) setup_standby_ports_on_rand_unselected_tor_unconditionally
    d) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m_unconditionally

  2. Second issue can be addressed by simply skipping the check for UC1 in presence of active-active links. (commit 2)

How did you verify/test it?

Verified on Arista 7260 device using dualtor/dualtor-aa topology.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Used `toggle_all_simulator_ports_to_enum_rand_one_per_hwsku_frontend_host_m` to make all the links active of the randomly selected ToR.
@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/bgp/test_bgp_queue.py:4:1: F401 'tests.common.dualtor.dual_tor_common.active_active_ports' imported but unused
tests/bgp/test_bgp_queue.py:32:84: F811 redefinition of unused 'active_active_ports' from line 4
tests/common/dualtor/dual_tor_utils.py:1733:1: E302 expected 2 blank lines, found 1
tests/common/dualtor/dual_tor_utils.py:1741:1: E302 expected 2 blank lines, found 1
tests/common/dualtor/dual_tor_utils.py:1749:1: E302 expected 2 blank lines, found 1
tests/common/dualtor/dual_tor_utils.py:1750:72: F811 redefinition of unused 'active_active_ports' from line 38
tests/common/dualtor/dual_tor_utils.py:1750:91: E231 missing whitespace after ','
tests/common/dualtor/dual_tor_utils.py:1750:121: E501 line too long (133 > 120 characters)
tests/common/dualtor/dual_tor_utils.py:1761:1: E302 expected 2 blank lines, found 1
tests/common/dualtor/dual_tor_utils.py:1768:1: E302 expected 2 blank lines, found 1
tests/common/dualtor/dual_tor_utils.py:1769:88: F811 redefinition of unused 'active_active_ports' from line 38
...
[truncated extra lines, please run pre-commit locally to view full check results]

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

The testcase dualtor/test_tor_ecn.py::test_ecn_during_encap_on_standby is suppose to run on standby ToR however the test is not doing anything to make the randomly selected ToR as standby in case of active-active links as the fixture `toggle_all_simulator_ports_to_rand_unselected_tor` only works for active-standby links.

The proposed solution is to move the fixture `setup_active_active_ports` from dualtor/test_tor_ecn.py, rename it to `setup_standby_ports_on_rand_selected_tor` and let both the tests use it.
UC1 will always have gRPC traffic flowing through if active-active links exist (dualtor-aa/dualtor-mixed), therefore, queue counters check for UC1 should be skipped in case of following tests if active-active links exist.

1. bgp/test_bgp_queue.py
2. dualtor/test_tor_ecn.py
The traffic from active-active mux ports is ECMPed twice:first time on the NiC to choose the ToR, second time on the ToR to choose the uplinks. The NiC ECMP is not within the test scope, and we also cannot guarantee that the traffic is evenly distributed among all the uplinks.

The proposed fix is to configure active-active ports to work in active-standby mode in case of the following tests :-

    1. dhcp_relay/test_dhcp_relay.py
    2. dhcp_relay/test_dhcpv6_relay.py
    3. drop_packets/test_configurable_drop_counters.py
    4. pfcwd/test_pfcwd_function.py
    5. route/test_route_flap.py
    6. snmp/test_snmp_fdb.py
    7. everflow/test_everflow_testbed.py
    8. everflow/test_everflow_ipv6.py
@StormLiangMS StormLiangMS requested a review from lolyu March 8, 2024 05:58
tests/bgp/test_bgp_queue.py Outdated Show resolved Hide resolved
tests/common/dualtor/dual_tor_utils.py Show resolved Hide resolved
@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/qos/test_tunnel_qos_remap.py:245:9: E128 continuation line under-indented for visual indent
tests/qos/test_tunnel_qos_remap.py:354:9: E128 continuation line under-indented for visual indent
tests/qos/test_tunnel_qos_remap.py:432:9: E128 continuation line under-indented for visual indent

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

Copy link
Contributor

@lolyu lolyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@StormLiangMS
Copy link
Collaborator

hi @vivekverma-arista could I ask your help to validate these changes with T0/T1? to avoid regression. pls verify with 202305 image also.

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing tests/dhcp_relay/test_dhcpv6_relay.py

check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Passed
flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing tests/dhcp_relay/test_dhcpv6_relay.py

check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Passed
flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Apr 17, 2024
What is the motivation for this PR?
In some test cases the packets go to unselected ToR in case of active-active links or forwarded by the unselected ToR in case of active-standby links.
Some tests verify queue counters, but this check fails for UC1 in presence of active active links as UC1 always has gRPC traffic flowing through it.
How did you do it?
First issue can be addressed by using the following fixtures in case of active-standby (wherever missing, commit 1) :-
a) toggle_all_simulator_ports_to_rand_selected_tor
b) toggle_all_simulator_ports_to_enum_rand_one_per_hwsku_frontend_host_m

To handle this in case of active-active links new fixtures are introduced (commit 3 & 4)
a) setup_standby_ports_on_rand_unselected_tor
b) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m
c) setup_standby_ports_on_rand_unselected_tor_unconditionally
d) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m_unconditionally

Second issue can be addressed by simply skipping the check for UC1 in presence of active-active links. (commit 2)

How did you verify/test it?
Verified on Arista 7260 device using dualtor/dualtor-aa topology.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #12488

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Apr 17, 2024
What is the motivation for this PR?
In some test cases the packets go to unselected ToR in case of active-active links or forwarded by the unselected ToR in case of active-standby links.
Some tests verify queue counters, but this check fails for UC1 in presence of active active links as UC1 always has gRPC traffic flowing through it.
How did you do it?
First issue can be addressed by using the following fixtures in case of active-standby (wherever missing, commit 1) :-
a) toggle_all_simulator_ports_to_rand_selected_tor
b) toggle_all_simulator_ports_to_enum_rand_one_per_hwsku_frontend_host_m

To handle this in case of active-active links new fixtures are introduced (commit 3 & 4)
a) setup_standby_ports_on_rand_unselected_tor
b) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m
c) setup_standby_ports_on_rand_unselected_tor_unconditionally
d) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m_unconditionally

Second issue can be addressed by simply skipping the check for UC1 in presence of active-active links. (commit 2)

How did you verify/test it?
Verified on Arista 7260 device using dualtor/dualtor-aa topology.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202311: #12489

mssonicbld pushed a commit that referenced this pull request Apr 17, 2024
What is the motivation for this PR?
In some test cases the packets go to unselected ToR in case of active-active links or forwarded by the unselected ToR in case of active-standby links.
Some tests verify queue counters, but this check fails for UC1 in presence of active active links as UC1 always has gRPC traffic flowing through it.
How did you do it?
First issue can be addressed by using the following fixtures in case of active-standby (wherever missing, commit 1) :-
a) toggle_all_simulator_ports_to_rand_selected_tor
b) toggle_all_simulator_ports_to_enum_rand_one_per_hwsku_frontend_host_m

To handle this in case of active-active links new fixtures are introduced (commit 3 & 4)
a) setup_standby_ports_on_rand_unselected_tor
b) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m
c) setup_standby_ports_on_rand_unselected_tor_unconditionally
d) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m_unconditionally

Second issue can be addressed by simply skipping the check for UC1 in presence of active-active links. (commit 2)

How did you verify/test it?
Verified on Arista 7260 device using dualtor/dualtor-aa topology.
mssonicbld pushed a commit that referenced this pull request Apr 17, 2024
What is the motivation for this PR?
In some test cases the packets go to unselected ToR in case of active-active links or forwarded by the unselected ToR in case of active-standby links.
Some tests verify queue counters, but this check fails for UC1 in presence of active active links as UC1 always has gRPC traffic flowing through it.
How did you do it?
First issue can be addressed by using the following fixtures in case of active-standby (wherever missing, commit 1) :-
a) toggle_all_simulator_ports_to_rand_selected_tor
b) toggle_all_simulator_ports_to_enum_rand_one_per_hwsku_frontend_host_m

To handle this in case of active-active links new fixtures are introduced (commit 3 & 4)
a) setup_standby_ports_on_rand_unselected_tor
b) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m
c) setup_standby_ports_on_rand_unselected_tor_unconditionally
d) setup_standby_ports_on_non_enum_rand_one_per_hwsku_frontend_host_m_unconditionally

Second issue can be addressed by simply skipping the check for UC1 in presence of active-active links. (commit 2)

How did you verify/test it?
Verified on Arista 7260 device using dualtor/dualtor-aa topology.
@vivekverma-arista vivekverma-arista deleted the dualtor-rand-dut-fix branch April 18, 2024 05:59
StormLiangMS pushed a commit that referenced this pull request Apr 26, 2024
Testcases fail in drop_packets/test_configurable_drop_counters.py fail in dualtor-aa because some of the traffic goes to the unselected ToR.

How did you do it?
Similar approach is taken as in #11921

How did you verify/test it?
Verified on Arista 7260 and Arista 7050 with both dualtor and dualtor-aa topology.
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request May 16, 2024
Testcases fail in drop_packets/test_configurable_drop_counters.py fail in dualtor-aa because some of the traffic goes to the unselected ToR.

How did you do it?
Similar approach is taken as in sonic-net#11921

How did you verify/test it?
Verified on Arista 7260 and Arista 7050 with both dualtor and dualtor-aa topology.
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request May 16, 2024
Testcases fail in drop_packets/test_configurable_drop_counters.py fail in dualtor-aa because some of the traffic goes to the unselected ToR.

How did you do it?
Similar approach is taken as in sonic-net#11921

How did you verify/test it?
Verified on Arista 7260 and Arista 7050 with both dualtor and dualtor-aa topology.
mssonicbld pushed a commit that referenced this pull request May 16, 2024
Testcases fail in drop_packets/test_configurable_drop_counters.py fail in dualtor-aa because some of the traffic goes to the unselected ToR.

How did you do it?
Similar approach is taken as in #11921

How did you verify/test it?
Verified on Arista 7260 and Arista 7050 with both dualtor and dualtor-aa topology.
mssonicbld pushed a commit that referenced this pull request May 16, 2024
Testcases fail in drop_packets/test_configurable_drop_counters.py fail in dualtor-aa because some of the traffic goes to the unselected ToR.

How did you do it?
Similar approach is taken as in #11921

How did you verify/test it?
Verified on Arista 7260 and Arista 7050 with both dualtor and dualtor-aa topology.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants