-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sonic Static Port Channel Support #1039
base: master
Are you sure you want to change the base?
Conversation
| 0.2 | 06/20/2022 | Kannan Selvaraj | Updated load-balance round-robin | | ||
| | | | show command display as STATIC | | ||
| | | | CONFIG_DB flag is changed from | | ||
| | | | static to on | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What consideration do you change "static" to "on"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
Took the Cisco CLI concept here.
on - Always port will be bundled into the group.
active - Wait for LACP protocol transaction to bundle the port into the group.
https://community.cisco.com/t5/switching/lacp-or-not-channel-group-1-mode-passive-active-or-on/td-p/772834
https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/interfaces/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_Interfaces_Configuration_Guide_7x/configuring_port_channels.pdf (page 18)
Thanks,
Kannan.S
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CLI command "config portchannel add PortChannel --static=true"
It is recommended to keep the original "static" naming in Rev 0.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure will change accordingly and update the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with chenkelly.
By default, there are not portchannels present in the system. Any newly created portchannel is Dynamic by default unless we explicitly distinguish using an option while portchannel creation.
The 'on' keyword is mainly used in Cisco CLI while adding a member port to portchannel. We may use the 'on' keyword for KLISH CLI; but in CLICK , we can continue with the '--static=true' option.
And also, please consider using the field-value pair "static":"true" in the CONFIG_DB entry "PORTCHANNEL|PortChannel1" instead of "on":"true".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure Madhukar will modify accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please consider using the --static=true for CLI static LAG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure Chenkelly accepted
|
||
## 1.2 Design Overview | ||
### 1.2.1 Basic Approach | ||
A Static Port Channel has its member ports always available for traffic as long the links are up. All ports are mandated to have the same speed to become members of a Port Channel. Teamd uses round-robin for a Static Port Channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid using round-robin as discussed. If there is a bug in teamd we should raise an issue with teamd. Both LACP and static should use same hashing mechanism. If required we can expose hashing mechanism through CLI to the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Kannan,
The commit jpirko/libteam@deadb5b caused the issue of 100% cpu.
The revert commit jpirko/libteam@61efd6d has the details about the issue that you were mentioning in today's meeting.
You may check which libteam version is used in your testing and fix the above issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sudharsan,
Regarding exposing the hashing mechanism through CLI: the traffic hashing/distribution among the portchannel members can happen at 2 instances - (1) Kernel (2) Merchant Silicon.
Libteam gives provision to configure hashing at a kernel level.
But hashing/distribution at a silicon level is typically vendor specific. Achieving the same traffic distribution at kernel and silicon may not be possible as all the silicon may not have the exact match in the hashing algorithms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes will try loadbalance with fix, updated accordingly in the document
### 3.2.1 CONFIG DB | ||
To support the static mode for a Port Channel, the PORTCHANNEL table is modified to add a new key-value pair where the value is a "true" or a "false". | ||
``` | ||
"PORTCHANNEL|PortChannel0001": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the yang model as required. If you need to impose restrictions between switching lacp vs static it has to be done through yang models too apart from CLI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Dgsudharsan,
Its already present in openconfig-if-aggregate.yang
typedef aggregation-type {
type enumeration {
enum LACP {
description "LAG managed by LACP";
}
enum STATIC {
description "Statically configured bundle / LAG";
}
}
description
"Type to define the lag-type, i.e., how the LAG is
defined and managed";
}
Thanks,
Kannan.S
S - selected, D - deselected, * - not synced | ||
No. Team Dev Protocol Ports | ||
----- --------------- ------------- -------------- | ||
0001 PortChannel0001 STATIC(A)(Up) Ethernet112(S) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically the 'Protocol' column field specifies protocol used for the portchannel.
For Dynamic portchannel, 'LACP' keyword is good.
For Static portchannel, 'NONE' keyword sounds more logical and would be inline with NOS vendor Cisco.
Note: I see that Arista uses 'Static' keyword though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@kselvaraj2 can you please add the code PRs into this HLD PR by referring to #806 ? Please help to address the comments from the reviewers. Thanks. |
@chenkelly @dgsudharsan @madhukar-kamarapu can you please help to approve this PR if you are ok? I will merge it after your approval. Thanks. |
@kselvaraj2 can you please sign the EasyCLA which is required to merge your PR? Also can you please address the comments? Not sure which one has been addressed and which one is still open. Thanks. |
@kselvaraj2 @zhangyanzhao what is the status of this pr? when do you expect to merge it? |
@kselvaraj2 @zhangyanzhao please provide status |
Why ? Static lag support changes in SWSS module sonic-net/SONiC#1039 Patch explanation 1. static lag supported with option roundrobin. 2. tlm_teamd -> teamdctl changes to handle json dump for static lag. 3. test cases -> updated UT:- Test cases 1 Create static port channel with static flag pass pass 2 verify static has option flag true or false pass pass 3 Add static member see the portchannel is up pass pass 4 verify teamd is created with round-robin option by default pass pass 5 Remove last portchannel member check port channel down pass pass 6 Remove portchannel member check port channel still up pass pass 7 verify teamdctl config dump pass pass 8 verify teamdctl state dump pass pass 9 shutdown the portchannel check the kernel state pass pass 10 no shutdown the portchannel check the kernel state pass pass 11 "Check the show output matches the review comment root@sonic:~# show inter port Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected, * - not synced No. Team Dev Protocol Ports ----- ------------- ----------- ----------------------------------------- 01 PortChannel01 LACP(A)(Up) Ethernet16(S) 02 PortChannel02 NONE(A)(Up) Ethernet48(S) Ethernet64(S) Ethernet32(S) root@sonic:~# 12 teamnl is set to roundrobin pass pass 13 save and reload and verify portchannel is up pass pass 14 "docker restart teamd teamd stopped swss stopped syncd stopped swss started syncd started teamd started" pass pass 15 warm-reboot fails even without any port channel config fail 16 verify teamd settles doesnt hog cpu with 100% cpu usage pass 17 "trying with static port channel config on non supported branches port channel will be configured as LACP." pass Not Supported Options 1. Min links and 2. fall back are not supported
Why ? Static lag support changes in SWSS module sonic-net/SONiC#1039 Patch explanation 1. static lag supported with option roundrobin. 2. tlm_teamd -> teamdctl changes to handle json dump for static lag. 3. test cases -> updated UT:- Test cases 1 Create static port channel with static flag pass pass 2 verify static has option flag true or false pass pass 3 Add static member see the portchannel is up pass pass 4 verify teamd is created with round-robin option by default pass pass 5 Remove last portchannel member check port channel down pass pass 6 Remove portchannel member check port channel still up pass pass 7 verify teamdctl config dump pass pass 8 verify teamdctl state dump pass pass 9 shutdown the portchannel check the kernel state pass pass 10 no shutdown the portchannel check the kernel state pass pass 11 "Check the show output matches the review comment root@sonic:~# show inter port Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected, * - not synced No. Team Dev Protocol Ports ----- ------------- ----------- ----------------------------------------- 01 PortChannel01 LACP(A)(Up) Ethernet16(S) 02 PortChannel02 NONE(A)(Up) Ethernet48(S) Ethernet64(S) Ethernet32(S) root@sonic:~# 12 teamnl is set to roundrobin pass pass 13 save and reload and verify portchannel is up pass pass 14 "docker restart teamd teamd stopped swss stopped syncd stopped swss started syncd started teamd started" pass pass 15 warm-reboot fails even without any port channel config fail 16 verify teamd settles doesnt hog cpu with 100% cpu usage pass 17 "trying with static port channel config on non supported branches port channel will be configured as LACP." pass Not Supported Options 1. Min links and 2. fall back are not supported
Why ? Static lag support changes in SWSS module sonic-net/SONiC#1039 Patch explanation 1. static lag supported with option roundrobin. 2. config and show command support 3. test cases -> updated CLI config config portchannel add PortChannel04 --static true root@sonic:~# show inter port Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected, * - not synced No. Team Dev Protocol Ports ----- ------------- ----------- ----------------------------------------- 01 PortChannel01 LACP(A)(Up) Ethernet16(S) 02 PortChannel02 NONE(A)(Up) Ethernet48(S) Ethernet64(S) Ethernet32(S) root@sonic:~# UT:- Test cases 1 Create static port channel with static flag pass pass 2 verify static has option flag true or false pass pass 3 Add static member see the portchannel is up pass pass 4 verify teamd is created with round-robin option by default pass pass 5 Remove last portchannel member check port channel down pass pass 6 Remove portchannel member check port channel still up pass pass 7 verify teamdctl config dump pass pass 8 verify teamdctl state dump pass pass 9 shutdown the portchannel check the kernel state pass pass 10 no shutdown the portchannel check the kernel state pass pass 11 "Check the show output matches the review comment root@sonic:~# show inter port Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected, * - not synced No. Team Dev Protocol Ports ----- ------------- ----------- ----------------------------------------- 01 PortChannel01 LACP(A)(Up) Ethernet16(S) 02 PortChannel02 NONE(A)(Up) Ethernet48(S) Ethernet64(S) Ethernet32(S) root@sonic:~# 12 teamnl is set to roundrobin pass pass 13 save and reload and verify portchannel is up pass pass 14 "docker restart teamd teamd stopped swss stopped syncd stopped swss started syncd started teamd started" pass pass syncd started teamd started" pass pass 15 warm-reboot fails even without any port channel config fail 16 verify teamd settles doesnt hog cpu with 100% cpu usage pass 17 "trying with static port channel config on non supported branches port channel will be configured as LACP." pass Not Supported Options 1. Min links and 2. fall back are not supported
This document provides design to Support Port Channel in Static mode in SONIC OS. Static Port Channel member ports are always available for traffic transmission and reception if the ports links are up. Port types and features are supported in the same way as SONiC LAGs today. MLAG should seamlessly work in Static Mode.