-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PFC watchdog not getting lossless TC #876
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
lguohan
reviewed
May 8, 2019
lguohan
reviewed
May 8, 2019
renukamanavalan
pushed a commit
that referenced
this pull request
Oct 14, 2019
* Allow PFC watchdog to retry start on port Signed-off-by: Wenda Ni <[email protected]> * Specify the qos mapping order in doTask() to avoid retry Signed-off-by: Wenda Ni <[email protected]> * Remove debugging symbols Signed-off-by: Wenda Ni <[email protected]> * Reduce log level to NOTICE for empty lossless TC on a port Signed-off-by: Wenda Ni <[email protected]>
yxieca
pushed a commit
that referenced
this pull request
Nov 1, 2019
* Allow PFC watchdog to retry start on port Signed-off-by: Wenda Ni <[email protected]> * Specify the qos mapping order in doTask() to avoid retry Signed-off-by: Wenda Ni <[email protected]> * Remove debugging symbols Signed-off-by: Wenda Ni <[email protected]> * Reduce log level to NOTICE for empty lossless TC on a port Signed-off-by: Wenda Ni <[email protected]>
EdenGri
pushed a commit
to EdenGri/sonic-swss
that referenced
this pull request
Feb 28, 2022
Signed-off-by: Nazarii Hnydyn <[email protected]>
oleksandrivantsiv
pushed a commit
to oleksandrivantsiv/sonic-swss
that referenced
this pull request
Mar 1, 2023
Signed-off-by: Wenda Ni <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What I did
Fix PFC watchdog not getting lossless TCs. This will cause the PFC watchdog not getting started properly if enabled by default. A patch that hard-codes the TCs was used. With this fix, the patch can be removed.
More importantly, it will pave the road to later turn at run time on/off PFC per priority, which may need pfc watchdog to be on/off adjusted accordingly.
This pr is an improvement over #795
Two techniques are used:
1 ) Allow PFC watchdog startWdOnPort to retry when port lossless TC is empty
2 ) Specify port qos map to be applied last in QosOrch::doTask(), as it must wait for all qos mapping profiles to be ready before rolling out to a port
Log error and return if no lossless TC is found when registering pfcwd stats on a port
Why I did it
How I verified it
Tested on brcm dut with both cold- and warm-reboot
Also tested with pfcwd warm-reboot test sonic-net/sonic-mgmt#834
Details if related
With technique 1), we can see retry logic in both cold- and warm-reboot because qos map is not ready to apply to ports due to missing qos map profile. And PFC watchdog relies on the pfc_mask info that is written by the qos map installation on a port.
3558:May 3 21:32:49.098087 str--acs-1 ERR swss#orchagent: :- registerInWdDb: No lossless TC found on port Ethernet0
3559-May 3 21:32:49.098087 str--acs-1 ERR swss#orchagent: :- createEntry: Failed to start PFC Watchdog on port Ethernet0
3560-May 3 21:32:49.098087 str--acs-1 ERR swss#orchagent: :- doTask: Failed to process PFC watchdog SET task, retry it
3561:May 3 21:32:49.098087 str--acs-1 ERR swss#orchagent: :- registerInWdDb: No lossless TC found on port Ethernet100
3562-May 3 21:32:49.098087 str--acs-1 ERR swss#orchagent: :- createEntry: Failed to start PFC Watchdog on port Ethernet100
3563-May 3 21:32:49.098087 str--acs-1 ERR swss#orchagent: :- doTask: Failed to process PFC watchdog SET task, retry it
3564:May 3 21:32:49.098163 str--acs-1 ERR swss#orchagent: :- registerInWdDb: No lossless TC found on port Ethernet104
3565-May 3 21:32:49.098163 str--acs-1 ERR swss#orchagent: :- createEntry: Failed to start PFC Watchdog on port Ethernet104
3566-May 3 21:32:49.098163 str--acs-1 ERR swss#orchagent: :- doTask: Failed to process PFC watchdog SET task, retry it
With technique 2), we specify the config execution order in QosOrch to defer port qos mapping to be the last to process. We see no startWdOnPort retry in PFC watchdog in either cold- and warm-reboot. In the warm-reboot case, PFC watchdog port start can finish in iteration 0 of doTask(), while it can only finish in iteration 1 of doTask() without 2).
May 4 00:21:10.449033 str--acs-1 NOTICE swss#orchagent: :- warmRestoreAndSyncUp: OrchDaemon::warmRestoreAndSyncUp: doTask iteration #0
May 4 00:21:11.820573 str--acs-1 NOTICE swss#orchagent: :- processWorkItem: Created [DSCP_TO_TC_MAP:AZURE]
May 4 00:21:11.821866 str--acs-1 NOTICE swss#orchagent: :- processWorkItem: Created [MAP_PFC_PRIORITY_TO_QUEUE:AZURE]
May 4 00:21:11.825710 str--acs-1 NOTICE swss#orchagent: :- handleSchedulerTable: Created [SCHEDULER:scheduler.0]
May 4 00:21:11.825913 str--acs-1 NOTICE swss#orchagent: :- handleSchedulerTable: Created [SCHEDULER:scheduler.1]
May 4 00:21:11.827262 str--acs-1 NOTICE swss#orchagent: :- processWorkItem: Created [TC_TO_PRIORITY_GROUP_MAP:AZURE]
May 4 00:21:11.828551 str--acs-1 NOTICE swss#orchagent: :- processWorkItem: Created [TC_TO_QUEUE_MAP:AZURE]
May 4 00:21:11.828966 str--acs-1 NOTICE swss#orchagent: :- addQosItem: Called create_wred() to create wred profile: oid:0x130000000009cd
May 4 00:21:11.828966 str--acs-1 NOTICE swss#orchagent: :- processWorkItem: Created [WRED_PROFILE:AZURE_LOSSLESS]
May 4 00:21:11.853372 str--acs-1 NOTICE swss#orchagent: :- handlePortQosMapTable: Applied QoS maps to ports
May 4 00:21:14.483174 str--acs-1 NOTICE swss#orchagent: :- createEntry: Started PFC Watchdog on port Ethernet8
May 4 00:21:14.483471 str--acs-1 ERR swss#orchagent: :- doTask: Succeeded PFC watchdog SET task
May 4 00:21:14.487775 str--acs-1 NOTICE swss#orchagent: :- createBindAclTableGroup: Called create_acl_table_group() to create egress ACL table group: oid:0xb000
000000a2b