Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[swss] L2 Forwarding Enhancements #1716

Merged
merged 15 commits into from
Nov 20, 2021
Merged

[swss] L2 Forwarding Enhancements #1716

merged 15 commits into from
Nov 20, 2021

Conversation

dr412113
Copy link
Contributor

What I did

  • Invoke flush Fdb entries per port/vlan
  • Instead of maintaining the vlan-members inside Port structure, added a new portVlanmap for vlan members.
    /* Performance issue are seen during mac event processing with large number of vlans on a
  • port. Large data should not be added to port structure. */
  • created a portOidToName map to improve performace
  • delay lag delete until bridge port is deleted
  • delay bridge port removal until all asociated fdb entries are
    removed. delete bridge port from portsorch instead of fdborch

Why I did it

To improve FDB performance.

How I verified it

Verified all mac operations.

Details if related

This reverts commit 047841e.
- Invoke flush Fdb entries per port/vlan
- Instead of maintaining the vlan-members inside Port structure, added a new portVlanmap for vlan members.
 /* Performance issue are seen during mac event processing with large number of vlans on a
 * port. Large data should not be added to port structure. */
- created a portOidToName map to improve performace
- delay lag delete until bridge port is deleted
- delay bridge port removal until all asociated fdb entries are
removed. delete bridge port from portsorch instead of fdborch
@lguohan lguohan requested a review from prsunny April 24, 2021 19:52
@lguohan
Copy link
Contributor

lguohan commented May 12, 2021

@prsunny , can you take a look at this?

orchagent/port.h Outdated
* port. Large data should not be added to this
* structure.
*/
//vlan_members_t m_vlan_members;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the commented section. You can mention it in the PR description.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
auto itr = portOidToName.find(id);
if (itr == portOidToName.end())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

auto itr = portOidToName.find(id);
if (itr == portOidToName.end())
return false;
else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ -> next line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
else
/* Cannot locate the VLAN */
if ((ret == true) && m_portVlanMember[port.m_alias].empty())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the logic changed here?. Just keep the previous one and use m_portVlanMember instead of port.m_vlan_members

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also checking the return value of removeBridgePort() and retry if it returns false

@@ -3671,6 +3646,36 @@ bool PortsOrch::addBridgePort(Port &port)

if (port.m_bridge_port_id != SAI_NULL_OBJECT_ID)
{
/* If the port is being added to the first VLAN,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite understand this section. Is this for the previous bridge port? In my understanding, this API is for the first bridge port addition, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous bridge port delete is still not done, as fdb flush is in progress. But the port is added to a vlan again and now we want to reuse the previous bridgeport, so just make the admin state up.
When removing bridgeport, if fdb entries exist, we just disable admin state of the bridgeport and keep retrying until no more references exist before deleting the bridgeport.


sai_status_t status = sai_bridge_api->set_bridge_port_attribute(port.m_bridge_port_id, &attr);
if (status != SAI_STATUS_SUCCESS)
if (port.m_bridge_port_admin_state)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this flow, in which case the m_bridge_port_admin_state would be false?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bridgeport deletion is not done if fdb entries exist, only admin state is set to down. bridgeport deletion will be retried and deleted when fdb entries are deleted.

@@ -3876,21 +3888,31 @@ bool PortsOrch::addVlan(string vlan_alias)
}
}

SWSS_LOG_NOTICE("Create an empty VLAN %s vid:%hu", vlan_alias.c_str(), vlan_id);
SWSS_LOG_NOTICE("Create an empty VLAN %s vid:%hu vlan_oid 0x%lx", vlan_alias.c_str(), vlan_id, (long unsigned int)vlan_oid);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use PRIx64. You can refer other logs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@prsunny
Copy link
Collaborator

prsunny commented Nov 3, 2021

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Collaborator

prsunny commented Nov 4, 2021

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@adyeung
Copy link

adyeung commented Nov 5, 2021

@prsunny please help complete the review and merge

Copy link
Collaborator

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

{
getPort(itr->second, port);
return true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls fix alignment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

auto itr = portOidToName.find(bridge_port_id);
if (itr == portOidToName.end())
return false;
else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ -> new line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return true;
}
auto itr = portOidToName.find(bridge_port_id);
if (itr == portOidToName.end())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@prsunny
Copy link
Collaborator

prsunny commented Nov 9, 2021

@dgsudharsan , can you please review?

@@ -4236,6 +4217,36 @@ bool PortsOrch::addBridgePort(Port &port)

if (port.m_bridge_port_id != SAI_NULL_OBJECT_ID)
{
/* If the port is being added to the first VLAN,
* set bridge port admin status to UP.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't the iterator in the remove_vlan member come back and delete this since it loops until last FDB is removed and then does remove_bridge_port?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @dgsudharsan that there could be race condition since the removeBridgePort is not complete and the iterator is retried for removal after this addition. I think this logic is slightly complicated. Since the main motivation is to separate the vlan member map from the Port class, could you please make that change and get it merged for this release. We can have the bridge port handling as a different PR.

@@ -4894,6 +4931,16 @@ bool PortsOrch::addLag(string lag_alias, uint32_t spa_id, int32_t switch_id)
{
SWSS_LOG_ENTER();

auto lagport = m_portList.find(lag_alias);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments. I believe this is the scenario where removeBridgePort is not yet successful due to fdb flush right?

if (!setHostIntfsStripTag(port, SAI_HOSTIF_VLAN_TAG_KEEP))
{
SWSS_LOG_ERROR("Failed to set %s for hostif of port %s",
hostif_vlan_tag[SAI_HOSTIF_VLAN_TAG_KEEP], port.m_alias.c_str());
return false;
}
m_portList[port.m_alias] = port;
portOidToName[port.m_bridge_port_id] = port.m_alias;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we saving both port object id to alias an bridge pot id to alias in data structure? It also has vlan and lag. Shouldn't we rename this to something like saiOidToAlias?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. will change the name to saiOidToAlias.

prsunny
prsunny previously approved these changes Nov 16, 2021
@prsunny
Copy link
Collaborator

prsunny commented Nov 16, 2021

@dr412113 , @anilkpan , @anilkpandey , please check the compilation issue.

@anilkpan
Copy link
Contributor

@prsunny the compilation error does not seem to be related to any changes in this PR. There is no change in aclorch.
Not sure why only the armhf build is failing.

             from aclorch.h:9,
             from aclorch.cpp:5:

/usr/include/c++/8/bits/stl_map.h: In member function 'virtual void AclRuleDTelFlowWatchListEntry::update(SubjectType, void*)':
/usr/include/c++/8/bits/stl_map.h:518:8: note: parameter passing for argument of type 'std::_Rb_tree<_sai_acl_entry_attr_t, std::pair<const _sai_acl_entry_attr_t, _sai_attribute_value_t>, std::_Select1st<std::pair<const _sai_acl_entry_attr_t, _sai_attribute_value_t> >, std::less<_sai_acl_entry_attr_t>, std::allocator<std::pair<const _sai_acl_entry_attr_t, _sai_attribute_value_t> > >::const_iterator' {aka 'std::_Rb_tree_const_iterator<std::pair<const _sai_acl_entry_attr_t, _sai_attribute_value_t> >'} changed in GCC 7.1
__i = _M_t._M_emplace_hint_unique(__i, std::piecewise_construct,
/usr/include/c++/8/bits/stl_map.h: In member function 'bool AclOrch::updateAclTablePorts(AclTable&, AclTable&)':
/usr/include/c++/8/bits/stl_map.h:499:8: note: parameter passing for argument of type 'std::_Rb_tree<long long unsigned int, std::pair<const long long unsigned int, long long unsigned int>, std::_Select1st<std::pair<const long long unsigned int, long long unsigned int> >, std::less, std::allocator<std::pair<const long long unsigned int, long long unsigned int> > >::const_iterator' {aka 'std::_Rb_tree_const_iterator<std::pair<const long long unsigned int, long long unsigned int> >'} changed in GCC 7.1
__i = _M_t._M_emplace_hint_unique(__i, std::piecewise_construct,
make[3]: Leaving directory '/__w/4/s/orchagent'
make[2]: *** [Makefile:416: all-recursive] Error 1

@prsunny
Copy link
Collaborator

prsunny commented Nov 16, 2021

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

dgsudharsan
dgsudharsan previously approved these changes Nov 16, 2021
@prsunny
Copy link
Collaborator

prsunny commented Nov 16, 2021

not sure why its consistently failing for this PR.. could you please rebase once and lets retry?

@anilkpan anilkpan dismissed stale reviews from dgsudharsan and prsunny via 902a621 November 16, 2021 22:45
@anilkpan
Copy link
Contributor

The GCC 7.1 warnings are not causing the failure, failure is due to a type mismatch in logging. I just fixed and updated.

@anilkpan
Copy link
Contributor

/azpw run

1 similar comment
@anilkpan
Copy link
Contributor

/azpw run

@anilkpan
Copy link
Contributor

vs test passed earlier. The only change in my last commit was to fix a type mismatch in SWSS_LOG.
I tried "/azpw run", but it does not restart the test.

@anilkpan
Copy link
Contributor

/AzurePipelines run

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 1716 in repo Azure/sonic-swss

@anilkpan
Copy link
Contributor

This seems to be the failure:

[2021-11-19 00:47:19] [build-stderr] saiaclschema_ut.cpp:1:10: fatal error: gmock/gmock.h: No such file or directory
[2021-11-19 00:47:19] [build-stderr] 1 | #include <gmock/gmock.h>
[2021-11-19 00:47:19] [build-stderr] | ^~~~~~~~~~~~~~~
[2021-11-19 00:47:19] [build-stderr] compilation terminated.

@prsunny
Copy link
Collaborator

prsunny commented Nov 19, 2021

@anilkpan , retrying now!

@prsunny
Copy link
Collaborator

prsunny commented Nov 19, 2021

@anilkpan , could you please rebase and push?

@anilkpan
Copy link
Contributor

@prsunny Its passing now after rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants