-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
portsorch: initial support for link-training and remote advertisement #2149
Conversation
orchagent/portsorch.cpp
Outdated
@@ -296,9 +317,11 @@ static bool isValidPortTypeForLagMember(const Port& port) | |||
PortsOrch::PortsOrch(DBConnector *db, DBConnector *stateDb, vector<table_name_with_pri_t> &tableNames, DBConnector *chassisAppDb) : | |||
Orch(db, tableNames), | |||
m_portStateTable(stateDb, STATE_PORT_TABLE_NAME), | |||
m_stateLinkTrainingTable(stateDb, STATE_LINK_TRAINING_TABLE_NAME), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a separate link training table? Why can't we add two fields to existing port table in State DB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's now merged into 'link_training_status' of PORT_TABLE in STATE_DB
orchagent/portsorch.cpp
Outdated
|
||
auto executor = new ExecutableTimer(m_timer, this, "PORT_STATUS_POLL"); | ||
Orch::addExecutor(executor); | ||
m_timer->start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timer shouldn't be enabled if the platform doesn't support link training or there is not link training config present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be migrated to use the implementation in #2215, which is for AutoNeg, and it will only activate the poll when the port is admin up, link_training feature is available and is enabled.
orchagent/portsorch.cpp
Outdated
sai_attribute_t attr; | ||
|
||
attr.id = SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE; | ||
status = sai_port_api->set_port_attribute(port.m_port_id, &attr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this logic. This is not the way to query and understand capabilities. We have separate API sai_query_attribute_capability. Please use that API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in #2215, sai_query_attribute_capability does not support per-port basis capability query, while the AN/LT may only be available on certain ports on the switch ASIC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the API used should have been get_port_attribute instead of set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, sorry about that, I thought I did this, but it's on the AutoNeg PR.
orchagent/portsorch.cpp
Outdated
{ | ||
string supp; | ||
|
||
if (m_stateLinkTrainingTable.hget(port.m_alias, "supported", supp)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use internal Port data structure to cache the capability rather than writing to DB and reading from it within the same process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now cached as member field of the internal Port data structure
@@ -112,6 +113,26 @@ static map<string, int> autoneg_mode_map = | |||
{ "off", 0 } | |||
}; | |||
|
|||
static map<string, int> link_training_mode_map = | |||
{ | |||
{ "on", 1 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use a bool instead of int? This will make code much simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to distinguish 'NOT CONFIURED' and 'OFF', we need -1 for this, and this is the same approach for the AutoNeg in the current implementation of SONiC
orchagent/portsorch.cpp
Outdated
} | ||
continue; | ||
} | ||
m_stateLinkTrainingTable.hset(p.m_alias, "status", lt > 0 ? "on" : "off"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use lt_str as such in status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will do.
tests/test_port_lt.py
Outdated
fvs = swsscommon.FieldValuePairs([("link_training","on")]) | ||
tbl.set("Ethernet4", fvs) | ||
|
||
# validate if autoneg false is pushed to asic db when set first time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comments, should be link-training instead of autoneg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix it
tests/test_port_lt.py
Outdated
expected_fields = {"SAI_PORT_ATTR_LINK_TRAINING_ENABLE":"false"} | ||
adb.wait_for_field_match("ASIC_STATE:SAI_OBJECT_TYPE_PORT", port_oid, expected_fields) | ||
|
||
# validate if autoneg true is pushed to asic db when set first time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comments, should be link-training instead of autoneg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix it
orchagent/portsorch.cpp
Outdated
sai_attribute_t attr; | ||
|
||
attr.id = SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE; | ||
status = sai_port_api->set_port_attribute(port.m_port_id, &attr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the API used should have been get_port_attribute instead of set.
orchagent/portsorch.cpp
Outdated
sai_status_t status; | ||
sai_attribute_t attr; | ||
|
||
#ifdef SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this ifdef needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the build failure in the PR and also the backward compatibility of the legacy SAI implementation, where SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE is not implemented, while the other control attributes are supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot have #ifdef in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
@@ -61,6 +61,7 @@ extern string gMyAsicName; | |||
#define DEFAULT_VLAN_ID 1 | |||
#define MAX_VALID_VLAN_ID 4094 | |||
|
|||
#define PORT_STAT_POLLING_SEC 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe 1 second polling is aggressive for this scenario. We should change it to 5 or 10 seconds. @prsunny What is your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also rename this to PORT_STATE_POLLING_SEC or similar? STAT means port statistics (like rx packets/bytes ) which is a different thing altogether
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it's done
orchagent/portsorch.cpp
Outdated
@@ -2351,6 +2426,9 @@ bool PortsOrch::initPort(const string &alias, const string &role, const int inde | |||
/* Create associated Gearbox lane mapping */ | |||
initGearboxPort(p); | |||
|
|||
/* Initialize port capabilities */ | |||
initPortCapLinkTraining(p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function loops through all the ports and get capability. This would increase the port initialization time which is a bottleneck in fastboot scenarios. If the user has no LT configs this operation is not needed.
Instead init the capability only when the user configures LT and capability not in cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted, and it's done
orchagent/portsorch.cpp
Outdated
@@ -298,7 +319,8 @@ PortsOrch::PortsOrch(DBConnector *db, DBConnector *stateDb, vector<table_name_wi | |||
m_portStateTable(stateDb, STATE_PORT_TABLE_NAME), | |||
port_stat_manager(PORT_STAT_COUNTER_FLEX_COUNTER_GROUP, StatsMode::READ, PORT_STAT_FLEX_COUNTER_POLLING_INTERVAL_MS, false), | |||
port_buffer_drop_stat_manager(PORT_BUFFER_DROP_STAT_FLEX_COUNTER_GROUP, StatsMode::READ, PORT_BUFFER_DROP_STAT_POLLING_INTERVAL_MS, false), | |||
queue_stat_manager(QUEUE_STAT_COUNTER_FLEX_COUNTER_GROUP, StatsMode::READ, QUEUE_STAT_FLEX_COUNTER_POLLING_INTERVAL_MS, false) | |||
queue_stat_manager(QUEUE_STAT_COUNTER_FLEX_COUNTER_GROUP, StatsMode::READ, QUEUE_STAT_FLEX_COUNTER_POLLING_INTERVAL_MS, false), | |||
m_timer(new SelectableTimer(timespec { .tv_sec = PORT_STATE_POLLING_SEC, .tv_nsec = 0 })) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename m_timer to something more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
sai_attribute_t attr; | ||
|
||
attr.id = SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE; | ||
status = sai_port_api->set_port_attribute(port.m_port_id, &attr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not fixed
orchagent/portsorch.cpp
Outdated
} | ||
else | ||
{ | ||
port.m_cap_lt = 1; /* Default to 1 for vstest */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this. It means that m_cap_lt will be 1 even if SAI return error. I would suggest to implement unit test if VS test does not support link training.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an issue in VS, do you know where is the source of this SAI model? SAI/bm/p4-sai?
Could we this VS enhancement deal in a separate PR? We'll need to do this for both AutoNeg and LinkTraining
orchagent/portsorch.cpp
Outdated
sai_status_t status = sai_port_api->set_port_attribute(port.m_port_id, &attr); | ||
if (status != SAI_STATUS_SUCCESS) | ||
{ | ||
SWSS_LOG_ERROR("Failed to set LT %s to port pid:%" PRIx64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to print the port alias here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
return handleSaiSetStatus(SAI_API_PORT, status); | ||
} | ||
|
||
SWSS_LOG_INFO("Set LT %s to port pid:%" PRIx64, op.c_str(), port.m_port_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to print the port alias here, and maybe it should be NOTICE instead of INFO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
|
||
sai_attribute_t attr; | ||
attr.id = SAI_PORT_ATTR_LINK_TRAINING_FAILURE_STATUS; | ||
attr.value.u32 = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is redundant, will have it removed
orchagent/portsorch.cpp
Outdated
{ | ||
if ((it->second == 0) || !getPort(it->first, port)) | ||
{ | ||
m_port_state_poll.erase(it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will invalid the iterator and cause unpredictable behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
typedef enum { | ||
PORT_STATE_POLL_NONE = 0, | ||
PORT_STATE_POLL_AN = 0x00000001, /* Auto Negotiation */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's part of the review for AutoNeg, we're having 2 PRs in review at the same time, and they're using the same port state poller.
} | ||
else | ||
{ | ||
m_port_state_poll[port.m_alias] &= ~type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check if it should be remove from m_port_state_poll and stop timer if need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stop() will only take place inside the poller thread to reduce the race conditions if any.
orchagent/portsorch.cpp
Outdated
if (active) | ||
{ | ||
m_port_state_poll[port.m_alias] |= type; | ||
m_timer->start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the timer is already started?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now addressed in the PR below, issue start() to a running poller is no longer having the expiration timer reset.
sonic-net/sonic-swss-common#614
/azpw run Azure sonic-swss |
/azpw run Azure.sonic-swss |
1 similar comment
/azpw run Azure.sonic-swss |
/AzurePipelines run Azure.sonic-swss |
Azure Pipelines successfully started running 1 pipeline(s). |
|
||
if (!lt_str.empty() && (p.m_type == Port::PHY)) | ||
{ | ||
if (link_training_mode_map.find(lt_str) == link_training_mode_map.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to AN, I don't see a lt capability query here. Please add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentionally removed at the earlier commit
- LT is similar to AN, it's a port capability available on a per-port basis, not system-wide on a single switch ASIC
- For a port capability available on a per-port basis, SAI_PORT_ATTR_SUPPORTED_XXXX should be introduced into saiport.h
- As per the latest comments from Rita, the SAI spec. was finalized around March time frame, it's too late for 202205 right now
- Since this is a port capability available on a per-port basis, using SAI_PORT_ATTR_LINK_TRAINING_ENABLE in sai_query_attribute_capability is not going to work, and during the SAI subgroup meeting a few months back, there was a conclusion that per-port capability query should be tracked in a separated PR, as all the existing attributes should all be migrated, not just the new attribute for LT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For your reference, the function prototype of sai_query_attribute_capability is as follows, it does not work on a per-port basis
https://github.com/opencomputeproject/SAI/blob/master/inc/saiobject.h#L221
sai_status_t sai_query_attribute_capability(
_In_ sai_object_id_t switch_id,
_In_ sai_object_type_t object_type,
_In_ sai_attr_id_t attr_id,
_Out_ sai_attr_capability_t *attr_capability);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a comment to make sure once SAI is there we revisit and add the check? It is important that we ensure it so that devices and specific ports where it is not supported will have check in place to avoid SAI failure thereby prevent orchagent crash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
} | ||
if (port.m_admin_state_up && port.m_link_training > 0) | ||
{ | ||
updatePortStateLinkTraining(port); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this move inside the below if condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have the if-statement for the port admin state and LT enablement, may I have more details of the additional checker that's missing? Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to autoneg should updatePortStateLinkTraining be done only when status == SAI_PORT_OPER_STATUS_UP? Else it will be called redundantly during down as updateStatePoll will take care of invoking this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of AN
- Link Up: Refresh AN states
- Link Down: Clear the remote advertisement
In the case of LT
- Link Up: Refresh LT states
- Link Down: Refresh LT states
Hence the updatePortStateLinkTraining() is invoked before checking the operation link status
orchagent/portsorch.cpp
Outdated
@@ -6883,3 +7132,115 @@ std::unordered_set<std::string> PortsOrch::generateCounterStats(const string& ty | |||
} | |||
return counter_stats; | |||
} | |||
|
|||
void PortsOrch::updatePortStateAutoNeg(const Port &port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name is misleading. Should it be updatePortRemoveAdvSpeeds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a placeholder for the operational AN state, not specific to remote speed only, there are more upcoming in the PRs in the furture (e.g. tx/rx pause, FEC ....)
fvs = swsscommon.FieldValuePairs([("autoneg","on"),("admin_status","up")]) | ||
ctbl.set("Ethernet0", fvs) | ||
|
||
time.sleep(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need 10 second sleep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you're right now, I'll remove this, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow me to rephrase, 10 sec is necessary.
Please note the poll interval is 5 sec, and in a worst-case scenario, the remote advertisement will not be available until 2nd poll.
@@ -2128,6 +2211,32 @@ task_process_status PortsOrch::setPortAutoNeg(sai_object_id_t id, int an) | |||
return task_success; | |||
} | |||
|
|||
task_process_status PortsOrch::setPortLinkTraining(const Port &port, bool state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is it ensured, LT is not enabled if its not supported on this port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LT capability checker is now restored, such a request will be blocked in doPortTask()
lt = link_training_mode_map[lt_str]; | ||
if (lt != p.m_link_training) | ||
{ | ||
auto status = setPortLinkTraining(p, lt > 0 ? true : false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we ensuring if port is admin down then we are not enabling LT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LT could be enabled/disable regardless of the port admin state, it's a set of control registers on the SerDes
orchagent/portsorch.cpp
Outdated
} | ||
} | ||
|
||
void PortsOrch::updatePortStateLinkTraining(const Port &port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brief comment what this function does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
@@ -6883,3 +7132,115 @@ std::unordered_set<std::string> PortsOrch::generateCounterStats(const string& ty | |||
} | |||
return counter_stats; | |||
} | |||
|
|||
void PortsOrch::updatePortStateAutoNeg(const Port &port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brief comment what this func does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
} | ||
if (port.m_admin_state_up && port.m_link_training > 0) | ||
{ | ||
updatePortStateLinkTraining(port); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to autoneg should updatePortStateLinkTraining be done only when status == SAI_PORT_OPER_STATUS_UP? Else it will be called redundantly during down as updateStatePoll will take care of invoking this.
sai_status_t status; | ||
sai_attribute_t attr; | ||
|
||
attr.id = SAI_PORT_ATTR_SUPPORTED_AUTO_NEG_MODE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one major concern here. Since SAI enforcing this involves SAI implementing this attribute, do we know whether all vendors support it? If not it might break autoneg in their platforms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will cause breakage, hence the AN capability will be updated to use 1(supported) by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Please use this capability to a warning log in case if SAI returns not supported for the attribute. SAI vendors can then mitigate to implement this attribute without the functional impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
// it's not available in SAI v1.10 paried with SONiC.202205. | ||
// And given that LT is part of AN, it should be okay to use | ||
// SAI_PORT_ATTR_SUPPORTED_AUTO_NEG_MODE as a fallback plan. | ||
attr.id = SAI_PORT_ATTR_SUPPORTED_AUTO_NEG_MODE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not to use a wrong attribute. Instead add a comment as todo and leave without using capability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
orchagent/portsorch.cpp
Outdated
SWSS_LOG_NOTICE("Unable to get %s AN capability", port.m_alias.c_str()); | ||
// To avoid breakage on the existing platforms, AN should be 1 by default | ||
port.m_cap_an = 1; | ||
SWSS_LOG_WARN("Unable to get %s AN capability, assumming it's supported", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer you just add "Unable to get AN support capability" in the log. Please remove "assuming it's supported"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
m_portStateTable.hset(port.m_alias, "rmt_adv_speeds", adv_speeds); | ||
} | ||
else | ||
if (port.m_admin_state_up) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the get remote adv speeds is done even when admin state is down and this new logic will restrict it to only when admin up.
This means polling becomes irrelevant. Can you explain the change for the reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the port is admin down, the SerDes is stopped, and it's impossible to get a correct remote advertisement, hence this redundant operation should be skipped if the port is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why do we need the poller? Can't we just update the remote advertised speeds on link up? The entire polling logic was introduced to get remote advertisement when link was down. Should that be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion, but that's a different story
-
MAC disabled + link down
This is a case we have port.m_admin_state_up == false, while the link is surely down, and the remote advertisement is not available since the MAC/SerDes is disabled -
MAC enabled + link down
This is a case we have port.m_admin_state_up == true, while the link is operationally down, and the remote advertisement could be available as the MAC/SerDes is enabled, frame exchanges are happening on the PHY layer (i.e. It's the internal PHY, in this case. Precisely speaking, it's the TSC drivers on the Broadcom chips) -
MAC enabled + link up
This is a general case of port.m_admin_state_up == true and link is operationally up.
Conclusion:
While the remote advertisement could be available in case 2 and 3, it's not available in case 1, hence the poller should only fetching the status when MAC/SerDes is enabled.
"/azp run" |
@ds952811 can you rebase with latest swss master to see if the build test passes? |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
commit 18632a3 Author: Dante Su <[email protected]> Date: Mon May 23 12:22:49 2022 +0000 optimize port state refresh logic Signed-off-by: Dante Su <[email protected]> commit 081d491 Author: ds952811 <[email protected]> Date: Mon May 23 02:33:56 2022 +0000 address review comments Signed-off-by: ds952811 <[email protected]> commit 84bdde4 Author: Dante Su <[email protected]> Date: Fri May 20 02:15:59 2022 +0000 update the default LT capability upon get failures Signed-off-by: Dante Su <[email protected]> commit 0f73666 Author: Dante Su <[email protected]> Date: Thu May 19 11:28:38 2022 +0000 Rename updatePortStatesXX as refreshPortStatesXX Signed-off-by: Dante Su <[email protected]> commit ddd57fe Author: Dante Su <[email protected]> Date: Thu May 19 04:03:13 2022 +0000 Have AN cap defaults to 1, and use AN attr for LT cap query Signed-off-by: Dante Su <[email protected]> commit 876e605 Author: Dante Su <[email protected]> Date: Fri May 13 11:15:12 2022 +0000 drop LT capability query Signed-off-by: Dante Su <[email protected]> commit 55ced7d Author: Dante Su <[email protected]> Date: Fri Apr 29 13:53:17 2022 +0000 incorporate autoneg support from PR#2215 Signed-off-by: Dante Su <[email protected]> commit a04594e Author: Dante Su <[email protected]> Date: Thu Apr 28 16:33:14 2022 +0000 address review comments Signed-off-by: Dante Su <[email protected]> commit e9eeb9a Author: Dante Su <[email protected]> Date: Thu Apr 28 15:00:04 2022 +0000 address review comments Signed-off-by: Dante Su <[email protected]> commit 4ff604d Author: Dante Su <[email protected]> Date: Fri Apr 22 03:51:56 2022 +0000 Stop the port state poll by default Signed-off-by: Dante Su <[email protected]> commit bdfb8d8 Author: Dante Su <[email protected]> Date: Fri Apr 22 03:48:07 2022 +0000 address review comments Signed-off-by: Dante Su <[email protected]> commit 1c6bda8 Author: Dante Su <[email protected]> Date: Mon Apr 18 08:46:21 2022 +0000 Restore pre-emphasis when LT is transitioned from ON to OFF Signed-off-by: Dante Su <[email protected]> commit 09a9b33 Author: Dante Su <[email protected]> Date: Mon Apr 18 02:33:11 2022 +0000 fix build failure due to SAI_PORT_ATTR_SUPPORTED_LINK_TRAINING_MODE Signed-off-by: Dante Su <[email protected]> commit b0bee3e Author: Dante Su <[email protected]> Date: Thu Apr 14 07:54:14 2022 +0000 address review comments Signed-off-by: Dante Su <[email protected]> commit c4345ef Author: Dante Su <[email protected]> Date: Fri Mar 25 02:26:05 2022 +0000 portsorch: initial support for link-training - What I did Add Link-Training support to portsorch, while Gearbox is not in the scope - Why I did it In the case of DAC, static pre-calibrated pre-emphasis is rarely available on SONIC, as most of the ODM are expecting this to be done dynamically at runtime via link-training, hence we'll need this feature to improve the link quality - How I verified it Manual test Ran the Unit-tests to the corresponding changes Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su [email protected]
What I did
Why I did it
How I verified it
Details if related
HLD: sonic-net/SONiC#924, sonic-net/SONiC#925