-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sfp-refactoring] xcvrd: add initial support for CMIS application initialization #217
Conversation
This pull request introduces 3 alerts when merging f2b0adc into 6e4cf6f - view on LGTM.com new alerts:
|
This pull request introduces 1 alert when merging 999e156 into ddb22b9 - view on LGTM.com new alerts:
|
This pull request introduces 1 alert when merging a61387c into 1fd3d16 - view on LGTM.com new alerts:
|
This pull request introduces 1 alert when merging abe64de into 7b95211 - view on LGTM.com new alerts:
|
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
vendor_pn_str = transceiver_dict['model'] | ||
vendor_key = (vendor_name_str.strip() + '#' + vendor_pn_str.strip()).upper() | ||
|
||
for k, v in g_dict[CMIS_MEDIA_SETTINGS_KEY].items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see that this could is optic specific(vendor_name_str + part number)?.
Can it be made it as generic?.
This code skips application code for flat memory/media types other than QSFP_DD, so can we apply the settings for other media types directly?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does all the CMIS vendors also need to specify "CMIS_MEDIA_SETTINGS" in media_settings.json in their platforms?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If the attached xcvr is not specified in CMIS_MEDIA_SETTINGS, or the CMIS_MEDIA_SETTINGS is missing, the CMIS application selection defaults to be enabled
- The CMIS registers of application selection are at page 0x10 and 0x11, hence it's not available on modules with Flat Memory.
- The idea for module-specific controls is for certain modules that have paged memory support while application selection is not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may include a condition to return failure for flat memory media. you can make use of the function defined in SFP refactored code. https://github.com/Azure/sonic-platform-common/blob/c8eceec53625b6c021b7c6619b2555d0c7a1b38d/sonic_platform_base/sonic_xcvr/api/public/cmis.py#L291
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"> 3. The idea for module-specific controls is for certain modules that have paged memory support while application selection is not necessary
"
How do you determine whether the specific media needs application selection?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Thanks for feedback, although we already have a checker for flat media in the xcvrd, I agree it's better to have this in the cmis.py as well.
- Actually, the CMIS_MEDIA_SETTINGS is a backup plan for the faulty transceiver if any. And as of now, we don't really need this, please feel free to let me know if we should have it removed, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for a record, the CMIS_MEDIA_SETTINGS is already removed.
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
|
||
has_cmis = False | ||
for sfp in platform_chassis.get_all_sfps(): | ||
if sfp.get_port_type() in [sfp.SFP_PORT_TYPE_QSFPDD]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more than just QSFPDD supports CMIS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, only QSFPDD (type=18h) is verified, we could circle back on OSFP(type=19h) once the sample is available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for a record, the get_port_type() checker is already removed in the latest commit
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
has_cmis = True | ||
break | ||
|
||
if not has_cmis: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modules may be plugged in or removed dynamically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check here is to validate the cage type rather than the media type of the attached transceiver, hence it does not matter what's the module attached to right now.
The main purpose is to reduce the CPU/memory usage when the service is not appliable to the target unit.
e.g. For a box with only SFP and QSFP28 cages, the CMIS Manager will not be activated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for a record, the get_port_type() checker is already removed in the latest commit
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
if conf.get('application_selection', True): | ||
self.log_notice("{}: application update for {}G, {}-lanes".format( | ||
lport, int(speed/1000), len(lanes))) | ||
ret = sfp.set_cmis_application(speed, len(lanes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we ensure that this is done after the ASIC SerDes taps are configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're referring to the pre-emphasis configuration on the MAC, it's handled by the orchagent when the serdes tap parameters are uploaded to the APPL_DB, hence it's outside the scope of this PR, the serdes tap parameters are independent with this PR, and the framework of media_settings.json needs to be revised to support speed/mode switches between PAM4 and NRZ if necessary.
If you're referring to the pre-emphasis configuration on the PHY, it's not supported as of now, we could circle back on this when the module-specific parameters are available and needs to be applied to the CMIS PHY registers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: In the case of Broadocm ASIC, we have different value range for PAM4 and NRZ, hence having NRZ/PAM4 pre-emphasise applied to SerDes in PAM4/NRZ mode may or may not lead to checker failure in SAI, and in turn, the orchagent may or may not crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that (per my understanding) the ASIC SerDes pre/main/post setitngs should be configured before the CMIS ApSel sequence is started. There doesn't seem to be anything that enforces this sequencing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need a synchronization mechanism between syncd/SAI and the xcvrd to gracefully switch in between the applications. Prince @ MSFT and I had this discussion yesterday, and he will be working on this, and we could deal with this in another PR
This pull request introduces 4 alerts when merging a1d8e10 into e53ff26 - view on LGTM.com new alerts:
|
@@ -83,21 +88,28 @@ def logical_port_name_to_physical_port_list(self, port_name): | |||
else: | |||
return None | |||
|
|||
def subscribe_port_config_change(db_list=['CONFIG_DB']): | |||
port_tbl_map = { | |||
'APPL_DB': swsscommon.APP_PORT_TABLE_NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll listen to APPL_DB for the PORT CONFIG changes, and STATE_DB for XCVR INSERTION/REMOVAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should listen to PORT_TABLE in STATE_TB instead of PORT_TABLE in APP_DB because, when you get SET event for a port in APP_DB, the orchagent has still not called the sai api to create the port yet. When SET event is received for a port in STATE_DB then its confirmed the PORT create has completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trick is, not all the fields in APPL_DB will be updated to STATE_DB by the orchagent as of now, it's part of synchronization issue between syncd and orchagent, it's outside the scope of this PR
This pull request introduces 4 alerts when merging 2bf4580 into e53ff26 - view on LGTM.com new alerts:
|
@@ -8,7 +8,10 @@ | |||
class PortChangeEvent: | |||
PORT_ADD = 0 | |||
PORT_REMOVE = 1 | |||
def __init__(self, port_name, port_index, asic_id, event_type): | |||
PORT_SET = 2 | |||
PORT_DEL = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it hard to differentiate the four actions here from the name, would you mind rephrasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I could update the namings, but do you have any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I could surely update these namings, but I don't know which one is better, could you please help me on this?
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
|
||
self.dbg_print("Stopped") | ||
|
||
def task_run(self, y_cable_presence): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose of having "y_cable_presence" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you're right, I'll remove it
return sel, asic_context | ||
|
||
|
||
def handle_port_config_change(sel, asic_context, stop_event, port_mapping, logger, port_change_event_handler): | ||
def handle_port_config_change(sel, asic_context, stop_event, port_mapping, logger, port_change_event_handler, promiscuous=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a separate handler for handling PORT_TABLE update events in APP_DB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the latest commit, the requested changes are already done
@@ -83,21 +88,28 @@ def logical_port_name_to_physical_port_list(self, port_name): | |||
else: | |||
return None | |||
|
|||
def subscribe_port_config_change(db_list=['CONFIG_DB']): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add another function to subscribe
subscribe_port_table_update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's subscribe_port_update_events() in the latest commit, do we need to have it rephrased to subscribe_port_table_update()?
port_mapping.get_logical_to_physical(key)[0], | ||
asic_context[port_tbl], | ||
PortChangeEvent.PORT_REMOVE) | ||
if promiscuous: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the promiscuous check for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a dead code that's already removed, please check the latest commit
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
has_cmis = False | ||
for sfp in platform_chassis.get_all_sfps(): | ||
try: | ||
ptype = sfp.get_port_type() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this check for checking the cage type. Don't want platforms to implement this new API for enabling CMIS datapath initialization. Its OK for one thread running and doing nothing (CMIS state will be in failed state for those platform having only SFP).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already removed in the earlier commit, please check it out
@@ -83,21 +88,28 @@ def logical_port_name_to_physical_port_list(self, port_name): | |||
else: | |||
return None | |||
|
|||
def subscribe_port_config_change(db_list=['CONFIG_DB']): | |||
port_tbl_map = { | |||
'APPL_DB': swsscommon.APP_PORT_TABLE_NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should listen to PORT_TABLE in STATE_TB instead of PORT_TABLE in APP_DB because, when you get SET event for a port in APP_DB, the orchagent has still not called the sai api to create the port yet. When SET event is received for a port in STATE_DB then its confirmed the PORT create has completed.
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
|
||
self.port_dict[lport]['cmis_state'] = self.CMIS_STATE_DP_DEINIT | ||
elif state == self.CMIS_STATE_DP_DEINIT: | ||
if api.get_module_state() != 'ModuleReady': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need timeout, if the module does NOT settle for 'ModuleReady' see page 1 byte 167
done = False | ||
continue | ||
if not done: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please bail out if not ConfigSuccess, otherwise we will keep continuing even though the configuration has failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
done = False | ||
continue | ||
if not done: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please bail out if datapath init failed. There is a timeout in page 1, byte 144
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information is not available in the mem map as of now, hence we'll use a default expiration time for now, let's do the fine-grained expiration in phase 2 (DPB support)
|
||
# D.2.2 Software Deinitialization | ||
api.set_datapath_deinit(host_lanes) | ||
api.set_lpmode(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make sure module indeed transitioned to low power.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but it looks like one of my CMIS4 optics will never be transitioned to ModuleLowPwr, hence I add 'ModuleReady' in the following checker
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
if not done: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please bail out if the datapath has not transitioned to DataPathActivated . see page 1 , byte 168 for timeout for completing datapath tx turn ON.
commit a1d8e10 Author: Dante Su <[email protected]> Date: Thu Nov 18 12:45:01 2021 +0000 okay Signed-off-by: Dante Su <[email protected]> commit 11b6850 Author: Dante Su <[email protected]> Date: Fri Nov 12 07:39:34 2021 +0000 Add custom media setting for CMIS Signed-off-by: Dante Su <[email protected]> commit 6dd3608 Author: Dante Su <[email protected]> Date: Tue Nov 9 11:34:10 2021 +0000 add missing unittest for CMIS Signed-off-by: Dante Su <[email protected]> commit ceea223 Author: Dante Su <[email protected]> Date: Tue Nov 9 03:34:59 2021 +0000 fix port remove event Signed-off-by: Dante Su <[email protected]> commit 9be5135 Author: Dante Su <[email protected]> Date: Tue Nov 9 01:46:27 2021 +0000 port_mapping: drop the hacks for CMIS Signed-off-by: Dante Su <[email protected]> commit 501134b Author: Dante Su <[email protected]> Date: Fri Nov 5 09:08:59 2021 +0000 [sfp-refactoring] xcvrd: add initial support for CMIS application initialization Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
… type checker Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
@prgeor are we planning to to take this SFP refractoring into 202111, Is this needed for 400G in Chassis ? |
This change is not needed if the default 400G application is needed. |
…e 1.2 (sonic-net#217) This release goes in sync with the following firmware version of Broadcom Y cable, which is consistent with release 6 { "version_nic_active": "D203.9.D103.1", "version_nic_inactive": "D203.9.D103.1", "version_nic_next": "D203.9.D103.1", "version_peer_active": "D302.8", "version_peer_inactive": "D302.8", "version_peer_next": "D302.8", "version_self_active": "D302.8", "version_self_inactive": "D302.8", "version_self_next": "D302.8" } Signed-off-by: vaibhav-dahiya [email protected] Description Basically a vendor specific implementation of abstract YCableBase class . detailed design discussion can be found https://github.com/Azure/SONiC/pull/757/files Motivation and Context to support the Y-Cable API required to support Broadcom's Y-Cable. Signed-off-by: vaibhav-dahiya <[email protected]>
Signed-off-by: Dante Su [email protected]
Why I did it
Add initial support for CMIS application initialization
How I did it
Add CmisManagerTask to handle the per-port CMIS application initialization in parallel
How to verify it
Which release branch to backport (provide reason below if selected)
Description for the changelog
A picture of a cute animal (not mandatory but encouraged)