Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sfputilbase.py: fix the Python crash upon IO failures #5

Merged
merged 1 commit into from
Nov 18, 2020
Merged

sfputilbase.py: fix the Python crash upon IO failures #5

merged 1 commit into from
Nov 18, 2020

Conversation

ds952811
Copy link

This fixes the following failure:

Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd Traceback (most recent call last):
Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd File "/usr/bin/xcvrd", line 683, in
Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd sys.exit(main())
Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd File "/usr/bin/xcvrd", line 606, in main
Nov 18 00:57:55.488188 sonic INFO pmon#supervisord: xcvrd rc = post_port_sfp_info_to_db(logical_port, int_tbl)
Nov 18 00:57:55.488188 sonic INFO pmon#supervisord: xcvrd File "/usr/bin/xcvrd", line 265, in post_port_sfp_info_to_db
Nov 18 00:57:55.488206 sonic INFO pmon#supervisord: xcvrd port_info_dict = platform_sfputil.get_transceiver_info_dict(physical_port)
Nov 18 00:57:55.488206 sonic INFO pmon#supervisord: xcvrd File "/usr/local/lib/python2.7/dist-packages/sonic_platform_base/sonic_sfp/sfputilbase.py", line 726, in get_transceiver_info_dict
Nov 18 00:57:55.501232 sonic INFO pmon#supervisord: xcvrd sfp_vendor_pn_raw = self._read_eeprom_specific_bytes(sysfsfile_eeprom, (offset + OSFP_VENDOR_PN_OFFSET), XCVR_VENDOR_PN_WIDTH)
Nov 18 00:57:55.501232 sonic INFO pmon#supervisord: xcvrd File "/usr/local/lib/python2.7/dist-packages/sonic_platform_base/sonic_sfp/sfputilbase.py", line 314, in _read_eeprom_specific_bytes
Nov 18 00:57:55.501390 sonic INFO pmon#supervisord: xcvrd print("Error: reading sysfs file %s" % sysfs_sfp_i2c_client_eeprom_path)
Nov 18 00:57:55.501390 sonic INFO pmon#supervisord: xcvrd NameError: global name 'sysfs_sfp_i2c_client_eeprom_path' is not defined

Signed-off-by: Dante Su [email protected]

This fixes the following failure:

Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd Traceback (most recent call last):
Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd   File "/usr/bin/xcvrd", line 683, in <module>
Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd     sys.exit(main())
Nov 18 00:57:55.488171 sonic INFO pmon#supervisord: xcvrd   File "/usr/bin/xcvrd", line 606, in main
Nov 18 00:57:55.488188 sonic INFO pmon#supervisord: xcvrd     rc = post_port_sfp_info_to_db(logical_port, int_tbl)
Nov 18 00:57:55.488188 sonic INFO pmon#supervisord: xcvrd   File "/usr/bin/xcvrd", line 265, in post_port_sfp_info_to_db
Nov 18 00:57:55.488206 sonic INFO pmon#supervisord: xcvrd     port_info_dict = platform_sfputil.get_transceiver_info_dict(physical_port)
Nov 18 00:57:55.488206 sonic INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/sonic_platform_base/sonic_sfp/sfputilbase.py", line 726, in get_transceiver_info_dict
Nov 18 00:57:55.501232 sonic INFO pmon#supervisord: xcvrd     sfp_vendor_pn_raw = self._read_eeprom_specific_bytes(sysfsfile_eeprom, (offset + OSFP_VENDOR_PN_OFFSET), XCVR_VENDOR_PN_WIDTH)
Nov 18 00:57:55.501232 sonic INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/sonic_platform_base/sonic_sfp/sfputilbase.py", line 314, in _read_eeprom_specific_bytes
Nov 18 00:57:55.501390 sonic INFO pmon#supervisord: xcvrd     print("Error: reading sysfs file %s" % sysfs_sfp_i2c_client_eeprom_path)
Nov 18 00:57:55.501390 sonic INFO pmon#supervisord: xcvrd NameError: global name 'sysfs_sfp_i2c_client_eeprom_path' is not defined

Signed-off-by: Dante Su <[email protected]>
Copy link
Owner

@zhenggen-xu zhenggen-xu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looked good. However, in what cases we hit the IOError?

@ds952811
Copy link
Author

This defect is identified by multiple attempts of QSFPDD module reinsertion, and since the I2C performance of Silverstone is pretty slow, it's quite easy to recreate this failure.

@zhenggen-xu
Copy link
Owner

zhenggen-xu commented Nov 18, 2020

This defect is identified by multiple attempts of QSFPDD module reinsertion, and since the I2C performance of Silverstone is pretty slow, it's quite easy to recreate this failure.

If I am not mistaken, xcvrd will retry max 5 times with 1 second interval if hit this IOError. So with the fix in this PR and retry logic, things are working, right?

@ds952811
Copy link
Author

Yes, with the fix in this PR, the xcvrd will not crash due to the undefined symbol, and successfully resume after the IO failures.

@zhenggen-xu zhenggen-xu merged commit c40f3b9 into zhenggen-xu:201811-TH3 Nov 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants