Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] Adjust log level to avoid too many thermal logs #15

Closed
wants to merge 3 commits into from

Conversation

Junchao-Mellanox
Copy link
Owner

- Why I did it

  1. There are logs that will be printed every 1 minutes which is not necessary
  2. When PSU is not present, we should not set its fan speed and trigger error logs

- How I did it

  1. Trigger thermal log if status change
  2. Test sysfs file before reading it

- How to verify it

Manually verify

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@Junchao-Mellanox
Copy link
Owner Author

Community PR sonic-net#4631

Junchao-Mellanox pushed a commit that referenced this pull request Jun 22, 2020
…sonic-net#4702)

Update AS7312-54X,AS7312-54XS,AS7315-27XB config.bcm file to make sure there is no the following error message.

configuration: format error in /usr/share/sonic/hwsku/th-as7312-48x25G+6x100G.config.bcm on line 110 (ignored)#15
Junchao-Mellanox pushed a commit that referenced this pull request Dec 14, 2020
This update brings in the following commits.

86c1108 Enable arm architecture to build in addition to amd64 (#37)
4acb2c3 fix bugs and enhance Transformer (#35)
49e5a22 ygot related enhancements and fixes (#34)
51224de Fix ietf yang search path for cvl schema builds (#32)
3c6cdb3 CVL Changes #8: 'must' and 'when' expression evaluation (#31)
dabf231 CVL Changes #7: 'leafref' evaluation (#28)
6f9535f CVL Changes #6: Customized Xpath Engine integration (#27)
5e2466b DB-Layer fixes/enhancements (#26)
9a27302 CVL Changes #4: Implementation of new CVL APIs (#22)
dbf1093 Translib support for authorization, yang versioning and Delete flag (#21)
80f369e CVL Changes #5: YParser enhancement (#23)
904ce18 CVL Changes #3: Multi-db instance support (#20)
9d24a34 CVL Changes #2:  YValidator infra changes for evaluating xpath expression (#19)
f3fc40f CVL Changes #1: Initial CVL code reorganization and common infra changes (#18)
4922601 Bulk and RPC API support in translib (#16)
1d730df RFC7895 yang module library implementation (#15)
Junchao-Mellanox pushed a commit that referenced this pull request Feb 1, 2021
…tool: not found (sonic-net#6615)

Starting with BRCM SAI 4.3.1.5 we see the following :ethtool not fount" error in syslog during boot up:
```
Jan 27 07:36:14.712472 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1:
Jan 27 07:36:14.712844 str-s6100-acs-1 INFO syncd#/supervisord: syncd ethtool: not found
Jan 27 07:36:14.713228 str-s6100-acs-1 INFO syncd#/supervisord: syncd #15
Jan 27 07:36:14.713840 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet39 speed 40000 rc:32512
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initializePort: Initializing port alias:Ethernet36 pid:1000000000040
Jan 27 07:36:14.726793 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet36 admin:0 oper:0 addr:4c:76:25:f5:48:80 ifindex:75 master:0
Jan 27 07:36:14.727967 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet36(ok) to state db
Jan 27 07:36:14.729331 str-s6100-acs-1 NOTICE swss#orchagent: :- addHostIntfs: Create host interface for port Ethernet36
Jan 27 07:36:14.752398 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1: ethtool: not found#015
Jan 27 07:36:14.752689 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet36 speed 40000 rc:32512
Jan 27 07:36:14.756050 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet36
Jan 27 07:36:14.757585 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet36
```
It seems that starting with BRCM SAI 4.2.1.5 syncd is using ethtool to set the host interface speed and since this ethtool was not part of the syncd Docker, we observe these "ethtool not found" issue.
Junchao-Mellanox pushed a commit that referenced this pull request Mar 1, 2021
…tool: not found (sonic-net#6615)

Starting with BRCM SAI 4.3.1.5 we see the following :ethtool not fount" error in syslog during boot up:
```
Jan 27 07:36:14.712472 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1:
Jan 27 07:36:14.712844 str-s6100-acs-1 INFO syncd#/supervisord: syncd ethtool: not found
Jan 27 07:36:14.713228 str-s6100-acs-1 INFO syncd#/supervisord: syncd #15
Jan 27 07:36:14.713840 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet39 speed 40000 rc:32512
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet39
Jan 27 07:36:14.717204 str-s6100-acs-1 NOTICE swss#orchagent: :- initializePort: Initializing port alias:Ethernet36 pid:1000000000040
Jan 27 07:36:14.726793 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet36 admin:0 oper:0 addr:4c:76:25:f5:48:80 ifindex:75 master:0
Jan 27 07:36:14.727967 str-s6100-acs-1 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet36(ok) to state db
Jan 27 07:36:14.729331 str-s6100-acs-1 NOTICE swss#orchagent: :- addHostIntfs: Create host interface for port Ethernet36
Jan 27 07:36:14.752398 str-s6100-acs-1 INFO syncd#/supervisord: syncd sh: 1: ethtool: not found#015
Jan 27 07:36:14.752689 str-s6100-acs-1 INFO syncd#syncd: [0] SAI_API_HOSTIF:_brcm_sai_hostif_speed_set:11894 cmd ethtool -s Ethernet36 speed 40000 rc:32512
Jan 27 07:36:14.756050 str-s6100-acs-1 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet36
Jan 27 07:36:14.757585 str-s6100-acs-1 NOTICE swss#orchagent: :- initPort: Initialized port Ethernet36
```
It seems that starting with BRCM SAI 4.2.1.5 syncd is using ethtool to set the host interface speed and since this ethtool was not part of the syncd Docker, we observe these "ethtool not found" issue.
Junchao-Mellanox pushed a commit that referenced this pull request Jan 17, 2022
* [BFN] Updated platform APIs impl

Signed-off-by: Andriy Kokhan <[email protected]>

* Extended BFN platform SFP APIs implementation

* Update sfp.py

* [BFN] Extended SFP platform plugin implementation

Signed-off-by: Andriy Kokhan <[email protected]>

* [BFN] Extended Fans platform plugin implementation

* [BFN] divided classes Fan and  FanDrawer into 2 files

* Signed-off-by: Vadym Yashchenko <[email protected]>

What I did
	Add get_model() function
	Add get_low_critical_threshold() function
	Change __get(...) function.
How I did it
	Differnece from previous implementation of __get(...) function is return real value or -9999.9 if value is not provided by thrift API

* Add get_presence() function and revised __get() function

Signed-off-by: Vadym Yashchenko <[email protected]>

* [BFN] Updated PSU platform APIs impl

Signed-off-by: Dmytro Lytvynenko <[email protected]>

* Added BFN PSU cache (#9)

Signed-off-by: Andriy Kokhan <[email protected]>

* [BFN]  Fans and Fantray platform APIs update (#7)

* [BFN] Updated SFP platform APIs (#10)

Signed-off-by: Volodymyr Boyko <[email protected]>

* [BFN] Updated platform API for thermal (#8)

* Signed-off-by: Vadym Yashchenko <[email protected]>

* Revert "[BFN]  Fans and Fantray platform APIs update (#7)" (#11)

This reverts commit c62a733.

* Add support health monitor system (#15)

Signed-off-by: Petro Bratash <[email protected]>

* Update chassis.py

* [BFN] Updated FANs and FAN Tray platform API (#14)

* Fix fix_alignment (#17)

Signed-off-by: Petro Bratash <[email protected]>

* [BFN] Improvement show environment (#16)

* Added PSU temperature skip into platform.json (#18)

Signed-off-by: Andriy Kokhan <[email protected]>

* Do not skip psud on Newport

Signed-off-by: Andriy Kokhan <[email protected]>

* [BFN] fix fan status from Not OK to Ok (#19)

* [BFN] Updated SFP platform plugin (#13)

Signed-off-by: Volodymyr Boyko <[email protected]>

* [DPB] Fix typo for Ethernet0 2x200G[100G,40G] breakout mode (#21)

Signed-off-by: Mykola Gerasymenko <[email protected]>

* [barefoot] Tmp fix vendor_rev (#22)

Signed-off-by: Volodymyr Boyko <[email protected]>

* Fixed python issues in sonic_platform/fan_drawer.py

Signed-off-by: Andriy Kokhan <[email protected]>

* Updated fan_drawer.py

* Fixing trailing white spaces in fan_drawer.py

* [BFN] Fix thrift for SFPs API

Signed-off-by: Volodymyr Boyko <[email protected]>

* In platform.json, replaced 'false' with '0' to workaround ast.literal_eval() issue

Signed-off-by: Andriy Kokhan <[email protected]>

* [Newport] Thermal manager  (#23)

* Signed-off-by: Vadym Yashchenko <[email protected]>

* Revert "In platform.json, replaced 'false' with '0' to workaround ast.literal_eval() issue"

This reverts commit 1e73127.

* Removed 'controllable' options from platform.json to fix factory default config generation

Signed-off-by: Andriy Kokhan <[email protected]>

* Update thermal_manager.py

* Migrated SFP plugin to sonic_xcvr API (#30)

Signed-off-by: Andriy Kokhan <[email protected]>

Co-authored-by: KostiantynYarovyiBf <[email protected]>
Co-authored-by: Vadym Yashchenko <[email protected]>
Co-authored-by: Dmytro Lytvynenko <[email protected]>
Co-authored-by: Volodymyr Boiko <[email protected]>
Co-authored-by: Petro Bratash <[email protected]>
Co-authored-by: Mykola Gerasymenko <[email protected]>
Junchao-Mellanox pushed a commit that referenced this pull request Jan 21, 2022
[sonic-linkmgrd][master] submodule update

Commits added:
0c23756 Jing Zhang      2022-01-19      Linkmgrd subscribing State DB route event  (#13)
12b9951 Longxiang Lyu   2021-12-13      Add TLV support to ICMP payload (#11)
3eedda3 Longxiang Lyu   2022-01-06      Add missing intermediate states (#16)
8da4982 Ying Xie        2022-01-04      [linkmgrd] update README, set coding style guidance (#15)
a897cf8 Longxiang Lyu   2021-12-13      Improve PR template (#16)
6fec701 Jing Zhang      2021-12-06      Add pull request template for linkmgrd repo (#9)


signed-off-by: Jing Zhang [email protected]
Junchao-Mellanox pushed a commit that referenced this pull request Mar 14, 2022
[sonic-linkmgrd][master] submodule update

Commits added:
0c23756 Jing Zhang      2022-01-19      Linkmgrd subscribing State DB route event  (#13)
12b9951 Longxiang Lyu   2021-12-13      Add TLV support to ICMP payload (#11)
3eedda3 Longxiang Lyu   2022-01-06      Add missing intermediate states (#16)
8da4982 Ying Xie        2022-01-04      [linkmgrd] update README, set coding style guidance (#15)
a897cf8 Longxiang Lyu   2021-12-13      Improve PR template (#16)
6fec701 Jing Zhang      2021-12-06      Add pull request template for linkmgrd repo (#9)


signed-off-by: Jing Zhang [email protected]
Junchao-Mellanox pushed a commit that referenced this pull request Nov 21, 2023
SAI 9.x requires a SYNCD_SHM_SIZE specified otherwise it will default to 64mb which is insufficient for syncd.

E.G. of a few failures seen when insufficient shmem was set

ha_init:  The file: warmboot_data_0 is of size=762[MB] and is beyond the directory: /dev/shm available storage of size=64[MB]#15
syncd.sh[26074]: Cannot get SYNCD_SHM_SIZE for chip: [869] in /usr/share/sonic/device/x86_64-broadcom_common/syncd_shm.ini. Skip set SYNCD_SHM_SIZE.

Syncd hangs here:

syncd#syncd: [none] SAI_API_SWITCH:_brcm_sai_shr_ha_section_resize:536 start=0x7f6e641b4000, end=0x7f6e645b4000, len=302276608, free=0x7f6e641b4000
Broadcom recommended using 1gb for DNX devices.

Since currently we don't use SAI9.x on master and 202305 this change won't fix anything until we upgrade the SAI on those branches.
Junchao-Mellanox pushed a commit that referenced this pull request Jul 29, 2024
…-net#19637)

Broadcom requires that programmability_ucode_relative_path is set in SAI11.
This soc property replaces the legacy custom_feature_ucode_path

Without this we get the following error:

syncd#supervisord: syncd 0:dbx_file_get_db_location: DB Resource not defined#015
syncd#supervisord: syncd #15#015
syncd#supervisord: syncd 0:dnx_init_pemla_get_ucode_filepath:  Error 'Invalid parameter' indicated ; #15#015
Junchao-Mellanox pushed a commit that referenced this pull request Aug 21, 2024
…-net#19637)

Broadcom requires that programmability_ucode_relative_path is set in SAI11.
This soc property replaces the legacy custom_feature_ucode_path

Without this we get the following error:

syncd#supervisord: syncd 0:dbx_file_get_db_location: DB Resource not defined#015
syncd#supervisord: syncd #15#015
syncd#supervisord: syncd 0:dnx_init_pemla_get_ucode_filepath:  Error 'Invalid parameter' indicated ; #15#015
Junchao-Mellanox pushed a commit that referenced this pull request Dec 11, 2024
…7250E platform (sonic-net#20367)

Update sonic-platform submodule for Nokia-IXR7250E:
Fixes Nokia-ION/ndk#57

cdfbbe2 [H4-32D]Update platform modules after OC tests (Update README.md #17)
f28eff0 [H4-64D]Fix SFP+ port, eeprom, reboot-cause, thermal algorithm, add PSU input voltage check (Fix rules in Makefiles #15)
178e15a Minor watchdog change for better retention of last kick stamp
c479392 Remove rogue platform_reboot file
331abe0 Enhance watchdog script to detect fsde device hung signature
4c6b7c1 Fixed update temperature issue
5002fb7 Remove average and maximum
c620130 No PSU Master status led in IMM. No need to set it

Signed-off-by: mlok <[email protected]>
Junchao-Mellanox pushed a commit that referenced this pull request Dec 11, 2024
…ly (sonic-net#20847)

#### Why I did it
src/sonic-bmp
```
* bfbd47b - (HEAD -> master, origin/master, origin/HEAD) Merge pull request #15 from FengPan-Frank/makefile (13 hours ago) [Feng-msft]
* ad31f5b - Create makefile for build image flow (17 hours ago) [Feng Pan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants