Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malcolm Sensor Temperature dashboard issue #265

Closed
patrickpritchett opened this issue Sep 15, 2023 · 6 comments
Closed

Malcolm Sensor Temperature dashboard issue #265

patrickpritchett opened this issue Sep 15, 2023 · 6 comments
Assignees
Labels
bug Something isn't working external Depends on a bug or feature external to this project sensor For issues dealing with the Hedgehog OS capture sensor
Milestone

Comments

@patrickpritchett
Copy link

Malcolm Sensor Temperature dashboard does not show temperature from hedgehog server motherboard (hedgehog1 below) but does show from laptop (hedgehog2) although both show in the drop down list on that particular dashboard.
Hedgehog 1 motherboard:

Base Board Information
	Manufacturer: ASUSTeK COMPUTER INC.
	Product Name: R1505I-IM-B
	Version: Default string
	Serial Number: 220298003200207
	Asset Tag: Default string
	Features:
		Board is a hosting board
		Board is replaceable
	Location In Chassis: Default string
	Chassis Handle: 0x0003
	Type: Motherboard
	Contained Object Handles: 0

Sensor type options on dashboard:

acpitz
x86_pkg_temp
sensor@hedgehog1:~$ sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +44.5°C  

nvme-pci-0300
Adapter: PCI adapter
Composite:    +34.9°C  (low  =  -0.1°C, high = +84.8°C)
                       (crit = +94.8°C)
Sensor 1:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +40.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 8:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)

amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:           N/A  
vddnb:            N/A  
edge:         +44.0°C
sensor@hedgehog2:~$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +46.0°C  (high = +87.0°C, crit = +105.0°C)
Core 0:        +45.0°C  (high = +87.0°C, crit = +105.0°C)
Core 1:        +39.0°C  (high = +87.0°C, crit = +105.0°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +43.0°C  (crit = +200.0°C)

thinkpad-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
CPU:          +43.0°C  
GPU:           +0.0°C  
temp3:         +0.0°C  
temp4:         +0.0°C  
temp5:         +0.0°C  
temp6:         +0.0°C  
temp7:         +0.0°C  
temp8:         +0.0°C  

BAT0-acpi-0
Adapter: ACPI interface
in0:          12.33 V  
@mmguero mmguero added bug Something isn't working sensor For issues dealing with the Hedgehog OS capture sensor labels Sep 15, 2023
@mmguero mmguero added this to Malcolm Sep 15, 2023
@mmguero mmguero moved this to Todo (investigate) in Malcolm Sep 15, 2023
@mmguero mmguero added this to the v23.10.0 milestone Sep 15, 2023
@mmguero
Copy link
Collaborator

mmguero commented Sep 21, 2023

Running fluent-bit (which is what's doing the temperature measurement on Hedgehog) manually:

/opt/fluent-bit/bin/fluent-bit -R /etc/fluent-bit/parsers.conf -i thermal -p Interval_Sec=10 -o stdout
Fluent Bit v2.1.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/09/21 13:36:12] [ info] [fluent bit] version=2.1.9, commit=, pid=1055697
[2023/09/21 13:36:12] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/09/21 13:36:12] [ info] [cmetrics] version=0.6.3
[2023/09/21 13:36:12] [ info] [ctraces ] version=0.3.1
[2023/09/21 13:36:12] [ info] [input:thermal:thermal.0] initializing
[2023/09/21 13:36:12] [ info] [input:thermal:thermal.0] storage_strategy='memory' (memory only)
[2023/09/21 13:36:12] [ warn] [input:thermal:thermal.0] thermal device file not found
[2023/09/21 13:36:12] [ info] [sp] stream processor started
[2023/09/21 13:36:12] [ info] [output:stdout:stdout.0] worker #0 started

@mmguero
Copy link
Collaborator

mmguero commented Sep 21, 2023

See fluent/fluent-bit#7955

@mmguero mmguero added the external Depends on a bug or feature external to this project label Sep 21, 2023
@mmguero mmguero self-assigned this Sep 21, 2023
@mmguero
Copy link
Collaborator

mmguero commented Sep 21, 2023

Alternatives if I can't get fluent-bit working:

  • We could just run sensors -j periodically and parse the output ourselves
    • I don't love this because it's not as cross-platform: we'd need to look hard at trying to format the output exactly like what we get from fluent-bit so that others using fluent-bit directly (not on Hedgehog) still work too

@mmguero
Copy link
Collaborator

mmguero commented Oct 9, 2023

With a patch provided by a fluent bit dev (see fluent/fluent-bit#8016) we can get temperature data. Once this patch is pulled into the main fluent bit branch and released I will close this bug.

@mmguero mmguero moved this from Todo (investigate) to In Progress in Malcolm Oct 23, 2023
@mmguero mmguero modified the milestones: v23.10.0, v23.11.0 Oct 23, 2023
@mmguero mmguero moved this from In Progress to In Progress (external) in Malcolm Oct 25, 2023
@mmguero mmguero moved this from In Progress (external) to Testing in Malcolm Nov 9, 2023
@mmguero
Copy link
Collaborator

mmguero commented Nov 9, 2023

with fluent-bit v2.2.0 this should be fixed, testing.

@mmguero mmguero closed this as completed Nov 16, 2023
@github-project-automation github-project-automation bot moved this from Testing to Done in Malcolm Nov 16, 2023
This was referenced Dec 4, 2023
@mmguero mmguero moved this from Done to Released in Malcolm Dec 5, 2023
@mmguero
Copy link
Collaborator

mmguero commented Dec 5, 2023

Confirmed working in v23.12.0:

sensor@pelican:~$ /opt/fluent-bit/bin/fluent-bit -R /etc/fluent-bit/parsers.conf -i thermal -p Interval_Sec=10 -o stdout -m '*'
Fluent Bit v2.2.0
* Copyright (C) 2015-2023 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/12/06 02:20:27] [ info] [fluent bit] version=2.2.0, commit=, pid=3300
[2023/12/06 02:20:27] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/12/06 02:20:27] [ info] [cmetrics] version=0.6.4
[2023/12/06 02:20:27] [ info] [ctraces ] version=0.3.1
[2023/12/06 02:20:27] [ info] [input:thermal:thermal.0] initializing
[2023/12/06 02:20:27] [ info] [input:thermal:thermal.0] storage_strategy='memory' (memory only)
[2023/12/06 02:20:27] [ info] [sp] stream processor started
[2023/12/06 02:20:27] [ info] [output:stdout:stdout.0] worker #0 started
[0] thermal.0: [[1701829237.550501248, {}], {"name"=>"hwmon2_temp1_input", "type"=>"k10temp", "temp"=>29.500000}]
[1] thermal.0: [[1701829237.550511337, {}], {"name"=>"hwmon0_temp1_input", "type"=>"nvme", "temp"=>36.850000}]
[2] thermal.0: [[1701829237.550515685, {}], {"name"=>"hwmon0_temp2_input", "type"=>"nvme", "temp"=>36.850000}]
[3] thermal.0: [[1701829237.550519342, {}], {"name"=>"hwmon0_temp3_input", "type"=>"nvme", "temp"=>44.850000}]
[4] thermal.0: [[1701829237.550522688, {}], {"name"=>"hwmon0_temp9_input", "type"=>"nvme", "temp"=>36.850000}]
[5] thermal.0: [[1701829237.550525974, {}], {"name"=>"hwmon1_temp1_input", "type"=>"amdgpu", "temp"=>29.000000}]
^C[2023/12/06 02:20:41] [engine] caught signal (SIGINT)
[2023/12/06 02:20:41] [ warn] [engine] service will shutdown in max 5 seconds
[2023/12/06 02:20:41] [ info] [input] pausing thermal.0
[2023/12/06 02:20:42] [ info] [engine] service has stopped (0 pending tasks)
[2023/12/06 02:20:42] [ info] [input] pausing thermal.0
[2023/12/06 02:20:42] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/12/06 02:20:42] [ info] [output:stdout:stdout.0] thread worker #0 stopped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working external Depends on a bug or feature external to this project sensor For issues dealing with the Hedgehog OS capture sensor
Projects
Status: Released
Development

No branches or pull requests

2 participants