-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only report core throttles per core, not per cpu #836
Only report core throttles per core, not per cpu #836
Conversation
collector/cpu_linux.go
Outdated
// cpu loop | ||
for _, cpu := range cpus { | ||
_, cpuName := filepath.Split(cpu) | ||
cpuNum := strings.TrimPrefix(cpuName, "cpu") | ||
|
||
core_id := -1 | ||
if value, err := readUintFromFile(filepath.Join(cpu, "topology/core_id")); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we only need core_id
for thermal throttles, let's move this read to after we check for thermal throttles. Then we can avoid the -1
handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to an outer if as there is no need to read thermal throttles if we don't have the core_id.
CI is running....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think this code is problematic on multi-socket systems: The cpu loop iterates over all cpus and writes the
Example (source):
Another example: Here's the metric output of one of my systems. It shows 12 metrics although this system has 24 physical cores (also notice that interesting and confusing
This is the cpu:
I.e. we only collect the metrics from CPUs 12-23 in this example and AFAICS the
Q: Should the |
Related: #867 - That "fixes" the tests, hiding this problem. So yeah I think we need a 2nd label or concate core+package somehow. |
(The 2nd label would be sourced from |
* Only report core throttles per core, not per cpu * Add topology/core_id to the cpu sysfs fixtures * Add new cpu fixtures to ttar file * Merge core_id reading and thermal throttle accounting * Declare core_id
Fixes #659