-
-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mitigate cpu_usage()
from returning NaN on Macs
#1236
Mitigate cpu_usage()
from returning NaN on Macs
#1236
Conversation
829f614
to
3c6dcc7
Compare
I believe the clippy failure to be unrelated to this PR, please let me know if I should take action to resolve it. |
Hi @Hoverbear! o/ First of all: thanks a lot for the PR, the detailed explanations of the issue and for adding test. It's really awesome. :) Just one nit and I merge. As for the clippy lint, it's not from your code, just last nightly clippy version. Can you add a |
So good to bump into you again, my friend. :) Really appreciate this crate and your work on it! |
cpu_usage()
from returning NaN on Macscpu_usage()
from returning NaN on Macs
I renamed this issue, I believe If the measurements of the |
Could you not simply add a check and if it's 0, just return 0? |
Forgot to precise: in addition to your add (which I think is a very good idea). |
Hmmmm... I believe that reporting zero would be incorrect, no? It's possible that say, Let's say: let old_in_use = 200;
let new_in_use = 200;
let old_idle = 4000;
let new_idle = 4000;
// Results in
let in_use = (new_in_use - old_in_use); // 0
let total = (new_in_use - old_in_use) + (new_idle - old_idle); // 0 Then we'd go divide by Therefore, to be safe here, we can check to see if the measurement is the same, and return the just the straight measurement instead of the difference if so. let new_in_use = get_in_use(cpu_info, offset);
let old_in_use = get_in_use(old_cpu_info, offset);
let new_idle = get_idle(cpu_info, offset);
let old_idle = get_idle(old_cpu_info, offset);
// Check the old measurement and new one differ before trying to calculate the difference
// Avoid `total` becoming a zero since `in_use / 0 == NaN`
if new_in_use == old_in_use && new_idle == old_idle {
in_use = new_in_use.saturating_sub(old_in_use);
total = new_in_use.saturating_add(new_idle as _);
} else {
in_use = new_in_use.saturating_sub(old_in_use);
total = in_use.saturating_add(new_idle.saturating_sub(old_idle) as _);
} Then, even still, it's possible to get a let old_in_use = 400; // It's bigger!
let new_in_use = 200; // It's smaller!
let old_idle = 4000;
let new_idle = 4000; // Unchanged because... CPU Frequency scaled(?)
// Results in
let in_use = (new_in_use - old_in_use); // 0
let total = (new_in_use - old_in_use) + (new_idle - old_idle); // 0 Heck! In this case we now need to do even more work, and how do we provide something accurate? 😵 Instead, I suggest we set let in_use;
let idle;
// In case we are initializing cpus, there is no "old value" yet.
if old_cpu_info == cpu_info {
in_use = get_in_use(cpu_info, offset);
idle = get_idle(cpu_info, offset);
} else {
let new_in_use = get_in_use(cpu_info, offset);
let old_in_use = get_in_use(old_cpu_info, offset);
let new_idle = get_idle(cpu_info, offset);
let old_idle = get_idle(old_cpu_info, offset);
in_use = new_in_use.saturating_sub(old_in_use);
idle = new_idle.saturating_sub(old_idle) as _;
}
let total = in_use.saturating_add(idle as _);
let usage = (in_use as f32 / total as f32) * 100.;
if usage.is_nan() {
// If divided by zero, avoid returning a NaN
0.
} else {
usage
} Please let me know if my brain became momentarily vacant and there would be a more appropriate fix? Here's a log from a run using that code: ~/git/GuillameGomez/sysinfo> cargo run --example simple 2024-03-28 06:53:03 AM
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
Running `target/debug/examples/simple`
Getting system information...
Done.
To get the commands' list, enter 'help'.
> refresh_cpu
Refreshing CPUs...
Done.
> cpus
number of physical cores: 12
total CPU usage: 3.2543619%
Cpu { name: "1", CPU usage: 12.703101, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "2", CPU usage: 10.895523, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "3", CPU usage: 5.8479533, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "4", CPU usage: 3.9358602, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "5", CPU usage: 2.1802325, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "6", CPU usage: 1.1678832, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "7", CPU usage: 0.5805515, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "8", CPU usage: 0.14492755, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "9", CPU usage: 0.5805515, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "10", CPU usage: 0.8708273, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "11", CPU usage: 0.0, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "12", CPU usage: 0.14492755, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
> refresh_cpu
Refreshing CPUs...
Done.
> cpus
number of physical cores: 12
total CPU usage: 3.3380797%
Cpu { name: "1", CPU usage: 13.250883, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "2", CPU usage: 10.229277, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "3", CPU usage: 8.391608, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "4", CPU usage: 2.7972028, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "5", CPU usage: 1.9130435, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "6", CPU usage: 0.8695652, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "7", CPU usage: 1.0416667, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "8", CPU usage: 0.0, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "9", CPU usage: 0.34843206, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "10", CPU usage: 1.2152778, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "11", CPU usage: 0.0, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" }
Cpu { name: "12", CPU usage: 0.0, frequency: 4056, vendor ID: "Apple", brand: "Apple M3 Pro" } |
I agree with you. New code looks better too. Thanks a lot for this! |
…o-0.30.8, r=Nilstrieb Update sysinfo to 0.30.8 Fixes a Mac specific issue when using `metrics = true` in `config.toml`. ```config.toml # Collect information and statistics about the current build and writes it to # disk. Enabling this or not has no impact on the resulting build output. The # schema of the file generated by the build metrics feature is unstable, and # this is not intended to be used during local development. metrics = true ``` During repeated builds, as the generated `metrics.json` grew, eventually `refresh_cpu()` would be called in quick enough succession (specifically: under 200ms) that a divide by zero would occur, leading to a `NaN` which would not be serialized, then when the `metrics.json` was re-read it would fail to parse. That error looks like this (collected from Ferrocene's CI): ``` Compiling rustdoc-tool v0.0.0 (/Users/distiller/project/src/tools/rustdoc) Finished release [optimized] target(s) in 38.37s thread 'main' panicked at src/utils/metrics.rs:180:21: serde_json::from_slice::<JsonRoot>(&contents) failed with invalid type: null, expected f64 at line 1 column 9598 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace Build completed unsuccessfully in 0:00:40 Exited with code exit status 1 ``` Related: GuillaumeGomez/sysinfo#1236
Rollup merge of rust-lang#123378 - ferrocene:hoverbear/bump-sysinfo-to-0.30.8, r=Nilstrieb Update sysinfo to 0.30.8 Fixes a Mac specific issue when using `metrics = true` in `config.toml`. ```config.toml # Collect information and statistics about the current build and writes it to # disk. Enabling this or not has no impact on the resulting build output. The # schema of the file generated by the build metrics feature is unstable, and # this is not intended to be used during local development. metrics = true ``` During repeated builds, as the generated `metrics.json` grew, eventually `refresh_cpu()` would be called in quick enough succession (specifically: under 200ms) that a divide by zero would occur, leading to a `NaN` which would not be serialized, then when the `metrics.json` was re-read it would fail to parse. That error looks like this (collected from Ferrocene's CI): ``` Compiling rustdoc-tool v0.0.0 (/Users/distiller/project/src/tools/rustdoc) Finished release [optimized] target(s) in 38.37s thread 'main' panicked at src/utils/metrics.rs:180:21: serde_json::from_slice::<JsonRoot>(&contents) failed with invalid type: null, expected f64 at line 1 column 9598 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace Build completed unsuccessfully in 0:00:40 Exited with code exit status 1 ``` Related: GuillaumeGomez/sysinfo#1236
Describe the bug
Using the current latest checkout of this repo (or any 0.30.x version) on a Mac can result in
cpu.cpu_usage()
reporting aNaN
. This is primarily caused by this code:sysinfo/src/unix/apple/cpu.rs
Lines 239 to 244 in b7d7f31
If
refresh_cpu()
is called in rapid succession the collected data can be the same, resulting intotal
being 0, sincein_use
is divided bytotal
, this creates a divide by zero situation and can return a NaN.It can be worked around by waiting
sysinfo::MINIMUM_CPU_UPDATE_INTERVAL
between calls torefresh_cpu()
, as the data will (hopefully) no longer be the same.Some platforms, such as Linux, do this interval check internally:
sysinfo/src/unix/linux/cpu.rs
Lines 83 to 85 in b7d7f31
This PR takes the same strategy as Linux and enforces an interval internally.
I can confirm that as a result of this change,
cargo test
passess successfully on my aarch64 Mac.To Reproduce
In order to reproduce this outside of the Rust repo, I did a bit of light refactoring of
compute_usage_of_cpu
:Then I updated the the
test_cpu
test to rapidly callrefresh_cpu()
twice, and assert that none of the usage wasNaN
:With that done, I was able to reproduce the issue via
cargo test test_cpu -- --nocapture
:I confirmed that adding the delay resolves the issue.