Skip to content
This repository has been archived by the owner on Jul 27, 2023. It is now read-only.

Prevent uint underflows when calculating rates and percentages #201

Merged
merged 1 commit into from
Nov 2, 2018

Conversation

kevinconaway
Copy link
Contributor

This PR addresses the scenario where some metrics were being reported as impossibly large values.

The cause here was unsigned integer underflow as when we calculate the "rate" for a given metric, we calculate (current - previous) / elapsed time in seconds. In our case, current and previous are usually unsigned integers so if previous > current, the subtraction operation will overflow.

This usually only occurs in corner cases where the container metrics were not able to be collected for some reason, possibly because the container is shutting down.

@kevinconaway kevinconaway requested a review from a team October 30, 2018 14:01
@bits-bot
Copy link

bits-bot commented Oct 30, 2018

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@conorbranagan conorbranagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@@ -159,9 +159,14 @@ func calculateCtrPct(cur, prev, sys2, sys1 uint64, numCPU int, before time.Time)
return 0
}

// Prevent uint underflows
if prev > cur || sys1 > sys2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be returning a negative percent here instead of zero?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke with Kevin offline, and I misunderstood what this was supposed to catch.

Could this be more clearly written as:

if cur == 0 || sys2 == 0 {
   return 0
}

Since zero is a value we're treating specially?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case yes. Given that the agent is more difficult to upgrade in the field, I'd prefer to be a bit defensive here though and guard against underflows in the case that the current values are non-zero and smaller than the previous ones which could occur in some future unforeseen circumstance.

Copy link
Member

@sunhay sunhay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the increased test coverage 📈

@kevinconaway kevinconaway merged commit 35ac442 into master Nov 2, 2018
@kevinconaway kevinconaway deleted the kevinconaway/fix-container-metrics-underflow branch November 2, 2018 15:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants