-
Notifications
You must be signed in to change notification settings - Fork 27
Conversation
9fc958d
to
9509c0a
Compare
monitoring/storage.go
Outdated
} | ||
|
||
// DiskSpaceCheckerID is the checker that checks disk space utilization | ||
const DiskSpaceCheckerID = "disk-space" | ||
|
||
// DefaultCriticalWatermark is the default critical disk usage percentage threshold. | ||
const DefaultCriticalWatermark = 90 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this constants alongside the other one (the low watermark), I think it's somewhere in lib/constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A watermark constant was not previously defined in satellite. The constant is defined in planet and passed to the storage checker.
monitoring/storage_linux.go
Outdated
TotalBytes: totalBytes, | ||
AvailableBytes: availableBytes, | ||
WatermarkCritical: c.HighWatermark, | ||
WatermarkWarning: c.HighWatermark - 10, // Set warning watermark 10% below the critical watermark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just have 2 separate constants TBH for soft/hard limit. Is there any particular reason you wanted to "tie" them one to another this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want the watermark to be configurable/overrideable in emergency situations. If the values are separately configurable, we might run into incorrect behavior if the critical watermark is set to a lower value than the warning watermark. Having them tied together will make sure we don't run into that situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scenario can be just checked during checker initialization, right? In Check(), as we usually do. And just return an error if that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that works. I guess it'd be nicer to have separate control over the two threshold values.
a99f81d
to
8186878
Compare
This should be also taken into account: gravitational/gravity#1748 |
Description
This PR updates the disk check and addresses items 1-3 of gravitational/gravity#1662. The disk/storage check will now report a warning or a critical probe at different thresholds. The default is set to 80% disk usage for a warning probe and 90% disk usage for a critical probe.
Linked tickets and PRs
Testing done
Testing done on version 5.5.47.
fallocate
is a quick way to alloate disk space for a file.Verify warning when disk usage > 80%
Verify critical when disk usage > 90%