Skip to content
This repository has been archived by the owner on Feb 9, 2024. It is now read-only.

[6.1.x] Update disk/storage check #1690

Merged
merged 2 commits into from
Jun 15, 2020
Merged

Conversation

bernardjkim
Copy link
Contributor

@bernardjkim bernardjkim commented Jun 11, 2020

Description

This PR updates the storage check to report separate warning and critical probes at configured thresholds.

Also updates the gravity upgrade command. Will now check cluster status and prevent upgrade operation if cluster is degraded and requires --force flag if there are any active warnings.

Type of change

  • Internal change (not necessarily a bug fix or a new feature)
  • This change has a user-facing impact

Linked tickets and other PRs

TODOs

  • Self-review the change
  • Perform manual testing
  • Address review feedback

Testing done

Verify warning if disk usage exceeds 80%

[vagrant@node-1 gravity]$ sudo gravity status
Cluster name:           dev.test
Cluster status:         active
Application:            telekube, version 6.1.29-dev.4
Gravity version:        6.1.29-dev.4 (client) / 6.1.29-dev.4 (server)
[...]
Cluster nodes:
    Masters:
        * node-1 / 172.28.128.101 / node
            Status:             healthy
            [!]                 disk utilization on /var/lib/gravity exceeds 80% (3.5GB is available out of 21GB), cluster will degrade if usage exceeds 90%, see https://gravitational.com/gravity/docs/cluster/#garbage-collection
            Remote access:      online

Verify critical if disk usage exceeds 90%

[vagrant@node-1 gravity]$ sudo gravity status
Cluster name:           dev.test
Cluster status:         degraded
Application:            telekube, version 6.1.29-dev.4
Gravity version:        6.1.29-dev.4 (client) / 6.1.29-dev.4 (server)
[...]
Cluster nodes:
    Masters:
        * node-1 / 172.28.128.101 / node
            Status:             degraded
            [×]                 disk utilization on /var/lib/gravity exceeds 90% (1.3GB is available out of 21GB), see https://gravitational.com/gravity/docs/cluster/#garbage-collection
            Remote access:      online

Error message when active warnings

[vagrant@node-1 gravity]$ sudo gravity upgrade
Some cluster nodes have active warnings:
    Masters:
        * node-2 / 172.28.128.102 / node
            Status:             healthy
            Remote access:      online
        * node-3 / 172.28.128.103 / node
            Status:             healthy
            Remote access:      online
        * node-1 / 172.28.128.101 / node
            Status:             healthy
            [!]                 disk utilization on /var/lib/gravity exceeds 80% (3.5GB is available out of 21GB), cluster will degrade if usage exceeds 90%, see https://gravitational.com/gravity/docs/cluster/#garbage-collection
            Remote access:      online
You can provide the --force flag to suppress this message and launch the upgrade anyways.
[ERROR]: failed to start upgrade operation

Verify upgrade operation can be forced if cluster has active warnings

[vagrant@node-1 gravity]$ sudo gravity upgrade --force
[ERROR]: update version (6.1.29-dev.4) must be greater than the currently installed version (6.1.29-dev.4)

Error message when cluster degraded

[vagrant@node-1 gravity]$ sudo gravity upgrade
The upgrade is prohibited because some cluster nodes are currently degraded.
    Masters:
        * node-2 / 172.28.128.102 / node
            Status:             healthy
            Remote access:      online
        * node-3 / 172.28.128.103 / node
            Status:             healthy
            Remote access:      online
        * node-1 / 172.28.128.101 / node
            Status:             degraded
            [×]                 disk utilization on /var/lib/gravity exceeds 90% (1.3GB is available out of 21GB), see https://gravitational.com/gravity/docs/cluster/#garbage-collection
            Remote access:      online
Please make sure the cluster is healthy before re-attempting the upgrade.
[ERROR]: failed to start upgrade operation

Verify Low/High watermarks can be manually configured by PLANET_LOW_WATERMARK and PLANET_HIGH_WATERMARK

[vagrant@node-1 gravity]$ sudo gravity status
Cluster name:           dev.test
Cluster status:         active
Application:            telekube, version 6.1.29-dev.4
Gravity version:        6.1.29-dev.4 (client) / 6.1.29-dev.4 (server)
[...]
Cluster nodes:
    Masters:
        * node-1 / 172.28.128.101 / node
            Status:             healthy
            [!]                 disk utilization on /var/lib/gravity exceeds 40% (11GB is available out of 21GB), cluster will degrade if usage exceeds 60%, see https://gravitational.com/gravity/docs/cluster/#garbage-collection
            Remote access:      online

@bernardjkim bernardjkim added the port/6.1 Requires port to version/6.1.x label Jun 11, 2020
@bernardjkim bernardjkim requested review from a team, r0mant and a-palchikov June 11, 2020 18:23
@bernardjkim bernardjkim merged commit fdd96e8 into version/6.1.x Jun 15, 2020
@bernardjkim bernardjkim deleted the bernard/6.1.x/disk-check branch June 15, 2020 16:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
port/6.1 Requires port to version/6.1.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants