diff --git a/keps/prod-readiness/sig-node/2400.yaml b/keps/prod-readiness/sig-node/2400.yaml index 1eb33a410c5..4741875582b 100644 --- a/keps/prod-readiness/sig-node/2400.yaml +++ b/keps/prod-readiness/sig-node/2400.yaml @@ -1,3 +1,5 @@ kep-number: 2400 alpha: approver: "@deads2k" +beta: + approver: "@deads2k" diff --git a/keps/sig-node/2400-node-swap/README.md b/keps/sig-node/2400-node-swap/README.md index 7fa2b6aeea0..c3dc4faac1e 100644 --- a/keps/sig-node/2400-node-swap/README.md +++ b/keps/sig-node/2400-node-swap/README.md @@ -401,8 +401,14 @@ For alpha: and further development efforts. - Focus should be on supported user stories as listed above. -Once this data is available, additional test plans should be added for the next -phase of graduation. +For beta: + +- Add e2e tests that exercise all available swap configurations via the CRI. +- Add e2e tests that verify pod-level control of swap utilization. +- Add e2e tests that verify swap performance with pods using a tmpfs. +- Verify new system-reserved settings for swap memory. +- Verify MemoryPressure behaviour with swap enabled and document any changes + for configuring eviction. ### Graduation Criteria @@ -416,8 +422,6 @@ phase of graduation. #### Beta -_(Tentative.)_ - - Add support for controlling swap consumption at the pod level [via cgroups]. - Handle usage of swap during container restart boundaries for writes to tmpfs (which may require pod cgroup change beyond what container runtime will do at @@ -426,6 +430,7 @@ _(Tentative.)_ detects on the host. - Consider introducing new configuration modes for swap, such as a node-wide swap limit for workloads. +- Add swap memory to the Kubelet stats api. - Determine a set of metrics for node QoS in order to evaluate the performance of nodes with and without swap enabled. - Better understand relationship of swap with memory QoS in cgroup v2 @@ -437,6 +442,8 @@ _(Tentative.)_ #### GA +_(Tentative.)_ + - Test a wide variety of scenarios that may be affected by swap support. - Remove feature flag. @@ -587,6 +594,19 @@ Try to be as paranoid as possible - e.g., what if some components will restart mid-rollout? --> +If a new node with swap memory fails to come online, it will not impact any +running components. + +It is possible that if a cluster administrator adds swap memory to an already +running node, and then performs an in-place upgrade, the new kubelet could fail +to start unless the configuration was modified to tolerate swap. However, we +would expect that if a cluster admin is adding swap to the node, they will also +update the kubelet's configuration to not fail with swap present. + +Generally, it is considered best practice to add a swap memory partition at +node image/boot time and not provision it dynamically after a kubelet is +already running and reporting Ready on a node. + ###### What specific metrics should inform a rollback? +Workload churn or performance degradations on nodes. The metrics will be +application/use-case specific, but we can provide some suggestions, based on +the stability metrics identified earlier. + ###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? +N/A because swap support lacks a runtime upgrade/downgrade path; kubelet must +be restarted with or without swap support. + ###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? +No. + ### Monitoring Requirements +KubeletConfiguration has set `failOnSwap: false`. + +The prometheus `node_exporter` will also export stats on swap memory +utilization. + ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? +TBD. We will determine a set of metrics as a requirement for beta graduation. +We will need more production data; there is not a single metric or set of +metrics that can be used to generally quantify node performance. + +This section to be updated before the feature can be marked as graduated, and +to be worked on during 1.23 development. + +We will also add swap memory utilization to the Kubelet stats API, to provide a means of monitoring this beyond cadvisor Prometheus stats. + - [ ] Metrics - Metric name: - [Optional] Aggregation method: @@ -647,6 +690,8 @@ high level (needs more precise definitions) those may be things like: - 99,9% of /health requests per day finish with 200 code --> +N/A + ###### Are there any missing metrics that would be useful to have to improve observability of this feature? +N/A + ### Dependencies + +Individual nodes with swap memory enabled may experience performance +degradations under load. This could potentially cause a cascading failure on +nodes without swap: if nodes with swap fail Ready checks, workloads may be +rescheduled en masse. + +Thus, cluster administrators should be careful while enabling swap. To minimize +disruption, you may want to taint nodes with swap available to protect against +this problem. Taints will ensure that workloads which tolerate swap will not +spill onto nodes without swap under load. + ###### What steps should be taken if SLOs are not being met to determine the problem? +It is suggested that if nodes with swap memory enabled cause performance or +stability degradations, those nodes are cordoned, drained, and replaced with +nodes that do not use swap memory. + ## Implementation History - **2015-04-24:** Discussed in [#7294](https://github.com/kubernetes/kubernetes/issues/7294). diff --git a/keps/sig-node/2400-node-swap/kep.yaml b/keps/sig-node/2400-node-swap/kep.yaml index 1ecf7efa74f..322f413ecbd 100644 --- a/keps/sig-node/2400-node-swap/kep.yaml +++ b/keps/sig-node/2400-node-swap/kep.yaml @@ -20,12 +20,12 @@ prr-approvers: - "@deads2k" # The target maturity stage in the current dev cycle for this KEP. -stage: alpha +stage: beta # The most recent milestone for which work toward delivery of this KEP has been # done. This can be the current (upcoming) milestone, if it is being actively # worked on. -latest-milestone: "v1.22" +latest-milestone: "v1.23" # The milestone at which this feature was, or is targeted to be, at each stage. milestone: