Skip to content

Commit

Permalink
Update swap KEP for 1.23 beta
Browse files Browse the repository at this point in the history
Fill out remaining beta PRR questions, add test plans
  • Loading branch information
ehashman committed Sep 3, 2021
1 parent 59a1941 commit 20a8885
Show file tree
Hide file tree
Showing 3 changed files with 61 additions and 4 deletions.
2 changes: 2 additions & 0 deletions keps/prod-readiness/sig-node/2400.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
kep-number: 2400
alpha:
approver: "@deads2k"
beta:
approver: "@deads2k"
59 changes: 57 additions & 2 deletions keps/sig-node/2400-node-swap/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -401,8 +401,12 @@ For alpha:
and further development efforts.
- Focus should be on supported user stories as listed above.

Once this data is available, additional test plans should be added for the next
phase of graduation.
For beta:

- Add e2e tests that exercise all available swap configurations via the CRI.
- Add e2e tests that verify pod-level control of swap utilization.
- Add e2e tests that verify swap performance with pods using a tmpfs.
- Verify new system-reserved settings for swap memory.

### Graduation Criteria

Expand Down Expand Up @@ -587,13 +591,29 @@ Try to be as paranoid as possible - e.g., what if some components will restart
mid-rollout?
-->

If a new node with swap memory fails to come online, it will not impact any
running components.

It is possible that if a cluster administrator adds swap memory to an already
running node, and then performs an in-place upgrade, the new kubelet could fail
to start unless the configuration was modified to tolerate swap. However, we
would expect that if a cluster admin is adding swap to the node, they will also
update the kubelet's configuration to not fail with swap present.

Generally, it is considered best practice to add a swap memory partition at
node image/boot time and not provision it dynamically after a kubelet is
already running and reporting Ready on a node.

###### What specific metrics should inform a rollback?

<!--
What signals should users be paying attention to when the feature is young
that might indicate a serious problem?
-->

Workload churn or performance degradations on nodes. The metrics will be
application/use-case specific, but we can provide some suggestions.

###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

<!--
Expand All @@ -602,12 +622,17 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
are missing a bunch of machinery and tooling and can't do that now.
-->

N/A because swap support lacks a runtime upgrade/downgrade path; kubelet must
be restarted with or without swap support.

###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

<!--
Even if applying deprecation policies, they may still surprise some users.
-->

No.

### Monitoring Requirements

<!--
Expand All @@ -622,12 +647,21 @@ checking if there are objects with field X set) may be a last resort. Avoid
logs or events for this purpose.
-->

KubeletConfiguration has set `failOnSwap: false`.

The prometheus `node_exporter` will also export stats on swap memory
utilization.

###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?

<!--
Pick one more of these and delete the rest.
-->

TBD. We will determine a set of metrics as part of beta graduation. We will
need more data; there is not a single metric or set of metrics that can be used
to generally quantify node performance.

- [ ] Metrics
- Metric name:
- [Optional] Aggregation method:
Expand All @@ -647,13 +681,17 @@ high level (needs more precise definitions) those may be things like:
- 99,9% of /health requests per day finish with 200 code
-->

N/A

###### Are there any missing metrics that would be useful to have to improve observability of this feature?

<!--
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
implementation difficulties, etc.).
-->

N/A

### Dependencies

<!--
Expand Down Expand Up @@ -784,6 +822,8 @@ details). For now, we leave it here.

###### How does this feature react if the API server and/or etcd is unavailable?

No change. Feature is specific to individual nodes.

###### What are other known failure modes?

<!--
Expand All @@ -799,8 +839,23 @@ For each of them, fill in the following information by copying the below templat
- Testing: Are there any tests for failure mode? If not, describe why.
-->


Individual nodes with swap memory enabled may experience performance
degradations under load. This could potentially cause a cascading failure on
nodes without swap: if nodes with swap fail Ready checks, workloads may be
rescheduled en masse.

Thus, cluster administrators should be careful while enabling swap. To minimize
disruption, you may want to taint nodes with swap available to protect against
this problem. Taints will ensure that workloads which tolerate swap will not
spill onto nodes without swap under load.

###### What steps should be taken if SLOs are not being met to determine the problem?

It is suggested that if nodes with swap memory enabled cause performance or
stability degradations, those nodes are cordoned, drained, and replaced with
nodes that do not use swap memory.

## Implementation History

- **2015-04-24:** Discussed in [#7294](https://github.com/kubernetes/kubernetes/issues/7294).
Expand Down
4 changes: 2 additions & 2 deletions keps/sig-node/2400-node-swap/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@ prr-approvers:
- "@deads2k"

# The target maturity stage in the current dev cycle for this KEP.
stage: alpha
stage: beta

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.22"
latest-milestone: "v1.23"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
Expand Down

0 comments on commit 20a8885

Please sign in to comment.