Skip to content

Commit

Permalink
Add Pod Topology Spread KEP to new template
Browse files Browse the repository at this point in the history
  • Loading branch information
Huang-Wei committed May 18, 2020
1 parent 867dce3 commit f6a5c5b
Show file tree
Hide file tree
Showing 2 changed files with 171 additions and 8 deletions.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Even Pods Spreading
title: Pod Topology Spread
authors:
- "@Huang-Wei"
owning-sig: sig-scheduling
Expand All @@ -13,7 +13,7 @@ approvers:
- "@ahg-g"
- "@alculquicondor"
creation-date: 2019-02-21
last-updated: 2020-01-21
last-updated: 2020-05-18
status: implementable
---

Expand Down Expand Up @@ -44,19 +44,28 @@ status: implementable
- [Graduation Criteria](#graduation-criteria)
- [Alternatives](#alternatives)
- [Impact to Other Features](#impact-to-other-features)
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
- [Feature enablement and rollback](#feature-enablement-and-rollback)
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
- [Monitoring requirements](#monitoring-requirements)
- [Dependencies](#dependencies)
- [Scalability](#scalability)
- [Troubleshooting](#troubleshooting)
- [Implementation History](#implementation-history)
<!-- /toc -->

## Release Signoff Checklist

- [x] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
- [x] KEP approvers have set the KEP status to `implementable`
- [x] Design details are appropriately documented
- [x] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [x] Graduation criteria is in place
- [x] (R kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
- [x] (R) KEP approvers have set the KEP status to `implementable`
- [x] (R) Design details are appropriately documented
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [x] (R) Graduation criteria is in place
- [ ] (R) Production readiness review completed
- [ ] Production readiness review approved
- [x] "Implementation History" section is up-to-date for milestone
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
- [x] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

## Terms

Expand Down Expand Up @@ -520,6 +529,129 @@ simply achieve the semantics of "PodAntiAffinity in zones" via a combination of
"Even pods spreading in zones" plus "PodAntiAffinity in nodes" which could be an
extra benefit of this KEP.
## Production Readiness Review Questionnaire
### Feature enablement and rollback
* **How can this feature be enabled / disabled in a live cluster?**
- [x] Feature gate
- Feature gate name: EvenPodsSpread
- Components depending on the feature gate: kube-scheduler, kube-apiserver
* **Does enabling the feature change any default behavior?**
No.
* **Can the feature be disabled once it has been enabled (i.e. can we rollback
the enablement)?**
The feature can be disabled in Alpha and Beta versions. In terms of Stable
versions, users can choose to opt-out by not setting the
`pod.spec.topologySpreadConstraints` field.
* **What happens if we reenable the feature if it was previously rolled back?**
N/A.
* **Are there any tests for feature enablement/disablement?**
E2e, integration and unit tests are in place to test the feature enablement.
### Rollout, Upgrade and Rollback Planning
* **How can a rollout fail? Can it impact already running workloads?**
Since this feature requires users to opt-in by setting new field in pod spec,
it should not impact already running workloads.
* **What specific metrics should inform a rollback?**
Metric "schedule_attempts_total" remaining at zero when new pods are added.
* **Were upgrade and rollback tested? Was upgrade->downgrade->upgrade path tested?**
N/A.
* **Is the rollout accompanied by any deprecations and/or removals of features,
APIs, fields of API types, flags, etc.?**
No.
### Monitoring requirements
* **How can an operator determine if the feature is in use by workloads?**
Operator can query `pod.spec.topologySpreadConstraints` field and identify if
this is being set to non-default values.
* **What are the SLIs (Service Level Indicators) an operator can use to
determine the health of the service?**
N/A.
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
N/A.
* **Are there any missing metrics that would be useful to have to improve
observability if this feature?**
N/A.
### Dependencies
* **Does this feature depend on any specific services running in the cluster?**
No.
### Scalability
* **Will enabling / using this feature result in any new API calls?**
No
* **Will enabling / using this feature result in introducing new API types?**
No.
* **Will enabling / using this feature result in any new calls to cloud
provider?**
No.
* **Will enabling / using this feature result in increasing size or count
of the existing API objects?**
Since this feature adds a new field to pod's spec, it will increase API size
of Pod object depending on how many `topologySpreadConstraints` are specified.

* **Will enabling / using this feature result in increasing time taken by any
operations covered by [existing SLIs/SLOs][]?**

No.

* **Will enabling / using this feature result in non-negligible increase of
resource usage (CPU, RAM, disk, IO, ...) in any components?**

No.

### Troubleshooting

* **How does this feature react if the API server and/or etcd is unavailable?**

Running workloads won't be impacted. Submissions of new workloads using this
feature will be rejected by API server.
* **What are other known failure modes?**
N/A.
* **What steps should be taken if SLOs are not being met to determine the problem?**
N/A.
## Implementation History
- 2019-02-21: Initial KEP sent out for review.
Expand All @@ -529,3 +661,4 @@ extra benefit of this KEP.
- NOTE: The term "Even Pods Spreading" is replaced with "Pod Topology Spread",
to be consistent with the [official doc](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/), but
the featuregate name "EvenPodsSpread" remains unchanged.
- 2020-05-18: KEP updated to adopt new KEP template (production readiness review).
30 changes: 30 additions & 0 deletions keps/sig-scheduling/895-pod-topology-spread/kep.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
title: Pod Topology Spread
kep-number: 895
authors:
- "@Huang-Wei"
owning-sig: sig-scheduling
status: implementable
creation-date: 2019-02-21
reviewers:
- "@bsalamat"
- "@lavalamp"
- "@krmayankk"
- "@ahg-g"
- "@alculquicondor"
approvers:
- "@ahg-g"
- "@alculquicondor"
see-also:
- "/keps/sig-scheduling/1258-default-even-pods-spreading"
stage: stable
latest-milestone: "v1.19"
milestone:
alpha: "v1.16"
beta: "v1.18"
stable: "v1.19"
feature-gate:
name: EvenPodsSpread
components:
- kube-scheduler
- kube-apiserver
disable-supported: true

0 comments on commit f6a5c5b

Please sign in to comment.