Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Allow mutating priority classes #2851

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions keps/sig-scheduling/268-priority-preemption/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@
- [Proposal](#proposal)
- [Risks and Mitigations](#risks-and-mitigations)
- [Graduation Criteria](#graduation-criteria)
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
- [Feature enablement and rollback](#feature-enablement-and-rollback)
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
- [Monitoring requirements](#monitoring-requirements)
- [Dependencies](#dependencies)
- [Scalability](#scalability)
- [Troubleshooting](#troubleshooting)
- [Testing Plan](#testing-plan)
- [Unit Tests](#unit-tests)
- [Integration tests](#integration-tests)
Expand Down Expand Up @@ -60,6 +67,130 @@ caused by this change.
* Adequate documentation exists for the features.
* Test coverage of the features is acceptable.

## Production Readiness Review Questionnaire
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the proposal to PRR section as we decided to go with PRR updates . Let me know, if you folks want a new KEP

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @soltysh was asking is that we have a new KEP that includes the PRR questionnaire

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ravisantoshgudimetla pinged me about it, I only cared about it being written down. Whether a new KEP or update to the current I'll leave that decision to both of you 😉 Just make sure to update kep.yaml I see it's outdated and gives wrong information about the state of this feature.

We'd like priority classes to be mutable going from release 1.23 because of we're technically achieving the
same effect with re-creation. Reasons for making priority classes immutable are mentioned in this
[proposal](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/pod-priority-api.md#drawbacks-of-changing-priority-classes)
but we noticed that people are anyways deleting/creating priority classes despite the reasons mentioned above.

### Feature enablement and rollback

* **How can this feature be enabled / disabled in a live cluster?**

We'll introduce a new featuregate flag called `AllowPriorityClassUpdates`. By enabling this flag, this feature can be enabled.

* **Does enabling the feature change any default behavior?**

It changes the behavior of priority classes. Priority class values and names can be changed now.

* **Can the feature be disabled once it has been enabled (i.e. can we rollback
the enablement)?**

Yes, by disabling the featuregate

* **What happens if we reenable the feature if it was previously rolled back?**

Priority classes become immutable again.

* **Are there any tests for feature enablement/disablement?**

Yes, we will add unit tests at api validation package level.

### Rollout, Upgrade and Rollback Planning

* **How can a rollout fail? Can it impact already running workloads?**

If the rollout fails, it won't impact the already running workloads.

* **What specific metrics should inform a rollback?**

N/A.

* **Were upgrade and rollback tested? Was upgrade->downgrade->upgrade path tested?**

Haven't been tested. Will be tested once this feature graduates.

* **Is the rollout accompanied by any deprecations and/or removals of features,
APIs, fields of API types, flags, etc.?**

N/A.

### Monitoring requirements

* **How can an operator determine if the feature is in use by workloads?**

N/A.

* **How can someone using this feature know that it is working for their instance?**

By updating priority classes. If they're able to do so, it means the feature is working, if not the
feature is not working

* **What are the SLIs (Service Level Indicators) an operator can use to
determine the health of the service?**

N/A.

* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**

N/A.

* **Are there any missing metrics that would be useful to have to improve
observability if this feature?**

N/A.

### Dependencies

* **Does this feature depend on any specific services running in the cluster?**

No.


### Scalability

* **Will enabling / using this feature result in any new API calls?**

No

* **Will enabling / using this feature result in introducing new API types?**

No REST API changes.

* **Will enabling / using this feature result in any new calls to cloud
provider?**

No.

* **Will enabling / using this feature result in increasing size or count
of the existing API objects?**

No.

* **Will enabling / using this feature result in increasing time taken by any
operations covered by [existing SLIs/SLOs][]?**

No.

* **Will enabling / using this feature result in non-negligible increase of
resource usage (CPU, RAM, disk, IO, ...) in any components?**

No.

### Troubleshooting

* **How does this feature react if the API server and/or etcd is unavailable?**

N/A.

* **What are other known failure modes?**

Errors will be logged to stderr.

* **What steps should be taken if SLOs are not being met to determine the problem?**

N/A.

## Testing Plan
Pod priority and preemption have unit, integration, and e2e tests. These tests
are run regularly as a part of Kubernetes presubmit and CI/CD pipeline.
Expand Down
4 changes: 3 additions & 1 deletion keps/sig-scheduling/268-priority-preemption/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@ participating-sigs:
- sig-scheduling
reviewers:
- "@k82cn"
- "@ahg-g"
- sig-api
approvers:
- "@liggitt"
editor: Babak Salamat
creation-date: 2019-01-31
last-updated: 2019-01-31
last-updated: 2021-08-06
status: implementable
see-also:
replaces:
Expand Down