diff --git a/keps/prod-readiness/sig-cli/3515.yaml b/keps/prod-readiness/sig-cli/3515.yaml new file mode 100644 index 00000000000..b2bb14812b0 --- /dev/null +++ b/keps/prod-readiness/sig-cli/3515.yaml @@ -0,0 +1,3 @@ +kep-number: 3515 +alpha: + approver: "@johnbelamaric" diff --git a/keps/sig-cli/3515-kubectl-explain-openapiv3/README.md b/keps/sig-cli/3515-kubectl-explain-openapiv3/README.md new file mode 100644 index 00000000000..63532363c3f --- /dev/null +++ b/keps/sig-cli/3515-kubectl-explain-openapiv3/README.md @@ -0,0 +1,992 @@ + +# KEP-3515: OpenAPI v3 for kubectl explain + + + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [OpenAPI v3 is a richer API description than OpenAPI v2](#openapi-v3-is-a-richer-api-description-than-openapi-v2) + - [CRD schemas expressed as OpenAPI v2 are lossy](#crd-schemas-expressed-as-openapi-v2-are-lossy) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [Basic Usage](#basic-usage) + - [Built-in Template Options](#built-in-template-options) + - [Plaintext](#plaintext) + - [OpenAPIV3 (raw json)](#openapiv3-raw-json) + - [HTML](#html) + - [Markdown](#markdown) + - [Risks and Mitigations](#risks-and-mitigations) + - [OpenAPI V3 Not Available](#openapi-v3-not-available) + - [Risk](#risk) + - [Mitigation](#mitigation) + - [OpenAPI serialization time](#openapi-serialization-time) + - [Risk](#risk-1) + - [Mitigation](#mitigation-1) +- [Design Details](#design-details) + - [Current High-level Approach](#current-high-level-approach) + - [Proposed High-level Approach](#proposed-high-level-approach) + - [Template rendering](#template-rendering) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Alpha 1](#alpha-1) + - [Alpha 2](#alpha-2) + - [Beta](#beta) + - [GA](#ga) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) + - [Implement proto.Models for OpenAPI V3 data](#implement-protomodels-for-openapi-v3-data) + - [Custom User Templates](#custom-user-templates) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +This KEP proposes an enhancement to kubectl explain: + +1. Switch data source from OpenAPI v2 to OpenAPI v3 +2. Replace the hand-written `kubectl explain` printer with a go/template implementation. + +## Motivation + + + +### OpenAPI v3 is a richer API description than OpenAPI v2 + +OpenAPI v3 support in Kubernetes is currently beta since version 1.24. +OpenAPI V3 is a richer representation of the kubernetes API to our users, who have been asking for visibility +into things like: + +1. nullable +2. default +3. validation fields like oneOf, anyOf, etc. + +To show each of these additional data points by themselves is a strong reason +to switch to using OpenAPI v3. + + +### CRD schemas expressed as OpenAPI v2 are lossy + +Today CRDs specify their schemas in OpenAPI v3 format. To serve the `/openapi/v2` +document used today by kubectl, there is an expensive conversion from the v3 down +to v2 format. + +This process is [very lossy](https://github.com/kubernetes/kubernetes/blob/6e0de20fbb4c127d2e45c7a22347c08545fc7a86/staging/src/k8s.io/apiextensions-apiserver/pkg/controller/openapi/v2/conversion.go#L56-L66), so `kubectl explain` when used against CRDs +making use of v3 features does not have a good experience with inaccurate information, or fields removed altogther. + +This transformation causes bugs, for example, when attempting to `explain` a field +that is `nullable`, kubectl instead shows nothing, due to the lossy conversion +wiping nullable fields. + +### Goals + +1. Provide the new richer type information specified by OpenAPI v3 within kubectl explain +2. Have a more maintainable `text/template` based approach to printing +3. Fallback to old `explain` implementation if cluster does not expose OpenAPI v3 data. +4. Provide multiple new output formats for kubectl explain: + * human-readable plaintext + * markdown + * maybe others +5. (Optional?) Allow users to specify their own templates for use with kubectl + explain (there may be interesting use cases for this) +6. Improve discoverability of API Resources and endpoints, and provide a platform +for richer information to be included in the future. + +### Non-Goals + +1. "Fix" openapi v3 to openapi v2 conversion + This is a non-goal for two reasons: + * These formats are not compatible, and there WILL be data loss and inaccuracy + * This negates the benefits of using OpenAPI v3 for the richer type information +2. Provide general-purpose OpenAPI visualization. + + +## Proposal + + +### Basic Usage + +The following user experience should be possible with `kubectl explain` + +```shell +kubectl explain pods.spec +``` + +Output should be familiar to users of today's `kubectl explain`, except new +information from the OpenAPI v3 spec is now populated. + +Note: Feature during development will be gated by an experimental flag. The commands +shown here elide the experimental flag for clarity. + +### Built-in Template Options +#### Plaintext + + +```shell +kubectl explain pods +``` +or + +```shell +kubectl explain pods --output plaintext +``` + +The plaintext output format is the default and should be crafted to be as close +as the existing `explain` output in use before this KEP. + +#### OpenAPIV3 (raw json) + +```shell +kubectl explain pods --output openapiv3 +``` + +To get raw OpenAPI v3 data for a certain resource today involves: +1.) setting up kubectl proxy +2.) fetching the correct path at `/openapi/v3//` +3.) filtering out unwanted results + +This command is useful not only for its convenience, but also other visualizations +may be built upon the raw output if we opt not to support a first-class custom +template solution in the future. + +#### HTML + +> PROVISIONAL SECTION + +```shell +kubectl explain pods --output html +``` + +Similarly to [godoc](https://pkg.go.dev), we suggest to provide a searchable, +navigable, generated webpage for the kubernetes types of whatever cluster kubectl +is talking to. + +Only the fields selected in the command line (and their subfields' types, etc) +will be included in the resultant page. + +Possible idea: If user types `kubectl explain --output html` with no specific target, +then all types in the cluster are included. + +#### Markdown +> PROVISIONAL SECTION + +```shell +kubectl explain pods --output md +``` + +When using the `md` template, a markdown document is printed to stdout, so it +might be saved and used for a documentation website, for example. + +Similarly to `html` output, only the fields selected in the command line +(and their subfields' types, etc) will be included in the resultant page. + +Possible idea: If user types `kubectl explain --output md` with no specific target, +then all types in the cluster are included. + +### Risks and Mitigations + +#### OpenAPI V3 Not Available + +##### Risk + +OpenAPI v3 data is not available in the current cluster. + +##### Mitigation + +###### If the user does not provide an --output argument + +In alpha in particular, if `--output` is not specified, the old `explain` behavior +using openapi v2 deta will be used. + +After beta, `--output plaintext` will be assumed and behave as below. + +###### If the user does provide an --output argument + +If a user specifies an `--output` argument and the server 404's attempting to +fetch the correct openapi version for the template, a new error message should +be thrown to the effect of: `server missing openapi data for version: %v.%v.%v`. + +Internal templates should strive to support the latest OpenAPI version enabled +by default by versions of kubernetes within their skew. With that policy, templates +will always render with the latest spec-version of the data, if it is available. + +Other network errors should be handled using normal kubectl error handling. + + +#### OpenAPI serialization time +##### Risk + +Today there is no interactive-speed way to deserialize protobuf or JSON openapi +v3 data into the kube-openapi format. + +##### Mitigation + +There has been recent progress in this area. To unmarshal kube-OpenAPI v3 is now able +to be done in a performant enough way to do it in the CLI. This KEP's beta release +should be blocked on the merging of this optimization. + +## Design Details + +#### Current High-level Approach + +1. User types `kubectl explain pods` +2. kubectl resolves 'pods' to GVR core v1 pods using cluster discovery information +3. kubectl resolves GVR to its GVK using restmapper +4. kubectl fetches `/openapi/v2` as protobuf +5. kubectl parses the protobuf into `gnostic_v2.Document` +6. kubectl converts `gnostic_v2.Document` into `proto.Models` +7. kubectl searches the document's `Definitions` for a schema with the +extension `x-kubernetes-group-version-kind` matching the interested GVK +8. If a field path was used, kubectl traverses the definition's fields to the subschema +specified by the user's path. +9. kubectl renders the definition using its hardcoded printer +10. If `--recursive` was used, repeat step 9 for the transitive closure of + object-typed fields of the top-level object. Concat the results together. + +#### Proposed High-level Approach + +1. User types `kubectl explain pods` +2. kubectl resolves 'pods' to GVR core v1 pods using cluster discovery information +3. kubectl fetches `/openapi/v3//` +4. kubectl parses the result as kube-openapi spec3 +5. kubectl locates the schema of the return type for the Path `/apis///` in kube-openapi +6. If a field path was used, kubectl traverses the definition's fields to the subschema +specified by the user's path. +8. kubectl renders the type using its built-in template for human-readable plaintext +9. If `--recursive` was used, repeat step 8 for the transitive closure of object-typed fields of the top-level object. Concat the results together. + +### Template rendering + +Go's text/template will be used due to its familiarity, stability, and virtue of being in stdlib. + +### Test Plan + + + +[x] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- `k8s.io/kubectl/pkg/explain`: `09/29/2022`-`75.6` + +##### Integration tests + + + +- : + +Tests should include + +- Expected Output tests +- Show correct OpenAPI v3 endpoints are hit +- Tests that show default/nullability information is being included in plaintext output +- Tests that update the backing openapi in between calls to explain + +##### e2e tests + + + +Existing e2e tests should be adapted for the new system. +E2E test that shows every definition in OpenAPI document can be retrieved via explain + + +- : + +### Graduation Criteria + +Defined using feature gate + +#### Alpha 1 + +- Feature implemented behind a command line flag `--output` and environment variable +- Existing explain tests are working or adapted for new implementation +- Plaintext output roughly matches explain output +- OpenAPIV3 (raw json) output implemented + +#### Alpha 2 + +If we decide to move ahead with the `md` and `html` outputs, an Alpha 2 may +be required. + +- `md` output implemented (or dropped from design due to continued debate) + - Table of contents all GVKs grouped by Group then Version. + - Section for each individual GVK + - All types hyperlink to specific section +- basic `html` output (or dropped from design due to continued debate) + - Table of contents all GVKs grouped by Group then Version. + - Page for each individual GVK. + - All types hyperlink to their specific page + - Searchable by name, description, field name. + +#### Beta + +- kube-openAPI v3 JSON deserialization is optimized to take less than 150ms on + most machines +- OpenAPI V3 is enabled by default on at least one version within kubectl's support window. +As of Kubernetes 1.24 OpenAPIV3 entered beta and become enabled by default, therefore meeting this requirement. +- `--output plaintext` is on-by-default and environment variable is removed/on by default +- `--output plaintext-openapiv2` added as a name for the old `explain` implementation, so the feature may be positively disabled. + +#### GA + +- OpenAPIV3 is GA and has been since at least the minimum supported apiserver version +by kubectl. +- All kube-apiserver releases within version skew of kubectl should have OpenAPIV3 on by default. This is true as of kubectl for Kubernetes 1.25 +- Old `kubectl explain` implementation is removed, as is support for OpenAPIV2-backed `kubectl explain` +- `--output plaintext-openapiv2` has been deprecated for at least one release + + + +### Upgrade / Downgrade Strategy + + + +N/A + +### Version Skew Strategy + + + +This feature only requires the target cluster has enabled The OpenAPIV3 feature. + +OpenAPIV3 is Beta as of Kubernetes 1.24. This feature should not be on-by-default +until it is GA. + +Users of the `--output` flag who attempt to use it against a cluster for which +OpenAPI v3 is not enabled will be shown an error informing them of missing openapi +version upon 404. + +Built-in templates supported by kubectl should aim to support at least one OpenAPI +version which is GA for an apiserver version within the support window. +`kubectl` will support trying to fetch each of these versions, so one is guaranteed +to be able to render. + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [x] Other + - Describe the mechanism: Environment variable `ENABLE_EXPLAIN_OPENAPIV3` which toggles validity of `--output` flag + (to be renamed to --output when feature is no longer experimental) + - Will enabling / disabling the feature require downtime of the control + plane? No + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled). No + +###### Does enabling the feature change any default behavior? + + + +Enabling the feature changes the data source of `kubectl explain` to use openapiv3. +The output optimally should be familiar to users, who may be delighted to see new +information populated. + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + +Until the feature is stable it will only be enabled when the environment variable is used. +It has no persistent effect on data that is viewewd. + +###### What happens if we reenable the feature if it was previously rolled back? + +There is no persistence to using the feature. It is only used for viewing data. + +###### Are there any tests for feature enablement/disablement? + + + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + +```shell +kubectl explain pods --output openapiv3 +``` + +User should see OpenAPI v3 JSON Schema for `pods` type printed to console. + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +To reap the benefits of this feature, OpenAPI v3 is required, however OpenAPI v2 +data can be used as a fallback. + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + +Yes, up feature replaces a single GET of `/openapi/v2` which returns a large (megabytes) +openapi document for all types with a more targeted call to `/openapi/v3//` + +The `/openapi/v3//` endpoint implements E-Tag caching so that if the document has +not changed the server incurs a cheap, almost negligible cost to serving the request. + +The document returned by calls to `/openapi/v3/...` is expected to be far smaller +than the megabytes-scale openapi v2 document, since it only includes information +for a single group-version. + + + +###### Will enabling / using this feature result in introducing new API types? + +No. + +###### Will enabling / using this feature result in any new calls to the cloud provider? + +No. + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + +No. + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + +No. + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + +No, would expect generally same amount of resource usage for kubectl. + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +Using kubectl's normal error handling. There is no lasting effect to data or the +user. + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +N/A + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + +### Implement proto.Models for OpenAPI V3 data + +The current hard-coded printer is capable of printing any objects in `proto.Models` form. + +[We already have a way to express OpenAPI v3 data as `proto.Models`, so this can be +seen as a path of least resistance for plugging OpenAPI v3 into `kubectl explain`. + +This approach is undesirable for a few different reasons: + +1.) We would like to update the explain printer to include new OpenAPI v3 information, +the current design makes that time consuming and not maintainable. + +2.) API-Machinery has desire to deprecate `proto.Models`. We see`proto.Models` +conversion as unnecessary and costly buraucracy, that contributes to high +OpenAPI overhead. We are seeking to deprecate the type in favor of the +kube-openapi types for future usage. + +### Custom User Templates +Users might also like to be able to specify a path to a custom template file for +the resource information to be written to: + +human-readable plaintext form: +```shell +kubectl explain pods --template /path/to/template.tmpl +``` + +Since the API surface for this sort of feature remains very unclear and will likely +be very unstable, this sort of feature should be delayed until the internal +templates have proven the API surface to be used. To do otherwise would risk +breaking user's templates. diff --git a/keps/sig-cli/3515-kubectl-explain-openapiv3/kep.yaml b/keps/sig-cli/3515-kubectl-explain-openapiv3/kep.yaml new file mode 100755 index 00000000000..c6f3b5eba08 --- /dev/null +++ b/keps/sig-cli/3515-kubectl-explain-openapiv3/kep.yaml @@ -0,0 +1,26 @@ +id: "3515" +name: kubectl-explain-openapiv3 +title: Kubectl Explain OpenAPIv3 +kep-number: 3515 +authors: ['@alexzielenski'] +owning-sig: sig-cli +participating-sigs: [sig-api-machinery, sig-cli] +reviewers: + - "@KnVerey" + - "@seans3" + - "@apelisse" +approvers: + - "@KnVerey" + - "@seans3" +creation-date: "2022-09-14" +last-updated: v1.26 +status: implementable +stage: alpha +latest-milestone: "1.26" +milestone: + alpha: "1.26" + beta: "1.28" + stable: "1.29" +feature-gates: [] +disable-supported: true +metrics: []