-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP4009: Add CDI Devices to device plugin API #4011
KEP4009: Add CDI Devices to device plugin API #4011
Conversation
elezar
commented
May 16, 2023
•
edited
Loading
edited
- One-line PR description: Add CDI Devices to device plugin API
- Issue link: Add CDI devices to device plugin API #4009
- Other comments:
- Related to KEP3063: Update CRI structures #3731
- WIP implementaiton: Add CDI devices to device plugin API kubernetes#118254
Welcome @elezar! |
Hi @elezar. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @klueska |
/ok-to-test |
3678f59
to
f0a20e6
Compare
/cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, I took a pass and left some comments (mostly NITs). Addition of CDIDevices
to ContainerAllocateResponse
seems reasonable to me. I see some sections are yet to be populated so will do another pass when we have content there.
Please note that you'll need to add a file 4009.yaml under keps/prod-readiness/sig-node and specify one of the PRR reviewers so the KEP is tracked for PRR review.
keps/sig-node/4009-add-cdi-devices-to-device-plugin-api/kep.yaml
Outdated
Show resolved
Hide resolved
replaces: [] | ||
|
||
# The target maturity stage in the current dev cycle for this KEP. | ||
stage: alpha|beta|stable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be alpha
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update. One question I had was what stage this should be if the device plugin API is currently beta, or is it ok to have an alpha feature in a beta API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this feature is making changes to device plugin API which is currently Beta, it would have its own lifecycle and graduate from alpha (where feature gate is disabled by default to preserve default behavior) to Beta (feature gate enabled by default when this functionality is well tested) to GA (no feature gate).
We had a similar case where we were contributing a new endpoint to a stable PodResource API and wanted to introduce a new endpoint. I think we would have to do something similar here. Here is an example: #2404. If you would like, you can double check this with PRR reviewers to be sure.
# The following PRR answers are required at alpha release | ||
# List the feature gate name and the components for which it must be enabled | ||
feature-gates: | ||
- name: MyFeature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to specify the feature gate here. Adding a comment so we don't forget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the changes made for the addition of the CDIDevices fields to the CRI spec itself, this doesn't seem to be gated by a feature flag -- although the usage is gated by the kubefeatures.DynamicResourceAllocation
flag. Does this then required a new feature flag given that this is an extension of the existing device plugin API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the changes made for the addition of the CDIDevices fields to the CRI spec itself, this doesn't seem to be gated by a feature flag -- although the usage is gated by the kubefeatures.DynamicResourceAllocation flag.
The intent is to guard against the usage of new changes made to the API unless explicitly specified. Since DynamicResourceAllocation is still an alpha feature it makes sense to not have a new feature gate for additional API changes as it pertains to the feature evolving while it is in alpha stage.
Does this then required a new feature flag given that this is an extension of the existing device plugin API?
Yes, I think this would require a new feature gate as we want new changes to be guarded by that gate and not be enabled by default just because the device plugin framework and the API is mature. This is to ensure that the newly introduced changes don't cause unintended regressions and it goes through its own graduation process maturing over multiple releases.
- kube-apiserver | ||
- kube-controller-manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Component should be kubelet
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update. Is the devicemanager a component in this context, or are these only higher-level k8s components?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not, in the current implemention it is not possible to disable this -- except in the sense that the device plugn API may not be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the devicemanager a component in this context, or are these only higher-level k8s components?
We have to mention only higher-level k8s components here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not, in the current implemention it is not possible to disable this -- except in the sense that the device plugn API may not be used.
Okay I understand now why you didn't introduce a feature gate as this change would be ingrained in the device plugin v1beta1
API. The fact that device plugin API is Beta makes it tricky. I will think a bit more about this because my understanding would have been that if the associated feature gate is disabled (which it will be at alpha stage) the device plugins should not have the ability to pass CDIDevices
as part of ContainerAllocateResponse
. Perhaps, we would have to make an exception here, not have a separate feature gate and get away with an API review. I would suggest finding a precedent for such a case and proceeding accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it: #1884 !!
Here we had added cpu_ids
field in ContainerResources
and TopologyInfo
in ContainerDevices
and never had a separate feature gate because the API was stable. Perhaps we can take the same approach. This is worth explicitly clarifying with the SIG and the production readiness reviewers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@swatisehgal that should have been feature gated!
metrics: | ||
- my_feature_metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you plan to have a metric corresponding to the feature please add its name here. For alpha, I believe we can leave this empty but it is mandatory for beta graduation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Removing for now. Do you have links to existing metrics for the device plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some existing device plugin metrics : https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/metrics/metrics.go#L73-L74.
keps/sig-node/4009-add-cdi-devices-to-device-plugin-api/README.md
Outdated
Show resolved
Hide resolved
|
||
The recent addition of [CDI device IDs to the CRI structures](#3731) | ||
allow these IDs to be forwarded to the CRI runtimes in a secure manner. Although | ||
these changes were motivated by KEP-3063, adding support for these fields to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a link to the KEP here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Will update.
keps/sig-node/4009-add-cdi-devices-to-device-plugin-api/README.md
Outdated
Show resolved
Hide resolved
keps/sig-node/4009-add-cdi-devices-to-device-plugin-api/README.md
Outdated
Show resolved
Hide resolved
keps/sig-node/4009-add-cdi-devices-to-device-plugin-api/README.md
Outdated
Show resolved
Hide resolved
keps/sig-node/4009-add-cdi-devices-to-device-plugin-api/kep.yaml
Outdated
Show resolved
Hide resolved
f0a20e6
to
1b9abb0
Compare
1b9abb0
to
a6724a2
Compare
|
||
The recent addition of CDI device IDs to the CRI structures in [KEP-3731](https://github.com/kubernetes/enhancements/pull/3731) | ||
|
||
allow these IDs to be forwarded to the CRI runtimes in a secure manner. Although |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really care about the way Kubelet is using to pass CDI devices to a runtime? Would it change anything if Kubelet uses annotations?
keps/sig-node/4009-add-cdi-devices-to-device-plugin-api/README.md
Outdated
Show resolved
Hide resolved
This lets device plugins take advantage of CDI simplifying their operations. /lgtm |
d94c454
to
4800b20
Compare
/approve |
adding a `cpu_ids` field to the existing PodResources API. | ||
|
||
Furthermore, the only code that would be featured-gated in the Device Manager would be the code to propagate this field from the `AllocateResponse` to the | ||
corresponding field in the CRI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the change is not feature gated, but some code in the Device Manager is feature gated? Which feature gate will be used for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not feature gated and no code is feature gated. I was calling out that if a feature gate were to be added, what would be guarded by it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference the Device Manager changes are:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just drop this sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you are missing the reference in here though:
This change is not behind a feature gate. It is adding an optional
field to a stable API. This mirrors what was done in []() when
adding a `cpu_ids` field to the existing PodResources API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not feature gated and no code is feature gated. I was calling out that if a feature gate were to be added, what would be guarded by it.
I see. It's about how it would be done if a feature would be future gated, but it's not going to be future gated. Thank you for the explanations!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The recent addition of CDI device IDs to the CRI structures in [#3731](https://github.com/kubernetes/enhancements/pull/3731) allow these IDs to be forwarded to the CRI runtimes in a secure manner. Although | ||
these changes were motivated by [KEP-3063](https://github.com/kubernetes/enhancements/issues/3063), adding support for these fields to the | ||
existing device plugin API allows this mechanism to also be used for devices | ||
supported by these plugins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still suggest not to emphasise usage of CRI field. This is a protocol details between the Kubelet and a CRI runtime. This KEP is about Device Plugin API change.
In fact, you may end up passing CDI devices from the Kubelet to a CRI runtime using annotations to support old runtime versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't disagree that the kubelet COULD use annotations to forward device IDs to runtimes, the intent of this KEP is to use the new field in the CRI to allow Device Plugins to request devices. This keeps these devices IDs opague and does not require any interaction by the Kubelet. At present the Kubelet (at least in my cursory inspection of the code) doesn't perform any other such processing for other fields present.
Forwarding the field directly also means that the Kubelet does not need to be aware of which runtime version -- or even specific configuration -- is being used. I mention configurations here, because for containerd, for example, CDI needs to be enabled and the CDI annotations need to be allowed before these function. The proposal as it stands shifts the responsibility on device plugin vendors to make a decision as to which mechansim they will use to support device injection and when to transition from one mechanism to another as well as educating users on the required configuration. This would extend the existing node-level configurations or requirements that already need to be communicated to users of a plugin.
If we do at some point decide that the kubelet should process this field and and perform operations on it -- such as converting these to annotations -- that should be a new KEP where that can be discussed with specific use cases in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal as it stands shifts the responsibility on device plugin vendors
Would it be better if Kubelet would take this responsibility? At least it would be done in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give a use case that is not covered by the current model to extending the Kubelet with additional functionality?
4800b20
to
d601074
Compare
This change proposes adding a field to the existing device plugin API to allow device plugin vendors to specify CDI device IDs as part of the allocate response for a container. Signed-off-by: Evan Lezar <[email protected]>
d601074
to
eb23e14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial PRR review as shadow.
This change is not behind a feature gate. It is adding an optional | ||
field to a stable API. This mirrors what was done in [KEP-1884](https://github.com/kubernetes/enhancements/pull/1884) when | ||
adding a `cpu_ids` field to the existing PodResources API. | ||
|
||
<!-- Furthermore, the only code that would be featured-gated in the Device Manager would be the code to propagate this field from the `AllocateResponse` to the | ||
corresponding field in the CRI. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not sound right to me. This new field should start as alpha with its own feature gate. It does not matter that the field is optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed @enj. KEP-1884 was prior to requiring PRR.
We should feature gate this and use standard feature rollout processes. The API you are adding to may not be alpha, but the field still needs a feature gate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the field itself can't be feature gated (it's a field in a prototbuf), and the code to add the feature gate will actually require more code than the code required to pass the value of the field on to the CRI (where the exact same field with the exact same format already exists).
It seems overly complicated in my opinion to make such a small change go through the standard alpha/beta/stable graduation criteria.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code to add the feature gate will actually require more code than the code required to pass the value of the field on to the CRI
Feature gating always adds branching and complexity. This is normal and expected.
It seems overly complicated in my opinion to make such a small change go through the standard alpha/beta/stable graduation criteria.
See #2784 as an example of going through the entire KEP process for adding a simple (new/optional) field to a GA API. Admins must explicitly opt-in to alpha features, and must be able to disable/prevent the use of beta features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enj the feature you referenced seemed to go straight to Beta. Would that also be an option here? Meaning that we feature-gate the feature buy have it enabled by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elezar I would recommend reading through that KEP - something like half of its content is focused on defending why it is safe for it to go straight to beta and the security benefits of doing so (and how nothing bad can happen on skewed clients/servers, downgrades, etc). Generally this is something to discuss well in advance with SIG leads and PRR folks to make sure it still fits within the bounds, but I will defer that decision to @johnbelamaric, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the KEP to add this as an alpha feature.
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? | ||
|
||
No. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also does not make sense for an alpha feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entire change is adding a field in the device-plugin protobuf and then passing that untouched through the kubelet to the CRI. It is hard for me to wrap my head around what it would mean to "disable" this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be enough to explain that here, so as to make it clearer why it doesn't make sense to disable it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard for me to wrap my head around what it would mean to "disable" this.
Disable means "it is impossible to use this feature through Kubernetes without setting the feature gate." Thus when the feature gate is disabled, the field would be explicitly unset (or cause an error on use).
###### What happens if we reenable the feature if it was previously rolled back? | ||
|
||
Not applicable. The feature cannot be toggled. | ||
|
||
###### Are there any tests for feature enablement/disablement? | ||
|
||
<!-- | ||
The e2e framework does not currently support enabling or disabling feature | ||
gates. However, unit tests in each component dealing with managing data, created | ||
with and without the feature, are necessary. At the very least, think about | ||
conversion tests if API types are being modified. | ||
|
||
Additionally, for features that are introducing a new API field, unit tests that | ||
are exercising the `switch` of feature gate itself (what happens if I disable a | ||
feature gate after having objects written with the new field) are also critical. | ||
You can take a look at one potential example of such test in: | ||
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282 | ||
--> | ||
|
||
End to end tests will be added that exercise this feature, but since the feature | ||
cannot be toggled, this will not be added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Difficult to review this from a PRR standpoint since this is not a valid approach for an alpha feature.
Signed-off-by: Kevin Klues <[email protected]>
@enj @johnbelamaric |
Thanks! /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elezar, johnbelamaric, mrunalp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bart0sh thanks for pointing this out via #3080 (comment). Although i understand this KEP to enable CDI Devices, i need to have more time to see if our requirements can be supported. will get back to you on another thread 👍