Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP4009: Add CDI Devices to device plugin API #4011

Conversation

elezar
Copy link
Contributor

@elezar elezar commented May 16, 2023

  • One-line PR description: Add CDI Devices to device plugin API

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 16, 2023
@k8s-ci-robot
Copy link
Contributor

Welcome @elezar!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels May 16, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @elezar. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label May 16, 2023
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 16, 2023
@elezar
Copy link
Contributor Author

elezar commented May 16, 2023

/cc @klueska

@k8s-ci-robot k8s-ci-robot requested a review from klueska May 16, 2023 13:15
@elezar elezar changed the title 4009: Add CDI Devices to device plugin API KEP4009: Add CDI Devices to device plugin API May 16, 2023
@kannon92
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 16, 2023
@elezar elezar force-pushed the KEP-4009/Add-CDI-devices-to-device-plugin-API branch from 3678f59 to f0a20e6 Compare May 25, 2023 08:02
@swatisehgal
Copy link
Contributor

/cc

@k8s-ci-robot k8s-ci-robot requested a review from swatisehgal May 25, 2023 08:45
Copy link
Contributor

@swatisehgal swatisehgal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I took a pass and left some comments (mostly NITs). Addition of CDIDevices to ContainerAllocateResponse seems reasonable to me. I see some sections are yet to be populated so will do another pass when we have content there.

Please note that you'll need to add a file 4009.yaml under keps/prod-readiness/sig-node and specify one of the PRR reviewers so the KEP is tracked for PRR review.

replaces: []

# The target maturity stage in the current dev cycle for this KEP.
stage: alpha|beta|stable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be alpha.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update. One question I had was what stage this should be if the device plugin API is currently beta, or is it ok to have an alpha feature in a beta API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this feature is making changes to device plugin API which is currently Beta, it would have its own lifecycle and graduate from alpha (where feature gate is disabled by default to preserve default behavior) to Beta (feature gate enabled by default when this functionality is well tested) to GA (no feature gate).
We had a similar case where we were contributing a new endpoint to a stable PodResource API and wanted to introduce a new endpoint. I think we would have to do something similar here. Here is an example: #2404. If you would like, you can double check this with PRR reviewers to be sure.

# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: MyFeature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to specify the feature gate here. Adding a comment so we don't forget.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the changes made for the addition of the CDIDevices fields to the CRI spec itself, this doesn't seem to be gated by a feature flag -- although the usage is gated by the kubefeatures.DynamicResourceAllocation flag. Does this then required a new feature flag given that this is an extension of the existing device plugin API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the changes made for the addition of the CDIDevices fields to the CRI spec itself, this doesn't seem to be gated by a feature flag -- although the usage is gated by the kubefeatures.DynamicResourceAllocation flag.

The intent is to guard against the usage of new changes made to the API unless explicitly specified. Since DynamicResourceAllocation is still an alpha feature it makes sense to not have a new feature gate for additional API changes as it pertains to the feature evolving while it is in alpha stage.

Does this then required a new feature flag given that this is an extension of the existing device plugin API?

Yes, I think this would require a new feature gate as we want new changes to be guarded by that gate and not be enabled by default just because the device plugin framework and the API is mature. This is to ensure that the newly introduced changes don't cause unintended regressions and it goes through its own graduation process maturing over multiple releases.

Comment on lines 38 to 39
- kube-apiserver
- kube-controller-manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Component should be kubelet here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update. Is the devicemanager a component in this context, or are these only higher-level k8s components?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not, in the current implemention it is not possible to disable this -- except in the sense that the device plugn API may not be used.

Copy link
Contributor

@swatisehgal swatisehgal May 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the devicemanager a component in this context, or are these only higher-level k8s components?

We have to mention only higher-level k8s components here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not, in the current implemention it is not possible to disable this -- except in the sense that the device plugn API may not be used.

Okay I understand now why you didn't introduce a feature gate as this change would be ingrained in the device plugin v1beta1 API. The fact that device plugin API is Beta makes it tricky. I will think a bit more about this because my understanding would have been that if the associated feature gate is disabled (which it will be at alpha stage) the device plugins should not have the ability to pass CDIDevices as part of ContainerAllocateResponse. Perhaps, we would have to make an exception here, not have a separate feature gate and get away with an API review. I would suggest finding a precedent for such a case and proceeding accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found it: #1884 !!
Here we had added cpu_ids field in ContainerResources and TopologyInfo in ContainerDevices and never had a separate feature gate because the API was stable. Perhaps we can take the same approach. This is worth explicitly clarifying with the SIG and the production readiness reviewers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swatisehgal that should have been feature gated!

Comment on lines 43 to 44
metrics:
- my_feature_metric
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you plan to have a metric corresponding to the feature please add its name here. For alpha, I believe we can leave this empty but it is mandatory for beta graduation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Removing for now. Do you have links to existing metrics for the device plugin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The recent addition of [CDI device IDs to the CRI structures](#3731)
allow these IDs to be forwarded to the CRI runtimes in a secure manner. Although
these changes were motivated by KEP-3063, adding support for these fields to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a link to the KEP here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will update.

@elezar elezar force-pushed the KEP-4009/Add-CDI-devices-to-device-plugin-API branch from f0a20e6 to 1b9abb0 Compare May 26, 2023 14:00
@elezar elezar marked this pull request as ready for review May 26, 2023 14:02
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 26, 2023
@elezar elezar requested a review from swatisehgal May 26, 2023 14:02
@elezar elezar force-pushed the KEP-4009/Add-CDI-devices-to-device-plugin-API branch from 1b9abb0 to a6724a2 Compare May 26, 2023 14:14

The recent addition of CDI device IDs to the CRI structures in [KEP-3731](https://github.com/kubernetes/enhancements/pull/3731)

allow these IDs to be forwarded to the CRI runtimes in a secure manner. Although
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really care about the way Kubelet is using to pass CDI devices to a runtime? Would it change anything if Kubelet uses annotations?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 6, 2023
@mrunalp
Copy link
Contributor

mrunalp commented Jun 6, 2023

This lets device plugins take advantage of CDI simplifying their operations.

/lgtm
/approved

@elezar elezar force-pushed the KEP-4009/Add-CDI-devices-to-device-plugin-API branch from d94c454 to 4800b20 Compare June 6, 2023 16:00
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 6, 2023
@mrunalp
Copy link
Contributor

mrunalp commented Jun 6, 2023

/approve
@dchen1107

@elezar elezar requested a review from mrunalp June 6, 2023 17:52
adding a `cpu_ids` field to the existing PodResources API.

Furthermore, the only code that would be featured-gated in the Device Manager would be the code to propagate this field from the `AllocateResponse` to the
corresponding field in the CRI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the change is not feature gated, but some code in the Device Manager is feature gated? Which feature gate will be used for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not feature gated and no code is feature gated. I was calling out that if a feature gate were to be added, what would be guarded by it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just drop this sentence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are missing the reference in here though:

This change is not behind a feature gate. It is adding an optional
field to a stable API. This mirrors what was done in []() when
adding a `cpu_ids` field to the existing PodResources API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not feature gated and no code is feature gated. I was calling out that if a feature gate were to be added, what would be guarded by it.

I see. It's about how it would be done if a feature would be future gated, but it's not going to be future gated. Thank you for the explanations!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @klueska. Removed and also added the link to #1884 that was missing.

The recent addition of CDI device IDs to the CRI structures in [#3731](https://github.com/kubernetes/enhancements/pull/3731) allow these IDs to be forwarded to the CRI runtimes in a secure manner. Although
these changes were motivated by [KEP-3063](https://github.com/kubernetes/enhancements/issues/3063), adding support for these fields to the
existing device plugin API allows this mechanism to also be used for devices
supported by these plugins.
Copy link
Contributor

@bart0sh bart0sh Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still suggest not to emphasise usage of CRI field. This is a protocol details between the Kubelet and a CRI runtime. This KEP is about Device Plugin API change.

In fact, you may end up passing CDI devices from the Kubelet to a CRI runtime using annotations to support old runtime versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I don't disagree that the kubelet COULD use annotations to forward device IDs to runtimes, the intent of this KEP is to use the new field in the CRI to allow Device Plugins to request devices. This keeps these devices IDs opague and does not require any interaction by the Kubelet. At present the Kubelet (at least in my cursory inspection of the code) doesn't perform any other such processing for other fields present.

Forwarding the field directly also means that the Kubelet does not need to be aware of which runtime version -- or even specific configuration -- is being used. I mention configurations here, because for containerd, for example, CDI needs to be enabled and the CDI annotations need to be allowed before these function. The proposal as it stands shifts the responsibility on device plugin vendors to make a decision as to which mechansim they will use to support device injection and when to transition from one mechanism to another as well as educating users on the required configuration. This would extend the existing node-level configurations or requirements that already need to be communicated to users of a plugin.

If we do at some point decide that the kubelet should process this field and and perform operations on it -- such as converting these to annotations -- that should be a new KEP where that can be discussed with specific use cases in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal as it stands shifts the responsibility on device plugin vendors

Would it be better if Kubelet would take this responsibility? At least it would be done in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give a use case that is not covered by the current model to extending the Kubelet with additional functionality?

@elezar elezar force-pushed the KEP-4009/Add-CDI-devices-to-device-plugin-API branch from 4800b20 to d601074 Compare June 7, 2023 12:37
This change proposes adding a field to the existing device plugin API
to allow device plugin vendors to specify CDI device IDs as part of
the allocate response for a container.

Signed-off-by: Evan Lezar <[email protected]>
@elezar elezar force-pushed the KEP-4009/Add-CDI-devices-to-device-plugin-API branch from d601074 to eb23e14 Compare June 12, 2023 09:55
Copy link
Member

@enj enj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial PRR review as shadow.

Comment on lines 483 to 488
This change is not behind a feature gate. It is adding an optional
field to a stable API. This mirrors what was done in [KEP-1884](https://github.com/kubernetes/enhancements/pull/1884) when
adding a `cpu_ids` field to the existing PodResources API.

<!-- Furthermore, the only code that would be featured-gated in the Device Manager would be the code to propagate this field from the `AllocateResponse` to the
corresponding field in the CRI. -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not sound right to me. This new field should start as alpha with its own feature gate. It does not matter that the field is optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed @enj. KEP-1884 was prior to requiring PRR.

We should feature gate this and use standard feature rollout processes. The API you are adding to may not be alpha, but the field still needs a feature gate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the field itself can't be feature gated (it's a field in a prototbuf), and the code to add the feature gate will actually require more code than the code required to pass the value of the field on to the CRI (where the exact same field with the exact same format already exists).

It seems overly complicated in my opinion to make such a small change go through the standard alpha/beta/stable graduation criteria.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code to add the feature gate will actually require more code than the code required to pass the value of the field on to the CRI

Feature gating always adds branching and complexity. This is normal and expected.

It seems overly complicated in my opinion to make such a small change go through the standard alpha/beta/stable graduation criteria.

See #2784 as an example of going through the entire KEP process for adding a simple (new/optional) field to a GA API. Admins must explicitly opt-in to alpha features, and must be able to disable/prevent the use of beta features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enj the feature you referenced seemed to go straight to Beta. Would that also be an option here? Meaning that we feature-gate the feature buy have it enabled by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elezar I would recommend reading through that KEP - something like half of its content is focused on defending why it is safe for it to go straight to beta and the security benefits of doing so (and how nothing bad can happen on skewed clients/servers, downgrades, etc). Generally this is something to discuss well in advance with SIG leads and PRR folks to make sure it still fits within the bounds, but I will defer that decision to @johnbelamaric, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the KEP to add this as an alpha feature.

Comment on lines 498 to 500
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

No.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also does not make sense for an alpha feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire change is adding a field in the device-plugin protobuf and then passing that untouched through the kubelet to the CRI. It is hard for me to wrap my head around what it would mean to "disable" this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be enough to explain that here, so as to make it clearer why it doesn't make sense to disable it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard for me to wrap my head around what it would mean to "disable" this.

Disable means "it is impossible to use this feature through Kubernetes without setting the feature gate." Thus when the feature gate is disabled, the field would be explicitly unset (or cause an error on use).

Comment on lines 512 to 532
###### What happens if we reenable the feature if it was previously rolled back?

Not applicable. The feature cannot be toggled.

###### Are there any tests for feature enablement/disablement?

<!--
The e2e framework does not currently support enabling or disabling feature
gates. However, unit tests in each component dealing with managing data, created
with and without the feature, are necessary. At the very least, think about
conversion tests if API types are being modified.

Additionally, for features that are introducing a new API field, unit tests that
are exercising the `switch` of feature gate itself (what happens if I disable a
feature gate after having objects written with the new field) are also critical.
You can take a look at one potential example of such test in:
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
-->

End to end tests will be added that exercise this feature, but since the feature
cannot be toggled, this will not be added.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Difficult to review this from a PRR standpoint since this is not a valid approach for an alpha feature.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 15, 2023
@klueska
Copy link
Contributor

klueska commented Jun 15, 2023

@enj @johnbelamaric
I have updated the KEP to include the addition of a feature gate as well as more adequately answer the PRR questions.

@johnbelamaric
Copy link
Member

Thanks!

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elezar, johnbelamaric, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 15, 2023
@klueska
Copy link
Contributor

klueska commented Jun 16, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 16, 2023
@k8s-ci-robot k8s-ci-robot merged commit 24130cd into kubernetes:master Jun 16, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Jun 16, 2023
Copy link

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bart0sh thanks for pointing this out via #3080 (comment). Although i understand this KEP to enable CDI Devices, i need to have more time to see if our requirements can be supported. will get back to you on another thread 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Development

Successfully merging this pull request may close these issues.

10 participants