-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support VFIO backed VFs #63
Comments
@zshi-redhat @ahalim-intel what's the official strategy for userspace devices for this CNI plugin? Would a PR that would make binding for e.g. VFIO devices be accepted into the CNI plugin, or should we go and contribute to another repo? (Perhaps start a new one.) I'd like to make sure that we are on the same page before I start working on the implementation. The implementation would resemble the shell CNI plugin mentioned in #37 (comment) but as part of the SR-IOV CNI plugin. |
/assign booxter |
There is a repo for userspace CNI plugins, it supports userspace interfaces with OVS or VPP as backend. But it doesn't support DPDK over VF.
Could you please elaborate a little bit on how SR-IOV CNI knows which driver the VF shall be bind to?
|
Would it make sense that we do the binding of VF in SR-IOV CNI by providing a net-attach-def Custom Resource with dpdk-driver as one of its parameters? Then we could have SR-IOV CNI check whether additional driver(dpdk-driver) is configured and do the binding accordingly. This doesn't solve the problem of publishing IPAM information to container Pod, but it avoid device plugin to track all the vfio device and |
@zshi-redhat the device cgroup permissions for the container cannot be taken care of with a CNI plugin. Thats why it makes more sense to leave driver binding/unbinding out of a CNI plugin. We had this in sriov-cni as a legacy feature(before any notion of device plugin). A device plugin is right place for device file permission related issue. The SRIOV DP takes care of it already, thus we no longer should do anything with driver changes in the CNI plugin. |
@ahalim-intel @zshi-redhat I think we can determine the current driver for a device using sysfs. If it's VFIO, do nothing to move (non-existing) netlink interface into namespace. (But still we can create a veth to carry IPAM information. That would be a separate follow-up feature.) |
Note that in my use case, it's not DPDK at all, it's a regular VF that happens to be registered with VFIO userspace driver. So using dpdk_driver for this need seems incorrect. I also think that it should be easy to automatically detect whether a device is bound to vfio driver and act accordingly (by skipping the move-netlink-to-namespace step of the plugin). |
If a pci device allocated to a container by SR-IOV device plugin is of vfio type then it doesn't have a netlink representation, so moving the interface into container namespace is useless and actually breaks the CNI plugin. We could perhaps go with a separate noop CNI plugin to do the same but there are plans to, in the future, allocate IPAM info for the device and expose it by one means or another (it could be a fake veth pair to carry the information: see issue k8snetworkplumbingwg#37; or some other mechanism). Fixes k8snetworkplumbingwg#63
If a pci device allocated to a container by SR-IOV device plugin is of vfio type then it doesn't have a netlink representation, so moving the interface into container namespace is useless and actually breaks the CNI plugin. We could perhaps go with a separate noop CNI plugin to do the same but there are plans to, in the future, allocate IPAM info for the device and expose it by one means or another (it could be a fake veth pair to carry the information: see issue k8snetworkplumbingwg#37; or some other mechanism). Fixes k8snetworkplumbingwg#63
If a pci device allocated to a container by SR-IOV device plugin is of vfio type then it doesn't have a netlink representation, so moving the interface into container namespace is useless and actually breaks the CNI plugin. We could perhaps go with a separate noop CNI plugin to do the same but there are plans to, in the future, allocate IPAM info for the device and expose it by one means or another (it could be a fake veth pair to carry the information: see issue k8snetworkplumbingwg#37; or some other mechanism). Fixes k8snetworkplumbingwg#63
If a pci device allocated to a container by SR-IOV device plugin is of vfio type then it doesn't have a netlink representation, so moving the interface into container namespace is useless and actually breaks the CNI plugin. We could perhaps go with a separate noop CNI plugin to do the same but there are plans to, in the future, allocate IPAM info for the device and expose it by one means or another (it could be a fake veth pair to carry the information: see issue k8snetworkplumbingwg#37; or some other mechanism). Fixes k8snetworkplumbingwg#63
When SR-IOV device plugin allocates a VFIO typed VF, the device doesn't have a netlink representation. So when SR-IOV CNI then tries to move the device into the pod, it fails because netlink interface is not present.
While one may argue that in case of VFIO devices we may not need a CNI at all, Multus requires that a network has
type
specified. If it's notsriov
then it should be something else. We could write a new CNI plugin that would do nothing in terms of device binding into the namespace; but it's not convenient if we already have SR-IOV CNI with meaningful type name. We could also benefit from SR-IOV CNI handling the binding if we e.g. want IPAM info to be allocated for the device (perhaps because later we will use this information to pass it into software running inside the pod; in case ofkubevirt
it would be libvirt-backed VM that receives IPAM configuration via cloud-init; #37 is the place where one of potential approaches to pass IPAM into pod is discussed.)It's my belief that SR-IOV CNI should gracefully handle VFs that are registered with VFIO driver like it already does for non-VFIO devices. This issue is just that - making the plugin not fail on VFIO registered VF and return desired IPAM information in result returned to the caller.
The text was updated successfully, but these errors were encountered: