Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc fixes #1561

Merged
merged 5 commits into from
Oct 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ The summary of resources available via plugins in this repository is given in th
* `gpu.intel.com` : `i915`
* [intelgpu-job.yaml](demo/intelgpu-job.yaml)
* `iaa.intel.com` : `wq-user-[shared or dedicated]`
* [iaa-qpl-demo-pod.yaml](demo/iaa-qpl-demo-pod.yaml)
* [iaa-accel-config-demo-pod.yaml](demo/iaa-accel-config-demo-pod.yaml)
* `qat.intel.com` : `generic` or `cy`/`dc`/`asym-dc`/`sym-dc`
* [crypto-perf-dpdk-pod-requesting-qat.yaml](deployments/qat_dpdk_app/base/crypto-perf-dpdk-pod-requesting-qat.yaml)
* `sgx.intel.com` : `epc`
Expand Down
20 changes: 10 additions & 10 deletions cmd/iaa_plugin/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,45 +76,45 @@ node1

## Testing and Demos

We can test the plugin is working by deploying the provided example iaa-qpl-demo test image.
We can test the plugin is working by deploying the provided example accel-config-demo test image.

1. Build a Docker image with an accel-config tests:

```bash
$ make iaa-qpl-demo
$ make accel-config-demo
...
Successfully tagged intel/iaa-qpl-demo:devel
Successfully tagged intel/accel-config-demo:devel
```

1. Create a pod running unit tests off the local Docker image:

```bash
$ kubectl apply -f ./demo/iaa-qpl-demo-pod.yaml
pod/iaa-qpl-demo created
$ kubectl apply -f ./demo/iaa-accel-config-demo-pod.yaml
pod/iaa-accel-config-demo created
```

1. Wait until pod is completed:

```bash
$ kubectl get pods |grep iaa-qpl-demo
iaa-qpl-demo 0/1 Completed 0 31m
$ kubectl get pods |grep iaa-accel-config-demo
iaa-accel-config-demo 0/1 Completed 0 31m

If the pod did not successfully launch, possibly because it could not obtain the IAA
resource, it will be stuck in the `Pending` status:

```bash
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
iaa-qpl-demo 0/1 Pending 0 7s
iaa-accel-config-demo 0/1 Pending 0 7s
```

This can be verified by checking the Events of the pod:

```bash

$ kubectl describe pod iaa-qpl-demo | grep -A3 Events:
$ kubectl describe pod iaa-accel-config-demo | grep -A3 Events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m26s default-scheduler 0/1 nodes are available: 1 Insufficient iaa.intel.com/wq-user-dedicated, 1 Insufficient iaa.intel.com/wq-user-shared.
Warning FailedScheduling 2m26s default-scheduler 0/1 nodes are available: 1 Insufficient iaa.intel.com/wq-user-dedicated.
```
16 changes: 0 additions & 16 deletions demo/iaa-qpl-demo-pod.yaml

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ spec:
gpu.intel.com/tiles: {op: Exists}
name: intel.gpu.fractionalresources
# generic rule for older and upcoming devices
- labels:
labelsTemplate: |
- labelsTemplate: |
{{ range .pci.device }}gpu.intel.com/device-id.{{ .class }}-{{ .device }}.present=true
{{ end }}
matchFeatures:
Expand All @@ -33,8 +32,7 @@ spec:
value:
- "8086"
name: intel.gpu.generic.deviceid
- labels:
labelsTemplate: gpu.intel.com/device-id.0300-{{ (index .pci.device 0).device }}.count={{ len .pci.device }}
- labelsTemplate: gpu.intel.com/device-id.0300-{{ (index .pci.device 0).device }}.count={{ len .pci.device }}
matchFeatures:
- feature: pci.device
matchExpressions:
Expand All @@ -47,8 +45,7 @@ spec:
value:
- "8086"
name: intel.gpu.generic.count.300
- labels:
labelsTemplate: gpu.intel.com/device-id.0380-{{ (index .pci.device 0).device }}.count={{ len .pci.device }}
- labelsTemplate: gpu.intel.com/device-id.0380-{{ (index .pci.device 0).device }}.count={{ len .pci.device }}
matchFeatures:
- feature: pci.device
matchExpressions:
Expand Down
7 changes: 5 additions & 2 deletions pkg/controllers/dlb/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,11 @@ func (c *controller) UpdateDaemonSet(rawObj client.Object, ds *apps.DaemonSet) (
updated = true
}
} else {
setInitContainer(&ds.Spec.Template.Spec, dp.Spec)
updated = true
containers := ds.Spec.Template.Spec.InitContainers
if len(containers) != 1 || containers[0].Image != dp.Spec.InitImage {
setInitContainer(&ds.Spec.Template.Spec, dp.Spec)
updated = true
}
}

if len(dp.Spec.NodeSelector) > 0 {
Expand Down
26 changes: 15 additions & 11 deletions pkg/controllers/dsa/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,9 @@ import (
)

const (
ownerKey = ".metadata.controller.dsa"
inicontainerName = "intel-idxd-config-initcontainer"
ownerKey = ".metadata.controller.dsa"
initcontainerName = "intel-idxd-config-initcontainer"
configVolumeName = "intel-dsa-config-volume"
)

var defaultNodeSelector = deployments.DSAPluginDaemonSet().Spec.Template.Spec.NodeSelector
Expand Down Expand Up @@ -87,7 +88,7 @@ func removeInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.DsaDevicePlugin)
newInitContainers := []v1.Container{}

for _, container := range ds.Spec.Template.Spec.InitContainers {
if container.Name == inicontainerName {
if container.Name == initcontainerName {
continue
}

Expand All @@ -98,7 +99,7 @@ func removeInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.DsaDevicePlugin)
newVolumes := []v1.Volume{}

for _, volume := range ds.Spec.Template.Spec.Volumes {
if volume.Name == "intel-dsa-config-volume" || volume.Name == "sys-bus-dsa" || volume.Name == "sys-devices" || volume.Name == "scratch" {
if volume.Name == configVolumeName || volume.Name == "sys-bus-dsa" || volume.Name == "sys-devices" || volume.Name == "scratch" {
continue
}

Expand All @@ -114,7 +115,7 @@ func addInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.DsaDevicePlugin) {
ds.Spec.Template.Spec.InitContainers = append(ds.Spec.Template.Spec.InitContainers, v1.Container{
Image: dp.Spec.InitImage,
ImagePullPolicy: "IfNotPresent",
Name: inicontainerName,
Name: initcontainerName,
Env: []v1.EnvVar{
{
Name: "NODE_NAME",
Expand Down Expand Up @@ -176,17 +177,17 @@ func addInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.DsaDevicePlugin) {

if dp.Spec.ProvisioningConfig != "" {
ds.Spec.Template.Spec.Volumes = append(ds.Spec.Template.Spec.Volumes, v1.Volume{
Name: "intel-dsa-config-volume",
Name: configVolumeName,
VolumeSource: v1.VolumeSource{
ConfigMap: &v1.ConfigMapVolumeSource{
LocalObjectReference: v1.LocalObjectReference{Name: dp.Spec.ProvisioningConfig}},
},
})

for i, initcontainer := range ds.Spec.Template.Spec.InitContainers {
if initcontainer.Name == inicontainerName {
if initcontainer.Name == initcontainerName {
ds.Spec.Template.Spec.InitContainers[i].VolumeMounts = append(ds.Spec.Template.Spec.InitContainers[i].VolumeMounts, v1.VolumeMount{
Name: "intel-dsa-config-volume",
Name: configVolumeName,
MountPath: "/idxd-init/conf",
})
}
Expand Down Expand Up @@ -218,16 +219,19 @@ func provisioningUpdate(ds *apps.DaemonSet, dp *devicepluginv1.DsaDevicePlugin)
found := false

for _, container := range ds.Spec.Template.Spec.InitContainers {
if container.Name == inicontainerName && container.Image != dp.Spec.InitImage {
if container.Name == initcontainerName {
if container.Image != dp.Spec.InitImage {
update = true
}

found = true
update = true

break
}
}

for _, volume := range ds.Spec.Template.Spec.Volumes {
if volume.Name == "intel-dsa-config-volume" && volume.ConfigMap.Name != dp.Spec.ProvisioningConfig {
if volume.Name == configVolumeName && volume.ConfigMap.Name != dp.Spec.ProvisioningConfig {
update = true
break
}
Expand Down
46 changes: 34 additions & 12 deletions pkg/controllers/gpu/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -242,32 +242,54 @@ func removeVolumeMount(volumeMounts []v1.VolumeMount, name string) []v1.VolumeMo
return newVolumeMounts
}

func (c *controller) UpdateDaemonSet(rawObj client.Object, ds *apps.DaemonSet) (updated bool) {
dp := rawObj.(*devicepluginv1.GpuDevicePlugin)

if ds.Spec.Template.Spec.Containers[0].Image != dp.Spec.Image {
ds.Spec.Template.Spec.Containers[0].Image = dp.Spec.Image
updated = true
}
func processInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.GpuDevicePlugin) bool {
initContainers := ds.Spec.Template.Spec.InitContainers

if dp.Spec.InitImage == "" {
if ds.Spec.Template.Spec.InitContainers != nil {
if initContainers != nil {
ds.Spec.Template.Spec.InitContainers = nil
ds.Spec.Template.Spec.Volumes = removeVolume(ds.Spec.Template.Spec.Volumes, "nfd-features")
updated = true

return true
}
} else {
} else if len(initContainers) != 1 || initContainers[0].Image != dp.Spec.InitImage {
setInitContainer(&ds.Spec.Template.Spec, dp.Spec.InitImage)
updated = true

return true
}

return false
}

func processNodeSelector(ds *apps.DaemonSet, dp *devicepluginv1.GpuDevicePlugin) bool {
if len(dp.Spec.NodeSelector) > 0 {
if !reflect.DeepEqual(ds.Spec.Template.Spec.NodeSelector, dp.Spec.NodeSelector) {
ds.Spec.Template.Spec.NodeSelector = dp.Spec.NodeSelector
updated = true

return true
}
} else if !reflect.DeepEqual(ds.Spec.Template.Spec.NodeSelector, defaultNodeSelector) {
ds.Spec.Template.Spec.NodeSelector = defaultNodeSelector

return true
}

return false
}

func (c *controller) UpdateDaemonSet(rawObj client.Object, ds *apps.DaemonSet) (updated bool) {
dp := rawObj.(*devicepluginv1.GpuDevicePlugin)

if ds.Spec.Template.Spec.Containers[0].Image != dp.Spec.Image {
ds.Spec.Template.Spec.Containers[0].Image = dp.Spec.Image
updated = true
}

if processInitContainer(ds, dp) {
updated = true
}

if processNodeSelector(ds, dp) {
updated = true
}

Expand Down
26 changes: 15 additions & 11 deletions pkg/controllers/iaa/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,9 @@ import (
)

const (
ownerKey = ".metadata.controller.iaa"
inicontainerName = "intel-iaa-initcontainer"
ownerKey = ".metadata.controller.iaa"
initcontainerName = "intel-iaa-initcontainer"
configVolumeName = "intel-iaa-config-volume"
)

// +kubebuilder:rbac:groups=deviceplugin.intel.com,resources=iaadeviceplugins,verbs=get;list;watch;create;update;patch;delete
Expand Down Expand Up @@ -85,7 +86,7 @@ func removeInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.IaaDevicePlugin)
newInitContainers := []v1.Container{}

for _, container := range ds.Spec.Template.Spec.InitContainers {
if container.Name == inicontainerName {
if container.Name == initcontainerName {
continue
}

Expand All @@ -97,7 +98,7 @@ func removeInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.IaaDevicePlugin)
newVolumes := []v1.Volume{}

for _, volume := range ds.Spec.Template.Spec.Volumes {
if volume.Name == "intel-iaa-config-volume" || volume.Name == "sys-bus-dsa" || volume.Name == "sys-devices" || volume.Name == "scratch" {
if volume.Name == configVolumeName || volume.Name == "sys-bus-dsa" || volume.Name == "sys-devices" || volume.Name == "scratch" {
continue
}

Expand All @@ -113,7 +114,7 @@ func addInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.IaaDevicePlugin) {
ds.Spec.Template.Spec.InitContainers = append(ds.Spec.Template.Spec.InitContainers, v1.Container{
Image: dp.Spec.InitImage,
ImagePullPolicy: "IfNotPresent",
Name: inicontainerName,
Name: initcontainerName,
Env: []v1.EnvVar{
{
Name: "NODE_NAME",
Expand Down Expand Up @@ -175,17 +176,17 @@ func addInitContainer(ds *apps.DaemonSet, dp *devicepluginv1.IaaDevicePlugin) {

if dp.Spec.ProvisioningConfig != "" {
ds.Spec.Template.Spec.Volumes = append(ds.Spec.Template.Spec.Volumes, v1.Volume{
Name: "intel-iaa-config-volume",
Name: configVolumeName,
VolumeSource: v1.VolumeSource{
ConfigMap: &v1.ConfigMapVolumeSource{
LocalObjectReference: v1.LocalObjectReference{Name: dp.Spec.ProvisioningConfig}},
},
})

for i, initcontainer := range ds.Spec.Template.Spec.InitContainers {
if initcontainer.Name == inicontainerName {
if initcontainer.Name == initcontainerName {
ds.Spec.Template.Spec.InitContainers[i].VolumeMounts = append(ds.Spec.Template.Spec.InitContainers[i].VolumeMounts, v1.VolumeMount{
Name: "intel-iaa-config-volume",
Name: configVolumeName,
MountPath: "/idxd-init/conf",
})
}
Expand Down Expand Up @@ -219,16 +220,19 @@ func provisioningUpdate(ds *apps.DaemonSet, dp *devicepluginv1.IaaDevicePlugin)
found := false

for _, container := range ds.Spec.Template.Spec.InitContainers {
if container.Name == "intel-iaa-initcontainer" && container.Image != dp.Spec.InitImage {
if container.Name == initcontainerName {
if container.Image != dp.Spec.InitImage {
update = true
}

found = true
update = true

break
}
}

for _, volume := range ds.Spec.Template.Spec.Volumes {
if volume.Name == "intel-iaa-config-volume" && volume.ConfigMap.Name != dp.Spec.ProvisioningConfig {
if volume.Name == configVolumeName && volume.ConfigMap.Name != dp.Spec.ProvisioningConfig {
update = true

break
Expand Down
15 changes: 12 additions & 3 deletions pkg/controllers/qat/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,12 @@ func (c *controller) NewDaemonSet(rawObj client.Object) *apps.DaemonSet {
func (c *controller) UpdateDaemonSet(rawObj client.Object, ds *apps.DaemonSet) (updated bool) {
dp := rawObj.(*devicepluginv1.QatDevicePlugin)

if !reflect.DeepEqual(ds.ObjectMeta.Annotations, dp.ObjectMeta.Annotations) {
// Remove always incrementing annotation so it doesn't cause the next DeepEqual
// to return false every time.
dsAnnotations := ds.ObjectMeta.DeepCopy().Annotations
delete(dsAnnotations, "deprecated.daemonset.template.generation")

if !reflect.DeepEqual(dsAnnotations, dp.ObjectMeta.Annotations) {
pluginAnnotations := dp.ObjectMeta.DeepCopy().Annotations
ds.ObjectMeta.Annotations = pluginAnnotations
ds.Spec.Template.Annotations = pluginAnnotations
Expand All @@ -131,8 +136,12 @@ func (c *controller) UpdateDaemonSet(rawObj client.Object, ds *apps.DaemonSet) (
updated = true
}
} else {
setInitContainer(&ds.Spec.Template.Spec, dp.Spec)
updated = true
containers := ds.Spec.Template.Spec.InitContainers
if len(containers) != 1 || containers[0].Image != dp.Spec.InitImage {
setInitContainer(&ds.Spec.Template.Spec, dp.Spec)

updated = true
}
}

if len(dp.Spec.NodeSelector) > 0 {
Expand Down
2 changes: 1 addition & 1 deletion pkg/controllers/reconciler.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ import (

var (
bKeeper = &bookKeeper{}
ImageMinVersion = versionutil.MustParseSemantic("0.27.0")
ImageMinVersion = versionutil.MustParseSemantic("0.28.0")
)

func init() {
Expand Down
Loading