Merge pull request #1670 from tkatila/xe-driver-support

GPU: Add support for the new xe KMD
intel · Mar 13, 2024 · ca301c0 · ca301c0
2 parents 7d00cf0 + 4946b26
commit ca301c0
Show file tree

Hide file tree

Showing 18 changed files with 856 additions and 246 deletions.
diff --git a/.github/workflows/lib-e2e.yaml b/.github/workflows/lib-e2e.yaml
@@ -25,6 +25,7 @@ jobs:
           - name: e2e-gpu
             runner: gpu
             images: intel-gpu-plugin intel-gpu-initcontainer
+            targetJob: e2e-gpu SKIP=Resource:xe
           - name: e2e-iaa-spr
             targetjob: e2e-iaa
             runner: simics-spr

diff --git a/README.md b/README.md
@@ -229,7 +229,7 @@ The summary of resources available via plugins in this repository is given in th
    * [dsa-accel-config-demo-pod.yaml](demo/dsa-accel-config-demo-pod.yaml)
  * `fpga.intel.com` : custom, see [mappings](cmd/fpga_admissionwebhook/README.md#mappings)
    * [intelfpga-job.yaml](demo/intelfpga-job.yaml)
- * `gpu.intel.com`  : `i915`
+ * `gpu.intel.com`  : `i915`, `i915_monitoring`, `xe` or `xe_monitoring`
    * [intelgpu-job.yaml](demo/intelgpu-job.yaml)
  * `iaa.intel.com`  : `wq-user-[shared or dedicated]`
    * [iaa-accel-config-demo-pod.yaml](demo/iaa-accel-config-demo-pod.yaml)

diff --git a/cmd/gpu_plugin/README.md b/cmd/gpu_plugin/README.md
@@ -16,6 +16,7 @@ Table of Contents
   * [Running GPU plugin as non-root](#running-gpu-plugin-as-non-root)
   * [Labels created by GPU plugin](#labels-created-by-gpu-plugin)
   * [SR-IOV use with the plugin](#sr-iov-use-with-the-plugin)
+  * [KMD and UMD](#kmd-and-umd)
   * [Issues with media workloads on multi-GPU setups](#issues-with-media-workloads-on-multi-gpu-setups)
     * [Workaround for QSV and VA-API](#workaround-for-qsv-and-va-api)
 
@@ -36,11 +37,23 @@ For example containers with Intel media driver (and components using that), can
 video transcoding operations, and containers with the Intel OpenCL / oneAPI Level Zero
 backend libraries can offload compute operations to GPU.
 
+Intel GPU plugin may register four node resources to the Kubernetes cluster:
+| Resource | Description |
+|:---- |:-------- |
+| gpu.intel.com/i915 | GPU instance running legacy `i915` KMD |
+| gpu.intel.com/i915_monitoring | Monitoring resource for the legacy `i915` KMD devices |
+| gpu.intel.com/xe | GPU instance running new `xe` KMD |
+| gpu.intel.com/xe_monitoring | Monitoring resource for the new `xe` KMD devices |
+
+While GPU plugin basic operations support nodes having both (`i915` and `xe`) KMDs on the same node, its resource management (=GAS) does not, for that node needs to have only one of the KMDs present.
+
+For workloads on different KMDs, see [KMD and UMD](#kmd-and-umd).
+
 ## Modes and Configuration Options
 
 | Flag | Argument | Default | Meaning |
 |:---- |:-------- |:------- |:------- |
-| -enable-monitoring | - | disabled | Enable 'i915_monitoring' resource that provides access to all Intel GPU devices on the node |
+| -enable-monitoring | - | disabled | Enable '*_monitoring' resource that provides access to all Intel GPU devices on the node, [see use](./monitoring.md) |
 | -resource-manager | - | disabled | Enable fractional resource management, [see use](./fractional.md) |
 | -shared-dev-num | int | 1 | Number of containers that can share the same GPU device |
 | -allocation-policy | string | none | 3 possible values: balanced, packed, none. For shared-dev-num > 1: _balanced_ mode spreads workloads among GPU devices, _packed_ mode fills one GPU fully before moving to next, and _none_ selects first available device from kubelet. Default is _none_. Allocation policy does not have an effect when resource manager is enabled. |
@@ -205,6 +218,31 @@ GPU plugin does __not__ setup SR-IOV. It has to be configured by the cluster adm
 
 GPU plugin does however support provisioning Virtual Functions (VFs) to containers for a SR-IOV enabled GPU. When the plugin detects a GPU with SR-IOV VFs configured, it will only provision the VFs and leaves the PF device on the host.
 
+### KMD and UMD
+
+There are 3 different Kernel Mode Drivers (KMD) available: `i915 upstream`, `i915 backport` and `xe`:
+* `i915 upstream` is a vanilla driver that comes from the upstream kernel and is included in the common Linux distributions, like Ubuntu.
+* `i915 backport` is an [out-of-tree driver](https://github.com/intel-gpu/intel-gpu-i915-backports/) for older enterprise / LTS kernel versions, having better support for new HW before upstream kernel does. API it provides to user-space can differ from the eventual upstream version.
+* `xe` is a new KMD that is intended to support future GPUs. While it has [experimental support for latest current GPUs](https://docs.kernel.org/gpu/rfc/xe.html) (starting from Tigerlake), it will not support them officially.
+
+For optimal performance, the KMD should be paired with the same UMD variant. When creating a workload container, depending on the target hardware, the UMD packages should be selected approriately.
+
+| KMD | UMD packages | Support notes |
+|:---- |:-------- |:------- |
+| `i915 upstream` | Distro Repository | For Integrated GPUs. Newer Linux kernels will introduce support for Arc, Flex or Max series. |
+| `i915 backport` | [Intel Repository](https://dgpu-docs.intel.com/driver/installation.html#install-steps) | Best for Arc, Flex and Max series. Untested for Integrated GPUs. |
+| `xe` | Source code only | Experimental support for Arc, Flex and Max series. |
+
+> *NOTE*: Xe UMD is in active development and should be considered as experimental.
+
+Creating a workload that would support all the different KMDs is not currently possible. Below is a table that clarifies how each domain supports different KMDs.
+
+| Domain | i915 upstream | i915 backport | xe | Notes |
+|:---- |:-------- |:------- |:------- |:------- |
+| Compute | Default | [NEO_ENABLE_i915_PRELIM_DETECTION](https://github.com/intel/compute-runtime/blob/3341de7a0d5fddd2ea5f505b5d2ef5c13faa0681/CMakeLists.txt#L496-L502) | [NEO_ENABLE_XE_DRM_DETECTION](https://github.com/intel/compute-runtime/blob/3341de7a0d5fddd2ea5f505b5d2ef5c13faa0681/CMakeLists.txt#L504-L510) | All three KMDs can be supported at the same time. |
+| Media | Default | [ENABLE_PRODUCTION_KMD](https://github.com/intel/media-driver/blob/a66b076e83876fbfa9c9ab633ad9c5517f8d74fd/CMakeLists.txt#L58) | [ENABLE_XE_KMD](https://github.com/intel/media-driver/blob/a66b076e83876fbfa9c9ab633ad9c5517f8d74fd/media_driver/cmake/linux/media_feature_flags_linux.cmake#L187-L190) | Xe with upstream or backport i915, not all three. |
+| Graphics | Default | Unknown | [intel-xe-kmd](https://gitlab.freedesktop.org/mesa/mesa/-/blob/e9169881dbd1f72eab65a68c2b8e7643f74489b7/meson_options.txt#L708) | i915 and xe KMDs can be supported at the same time. |
+
 ### Issues with media workloads on multi-GPU setups
 
 OneVPL media API, 3D and compute APIs provide device discovery

diff --git a/cmd/gpu_plugin/device_props.go b/cmd/gpu_plugin/device_props.go
@@ -0,0 +1,85 @@
+// Copyright 2024 Intel Corporation. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"slices"
+
+	"github.com/intel/intel-device-plugins-for-kubernetes/cmd/internal/labeler"
+	"github.com/intel/intel-device-plugins-for-kubernetes/cmd/internal/pluginutils"
+	"k8s.io/klog/v2"
+)
+
+type DeviceProperties struct {
+	currentDriver string
+	drmDrivers    map[string]bool
+	tileCounts    []uint64
+	isPfWithVfs   bool
+}
+
+type invalidTileCountErr struct {
+	error
+}
+
+func newDeviceProperties() *DeviceProperties {
+	return &DeviceProperties{
+		drmDrivers: make(map[string]bool),
+	}
+}
+
+func (d *DeviceProperties) fetch(cardPath string) {
+	d.isPfWithVfs = pluginutils.IsSriovPFwithVFs(cardPath)
+
+	d.tileCounts = append(d.tileCounts, labeler.GetTileCount(cardPath))
+
+	driverName, err := pluginutils.ReadDeviceDriver(cardPath)
+	if err != nil {
+		klog.Warningf("card (%s) doesn't have driver, using default: %s", cardPath, deviceTypeDefault)
+
+		driverName = deviceTypeDefault
+	}
+
+	d.currentDriver = driverName
+	d.drmDrivers[d.currentDriver] = true
+}
+
+func (d *DeviceProperties) drmDriverCount() int {
+	return len(d.drmDrivers)
+}
+
+func (d *DeviceProperties) driver() string {
+	return d.currentDriver
+}
+
+func (d *DeviceProperties) monitorResource() string {
+	return d.currentDriver + monitorSuffix
+}
+
+func (d *DeviceProperties) maxTileCount() (uint64, error) {
+	if len(d.tileCounts) == 0 {
+		return 0, invalidTileCountErr{}
+	}
+
+	minCount := slices.Min(d.tileCounts)
+	maxCount := slices.Max(d.tileCounts)
+
+	if minCount != maxCount {
+		klog.Warningf("Node's GPUs are heterogenous (min: %d, max: %d tiles)", minCount, maxCount)
+
+		return 0, invalidTileCountErr{}
+	}
+
+	return maxCount, nil
+}