Skip to content

Commit

Permalink
gpu: add xe notes
Browse files Browse the repository at this point in the history
Signed-off-by: Tuomas Katila <[email protected]>
  • Loading branch information
tkatila committed Feb 21, 2024
1 parent b8c171a commit b3a924f
Showing 1 changed file with 36 additions and 0 deletions.
36 changes: 36 additions & 0 deletions cmd/gpu_plugin/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Table of Contents
* [Running GPU plugin as non-root](#running-gpu-plugin-as-non-root)
* [Labels created by GPU plugin](#labels-created-by-gpu-plugin)
* [SR-IOV use with the plugin](#sr-iov-use-with-the-plugin)
* [KMD and UMD](#kmd-and-umd)
* [Issues with media workloads on multi-GPU setups](#issues-with-media-workloads-on-multi-gpu-setups)
* [Workaround for QSV and VA-API](#workaround-for-qsv-and-va-api)

Expand All @@ -36,6 +37,18 @@ For example containers with Intel media driver (and components using that), can
video transcoding operations, and containers with the Intel OpenCL / oneAPI Level Zero
backend libraries can offload compute operations to GPU.

Intel GPU plugin may register four node resources to the Kubernetes cluster:
| Resource | Description |
|:---- |:-------- |
| gpu.intel.com/i915 | GPU instance running legacy `i915` KMD |
| gpu.intel.com/i915_monitoring | Monitoring resource for the legacy `i915` KMD devices |
| gpu.intel.com/xe | GPU instance running new `xe` KMD |
| gpu.intel.com/xe_monitoring | Monitoring resource for the new `xe` KMD devices |

While GPU plugin basic operations support nodes having both (`i915` and `xe`) KMDs on the same node, its resource management (=GAS) does not, for that node needs to have only one of the KMDs present.

For workloads on different KMDs, see [KMD and UMD](#kmd-and-umd).

## Modes and Configuration Options

| Flag | Argument | Default | Meaning |
Expand Down Expand Up @@ -205,6 +218,29 @@ GPU plugin does __not__ setup SR-IOV. It has to be configured by the cluster adm
GPU plugin does however support provisioning Virtual Functions (VFs) to containers for a SR-IOV enabled GPU. When the plugin detects a GPU with SR-IOV VFs configured, it will only provision the VFs and leaves the PF device on the host.
### KMD and UMD
There are three different KMDs (Kernel Mode Driver) available: `i915 upstream`, `i915 backport` and `xe`:
* `i915 upstream` is a vanilla driver that comes from the upstream kernel and is included in the common Linux distributions, like Ubuntu.
* `i915 backport` is a [backported/out-of-tree driver](https://github.com/intel-gpu/intel-gpu-i915-backports/) and it has the freshest feature-set that will eventually make its way to the upstream kernel. Due to how upstreaming works with reviews and approvals, the feature-set might change in the upstreaming process.
* `xe` is a new KMD that is intended to support future GPUs. It can work with the current newest GPUs (>=Tigerlake).
For optimal performance, the KMD should be paired with the same UMD variant. When creating a workload container, depending on the target hardware, the UMD packages should be selected approriately.
| KMD | UMD packages | Support notes |
|:---- |:-------- |:------- |
| `i915 upstream` | Distro Repository | For Integrated GPUs. Newer Linux kernels will introduce support for Arc, Flex or Max series. |
| `i915 backport` | [Intel Repository](https://dgpu-docs.intel.com/driver/installation.html#install-steps) | Best for Arc, Flex and Max series. Untested for Integrated GPUs. |
| `xe` | Not available yet | Experimental support for Arc, Flex and Max series. |
Creating a workload that would support all the different KMDs is not currently possible. Below is a table that clarifies how each domain supports different KMDs.
| Domain | i915 upstream | i915 backport | xe | Notes |
|:---- |:-------- |:------- |:------- |:------- |
| Compute | Default | [NEO_ENABLE_i915_PRELIM_DETECTION](https://github.com/intel/compute-runtime/blob/master/CMakeLists.txt#L498) | [NEO_ENABLE_XE_DRM_DETECTION](https://github.com/intel/compute-runtime/blob/master/CMakeLists.txt#L506) | All three KMDs can be supported at the same time. |
| Media | Default | [ENABLE_PRODUCTION_KMD](https://github.com/intel/media-driver/blob/master/CMakeLists.txt#L58) | [ENABLE_NEW_KMD](https://github.com/intel/media-driver/blob/master/media_driver/cmake/linux/media_feature_flags_linux.cmake#L185) | Only one KMD can be enabled at a time. |
| Graphics | Default | Default | [intel-xe-kmd](https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/meson_options.txt?ref_type=heads#L708) | All KMDs can be supported at the same time. |
### Issues with media workloads on multi-GPU setups
OneVPL media API, 3D and compute APIs provide device discovery
Expand Down

0 comments on commit b3a924f

Please sign in to comment.