kubernetes-sigs · k8s-ci-robot · Jun 5, 2024 · May 13, 2024 · May 13, 2024 · May 13, 2024
diff --git a/dra-evolution/README.md b/dra-evolution/README.md
@@ -1,9 +1,13 @@
-# k8srm-prototype
+# dra-evolution
 
-For more background, please see this document, though it is not yet up to date
-with the latest in this repo:
-- [Revisiting Kubernetes Resource
-  Model](https://docs.google.com/document/d/1Xy8HpGATxgA2S5tuFWNtaarw5KT8D2mj1F4AP1wg6dM/edit?usp=sharing).
+The [k8srm-prototype](../k8srm-prototype/README.md) is an attempt to derive a
+new API for device management from scratch. The API in this directory is taking
+the opposite approach: it incorporates ideas from the prototype into the 1.30
+DRA API. For some problems it picks a different approach.
+To compare YAML files, something like this can be used:
+```
+diff -C2 ../k8srm-prototype/testdata/classes.yaml <(sed -e 's;resource.k8s.io/v1alpha2;devmgmtproto.k8s.io/v1alpha1;' -e 's/ResourceClass/DeviceClass/' testdata/classes.yaml)
+```
 
 ## Overall Model
 
@@ -34,124 +38,28 @@ projects.
 ## Open Questions
 
 The next few sections of this document describe a proposed model. Note that this
-is really a brainstorming exercise and under active development. See the [open
-questions](open-questions.md) document for some of the still under discussion
-items.
-
-We are also looking at how we might extend the existing 1.30 DRA model with some
-of these ideas, rather than changing it out for these specific types.
+is really a brainstorming exercise and under active development.
 
 ## Pod Spec
 
 This prototype changes the `PodSpec` a little from how it is in DRA in 1.30.
 
-In 1.30, the `PodSpec` has a list of named sources. The sources are structs that
+As 1.30, the `PodSpec` has a list of named sources. The sources are structs that
 could contain either a claim name or a template name. The names are used to
-associate individual claims with containers. The example below allocates a
-single "foozer" device to the container in the pod.
-
-```yaml
-apiVersion: resource.k8s.io/v1alpha1
-kind: ResourceClaimTemplate
-metadata:
-  name: foozer
-  namespace: default
-spec:
-  spec:
-    resourceClassName: example.com-foozer
----
-apiVersion: v1
-kind: Pod
-metadata:
-  name: foozer
-  namespace: default
-spec:
-  containers:
-  - image: registry.k8s.io/pause:3.6
-    name: my-container
-    resources:
-      requests:
-        cpu: 10m
-        memory: 10Mi
-      claims:
-      - name: gpu
-  resourceClaims:
-  - name: gpu
-    source:
-      resourceClaimTemplate: foozer
-```
+associate individual claims with containers.
 
-In the prototype model, we are adding `matchAttributes` constraints to control
-consistency within a selection of devices. In particular, we want to be able to
-specify a `matchAttributes` constraint across two separate named sources, so
-that we can ensure for example, a GPU chosen for one container is the same model
-as one chosen for another container. This would imply we need `matchAttributes`
-that apply across the list present in `PodSpec`. However, we don't want to put
-things like `matchAttributes` into `PodSpec`, since it is already `v1`.
-
-So, we tweak the `PodSpec` a bit from 1.30, such that, instead of a list of
-named sources, with each source being a oneOf, we instead have a single
-`DeviceClaims` oneOf in the `PodSpec`. This oneOf could be:
-- A list of named sources, where sources are limited to a simple "class" name
-  (ie, not a list of oneOfs, just a list of simple structs).
-- A template struct, which consists of ObjectMeta + a claim name.
-- A claim name.
-
-Additionally we move the container association from
-`spec.containers[*].resources.claims` to `spec.containers[*].devices`.
-
-The first form of the of the `DeviceClaims` oneOf allows for our simplest of use
-cases to be very simple to express, without creating a secondary object to which
-we must then refer. So, the equivalent of the 1.30 YAML above would be:
-
-```yaml
-apiVersion: v1
-kind: Pod
-metadata:
-  name: foozer
-  namespace: default
-spec:
-  containers:
-  - image: registry.k8s.io/pause:3.6
-    name: my-container
-    resources:
-      requests:
-        cpu: 10m
-        memory: 10Mi
-    devices:
-    - name: gpu
-  deviceClaims:
-    devices:
-    - name: gpu
-      class: example.com-foozer
-```
+Each claim may contain multiple request for different devices. Containers can
+also be associated with individual requests inside a claim.
+
+Allocating multiple devices per claim allows specifying constraints for a set
+of devices, like "some attribute has to be the same". Long-term, it would be
+good to allow such constraints also across claims when a pod references more
+than one, but that would imply extending the `PodSpec` with complex fields
+where we are not sure yet what they need to look like. Therefore these
+constraints are currently limited to claims. This limitation may be
+removed once constraints are stable enough to be included in the `PodSpec`.
 
-Each entry in `spec.deviceClaims.devices` is just a name/class pair, but in fact
-serves as a template to generate claims that exist with the lifecycle of the
-pod. We may want to add `ObjectMeta` here as well, since it is behaving as a
-template, to allow setting labels, etc.
-
-The second form of `DeviceClaims` is a single struct with an ObjectMeta, and a
-claim name. The key with this form is that it is not *list* of named objects.
-Instead, it is a reference to a single claim object, and the named entries are
-*inside* the referenced object. This is to avoid a two-key mount in the
-`spec.containers[*].devices` entry. If that's not important, then we can tweak
-this a bit. In any case, this form allows claims which follow the lifecycle of
-the pod, similar to the first form. Since a top-level API claim spec can can
-contain multiple claim instances, this should be equally as expressive as if we
-included `matchAttributes` in the `PodSpec`, without having to do so.
-
-The third form of `DeviceClaims` is just a string; it is a claim name and allows
-the user to share a pre-provisioned claim between pods.
-
-Given that the first and second forms both have a template-like structure, we
-may want to combine them and use two-key indexing in the mounts. If we do so, we
-still want the direct specification of the class, so that the most common case
-does not need separate object just to reference a class.
-
-These `PodSpec` Go types can be seen in [podspec.go](testdata/podspec.go). This
-is not the complete `PodSpec` but just the relevant parts of the 1.30 and
-proposed versions.
+These `PodSpec` Go types can be seen in [pod_types.go](pkg/api/pod_types.go).
 
 ## Types
 
@@ -162,21 +70,25 @@ claim types.
 
 Claim and allocation types are found in [claim_types.go](pkg/api/claim_types.go);
 individual types and fields are described in detail there in the comments.
+Capacity types are in [capacity_types.go](pkg/api/capacity_types.go). A quota
+mechanism is defined in [quota_types.go](pkg/api/quota_types.go).
 
 Vendors and administrators create `DeviceClass` resources to pre-configure
-various options for claims. DeviceClass resources come in two varieties:
-- Ordinary or "leaf" classes that represent devices managed by a specific
-  driver, along with some optional selection constraints and configuration.
-- "Meta" or "Group" or "Aggregate" or "Composition" classes that use a label
-  selector to identify a *set* of leaf classes. This allows a claim to be
-  satistfied by one of many classes.
+various options for requests in claims. Such a class contains:
+- configuration for a device, potentially including options that
+  only an administrator may set
+- device requirements which select device instances that match the intended
+  semantic of the class ("give me a GPU")
+
+Classes are not necessarily associated with a single vendor. Whether they are
+depends on how the requirements in them are defined.
 
 Example classes are in [classes.yaml](testdata/classes.yaml).
 
 Example pod definitions can be found in the `pod-*.yaml` and `two-pods-*.yaml`
 files in [testdata](testdata).
 
-Drivers publish capacity via `DevicePool` resources. Examples may be found in
+Drivers publish capacity via `ResourcePool` objects. Examples may be found in
 the `pools-*.yaml` files in [testdata](testdata).
 
 ## Building
@@ -188,14 +100,14 @@ capacity data.
 Just run `make`, it will build everything.
 
 ```console
-k8srm-prototype$ make
+dra-evolution$ make
 gofmt -s -w .
 go test ./...
-?   	github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/cmd/mock-apiserver	[no test files]
-?   	github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/cmd/schedule	[no test files]
-?   	github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/api	[no test files]
-?   	github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/gen	[no test files]
-ok  	github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/schedule	(cached)
+?   	github.com/kubernetes-sigs/wg-device-management/dra-evolution/cmd/mock-apiserver	[no test files]
+?   	github.com/kubernetes-sigs/wg-device-management/dra-evolution/cmd/schedule	[no test files]
+?   	github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/api	[no test files]
+?   	github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/gen	[no test files]
+ok  	github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/schedule	(cached)
 cd cmd/schedule && go build
 cd cmd/mock-apiserver && go build
 ```
@@ -207,7 +119,7 @@ and used to try out scheduling (WIP). It will spit out some errors but you can
 ignore them.
 
 ```console
-k8srm-prototype$ ./cmd/mock-apiserver/mock-apiserver
+dra-evolution$ ./cmd/mock-apiserver/mock-apiserver
 W0422 13:20:21.238440 2062725 memorystorage.go:93] type info not known for apiextensions.k8s.io/v1, Kind=CustomResourceDefinition
 W0422 13:20:21.238598 2062725 memorystorage.go:93] type info not known for apiregistration.k8s.io/v1, Kind=APIService
 W0422 13:20:21.238639 2062725 memorystorage.go:267] type info not known for foozer.example.com/v1alpha1, Kind=FoozerConfig
@@ -222,18 +134,18 @@ W0422 13:20:21.238723 2062725 memorystorage.go:267] type info not known for devm
 The included `kubeconfig` will access that server. For example:
 
 ```console
-k8srm-prototype$ kubectl --kubeconfig kubeconfig apply -f testdata/drivers.yaml
+dra-evolution$ kubectl --kubeconfig kubeconfig apply -f testdata/drivers.yaml
 devicedriver.devmgmtproto.k8s.io/example.com-foozer created
 devicedriver.devmgmtproto.k8s.io/example.com-barzer created
 devicedriver.devmgmtproto.k8s.io/sriov-nic created
 devicedriver.devmgmtproto.k8s.io/vlan created
-k8srm-prototype$ kubectl --kubeconfig kubeconfig get devicedrivers
+dra-evolution$ kubectl --kubeconfig kubeconfig get devicedrivers
 NAME                 AGE
 example.com-foozer   2y112d
 example.com-barzer   2y112d
 sriov-nic            2y112d
 vlan                 2y112d
-k8srm-prototype$
+dra-evolution$
 ```
 
 ## `schedule` CLI

diff --git a/dra-evolution/cmd/gen/main.go b/dra-evolution/cmd/gen/main.go
@@ -5,7 +5,7 @@ import (
 	"fmt"
 	"os"
 
-	"github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/gen"
+	"github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/gen"
 
 	"sigs.k8s.io/yaml"
 )

diff --git a/dra-evolution/cmd/mock-apiserver/main.go b/dra-evolution/cmd/mock-apiserver/main.go
@@ -1,11 +1,12 @@
 package main
 
 import (
+	"log"
+	"sync"
+
 	"k8s.io/apimachinery/pkg/api/meta"
 	"k8s.io/apimachinery/pkg/runtime/schema"
-	"log"
 	"sigs.k8s.io/kubebuilder-declarative-pattern/mockkubeapiserver"
-	"sync"
 )
 
 func main() {
@@ -19,14 +20,13 @@ func main() {
 	k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Namespace"}, "namespaces", meta.RESTScopeRoot)
 	k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Secret"}, "secrets", meta.RESTScopeNamespace)
 	k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "ConfigMap"}, "configmaps", meta.RESTScopeNamespace)
-	k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Pod"}, "pods", meta.RESTScopeNamespace)
+	k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha2", Kind: "Pod"}, "pods", meta.RESTScopeNamespace)
 	k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Node"}, "nodes", meta.RESTScopeNamespace)
 	k8s.RegisterType(schema.GroupVersionKind{Group: "foozer.example.com", Version: "v1alpha1", Kind: "FoozerConfig"}, "foozerconfigs", meta.RESTScopeNamespace)
-	k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DeviceDriver"}, "devicedrivers", meta.RESTScopeRoot)
-	k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DeviceClass"}, "deviceclasses", meta.RESTScopeRoot)
-	k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DeviceClaim"}, "deviceclaims", meta.RESTScopeNamespace)
-	k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DevicePrivilegedClaim"}, "deviceprivilegedclaims", meta.RESTScopeNamespace)
-	k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DevicePool"}, "devicepools", meta.RESTScopeRoot)
+	k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha2", Kind: "DeviceClass"}, "deviceclasses", meta.RESTScopeRoot)
+	k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha2", Kind: "ResourceClaim"}, "resourceclaims", meta.RESTScopeNamespace)
+	k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha1", Kind: "ResourcePool"}, "resourcepools", meta.RESTScopeRoot)
+	k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha1", Kind: "ResourcePolicy"}, "resourcepolicies", meta.RESTScopeRoot)
 
 	wg.Add(1)
 	addr, err := k8s.StartServing()

diff --git a/dra-evolution/go.mod b/dra-evolution/go.mod
@@ -1,45 +1,64 @@
-module github.com/kubernetes-sigs/wg-device-management/k8srm-prototype
+module github.com/kubernetes-sigs/wg-device-management/dra-evolution
 
 go 1.22.1
 
 replace github.com/kubernetes-sigs/wg-device-management/nv-partitionable-resources => ../nv-partitionable-resources
 
 require (
 	github.com/NVIDIA/go-nvml v0.12.0-5
-	github.com/google/cel-go v0.20.1
+	github.com/blang/semver/v4 v4.0.0
+	github.com/google/cel-go v0.17.8
 	github.com/kubernetes-sigs/wg-device-management/nv-partitionable-resources v0.0.0-00010101000000-000000000000
 	github.com/stretchr/testify v1.9.0
 	k8s.io/api v0.30.0
 	k8s.io/apimachinery v0.30.0
+	k8s.io/apiserver v0.30.0
+	k8s.io/klog/v2 v2.120.1
+	k8s.io/utils v0.0.0-20240423183400-0849a56e8f22
 	sigs.k8s.io/kubebuilder-declarative-pattern/mockkubeapiserver v0.0.0-20240404191132-83bd9c05741b
 	sigs.k8s.io/yaml v1.4.0
 )
 
 require (
 	github.com/Masterminds/semver v1.5.0 // indirect
 	github.com/NVIDIA/go-nvlib v0.3.0 // indirect
-	github.com/antlr4-go/antlr/v4 v4.13.0 // indirect
+	github.com/antlr/antlr4/runtime/Go/antlr/v4 v4.0.0-20230305170008-8188dc5388df // indirect
 	github.com/davecgh/go-spew v1.1.1 // indirect
+	github.com/emicklei/go-restful/v3 v3.11.0 // indirect
 	github.com/go-logr/logr v1.4.1 // indirect
+	github.com/go-openapi/jsonpointer v0.19.6 // indirect
+	github.com/go-openapi/jsonreference v0.20.2 // indirect
+	github.com/go-openapi/swag v0.22.3 // indirect
 	github.com/gogo/protobuf v1.3.2 // indirect
+	github.com/golang/protobuf v1.5.4 // indirect
+	github.com/google/gnostic-models v0.6.8 // indirect
 	github.com/google/gofuzz v1.2.0 // indirect
 	github.com/google/uuid v1.6.0 // indirect
+	github.com/josharian/intern v1.0.0 // indirect
 	github.com/json-iterator/go v1.1.12 // indirect
+	github.com/mailru/easyjson v0.7.7 // indirect
 	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
 	github.com/modern-go/reflect2 v1.0.2 // indirect
+	github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
 	github.com/pmezard/go-difflib v1.0.0 // indirect
 	github.com/stoewer/go-strcase v1.2.0 // indirect
 	golang.org/x/exp v0.0.0-20231110203233-9a3e6036ecaa // indirect
 	golang.org/x/net v0.23.0 // indirect
+	golang.org/x/oauth2 v0.10.0 // indirect
+	golang.org/x/sync v0.6.0 // indirect
+	golang.org/x/sys v0.18.0 // indirect
+	golang.org/x/term v0.18.0 // indirect
 	golang.org/x/text v0.14.0 // indirect
-	google.golang.org/genproto/googleapis/api v0.0.0-20230803162519-f966b187b2e5 // indirect
-	google.golang.org/genproto/googleapis/rpc v0.0.0-20230803162519-f966b187b2e5 // indirect
+	golang.org/x/time v0.3.0 // indirect
+	google.golang.org/appengine v1.6.7 // indirect
+	google.golang.org/genproto/googleapis/api v0.0.0-20230726155614-23370e0ffb3e // indirect
+	google.golang.org/genproto/googleapis/rpc v0.0.0-20230822172742-b8732ec3820d // indirect
 	google.golang.org/protobuf v1.33.0 // indirect
 	gopkg.in/inf.v0 v0.9.1 // indirect
 	gopkg.in/yaml.v2 v2.4.0 // indirect
 	gopkg.in/yaml.v3 v3.0.1 // indirect
-	k8s.io/klog/v2 v2.120.1 // indirect
-	k8s.io/utils v0.0.0-20240423183400-0849a56e8f22 // indirect
+	k8s.io/client-go v0.30.0 // indirect
+	k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340 // indirect
 	sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect
 	sigs.k8s.io/structured-merge-diff/v4 v4.4.1 // indirect
 )
-Original file line number
+Diff line change
@@ Expand Up / @@ -5,7 +5,7 @@ import ( @@
     	"fmt"
     	"os"
-    	"github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/gen"
+    	"github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/gen"
     	"sigs.k8s.io/yaml"
     )
@@ Expand Down @@