Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRA API evolution #14

Merged
merged 71 commits into from
Jun 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
a67f632
add "dra-evolution" proposal
pohly May 13, 2024
1504622
dra_evolution: add quota
pohly May 13, 2024
7c2d569
dra-evolution: claim status notes
pohly May 13, 2024
e3b570c
dra-evolution: move pod to api
pohly May 14, 2024
0a77223
dra-evolution: add ResourceClaimTemplate
pohly May 14, 2024
bec6652
dra-evolution: add automatic testing of YAML files
pohly May 14, 2024
ab87abd
dra-evolution/testdata: split up YAML
pohly May 14, 2024
c2a062c
Migrate dra-evolution/testdata/classes.yaml to relevant types
klueska May 14, 2024
84cc5f1
dra-evolution api: fix container resource claim list
pohly May 14, 2024
56c74fd
dra-evolution: validate CEL expressions
pohly May 14, 2024
dea8ded
dra-evolution: fix some CEL expressions
pohly May 14, 2024
83b43b9
dra-evolution: also validate ResourceClaim[Template]
pohly May 14, 2024
73af0cf
dra-prototype README.md: compare CEL syntax
pohly May 14, 2024
ee69061
dra-evolution: consolidate filter and request types
pohly May 15, 2024
8024e0c
dra-evolution: add device.driverName
pohly May 15, 2024
6f8518e
dra-evolution: update classes.yaml
pohly May 15, 2024
d435a4a
dra-evolution README.md: explain how to do a YAML diff
pohly May 15, 2024
8cbe299
dra-evolution: update pod_types.go with proper json/protobuf tags
klueska May 15, 2024
c34e043
dra-evolution: migrate pod-one-container-one-gpu.yaml to relevant types
klueska May 15, 2024
7a622ff
dra-evolution: device.attributes and device.<type>Attributes
pohly May 15, 2024
59de778
dra-evolution: clarify expection for 'one device' class
pohly May 15, 2024
f4d03d5
dra-evolution: update claim_types.go with proper json tags
klueska May 16, 2024
22c7ae6
dra-evolution: migrate pod-one-container-two-gpus-*.yaml to relevant …
klueska May 16, 2024
65d5c49
dra-evolution: add MatchAttributes into ResourceRequestDetail
klueska May 16, 2024
4e9e3f3
dra-evolution: update examples with new "localized" MatchAttributes
klueska May 16, 2024
f4b4052
dra-evolution: migrate two-pods-one-gpu-*.yaml to relevant types
klueska May 16, 2024
bb2a6d8
dra-evolution: harmonize fields
pohly May 16, 2024
846e397
dra-evolution: remove ResourceClaimDevice and expand ResourceClaimEntry
klueska May 16, 2024
7b3aef2
dra-evolution: migrate pod-two-containers-*.yaml to relevant types
klueska May 16, 2024
bfa9359
dra-evolution: migrate pod-one-container-one-gpu-one-vf.yaml to relev…
klueska May 16, 2024
88f447b
dra-evolution: consistently use pcie-root.dra.k8s.io in the examples
klueska May 16, 2024
d3fb9c4
dra-evolution: skip YAML test cases that haven't been converted
pohly May 16, 2024
4cb8709
dra-evolution: unknown keys are not runtime errors
pohly May 16, 2024
45d14ba
dra-evolution: require fully-qualified attribute names
pohly May 16, 2024
ac9871d
dra-evolution: use fully-qualified attribute name
pohly May 16, 2024
aa0d952
dra-evolution: introduce more general request requirements
pohly May 17, 2024
cd5ba42
dra-evolution: add support for expressing shared resources between a …
klueska May 17, 2024
bd8117a
dra-evolution: migrate pools-two-nodes-one-dgxa100.yaml to relevant t…
klueska May 17, 2024
3e5c69f
dra-evolution: refine the notion of a ResourceRequirement based on sh…
klueska May 17, 2024
3b98607
dra-evolution: migrate pod-one-container-shared-split-allocation-gpus…
klueska May 17, 2024
324e2e4
dra-evolution: add missing IntRange
pohly May 18, 2024
e71290b
dra-evolution: introduce "claim requirements"
pohly May 18, 2024
3ec5c7e
dra-evolution: hide empty claim and request options
pohly May 19, 2024
7703dfa
dra-evolution: revise class inheritance
pohly May 19, 2024
b4c48f6
dra-evolution: typo fix
pohly May 19, 2024
74df07f
dra-evolution: harmonize list of vendor configs with other fields
pohly May 19, 2024
adb0b5d
dra-evolution: add proper type for intrange
klueska May 20, 2024
c920b2b
dra-evolution: revise naming of allocation result fields and structs
pohly May 21, 2024
210b376
dra-evolution: remove "forClass" claim source
pohly May 21, 2024
6c887ba
dra-evolution: rename class references
pohly May 21, 2024
aecf75c
fixup! dra-evolution: remove forClass claim source
pohly May 21, 2024
819ac75
dra-evolution: use "constraints" and "requirements"
pohly May 21, 2024
883167e
dra-evolution: update stale content in README.md
pohly May 22, 2024
ed651b1
dra-evolution: fix mock api server
pohly May 22, 2024
1dbb6d2
dra-evolution: remove multi-inheritance of classes
pohly May 22, 2024
6eab574
dra-evolution: remove "source" nesting via inlining
pohly May 22, 2024
9120701
dra-evolution: add support for network-attached devices
pohly May 22, 2024
b18ad3b
Revert "dra-evolution: add proper type for intrange"
pohly May 24, 2024
3b324ee
Revert "dra-evolution: migrate pod-one-container-shared-split-allocat…
pohly May 24, 2024
75d1d2e
Revert "dra-evolution: refine the notion of a ResourceRequirement bas…
pohly May 24, 2024
ed56e13
Revert "dra-evolution: migrate pools-two-nodes-one-dgxa100.yaml to re…
pohly May 24, 2024
9e74635
Revert "dra-evolution: add support for expressing shared resources be…
pohly May 24, 2024
ebfa95f
dra-evolution: ResourceClass -> DeviceClass
pohly May 24, 2024
7aeb3c4
dra-evolution: revise ResourcePool
pohly May 28, 2024
82be70e
DRA: simplified proposal
pohly May 31, 2024
308b670
dra-prototype: review feedback
pohly Jun 3, 2024
13eace9
dra-prototype: use driver names which follow the recommended naming p…
pohly Jun 3, 2024
3ed68fa
dra-evolution: simplify referencing the node in allocation result
pohly Jun 4, 2024
dfd9ec8
dra-evolution: review feedback
pohly Jun 4, 2024
49e124c
dra-evolution: update README.md
pohly Jun 4, 2024
72c1f7c
dra-evolution: typo fix
pohly Jun 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 44 additions & 132 deletions dra-evolution/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
# k8srm-prototype
# dra-evolution

For more background, please see this document, though it is not yet up to date
with the latest in this repo:
- [Revisiting Kubernetes Resource
Model](https://docs.google.com/document/d/1Xy8HpGATxgA2S5tuFWNtaarw5KT8D2mj1F4AP1wg6dM/edit?usp=sharing).
The [k8srm-prototype](../k8srm-prototype/README.md) is an attempt to derive a
new API for device management from scratch. The API in this directory is taking
the opposite approach: it incorporates ideas from the prototype into the 1.30
DRA API. For some problems it picks a different approach.
To compare YAML files, something like this can be used:
```
diff -C2 ../k8srm-prototype/testdata/classes.yaml <(sed -e 's;resource.k8s.io/v1alpha2;devmgmtproto.k8s.io/v1alpha1;' -e 's/ResourceClass/DeviceClass/' testdata/classes.yaml)
```

## Overall Model

Expand Down Expand Up @@ -34,124 +38,28 @@ projects.
## Open Questions

The next few sections of this document describe a proposed model. Note that this
is really a brainstorming exercise and under active development. See the [open
questions](open-questions.md) document for some of the still under discussion
items.

We are also looking at how we might extend the existing 1.30 DRA model with some
of these ideas, rather than changing it out for these specific types.
is really a brainstorming exercise and under active development.

## Pod Spec

This prototype changes the `PodSpec` a little from how it is in DRA in 1.30.

In 1.30, the `PodSpec` has a list of named sources. The sources are structs that
As 1.30, the `PodSpec` has a list of named sources. The sources are structs that
could contain either a claim name or a template name. The names are used to
associate individual claims with containers. The example below allocates a
single "foozer" device to the container in the pod.

```yaml
apiVersion: resource.k8s.io/v1alpha1
kind: ResourceClaimTemplate
metadata:
name: foozer
namespace: default
spec:
spec:
resourceClassName: example.com-foozer
---
apiVersion: v1
kind: Pod
metadata:
name: foozer
namespace: default
spec:
containers:
- image: registry.k8s.io/pause:3.6
name: my-container
resources:
requests:
cpu: 10m
memory: 10Mi
claims:
- name: gpu
resourceClaims:
- name: gpu
source:
resourceClaimTemplate: foozer
```
associate individual claims with containers.

In the prototype model, we are adding `matchAttributes` constraints to control
consistency within a selection of devices. In particular, we want to be able to
specify a `matchAttributes` constraint across two separate named sources, so
that we can ensure for example, a GPU chosen for one container is the same model
as one chosen for another container. This would imply we need `matchAttributes`
that apply across the list present in `PodSpec`. However, we don't want to put
things like `matchAttributes` into `PodSpec`, since it is already `v1`.

So, we tweak the `PodSpec` a bit from 1.30, such that, instead of a list of
named sources, with each source being a oneOf, we instead have a single
`DeviceClaims` oneOf in the `PodSpec`. This oneOf could be:
- A list of named sources, where sources are limited to a simple "class" name
(ie, not a list of oneOfs, just a list of simple structs).
- A template struct, which consists of ObjectMeta + a claim name.
- A claim name.

Additionally we move the container association from
`spec.containers[*].resources.claims` to `spec.containers[*].devices`.

The first form of the of the `DeviceClaims` oneOf allows for our simplest of use
cases to be very simple to express, without creating a secondary object to which
we must then refer. So, the equivalent of the 1.30 YAML above would be:

```yaml
apiVersion: v1
kind: Pod
metadata:
name: foozer
namespace: default
spec:
containers:
- image: registry.k8s.io/pause:3.6
name: my-container
resources:
requests:
cpu: 10m
memory: 10Mi
devices:
- name: gpu
deviceClaims:
devices:
- name: gpu
class: example.com-foozer
```
Each claim may contain multiple request for different devices. Containers can
also be associated with individual requests inside a claim.

Allocating multiple devices per claim allows specifying constraints for a set
of devices, like "some attribute has to be the same". Long-term, it would be
good to allow such constraints also across claims when a pod references more
than one, but that would imply extending the `PodSpec` with complex fields
where we are not sure yet what they need to look like. Therefore these
constraints are currently limited to claims. This limitation may be
removed once constraints are stable enough to be included in the `PodSpec`.

Each entry in `spec.deviceClaims.devices` is just a name/class pair, but in fact
serves as a template to generate claims that exist with the lifecycle of the
pod. We may want to add `ObjectMeta` here as well, since it is behaving as a
template, to allow setting labels, etc.

The second form of `DeviceClaims` is a single struct with an ObjectMeta, and a
claim name. The key with this form is that it is not *list* of named objects.
Instead, it is a reference to a single claim object, and the named entries are
*inside* the referenced object. This is to avoid a two-key mount in the
`spec.containers[*].devices` entry. If that's not important, then we can tweak
this a bit. In any case, this form allows claims which follow the lifecycle of
the pod, similar to the first form. Since a top-level API claim spec can can
contain multiple claim instances, this should be equally as expressive as if we
included `matchAttributes` in the `PodSpec`, without having to do so.

The third form of `DeviceClaims` is just a string; it is a claim name and allows
the user to share a pre-provisioned claim between pods.

Given that the first and second forms both have a template-like structure, we
may want to combine them and use two-key indexing in the mounts. If we do so, we
still want the direct specification of the class, so that the most common case
does not need separate object just to reference a class.

These `PodSpec` Go types can be seen in [podspec.go](testdata/podspec.go). This
is not the complete `PodSpec` but just the relevant parts of the 1.30 and
proposed versions.
These `PodSpec` Go types can be seen in [pod_types.go](pkg/api/pod_types.go).

## Types

Expand All @@ -162,21 +70,25 @@ claim types.

Claim and allocation types are found in [claim_types.go](pkg/api/claim_types.go);
individual types and fields are described in detail there in the comments.
Capacity types are in [capacity_types.go](pkg/api/capacity_types.go). A quota
mechanism is defined in [quota_types.go](pkg/api/quota_types.go).

Vendors and administrators create `DeviceClass` resources to pre-configure
various options for claims. DeviceClass resources come in two varieties:
- Ordinary or "leaf" classes that represent devices managed by a specific
driver, along with some optional selection constraints and configuration.
- "Meta" or "Group" or "Aggregate" or "Composition" classes that use a label
selector to identify a *set* of leaf classes. This allows a claim to be
satistfied by one of many classes.
various options for requests in claims. Such a class contains:
- configuration for a device, potentially including options that
only an administrator may set
- device requirements which select device instances that match the intended
semantic of the class ("give me a GPU")

Classes are not necessarily associated with a single vendor. Whether they are
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking to people, I think we should pin down what a class is actually for.

Is it for vendors to expose all of their devices, e.g. class "nvidia-gpu" sets the driver "nvidia.com/gpu" and that's it? That seems pointless.

Is it for vendors to expose each individual device, e.g. classes "nvidia-a100", "nvidia-h100", etc each set a specific constraint? That seems heavy handed.

Is it for kubernetes to try to equivocate, e.g. class "k8s-gpu" allows NVIDIA, Intel, or AMD devices)? That seems unlikely to work.

Is it for cluster-admins to define a qualitative name and indirection (e.g. class "good-gpu" is a100 on one cluster and h100 on another)? Is that realistic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "classic" DRA classes were alot more meaningful. Since every claim had to point to a class, and every class had to point to a driver, classes provided an admin with a way to restrict what resources a claim could grab hold of in a given environment. Vendors could design their class parameters to define what restrictions were possible (e.g. disallow sharing), and their vendor-specific controller could enforce these restrictions at allocation time.

In the new world of "semantic" models, the purpose of a class is a little less clear since the allocation policy is not vendor specific anymore. In the latest iteration its purpose has become even less given that claims no longer have to point to a class, and classes no longe have to point to a driver,

In the latest iteration, classes have basically become a way of creating a set of predefined constraints / requirements as a convenience so that users don't have to write it all out themselves when defining their claims.

In principle, I'm not opposed to providing such conveniences -- but the abstraction it provides no longer feels like a ResourceClass in the sense that it did back when DRA was first formulated. Where things get most weird to me though, is when we start talking about having a proliferation of ResourceClasses defined this way -- each of which encapsulates one small constraint or a single resource requirement -- and then providing a means for a claim to "inherit" from multiple such classes to build out its full set of requirements.

If this is the true purpose of the ResourceClass now, then it almost feels like it would be better to do away with the concept of "resource classes" altogether and instead define a different top-level object which simply contains a list of named CEL expressions. One could then reference this object and one of its named expressions from appropriate places inside the claim.

There would be a lot less objects, the intent would be clearer, and the messy notion of "inheritance" would not be built into the model anywhere.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has become even less given that claims no longer have to point to a class

I'm not sure how I feel about this, still.

providing a means for a claim to "inherit" from multiple such classes

Yeah, this doesn't excite me.

If this is the true purpose of the ResourceClass now, then it almost feels like it would be better to do away with the concept of "resource classes" altogether

I'm soundly of a "less is more" mindset right now. Classes originated with StorageClass, whose goal was expressly to give cluster admins a way to interpose between requests for storage and the implementations of the volumes, so that workloads could plausibly port between clusters. This predicated on the idea that the API (mounted filesystems) was roughly identical between providers (true for block and FS volumes, a little less true for NAS volumes). I don't see that same property for GPUs any time soon, though it may be true for other kinds of devices (of which we have a shortage of real examples).

The only real properties of ResourceClass left are:

  • equivalence (class of classes, which has never come up for volumes, but a cluster of hetergeneous nodes SEEMS plausible)
  • giving admins a way to express the bounding set of devices (IFF all claims must use classes)

Are these valuable enough?

Copy link
Contributor Author

@pohly pohly May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a third reason for classes:

  • set configuration options that non-privileged users are not allowed to set

I don't know whether that is a strong enough reason to have them. But if we add a resourceClassName field later, it would be ignored by an older scheduler, so adding it later is going to be harder.

equivalence (class of classes, which has never come up for volumes, but a cluster of hetergeneous nodes SEEMS plausible)

I'm not sure I understand what is meant with this. Is this a weaker form of a vendor-independent "give me a GPU" class?

giving admins a way to express the bounding set of devices (IFF all claims must use classes)

My thinking was that this could be enforced with mutating webhooks, i.e. no need to have it in core Kubernetes. I might be wrong and it really has to be a mandatory field because a webhook would have to interpret the request to determine which device is meant, which might not be possible.

I'm not a fan of making classes mandatory. If we did this and then had "restricted" classes and "less restricted classes" in the cluster, we'd need access control for the classes. IMHO a better solution is to implement a quota mechanism that can be used to prevent the usage of certain devices per namespace, instead of doing it indirectly through classes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set configuration options that non-privileged users are not allowed to set

How do we govern that? Is it a different CRD type for class config vs claim config? I buy the use case, but I wonder how important it is REALLY? I know you have thought more about non-GPU devices than the rest of us...

equivalence (class of classes, which has never come up for volumes, but a cluster of hetergeneous nodes SEEMS plausible)

I'm not sure I understand what is meant with this. Is this a weaker form of a vendor-independent "give me a GPU" class?

Exactly. I, the cluster/platform admin declare that NVIDIA and AMD are "equivalent enough". Or more realistically, that two network or storage or something else device providers are equivalent enough.

IMHO a better solution is to implement a quota mechanism that can be used to prevent the usage of certain devices per namespace

I don't think I see how that can work in a reasonable way, but I have not gotten to your quota parts yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another, simpler alternative that is more in line with John's proposal:

  • We use DeviceClass instead of ResourceClass.
  • DeviceClass only has fields related to requests:
    • per-device configuration
    • device requirements
  • DeviceClasses can be referenced in a claim request, but not in the claim itself.
  • Using a class remains optional.

Much easier to understand and explain. It's also more limited. There's no way to pre-define constraints between multiple devices. Users have to put those into their claims. I'm fine with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because classes are optional, we need a different mechanism to limit which devices are accessible within which namespace. See https://github.com/kubernetes-sigs/wg-device-management/pull/14/files#r1613095446 for a proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if classes were mandatory, we then still would need an access control mechanism. I prefer to decouple "device access control" from "privileged device configuration access control" (aka access to classes). They seem orthogonal to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recapping from a doc-thread:

If classes are how we do quota, then they can't be optional.

If classes are not how we do quota, then a) we need a way to do quota; and b) I'm not sure classes are really valuable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) we need a way to do quota

Agreed. We also need to do it at allocation time, to accurately reflect what the user actually is getting.

If we were to base quota on classes, the classes would have to be accurate: if class is "a 10GiB GPU or larger", then users can game the system by using that class and hoping that they get something bigger. Therefore I believe that quota should be based on device attributes, not classes.

b) I'm not sure classes are really valuable

The current proposal (DeviceClass) seems useful to me, but I am also okay with dropping them as long as we design the API so that we can add them back safely through a separate KEP.

We could "reserve" fields for future use:

type ResourceClaimSpec struct {

     // At the moment, ClaimClassRef is always. But clients *must* check
     // that it really is nil and error out with "unsupported class reference" if it is non-nil
     // because a future extension might populate this field.
    *ClaimClassRef `json:",inline"`
}

// A one-of, always empty at the moment and thus not usable yet.
type ClaimClassRef struct {
    // FUTURE EXTENSION(s):
    // ClaimClassName *string `json:"claimClassName"`
}

Same for a request instead of defining a deviceClassName. Note that although the struct is now called ClaimClassRef, the actual field name is only going to be defined in the future and even the struct can be renamed. Nothing of this will appear in the current user API, it's just Go code.

We should only do this judiciously. It puts a burden on clients which might never pay off. But it is a way for us to punt on implementing something now that we want (or might want) to add in the future.

The error message in current clients needs to know what the field is for. We could even avoid that with:

type ResourceClaimSpec struct {

    // Clients must check for nil. If non-nil, they must error out with
    // "unsupported field: <FooField.FieldName>.
    *FooField `json:",inline"`
}

// Type must be set plus exactly one other field.
type ClaimClassRef struct {
    FieldName string

    // FUTURE EXTENSION(s):
    // ClaimClassName *string `json:"claimClassName"`
}

But that would make the future API weird: even if we set the FieldName in the API server as a default, it would be user-visible.

depends on how the requirements in them are defined.

Example classes are in [classes.yaml](testdata/classes.yaml).

Example pod definitions can be found in the `pod-*.yaml` and `two-pods-*.yaml`
files in [testdata](testdata).

Drivers publish capacity via `DevicePool` resources. Examples may be found in
Drivers publish capacity via `ResourcePool` objects. Examples may be found in
the `pools-*.yaml` files in [testdata](testdata).

## Building
Expand All @@ -188,14 +100,14 @@ capacity data.
Just run `make`, it will build everything.

```console
k8srm-prototype$ make
dra-evolution$ make
gofmt -s -w .
go test ./...
? github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/cmd/mock-apiserver [no test files]
? github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/cmd/schedule [no test files]
? github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/api [no test files]
? github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/gen [no test files]
ok github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/schedule (cached)
? github.com/kubernetes-sigs/wg-device-management/dra-evolution/cmd/mock-apiserver [no test files]
? github.com/kubernetes-sigs/wg-device-management/dra-evolution/cmd/schedule [no test files]
? github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/api [no test files]
? github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/gen [no test files]
ok github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/schedule (cached)
cd cmd/schedule && go build
cd cmd/mock-apiserver && go build
```
Expand All @@ -207,7 +119,7 @@ and used to try out scheduling (WIP). It will spit out some errors but you can
ignore them.

```console
k8srm-prototype$ ./cmd/mock-apiserver/mock-apiserver
dra-evolution$ ./cmd/mock-apiserver/mock-apiserver
W0422 13:20:21.238440 2062725 memorystorage.go:93] type info not known for apiextensions.k8s.io/v1, Kind=CustomResourceDefinition
W0422 13:20:21.238598 2062725 memorystorage.go:93] type info not known for apiregistration.k8s.io/v1, Kind=APIService
W0422 13:20:21.238639 2062725 memorystorage.go:267] type info not known for foozer.example.com/v1alpha1, Kind=FoozerConfig
Expand All @@ -222,18 +134,18 @@ W0422 13:20:21.238723 2062725 memorystorage.go:267] type info not known for devm
The included `kubeconfig` will access that server. For example:

```console
k8srm-prototype$ kubectl --kubeconfig kubeconfig apply -f testdata/drivers.yaml
dra-evolution$ kubectl --kubeconfig kubeconfig apply -f testdata/drivers.yaml
devicedriver.devmgmtproto.k8s.io/example.com-foozer created
devicedriver.devmgmtproto.k8s.io/example.com-barzer created
devicedriver.devmgmtproto.k8s.io/sriov-nic created
devicedriver.devmgmtproto.k8s.io/vlan created
k8srm-prototype$ kubectl --kubeconfig kubeconfig get devicedrivers
dra-evolution$ kubectl --kubeconfig kubeconfig get devicedrivers
NAME AGE
example.com-foozer 2y112d
example.com-barzer 2y112d
sriov-nic 2y112d
vlan 2y112d
k8srm-prototype$
dra-evolution$
```

## `schedule` CLI
Expand Down
2 changes: 1 addition & 1 deletion dra-evolution/cmd/gen/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import (
"fmt"
"os"

"github.com/kubernetes-sigs/wg-device-management/k8srm-prototype/pkg/gen"
"github.com/kubernetes-sigs/wg-device-management/dra-evolution/pkg/gen"

"sigs.k8s.io/yaml"
)
Expand Down
16 changes: 8 additions & 8 deletions dra-evolution/cmd/mock-apiserver/main.go
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
package main

import (
"log"
"sync"

"k8s.io/apimachinery/pkg/api/meta"
"k8s.io/apimachinery/pkg/runtime/schema"
"log"
"sigs.k8s.io/kubebuilder-declarative-pattern/mockkubeapiserver"
"sync"
)

func main() {
Expand All @@ -19,14 +20,13 @@ func main() {
k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Namespace"}, "namespaces", meta.RESTScopeRoot)
k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Secret"}, "secrets", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "ConfigMap"}, "configmaps", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Pod"}, "pods", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha2", Kind: "Pod"}, "pods", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Node"}, "nodes", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "foozer.example.com", Version: "v1alpha1", Kind: "FoozerConfig"}, "foozerconfigs", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DeviceDriver"}, "devicedrivers", meta.RESTScopeRoot)
k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DeviceClass"}, "deviceclasses", meta.RESTScopeRoot)
k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DeviceClaim"}, "deviceclaims", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DevicePrivilegedClaim"}, "deviceprivilegedclaims", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "devmgmtproto.k8s.io", Version: "v1alpha1", Kind: "DevicePool"}, "devicepools", meta.RESTScopeRoot)
k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha2", Kind: "DeviceClass"}, "deviceclasses", meta.RESTScopeRoot)
k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha2", Kind: "ResourceClaim"}, "resourceclaims", meta.RESTScopeNamespace)
k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha1", Kind: "ResourcePool"}, "resourcepools", meta.RESTScopeRoot)
k8s.RegisterType(schema.GroupVersionKind{Group: "resource.k8s.io", Version: "v1alpha1", Kind: "ResourcePolicy"}, "resourcepolicies", meta.RESTScopeRoot)

wg.Add(1)
addr, err := k8s.StartServing()
Expand Down
33 changes: 26 additions & 7 deletions dra-evolution/go.mod
Original file line number Diff line number Diff line change
@@ -1,45 +1,64 @@
module github.com/kubernetes-sigs/wg-device-management/k8srm-prototype
module github.com/kubernetes-sigs/wg-device-management/dra-evolution

go 1.22.1

replace github.com/kubernetes-sigs/wg-device-management/nv-partitionable-resources => ../nv-partitionable-resources

require (
github.com/NVIDIA/go-nvml v0.12.0-5
github.com/google/cel-go v0.20.1
github.com/blang/semver/v4 v4.0.0
github.com/google/cel-go v0.17.8
github.com/kubernetes-sigs/wg-device-management/nv-partitionable-resources v0.0.0-00010101000000-000000000000
github.com/stretchr/testify v1.9.0
k8s.io/api v0.30.0
k8s.io/apimachinery v0.30.0
k8s.io/apiserver v0.30.0
k8s.io/klog/v2 v2.120.1
k8s.io/utils v0.0.0-20240423183400-0849a56e8f22
sigs.k8s.io/kubebuilder-declarative-pattern/mockkubeapiserver v0.0.0-20240404191132-83bd9c05741b
sigs.k8s.io/yaml v1.4.0
)

require (
github.com/Masterminds/semver v1.5.0 // indirect
github.com/NVIDIA/go-nvlib v0.3.0 // indirect
github.com/antlr4-go/antlr/v4 v4.13.0 // indirect
github.com/antlr/antlr4/runtime/Go/antlr/v4 v4.0.0-20230305170008-8188dc5388df // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/emicklei/go-restful/v3 v3.11.0 // indirect
github.com/go-logr/logr v1.4.1 // indirect
github.com/go-openapi/jsonpointer v0.19.6 // indirect
github.com/go-openapi/jsonreference v0.20.2 // indirect
github.com/go-openapi/swag v0.22.3 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/gnostic-models v0.6.8 // indirect
github.com/google/gofuzz v1.2.0 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/stoewer/go-strcase v1.2.0 // indirect
golang.org/x/exp v0.0.0-20231110203233-9a3e6036ecaa // indirect
golang.org/x/net v0.23.0 // indirect
golang.org/x/oauth2 v0.10.0 // indirect
golang.org/x/sync v0.6.0 // indirect
golang.org/x/sys v0.18.0 // indirect
golang.org/x/term v0.18.0 // indirect
golang.org/x/text v0.14.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20230803162519-f966b187b2e5 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20230803162519-f966b187b2e5 // indirect
golang.org/x/time v0.3.0 // indirect
google.golang.org/appengine v1.6.7 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20230726155614-23370e0ffb3e // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20230822172742-b8732ec3820d // indirect
google.golang.org/protobuf v1.33.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
k8s.io/klog/v2 v2.120.1 // indirect
k8s.io/utils v0.0.0-20240423183400-0849a56e8f22 // indirect
k8s.io/client-go v0.30.0 // indirect
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340 // indirect
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect
sigs.k8s.io/structured-merge-diff/v4 v4.4.1 // indirect
)
Loading