-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating requiredResources in Application Management API #280
Changes from 1 commit
891a083
b291c76
d6fbf60
bc455b1
c276faf
1f3826a
f79413b
c76ff77
5312a84
1218954
bcd57bc
6b2b6c8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -961,6 +961,231 @@ | |
type: integer | ||
description: Number of GPUs | ||
|
||
Flavor: | ||
type: string | ||
description: | | ||
Preset configuration for compute, memory, GPU, | ||
and storage capacity. (i.e - A1.2C4M.GPU8G, A1.2C4M.GPU16G, A1.4C8M,..) | ||
example: A1.2C2M.GPU8G | ||
|
||
NodePools: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch! changed |
||
description: | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general an issue i see with this approach is that it offers too many choices to the Application Developer to ask for the compute it needs. There can be too many possibilities that the developers can provide in the API and platform needs to find out from where it can serve too many diverse combination of resources or clusters. Also, as a developer I may need to run multiple applications on same cluster so how can I express it here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The approach is to adopt a one-application-to-one-infrastructure-resource approach (VM, Kubernetes cluster, container, Docker Compose). This means we avoid managing infrastructure independently of the application. For running multiple applications on the same Kubernetes cluster, Helm packages provide a way to bundle them together. A Helm package can contain multiple application charts, such as a database and a web application chart, effectively treating them as a single application for deployment. This approach aligns well with node pools. Developers can leverage node pools to create clusters with a mix of nodes, such as having one with a GPU and others without, optimizing resource allocation. The application to node pool mapping is done through labels, allowing developers to reference them in Helm chart values for node affinity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks to be very resource heavy approach by one app to one type of infra like a k8s cluster unless we provide a way to enable in some way deploy multiple applications on one cluster. Also what will happen if cluster creation fails? That means application onboarding failed as both are now one atomic package. And another issue could be once an app along with its given infra accepted I cannot change the infra e.g. reduce or increase the resources if needed. So I still think specially with cluster type of infra that it will be hard to implement which could mean creating a cluster dynamically which could be a very time consuming process. If we delink infra creation then there could be options like platform offline creates cluster and provide API to retrieve details of cluster ID or even provide infra creation API to manage infra for applications and use the information with the App LCM API to link them together. Means there could be ways but otherwise in terms of approach it seems to be tightly couple the infra and applications and may reduce reusability. May be more inputs will help here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @gunjald, Sounds good. I think it would be interesting to discuss the creation of an API to manage the infrastructure lifecycle (Create, Update, Delete). Enabling the Kubernetes cluster reference within the Application Management API would be easy. For now, I think it's safe to keep things this way, allowing developers to use a Kubernetes cluster and define the minimum configuration details required by their application. We can then open a discussion about how to design a more comprehensive API for infrastructure management resources. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think that this should be discussed further as it changes how some of use see the problem we are trying to solve. While it makes sense for VM and containers, I'm not sure for k8s clusters. It's my understanding that operators want to use the same infra for multiple app providers/app types. In this case, packaging multiple apps in the same Helm Chart, as suggested above, cannot be done. |
||
Set of worker nodes in a Kubernetes cluster. | ||
type: object | ||
required: | ||
- flavor | ||
- numNodes | ||
properties: | ||
name: | ||
type: string | ||
example: nodepool1 | ||
description: | | ||
Nodepool Name (Autogenerated if not provided in the request) | ||
flavor: | ||
$ref: '#/components/schemas/Flavor' | ||
numNodes: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldnt it be something like numFlavors for better correlation? |
||
type: integer | ||
example: 1 | ||
description: Number of workers that compose the node pool. | ||
|
||
K8sAddons: | ||
description: | | ||
Addons for the Kubernetes cluster. | ||
Additional addons should be defined in application the helm chart | ||
(Service Mesh, Serverless, AI). | ||
type: object | ||
properties: | ||
monitoring: | ||
type: boolean | ||
example: true | ||
default: false | ||
description: Enable monitoring for Kubernetes cluster. | ||
ingress: | ||
type: boolean | ||
example: true | ||
default: false | ||
description: Enable ingress for Kubernetes cluster. | ||
|
||
VmAddons: | ||
description: | | ||
Addons for the Virtual Machine. | ||
type: object | ||
properties: | ||
dockerCompose: | ||
type: boolean | ||
example: true | ||
default: false | ||
description: | | ||
Enable docker-compose in the virtual machine to deploy applications. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As mentioned, I think it would be better to have docker-compose as a package type, rather than a VM addon. A VM addon to a VM-based deployment means the user will have full access to the VM, and may make changes to the VM that conflict with the state expected by the system which is trying to manage the docker deployment (i.e. worst case the user manually uninstalls docker, and then the system will fail trying to install/uninstall/upgrade docker-compose files. What I would recommend is adding DOCKER_COMPOSE_ZIP as a type to AppManifest.PackageType. So the user uploads a zip file of all their docker compose files, much like a helm chart. The Operator Platform would deploy a specific VM image and manage it, and the user would not have full access directly to the VM (much like users would not have full access directly to a kubernetes cluster). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good, I'll remove the VM addon and create a docker-compose type. |
||
|
||
K8sNetworking: | ||
description: | | ||
Kubernetes networking definition | ||
type: object | ||
properties: | ||
primaryNetwork: | ||
description: Definition of Kubernetes primary Network | ||
type: object | ||
properties: | ||
provider: | ||
description: CNI provider name | ||
type: string | ||
example: cilium | ||
version: | ||
description: CNI provider version | ||
type: string | ||
example: "1.13" | ||
additionalNetworks: | ||
description: Additional Networks for the Kubernetes cluster. | ||
type: array | ||
items: | ||
type: object | ||
description: Additional network interface definition | ||
properties: | ||
name: | ||
description: Additional Network Name | ||
type: string | ||
example: net1 | ||
interfaceType: | ||
description: | | ||
Type of additional Interface: | ||
netdevice: (SR-IOV) A regular kernel network device in the | ||
Network Namespace (netns) of the container | ||
vfio-pci: (SR-IOV) A PCI network interface directly mounted | ||
in the container | ||
interface: Additional interface to be used by cni plugins | ||
such as macvlan, ipvlan | ||
Note: The use of SR-IOV interfaces automatically | ||
configure the required kernel parameters for the nodes. | ||
type: string | ||
example: vfio-pci | ||
enum: | ||
- netdevice | ||
- vfio-pci | ||
- interface | ||
|
||
AdditionalStorage: | ||
description: Additional storage for the application. | ||
type: array | ||
items: | ||
type: object | ||
required: | ||
- storageSize | ||
- mountPoint | ||
properties: | ||
name: | ||
type: string | ||
description: Name of additional storage resource. | ||
example: logs | ||
storageSize: | ||
type: string | ||
description: Additional persistent volume for the application. | ||
example: 80GB | ||
pattern: ^\d+(GB|MB)$ | ||
mountPoint: | ||
type: string | ||
description: Location of additional storage resource. | ||
example: /logs | ||
|
||
Vcpu: | ||
type: string | ||
pattern: ^\d+((\.\d{1,3})|(m))?$ | ||
description: | | ||
Number of vcpus in whole (i.e 1), decimal (i.e 0.500) up to | ||
millivcpu, or millivcpu (i.e 500m) format. | ||
example: "500m" | ||
|
||
Kubernetes: | ||
description: Definition of Kubernetes Cluster Infrastructure. | ||
type: object | ||
required: | ||
- nodePools | ||
- infraKind | ||
properties: | ||
infraKind: | ||
description: Type of infrastructure for the application. | ||
type: string | ||
example: kubernetes | ||
enum: | ||
- kubernetes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The infrakind is part of the top level attribute KubernetesResources and looks redundant with value as "kubernetes" as KubernetesResources itself indicate that it is kubernetes resource. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is how discriminators work in OpenAPI: https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/ |
||
version: | ||
type: string | ||
description: Minimum Kubernetes Version. | ||
example: "1.29" | ||
controlNodes: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like there should be a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The definition of control nodes is out of the scope for the Application Developer. |
||
type: integer | ||
description: Number of nodes for Kubernetes control plane. | ||
enum: | ||
- 1 | ||
- 3 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure why an enum is defined for the controlNodes integer, shouldn't it allow any integer greater than 0? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I realize now this is for master nodes, so ignore my previous comment. Docs recommend up to 5 nodes for large clusters, I don't think we'll be dealing with large clusters here but perhaps for completeness add an enum value for 5. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okey, the idea here was to offer Control Plane w/HA and wo/HA but at the end it is controlled by the operator. I think better approach is a boolean controlPlaneHa: true/false There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this should be part of the API. Application provider should have SLA in place with the operators and they shouldn't care about how the control plane of the operator infra is implemented. |
||
nodePools: | ||
type: array | ||
description: | | ||
Description of worker node set in a Kubernetes cluster. | ||
items: | ||
$ref: '#/components/schemas/NodePools' | ||
additionalStorage: | ||
type: string | ||
description: | | ||
Amount of persistent storage allocated to the Kubernetes PVC. | ||
example: 80GB | ||
pattern: ^\d+(GB|MB)$ | ||
networking: | ||
$ref: '#/components/schemas/K8sNetworking' | ||
addons: | ||
$ref: '#/components/schemas/K8sAddons' | ||
|
||
|
||
VirtualMachine: | ||
description: Virtual Machine Infrastructure Definition | ||
type: object | ||
required: | ||
- flavor | ||
- infraKind | ||
properties: | ||
infraKind: | ||
description: Type of infrastructure for the application. | ||
type: string | ||
example: VirtualMachine | ||
enum: | ||
- VirtualMachine | ||
flavor: | ||
$ref: '#/components/schemas/Flavor' | ||
additionalStorages: | ||
$ref: '#/components/schemas/AdditionalStorage' | ||
addons: | ||
$ref: '#/components/schemas/VmAddons' | ||
|
||
|
||
Container: | ||
description: Container Infrastructure Definition | ||
type: object | ||
required: | ||
- numCPU | ||
- memory | ||
- storage | ||
- infraKind | ||
properties: | ||
infraKind: | ||
description: Type of infrastructure for the application. | ||
type: string | ||
example: containers | ||
enum: | ||
- containers | ||
numCPU: | ||
$ref: '#/components/schemas/Vcpu' | ||
memory: | ||
type: integer | ||
example: 10 | ||
description: Memory in giga bytes | ||
storage: | ||
$ref: '#/components/schemas/AdditionalStorage' | ||
gpu: | ||
type: array | ||
description: Number of GPUs | ||
items: | ||
$ref: '#/components/schemas/GpuInfo' | ||
|
||
Ipv4Addr: | ||
type: string | ||
format: ipv4 | ||
|
@@ -1024,33 +1249,23 @@ | |
type: integer | ||
description: Port to stablish the connection | ||
minimum: 0 | ||
|
||
RequiredResources: | ||
description: | | ||
Fundamental hardware requirements to be provisioned by the | ||
Application Provider. | ||
type: object | ||
required: | ||
- numCPU | ||
- memory | ||
- storage | ||
properties: | ||
numCPU: | ||
type: integer | ||
description: Number of virtual CPUs | ||
example: 1 | ||
memory: | ||
type: integer | ||
example: 10 | ||
description: Memory in giga bytes | ||
storage: | ||
type: integer | ||
example: 60 | ||
description: Storage in giga bytes | ||
gpu: | ||
type: array | ||
description: Number of GPUs | ||
items: | ||
$ref: '#/components/schemas/GpuInfo' | ||
type: array | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you really want an array here, right? That would imply the resources could include multiple kubernetes cluster plus multiple VirtualMachines plus multiple Containers. I think you really only want one of either a Kubernetes cluster or a VirtualMachine or a Container resources request. |
||
items: | ||
oneOf: | ||
- $ref: "#/components/schemas/Kubernetes" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a note, EdgeXR allows the user to pre-create a (kubernetes) cluster and then specify the cluster during AppInstance create, in addition to specifying the cluster-resources to create one on-the-fly (as per this spec). This allows users to, over time, manage multiple AppInstances in the same cluster. That could be supported here by additionally adding a ClusterRef as one of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea is to manage applications as self-contained units, including the resources they need. This way, the application itself collects all the resources required to work properly. If a second application needs to be deployed on the same cluster, it might indicate that the resource requirements for the first application were overestimated. Here are two options for the developer: a) Modify the Helm chart to add the application there (even modify the resources to fit - this would require an API for application Update) or |
||
- $ref: "#/components/schemas/VirtualMachine" | ||
- $ref: "#/components/schemas/Container" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find the schema names misleading. For example There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right!, I'll modify it. |
||
discriminator: | ||
propertyName: infraKind | ||
mapping: | ||
kubernetes: "#/components/schemas/Kubernetes" | ||
virtualMachine: "#/components/schemas/VirtualMachine" | ||
container: "#/components/schemas/Container" | ||
|
||
SubmittedApp: | ||
description: Information about the submitted app | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a GET API to get a list of flavors so that user knows what the possible flavor names to use are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @gainsley, we decided in #220 not to use flavours but if it is a better solution to implement it, GET /edge-cloud-zones should return in the response the information about the available flavour for each edge-cloud-zone of interest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also tends to agree :-) May be with GET /edge-cloud-zones we can add query parameters to retrieve list of flavors and then in future we can also extend other resources via query parameters. Just a suggestion though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, I'll add an entry in GET /edge-cloud-zones to report the flavors.