Nano GPU Agent is a Kubernetes device plugin implement for gpu allocation and use in container. It runs as a Daemonset in Kubernetes node. It works as follows:
- Register gpu core and memory resources on node
- Allocate and share gpu resources for containers
- Support gpu resources qos and isolation with specific gpu driver(e.g. nano gpu)
For the complete solution and further details, please refer to Nano GPU Scheduler.
- Kubernetes v1.17+
- golang 1.16+
- NVIDIA drivers
- nvidia-docker
- set
nvidia
as dockerdefault-runtime
: add"default-runtime": "nvidia"
to/etc/docker/daemon.json
, and restart docker daemon.
Run make
or TAG=<image-tag> make
to build nano-gpu-agent image
- Deploy Nano GPU Agent
$ kubectl apply -f deploy/nano-gpu-agent.yaml
Note To run on specific nodes instead of all nodes, please add appropriate spec.template.spec.nodeSelector
or to spec.template.spec.affinity.nodeAffinity
the yaml file.
- Deploy Nano GPU Scheduler
$ kubectl apply -f https://raw.githubusercontent.com/nano-gpu/nano-gpu-scheduler/master/deploy/nano-gpu-scheduler.yaml
For more information , please refer to Nano GPU Scheduler.
- Enable Kubernetes scheduler extender
Add the following configuration to
extenders
section in the--policy-config-file
file:
{
"urlPrefix": "http://<kube-apiserver-svc>/api/v1/namespaces/kube-system/services/nano-gpu-scheduler/proxy/scheduler",
"filterVerb": "filter",
"prioritizeVerb": "priorities",
"bindVerb": "bind",
"weight": 1,
"enableHttps": false,
"nodeCacheCapable": true,
"managedResources": [
{
"name": "nano-gpu/gpu-percent"
}
]
}
You can set a scheduling policy by running kube-scheduler --policy-config-file <filename>
or kube-scheduler --policy-configmap <ConfigMap>
. Here is a scheduler policy config sample.
- Create GPU pod
cat <<EOF | kubectl create -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: cuda-gpu-test
labels:
app: gpu-test
spec:
replicas: 1
selector:
matchLabels:
app: gpu-test
template:
metadata:
labels:
app: gpu-test
spec:
containers:
- name: cuda
image: nvidia/cuda:10.0-base
command: [ "sleep", "100000" ]
resources:
limits:
nano-gpu/gpu-percent: "20"
EOF
- Support GPU share
- Support GPU monitor at pod and container level
- Support single container multi-card scheduling
- Support GPU topology-aware scheduling
- Support GPU load-aware scheduling
- Migrate to Kubernetes scheduler framework
Distributed under the Apache License.