Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Kubernetes metadata (labels+annotations) #189

Merged
merged 2 commits into from
Oct 12, 2021

Conversation

0x2b3bfa0
Copy link
Member

@0x2b3bfa0 0x2b3bfa0 commented Sep 5, 2021

Part of #185

Labels can be used in queries and are also used for system metadata; annotations are for users, but can't be queried, so the solomonic solution was to use both.

@0x2b3bfa0 0x2b3bfa0 mentioned this pull request Sep 5, 2021
6 tasks
@0x2b3bfa0 0x2b3bfa0 added enhancement New feature or request vendor-kubernetes labels Sep 5, 2021
@0x2b3bfa0 0x2b3bfa0 self-assigned this Sep 5, 2021
@0x2b3bfa0 0x2b3bfa0 added the resource-machine iterative_machine TF resource label Sep 5, 2021
@0x2b3bfa0 0x2b3bfa0 changed the title Add support for Kubernetes metadata (labels/annotations) Add support for Kubernetes metadata (labels+annotations) Sep 5, 2021
@0x2b3bfa0
Copy link
Member Author

0x2b3bfa0 commented Sep 10, 2021

Setting up Azure Kubernetes Service

Installing the Azure command-line interface tool

First of all, we need to install the az tool following the official documentation on the Microsoft documentation portal: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli

Enabling experimental features

In order to automatically provision GPU nodes for our cluster, we'll also need to enable the following experimental features through the aks-preview extension:

az extension add \
  --name aks-preview

az provider register \
  --namespace Microsoft.ContainerService

az feature register \
  --namespace Microsoft.ContainerService \
  --name GPUDedicatedVHDPreview

Creating a test cluster

The following commands will create an AKS cluster with a single node, keeping everything into a a new resource group for easier deletion of the resources:

az group create \
  --name testKubernetesResourceGroup \
  --location eastus
az aks create \
  --resource-group testKubernetesResourceGroup \
  --name testKubernetesCluster \
  --node-vm-size Standard_NC6 \
  --node-count 1 \
  --aks-custom-headers UseGPUDedicatedVHD=true
Click to reveal a budget-friendly cluster configuration without GPU...
az aks create \
  --resource-group testKubernetesResourceGroup \
  --name testKubernetesCluster \
  --node-vm-size Standard_A2_v2 \
  --node-count 1

Retrieving the credentials

Azure has some wrappers for Kubernetes authentication and will generate for us the required credentials. The following command will produce a full-fledged kubeconfig string that can be directly stored in the KUBERNETES_CONFIGURATION secret of your continous integration system of choice:

az aks get-credentials \
  --resource-group testKubernetesResourceGroup \
  --name testKubernetesCluster \
  --file -

💡 If you skip the --file option, settings will be saved to your computer's ~/.kube/config file, that will be automatically used by kubectl if you ever need to run any sanity check manually on the cluster.

Deleting the test cluster

Once you've finished testing you can run the following command to delete the entire resource group, which includes the cluster and all its nodes:

az group delete \
  --name testKubernetesResourceGroup

⚠️ Please delete the entire resource group as soon as you finish using the cluster. They are really pricey, and you may end spending a lot of money just on testing. 🔥 💵

@0x2b3bfa0
Copy link
Member Author

Manual test report

main.tf

terraform {
  required_providers {
    iterative = {
      source = "github.com/iterative/iterative"
    }
  }
}

provider "iterative" {}

resource "iterative_machine" "machine" {
  cloud    = "kubernetes"
  metadata = { key = "value" }
}

kubectl describe jobs

Name:           iterative-3pkyvx5rk6jsn
Namespace:      default
Selector:       controller-uid=07126986-6f96-461b-a3a3-08ce50fad9bb
Labels:         key=value
Annotations:    key: value
Parallelism:    1
Completions:    1
Start Time:     Fri, 10 Sep 2021 22:33:34 +0000
Completed At:   Fri, 10 Sep 2021 22:33:37 +0000
Duration:       3s
Pods Statuses:  0 Running / 1 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=07126986-6f96-461b-a3a3-08ce50fad9bb
           job-name=iterative-3pkyvx5rk6jsn
  Containers:
   iterative-3pkyvx5rk6jsn:
    Image:      dvcorg/cml:0-dvc2-base1-gpu
    Port:       <none>
    Host Port:  <none>
    Command:
      bash
      -c
      #!/bin/bash
    Limits:
      cpu:                8
      ephemeral-storage:  35G
      memory:             32Gi
    Requests:
      cpu:        0
      memory:     0
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  14s   job-controller  Created pod: iterative-3pkyvx5rk6jsn-vjs9n
  Normal  Completed         11s   job-controller  Job completed

@0x2b3bfa0
Copy link
Member Author

It works as expected as per the manual and automated tests above, but this could be a good opportunity for reviewers (hello, reviewers) to play with Kubernetes and our Terraform provider. Feel free to take a look before merging.

@casperdcl
Copy link
Contributor

ping @iterative/cml

@0x2b3bfa0
Copy link
Member Author

ping @iterative/cml ++

Copy link
Contributor

@DavidGOrtega DavidGOrtega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 lgtm

@0x2b3bfa0
Copy link
Member Author

😅 🙏🏼 :shipit:

@DavidGOrtega DavidGOrtega merged commit 82d1b60 into master Oct 12, 2021
@DavidGOrtega DavidGOrtega deleted the kubernetes-metadata branch October 12, 2021 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request resource-machine iterative_machine TF resource
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants