Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spot nodepool loses "kubernetes.azure.com/scalesetpriority: spot" label #2460

Closed
NovemberZulu opened this issue Jul 7, 2022 · 7 comments · Fixed by #2604
Closed

Spot nodepool loses "kubernetes.azure.com/scalesetpriority: spot" label #2460

NovemberZulu opened this issue Jul 7, 2022 · 7 comments · Fixed by #2604
Assignees
Labels
area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@NovemberZulu
Copy link

/kind bug

[Before submitting an issue, have you checked the Troubleshooting Guide?] yes

What steps did you take and what happened:

  1. Create a cluster with a managed spot nodepool @ k8s 1.22.4. Nodepool automatically gets kubernetes.azure.com/scalesetpriority: spot label and kubernetes.azure.com/scalesetpriority=spot:NoSchedule taint
  2. Upgrade control and then nodepool to k8s 1.23.5
  3. Nodepool no longer has kubernetes.azure.com/scalesetpriority: spot, is it gone as soon as upgrade starts (the taint stays)

What did you expect to happen:
Nodepool is upgraded to 1.23.5 with labels and taints intact.

Anything else you would like to add:

  1. If I create a new spot managed nodepool after control plane is upgraded, it is labeled properly
  2. Azure support believes the issue is with CAPI/CAPZ

Environment:

  • cluster-api-provider-azure version:
  • Kubernetes version: (use kubectl version): 1.22.4 -> 1.23.5
  • OS (e.g. from /etc/os-release): AKSUbuntu-1804gen2containerd-2022.02.01
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 7, 2022
@CecileRobertMichon
Copy link
Contributor

AFAIK CAPZ does not yet support spot instances for AKS: #1925

How are you doing this with CAPZ? What does you AzureManagedMachinePool look like?

Create a cluster with a managed spot nodepool @ k8s 1.22.4.

cc @zmalik @alexeldeib

@CecileRobertMichon
Copy link
Contributor

/area managedcluster

@k8s-ci-robot
Copy link
Contributor

@CecileRobertMichon: The label(s) area/managedcluster cannot be applied, because the repository doesn't have them.

In response to this:

/area managedcluster

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@CecileRobertMichon
Copy link
Contributor

/area managedclusters

@k8s-ci-robot k8s-ci-robot added the area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type label Jul 7, 2022
@NovemberZulu
Copy link
Author

@CecileRobertMichon

I might be using a fork, I'll need to double check, sorry 😊
AzureManagedMachinePool looks quite ordinary

Name:  shared003
Namespace:    cluster-registry
Labels:       azuremanagedmachinepool.infrastructure.cluster.x-k8s.io/agentpoolmode=User
              cluster.x-k8s.io/cluster-name=aks-eastus2-dev-main-002
Annotations:  <none>
API Version:  infrastructure.cluster.x-k8s.io/v1beta1
Kind:         AzureManagedMachinePool
Metadata:
  Creation Timestamp:  2022-03-01T09:41:19Z
  Finalizers:
    azurecluster.infrastructure.cluster.x-k8s.io
  Generation:  4
  Managed Fields:
    <skipped>
  Owner References:
    API Version:           cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  MachinePool
    Name:                  shared003
    UID:                   8c4ad431-3e60-4a60-8d83-aafd02d00316
  Resource Version:        106745044
  UID:                     45f15f58-865d-4c9d-a3bc-944f6bb51c69
Spec:
  Additional Tags:
    Cluster - Autoscaler - Enabled:  true
    Max:                             1000
    Min:                             0
  Max Pods:                          60
  Mode:                              User
  Name:                              shared003
  Node Labels:
    Spot:              true
    Type:              shared
  Os Disk Size GB:     128
  Os Disk Type:        Managed
  Scale Set Priority:  Spot
  Scaling:
    Max Size:  1000
    Min Size:  0
  Sku:         Standard_D4as_v4
  Taints:
    Effect:  NoSchedule
    Key:     type
    Value:   shared
    Effect:  NoSchedule
    Key:     spot
    Value:   true
Status:
  Ready:    true
  Version:  1.23.5  <-- this was changed from 1.22.4
Events:
  Type    Reason                             Age                     From                                       Message
  ----    ------                             ----                    ----                                       -------
  Normal  AzureManagedMachinePool available  63s (x1177 over 2d17h)  azuremanagedmachinepoolmachine-reconciler  agent pool successfully reconciled

@CecileRobertMichon
Copy link
Contributor

Got it, yes looks like it

Scale Set Priority: Spot

is not supported in the latest released CAPZ AzureManagedMachinePool CRD... Seems like a bug in the fork implementation of Spot instances.

Side note: we'd love to get that feature merged into Upstream CAPZ if that's something you/your team want to contribute. #1925 was in progress but it's gotten stale.

@jackfrancis
Copy link
Contributor

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants