Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update group identifier to use for Cluster API annotations #3161

Merged
merged 2 commits into from
Sep 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 39 additions & 10 deletions cluster-autoscaler/cloudprovider/clusterapi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,6 @@ cluster.
The cluster-api provider requires Kubernetes v1.16 or greater to run the
v1alpha3 version of the API.

## Cluster API Prerequisites

Please be aware that currently the cluster autoscaler only supports CAPI
clusters that have joined their management and workload clusters into a single
cluster. For more information about this please see the
[Cluster API Concepts documentations](https://cluster-api.sigs.k8s.io/user/concepts.html)
and the [`clusterctl move` command documentation](https://cluster-api.sigs.k8s.io/user/concepts.html).

## Starting the Autoscaler

To enable the Cluster API provider, you must first specify it in the command
Expand Down Expand Up @@ -62,6 +54,43 @@ in the staging namespace, belonging to the purple cluster, with the label owner=
--node-group-auto-discovery=clusterapi:namespace=staging,clusterName=purple,owner=jim
```

## Connecting cluster-autoscaler to Cluster API management and workload Clusters

You will also need to provide the path to the kubeconfig(s) for the management
and workload cluster you wish cluster-autoscaler to run against. To specify the
kubeconfig path for the workload cluster to monitor, use the `--kubeconfig`
option and supply the path to the kubeconfig. If the `--kubeconfig` option is
not specified, cluster-autoscaler will attempt to use an in-cluster configuration.
To specify the kubeconfig path for the management cluster to monitor, use the
`--cloud-config` option and supply the path to the kubeconfig. If the
`--cloud-config` option is not specified it will fall back to using the kubeconfig
that was provided with the `--kubeconfig` option.

Use in-cluster config for both management and workload cluster:
```
cluster-autoscaler --cloud-provider=clusterapi
```

Use in-cluster config for workload cluster, specify kubeconfig for management cluster:
```
cluster-autoscaler --cloud-provider=clusterapi --cloud-config=/mnt/kubeconfig
```

Use in-cluster config for management cluster, specify kubeconfig for workload cluster:
```
cluster-autoscaler --cloud-provider=clusterapi --kubeconfig=/mnt/kubeconfig --clusterapi-cloud-config-authoritative
```

Use separate kubeconfigs for both management and workload cluster:
```
cluster-autoscaler --cloud-provider=clusterapi --kubeconfig=/mnt/workload.kubeconfig --cloud-config=/mnt/management.kubeconfig
```

Use a single provided kubeconfig for both management and workload cluster:
```
cluster-autoscaler --cloud-provider=clusterapi --kubeconfig=/mnt/workload.kubeconfig
```

## Enabling Autoscaling

To enable the automatic scaling of components in your cluster-api managed
Expand All @@ -72,12 +101,12 @@ resources depending on the type of cluster-api mechanism that you are using.

There are two annotations that control how a cluster resource should be scaled:

* `cluster.k8s.io/cluster-api-autoscaler-node-group-min-size` - This specifies
* `cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size` - This specifies
the minimum number of nodes for the associated resource group. The autoscaler
will not scale the group below this number. Please note that currently the
cluster-api provider will not scale down to zero nodes.

* `cluster.k8s.io/cluster-api-autoscaler-node-group-max-size` - This specifies
* `cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size` - This specifies
the maximum number of nodes for the associated resource group. The autoscaler
will not scale the group above this number.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,12 @@ func (c *machineController) findMachineByProviderID(providerID normalizedProvide
if node == nil {
return nil, nil
}
return c.findMachine(node.Annotations[machineAnnotationKey])

machineID, ok := node.Annotations[machineAnnotationKey]
if !ok {
machineID = node.Annotations[deprecatedMachineAnnotationKey]
}
return c.findMachine(machineID)
}

func isFailedMachineProviderID(providerID normalizedProviderID) bool {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -514,24 +514,32 @@ func deleteTestConfigs(t *testing.T, controller *machineController, testConfigs

func TestControllerFindMachine(t *testing.T) {
type testCase struct {
description string
name string
namespace string
lookupSucceeds bool
description string
name string
namespace string
useDeprecatedAnnotation bool
lookupSucceeds bool
}

var testCases = []testCase{{
description: "lookup fails",
lookupSucceeds: false,
name: "machine-does-not-exist",
namespace: "namespace-does-not-exist",
description: "lookup fails",
lookupSucceeds: false,
useDeprecatedAnnotation: false,
name: "machine-does-not-exist",
namespace: "namespace-does-not-exist",
}, {
description: "lookup fails in valid namespace",
lookupSucceeds: false,
name: "machine-does-not-exist-in-existing-namespace",
description: "lookup fails in valid namespace",
lookupSucceeds: false,
useDeprecatedAnnotation: false,
name: "machine-does-not-exist-in-existing-namespace",
}, {
description: "lookup succeeds",
lookupSucceeds: true,
description: "lookup succeeds",
lookupSucceeds: true,
useDeprecatedAnnotation: false,
}, {
description: "lookup succeeds with deprecated annotation",
lookupSucceeds: true,
useDeprecatedAnnotation: true,
}}

test := func(t *testing.T, tc testCase, testConfig *testConfig) {
Expand Down Expand Up @@ -569,6 +577,19 @@ func TestControllerFindMachine(t *testing.T) {
if tc.namespace == "" {
tc.namespace = testConfig.machines[0].GetNamespace()
}
if tc.useDeprecatedAnnotation {
for i := range testConfig.machines {
n := testConfig.nodes[i]
annotations := n.GetAnnotations()
val, ok := annotations[machineAnnotationKey]
if !ok {
t.Fatal("node did not contain machineAnnotationKey")
}
delete(annotations, machineAnnotationKey)
annotations[deprecatedMachineAnnotationKey] = val
n.SetAnnotations(annotations)
}
}
test(t, tc, testConfig)
})
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,13 @@ import (
)

const (
machineDeleteAnnotationKey = "cluster.k8s.io/delete-machine"
machineAnnotationKey = "cluster.k8s.io/machine"
debugFormat = "%s (min: %d, max: %d, replicas: %d)"
// deprecatedMachineDeleteAnnotationKey should not be removed until minimum cluster-api support is v1alpha3
deprecatedMachineDeleteAnnotationKey = "cluster.k8s.io/delete-machine"
// TODO: determine what currently relies on deprecatedMachineAnnotationKey to determine when it can be removed
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't actually clear to me where this is being used. @elmiko do you know? I would expect any v1alpha3 compatible infrastructure providers to be setting ProviderID. We'll still likely need to keep this for backward compatibility, but would like to have a target removal in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure i completely understand, but afaik, we should be using the const variable names and not the actual string values anywhere. as for how it is used in the context of the autoscaler, the DeleteNodes function wraps this all to be used during the scale down attempts from the core.

i was under the impression that these annotations are then used to help inform during the scale down, but perhaps we don't need them from CAPI side? (if so, it certainly seems like we could investigate removing them)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think where I added the comment with the github UI might be messing up the context I was trying to set a bit. My comment (and the TODO is for the deprecatedMachineAnnotationKey rather than deprecatedMachineDeleteAnnotationKey.

From what I can tell we are using that annotation for matching a Node to a Machine (it's set on the Node) when a providerID is not present on the Machine to be able to use for matching. It is currently being used by some tests in this repo, but I wasn't sure if there is some known usage externally from pre-v1alpha3 cluster-api infrastructure providers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that makes so much more sense lol =)

i think you are correct, this might be an artifact from previous generations. definitely seems like we are using it for storage of the id. as for the external usage, i'm not 100% sure but i can do a little research. certainly seems like a good candidate to cleanup.

deprecatedMachineAnnotationKey = "cluster.k8s.io/machine"
machineDeleteAnnotationKey = "cluster.x-k8s.io/delete-machine"
machineAnnotationKey = "cluster.x-k8s.io/machine"
debugFormat = "%s (min: %d, max: %d, replicas: %d)"
)

type nodegroup struct {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -649,6 +649,9 @@ func TestNodeGroupDeleteNodes(t *testing.T) {
if _, found := machine.GetAnnotations()[machineDeleteAnnotationKey]; !found {
t.Errorf("expected annotation %q on machine %s", machineDeleteAnnotationKey, machine.GetName())
}
if _, found := machine.GetAnnotations()[deprecatedMachineDeleteAnnotationKey]; !found {
t.Errorf("expected annotation %q on machine %s", deprecatedMachineDeleteAnnotationKey, machine.GetName())
}
}

gvr, err := ng.scalableResource.GroupVersionResource()
Expand Down
24 changes: 18 additions & 6 deletions cluster-autoscaler/cloudprovider/clusterapi/clusterapi_provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -140,30 +140,42 @@ func newProvider(

// BuildClusterAPI builds CloudProvider implementation for machine api.
func BuildClusterAPI(opts config.AutoscalingOptions, do cloudprovider.NodeGroupDiscoveryOptions, rl *cloudprovider.ResourceLimiter) cloudprovider.CloudProvider {
externalConfig, err := clientcmd.BuildConfigFromFlags("", opts.KubeConfigPath)
managementKubeconfig := opts.CloudConfig
if managementKubeconfig == "" && !opts.ClusterAPICloudConfigAuthoritative {
managementKubeconfig = opts.KubeConfigPath
}

managementConfig, err := clientcmd.BuildConfigFromFlags("", managementKubeconfig)
if err != nil {
klog.Fatalf("cannot build management cluster config: %v", err)
}

workloadKubeconfig := opts.KubeConfigPath

workloadConfig, err := clientcmd.BuildConfigFromFlags("", workloadKubeconfig)
if err != nil {
klog.Fatalf("cannot build config: %v", err)
klog.Fatalf("cannot build workload cluster config: %v", err)
}

// Grab a dynamic interface that we can create informers from
managementClient, err := dynamic.NewForConfig(externalConfig)
managementClient, err := dynamic.NewForConfig(managementConfig)
if err != nil {
klog.Fatalf("could not generate dynamic client for config")
}

workloadClient, err := kubernetes.NewForConfig(externalConfig)
workloadClient, err := kubernetes.NewForConfig(workloadConfig)
if err != nil {
klog.Fatalf("create kube clientset failed: %v", err)
}

managementDiscoveryClient, err := discovery.NewDiscoveryClientForConfig(externalConfig)
managementDiscoveryClient, err := discovery.NewDiscoveryClientForConfig(managementConfig)
if err != nil {
klog.Fatalf("create discovery client failed: %v", err)
}

cachedDiscovery := memory.NewMemCacheClient(managementDiscoveryClient)
managementScaleClient, err := scale.NewForConfig(
externalConfig,
managementConfig,
restmapper.NewDeferredDiscoveryRESTMapper(cachedDiscovery),
dynamic.LegacyAPIPathResolverFunc,
scale.NewDiscoveryScaleKindResolver(managementDiscoveryClient))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -129,15 +129,12 @@ func (r unstructuredScalableResource) UnmarkMachineForDeletion(machine *unstruct
}

annotations := u.GetAnnotations()
if _, ok := annotations[machineDeleteAnnotationKey]; ok {
delete(annotations, machineDeleteAnnotationKey)
u.SetAnnotations(annotations)
_, updateErr := r.controller.managementClient.Resource(r.controller.machineResource).Namespace(u.GetNamespace()).Update(context.TODO(), u, metav1.UpdateOptions{})

return updateErr
}
delete(annotations, machineDeleteAnnotationKey)
delete(annotations, deprecatedMachineDeleteAnnotationKey)
u.SetAnnotations(annotations)
_, updateErr := r.controller.managementClient.Resource(r.controller.machineResource).Namespace(u.GetNamespace()).Update(context.TODO(), u, metav1.UpdateOptions{})

return nil
return updateErr
}

func (r unstructuredScalableResource) MarkMachineForDeletion(machine *unstructured.Unstructured) error {
Expand All @@ -154,6 +151,7 @@ func (r unstructuredScalableResource) MarkMachineForDeletion(machine *unstructur
}

annotations[machineDeleteAnnotationKey] = time.Now().String()
annotations[deprecatedMachineDeleteAnnotationKey] = time.Now().String()
u.SetAnnotations(annotations)

_, updateErr := r.controller.managementClient.Resource(r.controller.machineResource).Namespace(u.GetNamespace()).Update(context.TODO(), u, metav1.UpdateOptions{})
Expand Down
16 changes: 12 additions & 4 deletions cluster-autoscaler/cloudprovider/clusterapi/clusterapi_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,12 @@ import (
)

const (
nodeGroupMinSizeAnnotationKey = "cluster.k8s.io/cluster-api-autoscaler-node-group-min-size"
nodeGroupMaxSizeAnnotationKey = "cluster.k8s.io/cluster-api-autoscaler-node-group-max-size"
clusterNameLabel = "cluster.x-k8s.io/cluster-name"
deprecatedClusterNameLabel = "cluster.k8s.io/cluster-name"
deprecatedNodeGroupMinSizeAnnotationKey = "cluster.k8s.io/cluster-api-autoscaler-node-group-min-size"
deprecatedNodeGroupMaxSizeAnnotationKey = "cluster.k8s.io/cluster-api-autoscaler-node-group-max-size"
nodeGroupMinSizeAnnotationKey = "cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size"
nodeGroupMaxSizeAnnotationKey = "cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size"
clusterNameLabel = "cluster.x-k8s.io/cluster-name"
deprecatedClusterNameLabel = "cluster.k8s.io/cluster-name"
)

var (
Expand Down Expand Up @@ -60,6 +62,9 @@ type normalizedProviderID string
// value is not of type int.
func minSize(annotations map[string]string) (int, error) {
val, found := annotations[nodeGroupMinSizeAnnotationKey]
if !found {
val, found = annotations[deprecatedNodeGroupMinSizeAnnotationKey]
}
if !found {
return 0, errMissingMinAnnotation
}
Expand All @@ -76,6 +81,9 @@ func minSize(annotations map[string]string) (int, error) {
// value is not of type int.
func maxSize(annotations map[string]string) (int, error) {
val, found := annotations[nodeGroupMaxSizeAnnotationKey]
if !found {
val, found = annotations[deprecatedNodeGroupMaxSizeAnnotationKey]
}
if !found {
return 0, errMissingMaxAnnotation
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,24 @@ func TestUtilParseScalingBounds(t *testing.T) {
},
min: 0,
max: 1,
}, {
description: "deprecated min/max annotations still work, result is min 0, max 1",
annotations: map[string]string{
deprecatedNodeGroupMinSizeAnnotationKey: "0",
deprecatedNodeGroupMaxSizeAnnotationKey: "1",
},
min: 0,
max: 1,
}, {
description: "deprecated min/max annotations do not take precedence over non-deprecated annotations, result is min 1, max 2",
annotations: map[string]string{
deprecatedNodeGroupMinSizeAnnotationKey: "0",
deprecatedNodeGroupMaxSizeAnnotationKey: "1",
nodeGroupMinSizeAnnotationKey: "1",
nodeGroupMaxSizeAnnotationKey: "2",
},
min: 1,
max: 2,
}} {
t.Run(tc.description, func(t *testing.T) {
machineSet := unstructured.Unstructured{
Expand Down
3 changes: 3 additions & 0 deletions cluster-autoscaler/config/autoscaling_options.go
Original file line number Diff line number Diff line change
Expand Up @@ -142,4 +142,7 @@ type AutoscalingOptions struct {
AWSUseStaticInstanceList bool
// Path to kube configuration if available
KubeConfigPath string
// ClusterAPICloudConfigAuthoritative tells the Cluster API provider to treat the CloudConfig option as authoritative and
// not use KubeConfigPath as a fallback when it is not provided.
ClusterAPICloudConfigAuthoritative bool
}
Loading