Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AKS (azurerm_kubernetes_cluster, azurerm_kubernetes_cluster_node_pool): add support for version aliases #17084

Merged
merged 6 commits into from
Jun 28, 2022

Conversation

weisdd
Copy link
Contributor

@weisdd weisdd commented Jun 2, 2022

Issue

Recently, AKS moved alias minor version support to GA (changelog), which allows users to specify, say, 1.20 instead of 1.20.1.

The azurerm provider does not currently support the feature, thus the PR:

Error: creating Cluster: (Managed Cluster Name "llama-k8s" / Resource Group "llama-k8s-resources"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=500 -- Original Error: Code="InternalOperationError" Message="Error parsing agent pool version \"1.21\": No Major.Minor.Patch elements found"

Fixes: #17085

Suggested changes

  1. Bump API version used for interacting with container services from 2022-01-02-preview to 2022-03-02-preview as the former demands x.y.z version syntax whereas the latter allows aliases;
  2. validateNodePoolSupportsVersion needs to be adjusted to allow not only exact versions, but also aliases. My code in that part is unlikely to be considered the best way of doing things, so feel free to suggest a different approach:
	// when creating a new cluster or upgrading the desired version should be supported
	if !versionExists && versions.AgentPoolAvailableVersionsProperties != nil && versions.AgentPoolAvailableVersionsProperties.AgentPoolVersions != nil {
		for _, version := range *versions.AgentPoolAvailableVersionsProperties.AgentPoolVersions {
			if version.KubernetesVersion == nil {
				continue
			}
			supportedVersions = append(supportedVersions, *version.KubernetesVersion)
			if *version.KubernetesVersion == desiredNodePoolVersion {
				versionExists = true
			}
		}
	}

=>

	if !versionExists && versions.AgentPoolAvailableVersionsProperties != nil && versions.AgentPoolAvailableVersionsProperties.AgentPoolVersions != nil {
		for _, version := range *versions.AgentPoolAvailableVersionsProperties.AgentPoolVersions {
			if version.KubernetesVersion == nil {
				continue
			}
			supportedVersions = append(supportedVersions, *version.KubernetesVersion)
			if *version.KubernetesVersion == desiredNodePoolVersion {
				versionExists = true
			} else if strings.Count(desiredNodePoolVersion, ".") == 1 && strings.HasPrefix(*version.KubernetesVersion, fmt.Sprintf("%s.", desiredNodePoolVersion)) {
				// alias versions (major.minor) are also fine as the latest supported GA patch version is chosen automatically in this case
				versionExists = true
			}
		}
	}
  1. resourceKubernetesClusterRead should not be update kubernetes_version to props.CurrentKubernetesVersion, because it creates a state drift that cannot be overcome:
d.Set("kubernetes_version", props.KubernetesVersion)
if v := props.CurrentKubernetesVersion; v != nil {
    d.Set("kubernetes_version", v)
}

=>

d.Set("kubernetes_version", props.KubernetesVersion)

Questions / points of discussion

  1. Since it's the first time I'm playing with the azurerm provider, I'm not really sure if all the code that got automatically generated should come as part of the PR or I was supposed to omit it. Please, let me know if something needs to be deleted;

  2. I know that there are some acceptance tests with cluster upgrades that currently rely on these versions:

var (
	olderKubernetesVersion   = "1.21.7"
	currentKubernetesVersion = "1.22.4"
)

If the expectation is to cover version aliases with tests, then what exactly should be done? - Should I just change the versions specified above or create another test that would rely on different variables?

@weisdd weisdd changed the title azurerm_kubernetes_cluster, azurerm_kubernetes_cluster_node_pool: add support for version aliases AKS (azurerm_kubernetes_cluster, azurerm_kubernetes_cluster_node_pool): add support for version aliases Jun 7, 2022
Copy link
Member

@stephybun stephybun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @weisdd! Overall this is off to a good start, I have a few suggestions/questions in addition to the comments left in-line:

  1. Could we add an additional test for version aliases, I think it's acceptable to hard code the value of the aliased version into the test config since they aren't being used in the other tests.
  2. It would be helpful to update the docs as well so that users know it's possible to omit the patch version.
  3. What would happen if a user specified a version alias for the node pool but supplied a patch version for the control plane e.g. 1.22 for node pool and 1.22.4 for control plane. Perhaps some additional validation should be added to prevent this? I'm thinking this could be a good place to add that
    https://github.com/weisdd/terraform-provider-azurerm/blob/124a8d9fa66eb952a1658b9ea0dd389dc2ffbd68/internal/services/containers/kubernetes_cluster_resource.go#L1028-L1036

@weisdd weisdd force-pushed the feature/aks-version-alias branch from 4a167e3 to b21488e Compare June 22, 2022 12:35
@weisdd
Copy link
Contributor Author

weisdd commented Jun 22, 2022

@stephybun Thanks for the review, I appreciate it! You can find my replies to your questions below.

1. Could we add an additional test for version aliases, I think it's acceptable to hard code the value of the aliased version into the test config since they aren't being used in the other tests.
I've added TestAccKubernetesCluster_upgradeControlPlaneAndAllPoolsTogetherVersionAlias. Since there are plenty of tests that cover upgrade scenarios and the only difference is in the requested AKS versions, I think a simple test enough. It creates a cluster with an additional node pool and then tries to upgrade all versions at once.

git:(feature/aks-version-alias) ✗ make acctests SERVICE='containers' TESTARGS='-run=TestAccKubernetesCluster_upgradeControlPlaneAndAllPoolsTogetherVersionAlias' TESTTIMEOUT='60m'
==> Checking that code complies with gofmt requirements...
==> Checking that Custom Timeouts are used...
==> Checking that acceptance test packages are used...
TF_ACC=1 go test -v ./internal/services/containers -run=TestAccKubernetesCluster_upgradeControlPlaneAndAllPoolsTogetherVersionAlias -timeout 60m -ldflags="-X=github.com/hashicorp/terraform-provider-azurerm/version.ProviderVersion=acc"
=== RUN   TestAccKubernetesCluster_upgradeControlPlaneAndAllPoolsTogetherVersionAlias
=== PAUSE TestAccKubernetesCluster_upgradeControlPlaneAndAllPoolsTogetherVersionAlias
=== CONT  TestAccKubernetesCluster_upgradeControlPlaneAndAllPoolsTogetherVersionAlias
--- PASS: TestAccKubernetesCluster_upgradeControlPlaneAndAllPoolsTogetherVersionAlias (1969.08s)
PASS
ok  	github.com/hashicorp/terraform-provider-azurerm/internal/services/containers	1969.103s

2. It would be helpful to update the docs as well so that users know it's possible to omit the patch version.
Updated the docs. I hope the suggested wording is fine.

3. What would happen if a user specified a version alias for the node pool but supplied a patch version for the control plane e.g. 1.22 for node pool and 1.22.4 for control plane. Perhaps some additional validation should be added to prevent this? I'm thinking this could be a good place to add that
https://github.com/weisdd/terraform-provider-azurerm/blob/124a8d9fa66eb952a1658b9ea0dd389dc2ffbd68/internal/services/containers/kubernetes_cluster_resource.go#L1028-L1036

As I see from that code, it tries to avoid a version drift that could be observed a few months ago, before the additional validation was introduced. It actually covers the version alias case as well, because it only cares about 2 strings being equal to each other, not about the version semantics. So, I think we're covered here:

azurerm_kubernetes_cluster.example: Creating...
╷
│ Error: version mismatch between the control plane running 1.22.4 and default node pool running 1.22, they must use the same kubernetes versions
│ 
│   with azurerm_kubernetes_cluster.example,
│   on main.tf line 18, in resource "azurerm_kubernetes_cluster" "example":
│   18: resource "azurerm_kubernetes_cluster" "example" {
│ 
╵

Please, let me know if there are any other changes that you'd like to see implemented.

Copy link
Member

@stephybun stephybun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those changes @weisdd! I have one last remark about the test but otherwise this should be good to go!

@weisdd
Copy link
Contributor Author

weisdd commented Jun 27, 2022

@stephybun I've just made the requested changes. Thanks for your help with making it all better!

Copy link
Member

@stephybun stephybun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the usual test failures that should be fixed by #17124 the test results look good.

Thanks again for this @weisdd, LGTM 🎉

@stephybun stephybun merged commit 16b9fa0 into hashicorp:main Jun 28, 2022
@github-actions github-actions bot added this to the v3.12.0 milestone Jun 28, 2022
stephybun added a commit that referenced this pull request Jun 28, 2022
@github-actions
Copy link

github-actions bot commented Jul 1, 2022

This functionality has been released in v3.12.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

github-actions bot commented Aug 1, 2022

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for version aliases in AKS (azurerm_kubernetes_cluster and azurerm_kubernetes_cluster_node_pool)
2 participants