Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIKE] Research upgrade for Terraform. #2825

Closed
5 tasks
seriva opened this issue Dec 30, 2021 · 6 comments
Closed
5 tasks

[SPIKE] Research upgrade for Terraform. #2825

seriva opened this issue Dec 30, 2021 · 6 comments
Assignees

Comments

@seriva
Copy link
Collaborator

seriva commented Dec 30, 2021

Is your spike related to a problem or idea? Please describe.
Currently the Terraform in Epicli was not updated/maintained for a long time. This already caus-es/-ed a bunch of issues:

  1. [FEATURE REQUEST] Upgrade Terraform to at least v0.12.31 #2706
  2. [BUG] Azure network security groups: eplicli apply makes changes when configuration file is not changed #1570
  3. [BUG] Azure unmanaged disks not suppored by Epiphany but there is misleading setting in the default configuration #1569
  4. [BUG] Issue creating service principal on Azure. #2774

Besides that support for data disks will need to be added for the upcoming support for Cloud Native Storage (#13)

Describe the outcome you'd like
This spike should research what is needed to get our current Terraform up to date. Few points of interest:

  1. Check the work needed to upgrade Terraform to the latest version. ALso we would like to top the use our terraform-bin package and install it straight in the (dev)container.
    2/ Check the work needed to upgrade the Azure provider to the latest version and check what is needed to upgrade our current templates.
  2. Check the work needed to upgrade the AWS provider to the latest version and check what is needed to upgrade our current templates.
  3. Check if we can provide an upgrade path from pre Epicli 2.0.0 to the latest Terraform + providers.
  4. Check if its worth it to remove the use of autoscaling groups in AWS in favor of plain VMs. As AWS is currently mostly used internally for testing this is not really a requirement.
  5. Check if we can upgrade the azure-cli package to the latest.

Expected that this spike would result in a list of issues we can pull in one by one but if it turns out to be easy this spike might also results in (partial) implementation.

Additional context


DoD checklist

  • Reader is able to understand the results of spike
  • The results of the spike are presented in a table (to show simply what are compared or researched parameters) / not applicable
  • Each value / cell in the results table is described more deeply below
  • Demo of the spike (automated as much as possible)
  • Design doc updated
@atsikham
Copy link
Contributor

@seriva #2001 seems to be a candidate to close as the same was done for #2000.

@seriva
Copy link
Collaborator Author

seriva commented Dec 30, 2021

@seriva #2001 seems to be a candidate to close as the same was done for #2000.

Good point ill remove and close it. We can re-open a new issue if this team comes forward again.

@seriva
Copy link
Collaborator Author

seriva commented Jan 10, 2022

For this spike I research and implemented to possible update strategies.

Minor upgrade

Changes can be viewed in the following draft pull: #2851

This upgrades the Terraform components as follows:

  • Terraform from 0.12.6 to 0.12.31.
  • azurerm provider from 1.38 to 1.44.
  • AWS provider from 2.26 to 2.70.1

This pull would resolve the following issues: #2706, #1570 #1569

This pull would not solve #2774 long-term as we would still have the issue that we cant upgrade past 2.29 Azure-cli and the new authentication mechanism.

Main benefit of this approach would be that the Terraform stays competible with Epicli 1.x.

Main down side would be that the version updated to here are already really old and might give issues long-term as Epiphany 2.0 would be an LTS version.

Major upgrade

Changes can be viewed in the following draft pull: #2852

This upgrades the Terraform components as follows:

  • Terraform from 0.12.6 to 1.1.3.
  • azurerm provider from 1.38 to 2.91.
  • AWS provider from 2.26 to 3.70.1

This pull would resolve the following issues: #2706, #1570, #1569, #2774

Main benefit of this approach would be that the Terraform would be at the latest of the latest and should be stable for long-term LTS use.

Main downside is that this would break compatibility with Epiphany 1.x clusters. However a manual update of the Terraform is possible.

Final though

IMO we should go for the major upgrade to save us the trouble moving forward with 2.0 LTS. Also as this LTS marks a major version change I don't think a breaking change would be weird. In the end a manual upgrade path is available as documented inside the major draft pull.

As for the upgrade of the AWS terraform, it would be nice to remove the 'autoscaling_group' approach in favor of something similar like we do for Azure with creating the VMs ourselves. In the end this will make it easier later when doing proper scalable components. Also as AWS is not used for production anywhere I don't think we need to worry about backwards compatibility anyway. The issue has already been created: #2853

@atsikham
Copy link
Contributor

Agree with the 'final though' part. In addition it can be mentioned somewhere with the pointing to --no-infra to leave existing infra untouched even with upgraded version of Terraform.

@cicharka
Copy link
Contributor

cicharka commented Jan 24, 2022

Tested with:
provider: Azure:
✔️ apply v2.0 (all components and apps enabled)
✔️ apply v1.3, manual upgrade of Terraform, re-apply v2.0 (all components and apps enabled/minimal cluster)
✔️ delete v2.0
provider: AWS:
✔️ apply v2.0 (all components and apps enabled)
✔️ delete v2.0

@cicharka
Copy link
Contributor

cicharka commented Jan 26, 2022

More tests:

  • provider Azure:
    • Ubuntu and RedHat (all components and apps enabled):
      • init v2.0dev -> apply v2.0dev -> delete v2.0dev ✔️
      • upgrade from v1.0.1 -> v2.0dev: ✔️
        • includes manual Terraform upgrade ✔️
        • re-applying and re-upgrade to test idempotency ✔️
        • backup and recovery commands ✔️
        • delete ✔️
  • provider AWS:
    • Ubuntu and RedHat (all components and apps enabled):
      • init v2.0dev -> apply v2.0dev -> delete v2.0dev ✔️
        All scenarios verified with epicli test . ✔️

@seriva seriva closed this as completed Jan 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants