The idea of this repository is to have a mono-repo with multiple terraform and terragrunt based stacks to provision and manage Azure cloud resources.
For both terragrunt and terraform stacks, this repository provisions infrastructure via atlantis which is a pull request automation tool and hosted internally within Electrolux. For detailed demo of how atlantis works, please watch this video. atlantis config can be seen in this file.
We are using both Terraform and Terragrunt for provisioning infrastructure resources. terraform and terragrunt directories contain the infra for each specific tool. Terragrunt is mainly used to effectively provision resources/stacks in multiple regions. Lets discuss about when to use one over the other.
Terraform should be the defacto choice for provisioning resources for domain teams. The terraform
directory already contains multiple Terraform infrastructure stacks which are provisioned per business domain. Each business domain has its own statefile. Moreover, please note that we are using workspaces and there is 1:1 relationship between workspace and environment, thus, statefile is per environment within a single stack/project. Please read detailed Terraform documentation and how to work with it here.
Terragrunt is a thin wrapper on terraform and in our tooling stack Terragrunt is mainly used to provision and maintain resources across multiple regions. For instance, it is currently used within PE to provision AKS clusters across multiple regions and environments. Since terragrunt adds complexity, therefore, it should be used in a pragmatic way. Please read detailed terragrunt documentation and how to work with it here.
-
Before you begin, please read our guidelines about Terraform and Terragrunt.
-
You must branch out from
main
, add your changes and raise a PR againstmain
branch -
atlantis should run the plan you and display the plan output on your PR. You may also run the plan anytime by commenting
atlantis plan
on your PR. -
Carefully check the plan output, ensure everything looks as expected before you ask for a review.
-
Ask for a review. Once the PR gets approved and mergeable conditions get satisfied, you may apply your changes by commenting
atlantis apply <specific stack>
. DONOT use globalatlantis apply
. Instead apply in one environment/stack at a time starting from lowest environment. At the end of the plan output, atlantis shows the command to specifically apply that particular plan. Use that command and comment it on your PR. -
The merge of PR is also controlled by atlantis meaning you must not merge your PRs yourself. Once the changes get applied by atlantis, your PR will automatically be merged and completed.
Ideally we should be provisioning infrastructure using modules as they provide multiple benefits such as reusability, consistency and testability. We need to be pragmatic in deciding whether we should develop our own custom modules or use opensource ones. General guidelines are mentioned below:
- We may use opensource terraform module directly, provided we are careful about a few aspects such as the module is:
- Opensource and developed by a verified provider like Azure
- Well maintained (check the latest release date)
- Small in size with particular scope and does its job well
- Well tested and widely used by community
- Comes under permissive opensource licenses like Apache-2.0 and MIT
- For complex components like AKS, we must use Opensource modules provided by verified providers like Azure itself. This is to ensure that we get the updates and maintainability from wider opensource community.
Following guidelines need to kept in mind for developing terraform modules.
- The module should have its own repository under infra-modules project in Azure DevOps.
- For Azure, the module must be named as
terraform-azurerm-<name-of-module>
for instance,terraform-azurerm-managed-identity
. For existing modules, we may keep their existing names. - The repo/code structure must be similar to the opensource modules such as terraform-azurerm-aks. Few important things to consider:
- The code should be available on root level
- There must exists an
examples
folder containing code showing how to consume the module outputs.tf
must exist to expose the required outputs
- Tests and release pipelines are created
- Documentation, terraform-docs should be used to autogenerate module documentation
- Only default
main
branch is used as a long lived branch. PRs need to be raised againstmain
branch only. - We should ideally use conventional commits with autorelease [semantic(https://semver.org/)] versioning. Since we dont currently have the release pipeline, therefore for the timebeing we may use specific HEAD sha hash from
main
branch in the consumer. An example can be found here.
In this section, we have provided details about how to provision various Azure components via Terraform and in particular, which Terraform modules should be used. This is work in progress and therefore, should be contributed by all team members
Azure Service | Recommended Module | Example Usage | Comments |
---|---|---|---|
Regions & Availability Zones | terraform-azurerm-regions | usage examples | Opensource by Azure, can be directly consumed |
Resource Group | infra-mod-rg | usage examples | Custom internal module |
Virtual network | terraform-azurerm-avm-res-network-virtualnetwork | usage examples | Opensource module by Azure, can be directly consumed |
Network Security Group | terraform-azurerm-avm-res-network-networksecuritygroup | usage examples | Opensource module by Azure, can be directly consumed |
Network Route Table | terraform-azurerm-avm-res-network-routetable | usage examples | Opensource module by Azure, can be directly consumed |
Storage Accounts | terraform-azurerm-avm-res-storage-storageaccount | usage examples | Opensource module by Azure, can be directly consumed |
AKS | terraform-azurerm-aks | usage examples | custom wrapper module which internally calls Opensource AKS module by Azure |
Keyvault | terraform-azurerm-avm-res-keyvault-vault | usage examples | Opensource module by Azure, can be directly consumed |
Application Insights | terraform-azurerm-avm-res-insights-component | usage examples | Opensource module by Azure, can be directly consumed |
Log Analytics | infra-mod-law | usage examples | needs refactoring |
Cosmos DB | terraform-azurerm-avm-res-documentdb-databaseaccount | usage examples | Opensource module by Azure, can be directly consumed |
Role Assignment | terraform-azurerm-roleassignment branch [role-assignment] | usage examples | customized based on https://github.com/Azure/terraform-azurerm-avm-res-authorization-roleassignment |
Logic App | infra-mod-logicapp branch [logicapp-prod-r2d2-refactor] | usage examples | needs consumption mode to be added as well. |
Event Hub | infra-mod-eh | usage examples | Needs refactoring, check and compare against OS module here |
Azure API management | terraform-azurerm-api-management | usage examples | OS module, can be directly consumed |
Azure Container Registry (ACR) | infra-mod-acr | usage examples | needs refactoring, check and compare against this OS module |
Azure Data Factory (ADF) | infra-mod-ADF | usage examples | needs refactoring, check and compare against this OS module |
Azure Managed Identity | terraform-azurerm-managed-identity | usage examples | custom internal module |
Event grid event topic | infra-mod-eventgrid | usage examples | needs refactoring, use for custom topic creation only, subscriptions to be extracted in separate module |
Event grid event subscription | terraform-azurerm-eventgrid-event-subscription | usage examples | WIP, subscriptions can be created on various scopes therefore should have its own module, logic similar to this OS module |
App Gateway | terraform-azurerm-app-gateway | usage examples | OS module, can be directly consumed |
This is a how-to section about granting accessing to Azure resources to different teams. Following general points must be considered:
- The role assignment must be done via code. The SPs used in atlantis for both non-prod and prod have access to grant roles.
- Principle of least privilege must be followed.
- Use a narrow scope, for instance, if AKS access is needed, the scope should be on the cluster/namespace level instead of subscription/resource group level.
- Use AD group names instance of object IDs. Use data source to fetch the corresponding object ID. Please see here.
- Use existing Azure specified roles where needed as most of the needs can be fulfilled by them. Custom roles may be created on need.
In this repo, the role assignment is already been done for various infra components like AKS and APIM. For AKS, please see this page for the RBAC recommendations.
Once a request comes you need to simply raise a PR and add the respective group to the already existing component like for AKS here or in the environment specific files.
If the infra component was created manually, we must still do role assignment via code. We may use datasource to fetch the resource info and use the information in role assignment.
Here is an example PR to grant API gateway access to ODL developers.