Azure Pipelines includes Microsoft-hosted Agents as a managed service. If you can use these agents I recommend you do so as they provide the best managed experience.
However, there may be scenarios where you need to manage your own agents:
- Network access to your private resources e.g. geekzter/azure-aks
- Configuration can't be met with any of the hosted agents (e.g. Linux distribution, Windows version)
- Improve build times by caching artifacts
The first point is probably the most common reason to set up your own agents. With the advent of Private Link it is more common to deploy Azure Services so that they can only be accessed from a virtual network. This requires an agent hosting model that fits that constraint.
This repository contains Virtual Network integrated Azure Pipeline scale set agents and self-hosted agents that can build the VM images it itself uses.
Azure services used include:
- Bastion
- Compute Gallery
- Firewall
- NAT Gateway
- Pipelines
- Storage File Share
- Virtual Network
Tools used are:
- Azure CLI
- cloud-init
- Packer
- PowerShell
- Terraform
This repo will provision an Azure Virtual Machine Scale Set in a Virtual Network. It will provision an egress device (Firewall or NAT Gateway) and remote access (Bastion). A choice can be made between a NAT Gateway (optimize cost) or Azure Firewall (optimize control) depending on the deploy_azure_firewall
Terraform variable. This also impacts the extend to which resources are connected via Private Endpoints.
To enable Virtual Network integrated image builds with build-image-isolated.yml, a separate Virtual Network (and resource group, optionally in a different subscription) to be used by Packer is created. For the image build VM's themselves yet another resource group is created. Policy is assigned to this resource group to prevent image build time VM extension installation which would render the image unusable (we want to install extensions at deploy time, not build time).
The build-image.yml pipeline uses the method and scripts described on the actions/runner-images GitHub repo to build a managed image with the same configuration Azure DevOps and GitHub Actions are using for Microsoft-hosted agents and GitHub-hosted runners. The GenerateResourcesAndImage.ps1 script does the heavy lifting of building the managed image with Packer. This pipeline can run on Microsoft-hosted agents ('Azure Pipelines' pool).
In Enterprise you will have isolation requirements (e.g. no public endpoints), build in a Virtual Network, protect the identity used for the build, etc. To accommodate such requirements the build-image-isolated.yml takes the packer templates and provides the variables required to customize the VM that is used to create the image from. This pipeline needs to run on a self-hosted agent such as the scale set agents deployed by this repository.
Note that by building an image you are accepting licenses pertaining to the tools installed at software installation (i.e. build) time.
With the aforementioned image template created by actions/runner-images, or an Azure Marketplace image, you can make sure you're always on the latest version. Instead of post deployment patching, an immutable infrastructure approach is taken when new versions of the image are built instead of patching VM's.
Lifecycle steps are:
- A Virtual Machine Scale Set (VMSS) is created with the (at that time) latest version of an image
- Adding the VMSS has a scale set agent pool ensures the Azure Pipelines agent is installed
- When a pipeline job needs to be run, a VMMS instance is assigned to run the job
- When the pipeline completes, the VMSS instance is destroyed
- When Virtual Machine Scale Set needs a new instance, an instance is created from the latest VM image version
The above ensures VM instances are kept up to date. The speed of this is controlled by the minimum and maximum number of instances if the scale set agents pool (as configured in Azure DevOps).
To customize provisioning, see configuration.
The easiest method is to use a GitHub Codespace. Just create a GitHub Codespace from the Code menu or page. This will create a Codespace with prerequisites installed. Wait until Codespace preparation including post create commands have been completed to start a clean shell (pwsh).
If your prompt looks like this, post creation has not yet finished:
PS /workspaces/azure-pipeline-agents>
Instead a terminal should look like:
/workspaces/azure-pipeline-agents/scripts [master ≡]>
Follow the instructions shown in the terminal to provision infrastructure.
If you fork this repository on GitHub, you can define Codescape secrets. These will be surfaced as environment variables with the same name. Defining secrets for ARM_TENANT_ID
and ARM_SUBSCRIPTION_ID
will make sure you target the right Azure subscription.
You can reconnect to disconnected terminal sessions using tmux. This blog post explains how that works. Just type
ct <terraform workspace>
to enter a tmux session with the terraform workspace environment variable TF_WORKSPACE
set. Type the same to get back into a previously disconnected session. This can be done up to the timeout configured in Codespaces.
If you set this up locally, make sure you have the following pre-requisites:
- Azure CLI
- PowerShell
- Terraform (to get that you can use tfenv on Linux & macOS, Homebrew on macOS or chocolatey on Windows).
Run:
scripts/deploy.ps1 -Apply
This will also log into Azure and let you select a subscription in case ARM_SUBSCRIPTION_ID
is not set.
This repo contains a pipeline that can be used for CI/CD. You'll need the Azure Pipelines Terraform Tasks extension installed. To be able to create Self-Hosted Agents, the 'Project Collection Build Service (org)' group needs to be given 'Administrator' permission to the Agent Pool, and 'Limit job authorization scope to current project for non-release pipelines' disabled. For this reason, it is recommended to have a dedicated project for this pipeline.
Self-hosted Agents are the predecessor to Scale Set Agents. They also provide the ability to run agents anywhere (including outside Azure). However, you have to manage the full lifecycle of each agent instance. I still include this approach as separate Terraform modules for Ubuntu & Windows. It involves installing the VM agent as described on this page for Linux.
Set Terraform variable deploy_self_hosted
to true
to provision self-hosted agents. You will also need to set azdo_pat
and azdo_org
.
Scale Set Agents leverage Azure Virtual Machine Scale Sets. The lifecycle of individual agents is managed by Azure DevOps, therefore I recommend Scale Set Agents over Self-hosted agents.
Set Terraform variable deploy_azure_scale_set
to true
to provision scale set agents.
The software in the scale set (I use Ubuntu only), is installed using cloud-init.
Note this also sets up some environment variables on the agent e.g. PIPELINE_DEMO_AGENT_VIRTUAL_NETWORK_ID
that can be used in pipelines to set up a peering connection from (see example below).
Features toggles are declared in variables.tf
and can be overridden by creating a .auto.tfvars
file (see config.auto.tfvars.sample), or environemt variables e.g. TF_VAR_deploy_self_hosted="true"
.
Terraform variable | Feature |
---|---|
configure_azure_cidr_allow_rules |
Configure allow rules for IP ranges documented here. When enabled traffic allowed by this rule will not have FQDN's shown in the logs. |
configure_azure_crl_oscp_rules |
Allow traffic to TLS recommended locations. This is plain HTTP (port 80) traffic used by Certificate Revocation List (CRL) download and/or Online Certificate Status Protocol (OCSP). |
configure_azure_wildcard_allow_rules |
Configure generic wildcard FQDN rules e.g. *.blob.core.windows.net. |
deploy_azure_bastion |
Deploy managed bastion host. |
deploy_azure_files_share |
Deploy SMB files share, mount it on agents and configure Pipeline Agent diagnostics (_diag directory) to use it. |
deploy_azure_firewall |
Instead of NAT Gateway, uses Azure Firewall for network egress traffic. This allows you to control outbound traffic e.g. by FQDN, as well as monitor it. Setting this value to true will also create private endpoints for storage used, Azure Monitor, etc. |
deploy_non_essential_azure_vm_extensions |
Deploy monitoring extensions. These extensions generate their own network traffic. This variable allows you to turn them off. |
deploy_azure_scale_set |
Deploy Scale Set agents. |
deploy_azure_self_hosted_vms |
Deploy Self-Hosted agent VMs. |
deploy_azdo_self_hosted_vm_agents |
Deploy Self-Hosted agent VM extensions. |
linux_tools |
Uses cloud-init to instal tools (e.g. AzCopy, Packer, PowerShell, PowerShell Azure modules). Should not be used when using a pre-baked image. |
azure_linux_os_image_id |
Use pre-baked image by specifying the resource id of a VM image e.g. /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/Shared/providers/Microsoft.Compute/galleries/SharedImages/images/Ubuntu2204/versions/latest |
azure_log_analytics_workspace_id |
Providing a value of an existing Log Analytics workspace allows you to retain logs after infrastructure is destroyed. |
azure_windows_os_image_id |
Use pre-baked image by specifying the resource id of a VM image e.g. /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/Shared/providers/Microsoft.Compute/galleries/SharedImages/images/Windows2022/versions/latest |
This yaml snippet shows how to reference the scale set pool and use the environment variables set by the agent:
pool:
name: 'Scale Set Agents 1' # Name of the Scale Set Agent Pool you created
steps:
- pwsh: |
# Use pipeline agent virtual network as VNet to peer from
$env:TF_VAR_peer_network_id = $env:PIPELINE_DEMO_AGENT_VIRTUAL_NETWORK_ID
# Terraform will use $env:PIPELINE_DEMO_AGENT_VIRTUAL_NETWORK_ID as value for input variable 'peer_network_id'
# Create on-demand peering... (e.g. https://github.com/geekzter/azure-aks)
If you are using Azure Firewall, and find things are failing, you can monitor allowed & blocked traffic with Log Analytics queries e.g.
AzureDiagnostics
| where Category == "AzureFirewallApplicationRule" or Category == "AzureFirewallNetworkRule"
| where msg_s contains "Deny"
| order by TimeGenerated desc
| project TimeGenerated, msg_s
For more elaborate queries, check the kusto directory.