Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR - Infrastructure-as-Code Sharing Approach #104

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 113 additions & 0 deletions docs/adr/0009-infrastructure-as-code-sharing-approach.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# infrastructure-as-code-sharing-approach

## Status

Proposed

## Context

* The current version of Marain uses ARM templates to automate the infrastructure deployment for each Marain service.
* The deployment templates resides alongside the code for each service
* There is significant commonality/duplication of ARM templates between services
* We wish to migrate to using Bicep to define the Marain infrastructure deployments
* A desire to minimise duplication and utilise shared Bicep modules
* Ideally, the overall Marain deployment process should be as low-friction as the current "'git clone' & deploy" approach

Given this context and goals, the challenge is how to combine the use of shared Bicep modules (to de-duplicate the infrastructure-as-code) whilst still having a low friction means of deploying the Marain stack. The absence of a defacto public Bicep module repository that can host the Marain-specific modules means there isn't an obvious way to share these modules without additional work.

***NOTE**: A more general, but related, Bicep reuse [ADR](https://github.com/endjin/Endjin.RecommendedPractices.Bicep/blob/main/docs/adr/0002-sharing-bicep-modules.md) was written before it became clear that the public Bicep registry would not be 'open to all', for publishing, in the same way that nuget.org or Docker Hub are.*

In the ideal world the Marain deployment would use Bicep modules in the [Endjin.RecommendedPractices.Bicep repository](https://github.com/endjin/Endjin.RecommendedPractices.Bicep) so as to avoid duplication as well as a set of Marain-specific modules that are re-usable across different Marain services. This means that the envisioned deployment process would need to interact with 3 sets of Bicep modules:

1. Shared/Common - non-Marain specific functionality
1. Marain shared - functionality common to multiple Marain services
1. Marain service-specific - functionality tailored to an individual Marain service

This ADR considers the following options:

1. Hosting a public Bicep registry
1. Supporting internal Bicep registries
1. Shipping pre-compiled ARM templates

## Options

### Hosting a public Bicep registry

This option involves us hosting an Azure Container Registry that we make publicly available and publishing the Marain Bicep modules to it as part of the overall Marain release process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an officially supported use of an ACR?

I don't know what Bicep's relationship with a registry looks like. Here are some things I don't know:

  • is Bicep designed with the intention that it will be used with a mixture of "the public Bicep registry" referred to earlier and other registries (such as a private one, or a non-Microsoft public one)? E.g., does its configuration fully support an idea of "here are all the registries we're using" or do you end up doing something more hacky like pointing it to a registry which somehow defers back to the public registry (e.g., as you can do with Azure DevOps 'artifact feeds')?
  • is it just making simple HTTP requests based on a presumed URL structure? (E.g., could you in principle use a blob store as a registry?) Or are there particular things it requires of a repository?
  • what are the expectations around auth when the bicep tools talk to a registry? might these change over time?

In particular, I'm thinking about the fact that when NuGet first launched, it really did treat the repo as just a file store, so you could use pretty much any web server as a package server, but that's not exactly true any more.

So I'm wondering: does an ACR work as a Bicep store just because it happens to resemble a generic static HTTP service enough to keep bicep happy? (Bicep is not Docker. A Bicep module is not a container image. So it's not obvious to me that a container registry should necessarily work as a Bicep registry.) And if so, might Bicep's expectations change in the future in a way that means this won't always be true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions:


#### Advantages
* Simplest and lowest friction option for the Marain consumer.
* Easy to integrate with our existing release process.
* Some flexibility for where the source for the shared Bicep modules is stored, as they won't be primarily consumed as 'source' artefacts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they won't be primarily consumed as 'source' artefacts.

I don't understand what this means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed-up some changes to, hopefully, better describe this

* Since the deployment tooling can reference a public distribution point, the modules would not have to be available alongside the source repository in order to support the "'git clone' & deploy" approach
* The ACR could also be used to release & host Marain container images in the future.

#### Disadvantages
* Costs associated with operating the ACR, both resource and bandwidth costs
* Basic service: ~£20 per month (based on ACR Standard SKU)
* Resilient service: ~£85 per month (based on ACR Premium SKU with geo-replication)
* The Marain deployment process becomes dependent on network connectivity to the public registry - although this is arguably no different to the current requirement for access to GitHub (needed to download release artefacts).


### Supporting internal Bicep registries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has any thought been given to the kinds of deployment topologies in which everything that can be locked off on a restrictive VNET has been?

There at least are a couple of dimensions to that:

  • bicep registries accessible only via a VNET
  • deployment mechanisms running on agents that are on a VNET in which outbound requests are restricted to prevent exfiltration attacks

I don't know enough about how multi-module bicep deployments work to know what's viable here. (I believe that with nested classic ARM deployments, Azure needs to be able to access the storage account containing the child templates, which imposes some limitations on the extent to which you can lock things down. I don't know if that also applies to Bicep; I believe Bicep has a mode in which you can pre-compile to an ARM template, so I can imagine it might be possible to run tools on a build agent does have access to a locked-down Bicep store and which fetches all the modules and builds them into one huge monolithic ARM template, which would mean Azure itself wouldn't need access to the stores containing the modules. But while that seems conceivable, I have no idea whether you can actually do it in practice.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In either scenario it's always the system invoking the deployment that needs to have access to any module registries - the ARM runtime still only deals with JSON-based templates.

As per your suggestion, what we typically do now is generate the ARM template as part of the build process and then treat it as an artefact to feed into the deployment.


This option assumes that a Marain consumer will have access to an Azure Container Registry (ACR) that can be populated with the required Bicep modules.

* We would provide tooling to provision the basic ACR infrastructure and/or publish the required Bicep modules.
* The deployment tooling would need to support customising the ACR details.

#### Advantages
* No ongoing hosting costs for the Marain maintainers.
* Provides consumers with an internalised deployment solution.

#### Disadvantages
* Additional steps/friction for the consumer before they can deploy Marain.
* Additional effort required provide the tooling to provision/populate the Bicep module registry.
* Requires a method of bundling the required modules to support the process of populating the internal registry.
* The need to bundle may conflict with the desire to keep the Bicep source alongside the associated service.
* The above may lead us to using Git submodules


### Shipping pre-compiled ARM templates

This option relies on using ARM templates for the deployment, whilst using Bicep for their development.

The release process for each service would include generating an ARM template from its Bicep source files, which the deployment tooling would use.

#### Advantages

* No ongoing hosting costs for the Marain maintainers or consumers.
* The required Bicep modules need only be accessible during the build/release process, so no bundling required.

#### Disadvantages

* The Bicep source files in the various repos would not be useable by someone cloning the repo (without access to a populated module registry).
* The generated ARM templates would potentially need to be committed to the git repo as part of the build/release process to ensure they were available to someone cloning the repo (alternatively they could be treated as release artefacts like the current ZIP deployment packages).
* Marain consumers needing to troubleshoot a deployment issue would have to debug a single large ARM template.


## Decision

The following hybrid approach will be taken:

* A monolithic generated ARM template will be provided as a release artefact for each service, to facilitate OSS deployment scenarios
* Shared Bicep modules will be hosted on a non-public ACR
* Source references to such modules will use an [alias](https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/azure-resource-manager/bicep/bicep-config-modules.md#aliases-for-modules) to de-couple them from the hosted ACR infrastructure
* The hosted ACR will adopt a repository structure that will support logical grouping of modules with different security requirements (e.g. public, restricted, private etc.)
* 3rd parties entitled to access the hosted modules will do so via ACR [repository-scoped](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-repository-scoped-permissions) access tokens
* OSS repositories that consume these hosted modules should contain guidance on:
* Where to find the source code for these modules
* How to request access to our 'convenience' hosted Bicep module registry (exact process and requirements TBC)
* Closed-source scenarios that require access to these hosted modules will be handled as part of the underlying engagement or procured service. For example:
* Access to hosted ACR provided by default
* Access to underlying private GitHub repos provided on request

## Consequences

### Positive
* Everything will work the same as currently developed (i.e. referencing our internal ACR)
* Minimises upfront work to support what may be a lightly used scenario (i.e. unknown parties using our OSS IP), by not needing to build a complex re-packaging solution
* Only requires a single container registry to support external access as there is no need for a truly public/anonymous ACR - this minimises costs.

### Negative
* Requires assigning each repository to a [scope map](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-repository-scoped-permissions#concepts) as ACR does not currently support a wildcard-style approach. This would likely be added to the Bicep module CI/CD process.