Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Distribute Bicep providers using OCI registry #10662

Closed
asilverman opened this issue May 10, 2023 · 0 comments · Fixed by #10624 or #10868
Closed

Proposal: Distribute Bicep providers using OCI registry #10662

asilverman opened this issue May 10, 2023 · 0 comments · Fixed by #10624 or #10868
Assignees
Labels
design approved The team has reviewed and signed off on this design proposal story: dynamic type loading Collects all work items related to decoupling of Bicep types from compiler
Milestone

Comments

@asilverman
Copy link
Contributor

Summary

TL;DR: We propose to change the means of distribution of Bicep provider types from NuGet (static compilation dependency) to OCI registry (dynamic loading).

Motivation

The current implementation of Bicep providers is based on NuGet packages has the following limitations:

  1. Bicep costumers must wait for a Bicep release before they can consume Azure Service providers features made available in new api-versions.

  2. When a customer upgrades their Bicep compiler to the latest version, previously compiling bicep files may result in compilation failures in the upgraded Bicep compiler because of added linter/compilation checks in the updated version.

As a consequence, Bicep customers that pin their Bicep compiler version to a specific version can't consume new resource api-versions and benefit from bug fixes and new features without a significant investment in upgrading their Bicep compiler version.

Since the az provider definitions are generated, when a new api-version is released for an Azure resource, the Bicep team must update the az provider definitions and release a new version of the az provider NuGet package, if there are defects in the generator or the generated code, the Bicep team must release a new version of Bicep to fix the issue. This makes for a degraded customer experience because of the added time and effort a customer must invest to consume the new Bicep release.

  • As a Bicep user, I want to use the latest api-versions of Azure resources without having to upgrade my Bicep compiler

  • As a Bicep user, I want resource definitions to be immutable so that I control what definitions are used by the Bicep compiler during the compilation process

Design

Current State

Currently, az resource type definitions are serialized into types.json in a cron job that's run on a weekly basis.

The current implementation based on NuGet packages uses the following workflow:

  1. On a weekly basis, a cron job is triggered to generate the az provider types from the azure-rest-api-specs definitions. The generated artifacts are json blobs conforming to the types.json serialization format, the json files rely on a directory structure and are kept under a folder named generated (here an there after referred to as the generated-package). The cron job published the updated contents to a branch named autogenerated in the azure/bicep-types-az repository.
  2. On a manual basis, the Bicep team reviews the PR and merges the autogenerated branch into the main branch.
  3. On each Bicep release, the Bicep team starts a manual workflow to publish the az provider NuGet package to NuGet.org from a mirror repository (BicepMirror-Types-Az) in ADO.
  4. On each Bicep release, the Bicep team must update the version of the NuGet package used in the Bicep compiler to match the latest az provider NuGet package.

A similar workflow is used the kubernetes extensibility provider.

Following, we describe how the provider resource type data is loaded into the Bicep compiler.

  1. The generated-package is turned into an embedded resource for Azure.Bicep.Types.Az.dll, this library implements a single class AzTypeLoader that extends an abstract class TypeLoader from a different NuGet package Azure.Bicep.Types. TypeLoader is an abstract class that knows how to deserialize types.json into runtime objects.
  2. The Bicep compiler consumes Azure.Bicep.Types NuGet to deserialize the embedded resource into runtime objects that it processes into internal data structures that are used during the compilation process.

Proposed State

We propose replacing the NuGet package with an OCI manifest. The OCI registry will contain the same artifacts as the generated-package, the artifacts will be immutable and will be published on a weekly basis by a similar cron job in (azure/bicep-types-az). The Bicep compiler will be updated to load the provider resource type data from the OCI registry instead of the NuGet package.

The OCI artifact will be a manifest that conforms to the following structure:

{
	"schemaVersion": 2,
	"mediaType":  "application/vnd.ms.bicep.provider.v1+json",
	"manifests": [
		// The type definitions gzipped. Follows the file structure of current types.json definitions,
		// that is, a top-level index.json and index.md that point to nested types.json and types.md
		// definitions relative to that top level directory.
		// 
		// This is like the current file structure found in https://github.com/Azure/bicep-types-az/tree/main/generated
		{
            // The mediaType is a custom media type that we define to be able to distinguish between version os the serialization format
            // we can evolve the serialization format and the version, and the compiler will be able to maintain backwards compatibility
            // through conditional logic based on the version of the serialization format 
			"mediaType": "application/vnd.ms.bicep.provider.types.v1.tar+gzip",
			"digest": "sha256:9834876dcfb05cb167a5c24953eba58c4ac89b1adf57f28f2f9d09af107ee8f0",
			"size": 32654,
			"annotations": {}
		}
	],
	"annotations": {}
}

The cron job will publish the az manifest to a repository mcr.microsoft.com/bicep/providers/az hosted by the Microsoft Container Registry (MCR). The Bicep compiler will be updated to load generate the ByteStream from the OCI manifest cache instead of the NuGet package.

Implementation Details

  • Loading provider data from the cache

The current implementation for Bicep modules loads module data from a filesystem cache under ~/.bicep. We propose to extend this logic to support loading provider manifests using the same cache. The CreateCompilation method of the BicepCompiler class is the entrypoint for this new logic. The method manipulates a sourceFileGrouping object that traverses the Bicep files that describe the deployment and inspects their syntax for module declarations and loads the data from the filesystem when a module declaration is found. We propose to extend this process to inspect the syntax for provider declarations and restore their data from the filesystem cache.

  • Loading provider data from the network

The current implementation for modules uses a class OciModuleRegistry to load module data from the network when there is a cache miss. We propose to rename and generalize this class to support loading provider data as well. The strategy used for caching provider data will align with the current implementation for modules, that is, a restore operation will be performed on provider versions that aren't in the cache on demand during the bicep build processing.

  • Embedded az provider

To maintain backwards compatibility with the current implementation, we propose to ship the Bicep compiler with an embedded version of the provider manifests for az and kubernetes. The embedded providers will be expanded into the filesystem cache and will be used when the user doesn't specify a az provider in the import directive.

Sample expanded folder structure in the local Bicep cache for az provider:

$USERPROFILE\.BICEP
└───br
    └───mcr.microsoft.com
        └───providers$az
            └───1.0.0$
                |   lock
                |   manifest
                |   index.json
                |   index.md
                ├───resources
                    │   log.out
                    │
                    └───microsoft.resources
                    |   ├───2022-11-01-preview
                    |   │       types.json
                    |   │       types.md
                    |   |--- //Some other API Version
                    // .. more nested directories ...

Scalability or cost concerns with using ACR to store provider data

  • The az provider generated-package size is ~ 150MB uncompressed, after applying tar+gzip we get a compressed size of ~20MB
  • The generated-package would be hosted by MCR and mirrored to AirGapped and Government clouds using syndication in the same way that modules are made available in these clouds
  • We expect customers to pull the provider data once and restore the data from the cache in subsequent builds so the cost of pulling the data from the network ~500USD a year for hosting the az provider data ARC with geo-replication (see here for costs).

Out of scope

- Support for "un-importing" of providers (e.g. removing symbols from the implicitly imported az provider)

The current proposal is based on the provider model introduced by the extensibility providers feature. It uses the import gesture to load symbols of a provider into the current file context. This provider mechanism is used also for Microsoft resources that conform a special case of a provider known as the az provider.

Encapsulating az to its own provider presents its own challenges, in the current implementation of Bicep symbols such as targetScope, resourceId(), environment(), reference()and others are hardcoded into the global scope of Bicep compiler ([see here](https://github.com/Azure/bicep/blob/main/src/Bicep.Core/Semantics/Namespaces/AzNamespaceType.cs#L288-L443)). Other implications of encapsulating these relate to the ability to express them intypes.json(see below bullet), hence, we defer the ability to "un-import" symbols from theazprovider until a separate work item to encapsulateaz` is implemented.

- Serialization of provider scoped functions (e.g. resourceId(), environment(), reference()) in types.json

Its concievable for providers to expose provider scoped functions similar to the functions above, however the current serialization format (types.json) doesn't support encoding such symbols. The current implementation hardcodes az provider function in the Bicep compiler. The current serialization protocol must be enhanced to allow encode provider function signatures before we can dinamically load them from a provider definition. This will be handled in a separate proposal.

- Aliasing or configuring aliases for provider registries in the bicepconfig.json

In this iteration we restrict the import gesture to fetch providers from OCI compliant registries alone. We plan to enhance the importing mechanism to use registry aliasing (similarly to Bicep module handling) so users can load provider definitions from offline sources such as disk in a separate proposal

- Support for third party extensibility providers

The design proposal can be extended to support third party extensibility providers that are composed from type definitions as well as a server-side component. That said, there is more work necessary to define exactly how the server-side component of an extensibility provider is packaged, how it's hosted and other details that should be elaborated and described in full as independent capabilities of the system. For this proposal we will defer this conversation and manage it as an incremental feature on top of this proposal.

- Signing of provider manifests

The current proposal doesn't address the signing of provider manifests. We expect to add signed manifests as an incremental feature that uses notary. We plan to address this in a separate proposal.

- Automated publishing of provider manifests to a registry

The current proposal doesn't address the automated publishing of provider manifests to a registry. We plan to manually publish the az provider manifest to the MCR registry as part of the initial implementation. We will address a process to publish manifests as a separate proposal in the future.

Advantages of this proposal

  • Implements the use-cases described above
  • Leverages pre-existing infrastructure related to modules
  • Is extensible to support 3rd party extensibility providers server-side components
  • Decouples the serialization format of providers data from the Bicep compiler and enables the evolution to other serialization formats in the future

Open Questions

@github-project-automation github-project-automation bot moved this to Todo in Bicep May 10, 2023
@ghost ghost added the Needs: Triage 🔍 label May 10, 2023
@asilverman asilverman moved this from Todo to In Review in Bicep May 16, 2023
@asilverman asilverman self-assigned this May 16, 2023
@stephaniezyen stephaniezyen added this to the v0.18 milestone May 17, 2023
@asilverman asilverman added proposal design approved The team has reviewed and signed off on this design labels May 17, 2023
@github-project-automation github-project-automation bot moved this from In Review to Done in Bicep May 25, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 25, 2023
asilverman added a commit that referenced this issue Jul 20, 2023
## Overview
Adds support for loading the 'az' provider dynamically sourced from an
OCI artifact registry. The artifact must be restored (downloaded to the
appropriate location) manually under

`$BicepCacheRootDir/br/mcr.microsoft.com/bicep$providers$az/${providerVersion}`
which is the target location in the Bicep cache for the `az` provider.

The work to restore the provider data to the cache will be handled in a
separate PR for convenience so that reviewing PRs is easier.

To enable the feature `DynamicTypeLoadingEnabled` and must be set to
true in `bicepconfig.json`:
```json
{
    "experimentalFeaturesEnabled": {
        "extensibility": true,
        "dynamicTypeLoadingEnabled": true,
    },
    "cacheRootDirectory": "~/.bicep",
}
```
## Changes
- Adds a new experimental feature flag `DynamicTypeLoadingEnabled`
- Adds a new factory `AzResourceTypeLoader` that is handling the concern
of deciding the provider loader to use based on the feature flag and
presence of an import declaration syntax in the Bicep file being
processed.
- Serializes the provider version by inspecting the
`ImportDeclarationSyntax` vs using a hardcoded value

Fixes #10662

## Contributing a feature

* [x] I have opened a new issue for the proposal, or commented on an
existing one, and ensured that the Bicep maintainers are good with the
design of the feature being implemented
* [x] I have included "Fixes #{issue_number}" in the PR description, so
GitHub can link to the issue and close it when the PR is merged
* [x] I have appropriate test coverage of my new feature

---------

Co-authored-by: Ariel Silverman <[email protected]>
@asilverman asilverman added the story: dynamic type loading Collects all work items related to decoupling of Bicep types from compiler label Jul 27, 2023
@asilverman asilverman converted this issue into discussion #12111 Oct 9, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
design approved The team has reviewed and signed off on this design proposal story: dynamic type loading Collects all work items related to decoupling of Bicep types from compiler
Projects
Archived in project
2 participants