Refactor: Separate Pipeline Management into Serverless Application (#424

) * Refactor: Pipeline Management State Machine Seperate out Pipeline Mananagement away from CodePipeline and remove the major bottlenecks of building each deployment map sequentially. Starts to isolate the CodeCOmmit repository and moves the source of deployment maps into an S3 Bucket. Sets the groundwork for future refactoring of Pipeline Management and moves towards enabling decentralised deployment maps. Authored-by: Stewart Wallace * Updating documentation * Fixing yamllint errors * final tox issues: * Update docs/technical-guide.md Co-authored-by: Simon Kok * Fixing some issues with testing * Enabling create repo based on config value * Code Review Suggestions * Copy Paste Error * removing packaged.yml and fixing trailing spaces * Missing default scm config value * Apply suggestions from code review Co-authored-by: Simon Kok * Apply suggestions from code review Co-authored-by: Simon Kok * Changing permissions of src/lambda_codebase/initial_commit/bootstrap_repository/adf-bootstrap/deployment/global.yml * Passing thru log level param * Restricting IAM permissions to prefix * Updating Function names Adding ADF Prefix to make it clearer that they are part of the ADF ecosystem Co-authored-by: Simon Kok * Apply suggestions from code review * Code Review Suggestions * removing condition * Add wave generation test from #484 Co-authored-by: Simon Kok * Make use of Temp.Directory to fetch the deployment map files Co-authored-by: Simon Kok
awslabs · Aug 26, 2022 · 0032fab · 0032fab
1 parent b9d045a
commit 0032fab
Show file tree

Hide file tree

Showing 31 changed files with 1,823 additions and 114 deletions.
diff --git a/docs/images/TechnicalGuide-AccountManagementStateMachine.drawio.png b/docs/images/TechnicalGuide-AccountManagementStateMachine.drawio.png
diff --git a/docs/images/TechnicalGuide-BootstrapRepo.drawio.png b/docs/images/TechnicalGuide-BootstrapRepo.drawio.png
diff --git a/docs/images/TechnicalGuide-BootstrapRepo.png b/docs/images/TechnicalGuide-BootstrapRepo.png
diff --git a/docs/technical-guide.md b/docs/technical-guide.md
@@ -1,8 +1,24 @@
-## Technical Guide
-### Introduction
+# Technical Guide
+## Introduction
 This document is intended to give insight into how the AWS Deployment Framework works under the hood. 
 
-### High Level Overview - AWS Deployment Framework Bootstrap Repository
+## High Level Overview - AWS Deployment Framework Bootstrap Repository
 The AWS Deployment Framework Bootstrap Repository aka "Bootstrap Repo" is where the source code used by ADF lives. The bootstrap repo is also where your accounts, OU layout and base templates are defined. 
 The flow below is a high level overview of what happens when a change is committed to this repository. 
 ![bootstrap-repo-overview](./images/TechnicalGuide-BootstrapRepo.png)
+
+### Account Management State Machine 
+The Account Managment State Machine is triggered by S3 PUT events to the ADF Accounts bucket. 
+Below is a diagram detailing the components of the standard state machine. This state machine is defined in `src/account_processing.yml` and the lambda functions code is location in `src/lambda_codebase/account_processing`
+![account-management-state-machine](./images/TechnicalGuide-AccountManagementStateMachine.drawio.png)
+
+
+## High Level Overview - AWS Deployment Framework Pipeline Repository
+The AWS Deployment Framework Pipeline Repository aka "Pipeline Rep" is where the deployment map definitions live. It typically exists in CodeCommit within your Deployment Account(s). 
+The diagram below details what happens when a commit is pushed to this repository. 
+![pipeline-repo-overview](./images/TechnicalGuide-PipelineRepo.drawio.png)
+
+### Pipeline Management State Machine
+The Pipeline Management State machine is triggered by S3 PUT events to the ADF Pipelines bucket. This state machine is responsible for expanding the deployment map, resolving the targets, creating pipeline definitions (JSON objects that detail the source(s) and stages involved and the targets) and then generating CDK stacks off of the definitions. 
+
+It additionally covers the deletion of stale pipelines. A Stale pipeline is any pipeline that has a definition but does not exist in a deployment map. 
diff --git a/src/lambda_codebase/initial_commit/bootstrap_repository/adf-bootstrap/deployment/global.yml b/src/lambda_codebase/initial_commit/bootstrap_repository/adf-bootstrap/deployment/global.yml
@@ -170,6 +170,25 @@ Resources:
         BlockPublicPolicy: true
         IgnorePublicAcls: true
         RestrictPublicBuckets: true
+  PipelineManagementApplication:
+    Type: AWS::Serverless::Application
+    DeletionPolicy: Delete
+    UpdateReplacePolicy: Retain
+    Properties:
+      Location: pipeline_management.yml
+      Parameters:
+        LambdaLayer: !Ref LambdaLayerVersion
+        ADFVersion: !Ref ADFVersion
+        OrganizationID: !Ref OrganizationId
+        CrossAccountAccessRole: !Ref CrossAccountAccessRole
+        PipelineBucket: !Ref PipelineBucket
+        RootAccountId: !Ref MasterAccountId
+        CodeBuildImage: !Ref Image
+        CodeBuildComputeType: !Ref ComputeType
+        SharedModulesBucket: !Ref SharedModulesBucket
+        PipelinePrefix: !Ref PipelinePrefix
+        StackPrefix: !Ref StackPrefix
+        ADFLogLevel: !Ref ADFLogLevel
 
   CodeCommitRole:
     Type: AWS::IAM::Role
@@ -260,6 +279,8 @@ Resources:
               - !Sub arn:${AWS::Partition}:s3:::${PipelineBucket}/*
               - !Sub arn:${AWS::Partition}:s3:::${SharedModulesBucket}
               - !Sub arn:${AWS::Partition}:s3:::${SharedModulesBucket}/*
+              - !Sub arn:${AWS::Partition}:s3:::${PipelineManagementApplication.Outputs.Bucket}
+              - !Sub arn:${AWS::Partition}:s3:::${PipelineManagementApplication.Outputs.Bucket}/*
           - Effect: Allow
             Sid: "KMS"
             Action:
@@ -354,6 +375,8 @@ Resources:
               - !Sub arn:${AWS::Partition}:s3:::${PipelineBucket}/*
               - !Sub arn:${AWS::Partition}:s3:::${SharedModulesBucket}
               - !Sub arn:${AWS::Partition}:s3:::${SharedModulesBucket}/*
+              - !Sub arn:${AWS::Partition}:s3:::${PipelineManagementApplication.Outputs.Bucket}
+              - !Sub arn:${AWS::Partition}:s3:::${PipelineManagementApplication.Outputs.Bucket}/*
           - Effect: Allow
             Sid: "KMS"
             Action:
@@ -716,6 +739,8 @@ Resources:
             Value: !Ref PipelineBucket
           - Name: SHARED_MODULES_BUCKET
             Value: !Ref SharedModulesBucket
+          - Name: ADF_PIPELINES_BUCKET
+            Value: !GetAtt PipelineManagementApplication.Outputs.Bucket
           - Name: ADF_PIPELINE_PREFIX
             Value: !Ref PipelinePrefix
           - Name: ADF_STACK_PREFIX
@@ -744,23 +769,8 @@ Resources:
                 - pip install -r adf-build/requirements.txt -q -t ./adf-build
             build:
               commands:
-                - cdk --version
-                - >-
-                    chmod 755
-                    adf-build/cdk/execute_pipeline_stacks.py
-                    adf-build/cdk/generate_pipeline_inputs.py
-                    adf-build/cdk/generate_pipeline_stacks.py
-                    adf-build/cdk/clean_pipelines.py
-                - python adf-build/cdk/generate_pipeline_inputs.py
-                - >-
-                    cdk synth
-                    --no-version-reporting
-                    --app adf-build/cdk/generate_pipeline_stacks.py
-                    1> /dev/null
-                - python adf-build/cdk/execute_pipeline_stacks.py
-            post_build:
-              commands:
-                - python adf-build/cdk/clean_pipelines.py
+                - aws s3 cp deployment_map.yml s3://$ADF_PIPELINES_BUCKET/deployment_map.yml
+                - aws s3 sync deployment_maps/* s3://$ADF_PIPELINES_BUCKET
       ServiceRole: !GetAtt PipelineProvisionerCodeBuildRole.Arn
       Tags:
         - Key: "Name"

diff --git a/...ory/adf-bootstrap/deployment/lambda_codebase/pipeline_management/create_or_update_rule.py b/...ory/adf-bootstrap/deployment/lambda_codebase/pipeline_management/create_or_update_rule.py
@@ -0,0 +1,60 @@
+"""
+Pipeline Management Lambda Function
+Creates or Updates an Event Rule for forwarding events
+If the source account != the Deplyment account
+"""
+
+import os
+import boto3
+
+from cache import Cache
+from rule import Rule
+from logger import configure_logger
+from cloudwatch import ADFMetrics
+
+
+LOGGER = configure_logger(__name__)
+DEPLOYMENT_ACCOUNT_REGION = os.environ["AWS_REGION"]
+DEPLOYMENT_ACCOUNT_ID = os.environ["ACCOUNT_ID"]
+PIPELINE_MANAGEMENT_STATEMACHINE = os.getenv("PIPELINE_MANAGEMENT_STATEMACHINE_ARN")
+CLOUDWATCH = boto3.client("cloudwatch")
+METRICS = ADFMetrics(CLOUDWATCH, "PIPELINE_MANAGEMENT/RULE")
+
+_cache = None
+
+
+def lambda_handler(pipeline, _):
+    """Main Lambda Entry point"""
+
+    # pylint: disable=W0603
+    # Global variable here to cache across lambda execution runtimes.
+    global _cache
+    if not _cache:
+        _cache = Cache()
+        METRICS.put_metric_data(
+            {"MetricName": "CacheInitalised", "Value": 1, "Unit": "Count"}
+        )
+
+    LOGGER.info(pipeline)
+
+    _source_account_id = (
+        pipeline.get("default_providers", {})
+        .get("source", {})
+        .get("properties", {})
+        .get("account_id", {})
+    )
+    if (
+        _source_account_id
+        and int(_source_account_id) != int(DEPLOYMENT_ACCOUNT_ID)
+        and not _cache.check(_source_account_id)
+    ):
+        rule = Rule(pipeline["default_providers"]["source"]["properties"]["account_id"])
+        rule.create_update()
+        _cache.add(
+            pipeline["default_providers"]["source"]["properties"]["account_id"], True
+        )
+        METRICS.put_metric_data(
+            {"MetricName": "CreateOrUpdate", "Value": 1, "Unit": "Count"}
+        )
+
+    return pipeline
diff --git a/...ository/adf-bootstrap/deployment/lambda_codebase/pipeline_management/create_repository.py b/...ository/adf-bootstrap/deployment/lambda_codebase/pipeline_management/create_repository.py
@@ -0,0 +1,56 @@
+"""
+Pipeline Management Lambda Function
+Creates or Updates a CodeCommit Repository
+"""
+
+import os
+import boto3
+from repo import Repo
+
+from logger import configure_logger
+from cloudwatch import ADFMetrics
+from parameter_store import ParameterStore
+
+
+CLOUDWATCH = boto3.client("cloudwatch")
+METRICS = ADFMetrics(CLOUDWATCH, "PIPELINE_MANAGEMENT/REPO")
+LOGGER = configure_logger(__name__)
+DEPLOYMENT_ACCOUNT_REGION = os.environ["AWS_REGION"]
+DEPLOYMENT_ACCOUNT_ID = os.environ["ACCOUNT_ID"]
+
+
+def lambda_handler(pipeline, _):
+    """Main Lambda Entry point"""
+    parameter_store = ParameterStore(DEPLOYMENT_ACCOUNT_REGION, boto3)
+    auto_create_repositories = parameter_store.fetch_parameter(
+                "auto_create_repositories"
+            )
+    LOGGER.info(auto_create_repositories)
+    if auto_create_repositories == "enabled":
+        code_account_id = (
+            pipeline.get("default_providers", {})
+            .get("source", {})
+            .get("properties", {})
+            .get("account_id", {})
+        )
+        has_custom_repo = (
+            pipeline.get("default_providers", {})
+            .get("source", {})
+            .get("properties", {})
+            .get("repository", {})
+        )
+        if (
+            auto_create_repositories
+            and code_account_id
+            and str(code_account_id).isdigit()
+            and not has_custom_repo
+        ):
+            repo = Repo(
+                code_account_id, pipeline.get("name"), pipeline.get("description")
+            )
+            repo.create_update()
+            METRICS.put_metric_data(
+                {"MetricName": "CreateOrUpdate", "Value": 1, "Unit": "Count"}
+            )
+
+    return pipeline
diff --git a/.../adf-bootstrap/deployment/lambda_codebase/pipeline_management/generate_pipeline_inputs.py b/.../adf-bootstrap/deployment/lambda_codebase/pipeline_management/generate_pipeline_inputs.py
@@ -0,0 +1,117 @@
+"""
+Pipeline Management Lambda Function
+Generates Pipeline Inputs
+"""
+
+import os
+import boto3
+
+from pipeline import Pipeline
+from target import Target, TargetStructure
+from organizations import Organizations
+from parameter_store import ParameterStore
+from sts import STS
+from logger import configure_logger
+from partition import get_partition
+
+
+LOGGER = configure_logger(__name__)
+DEPLOYMENT_ACCOUNT_REGION = os.environ["AWS_REGION"]
+DEPLOYMENT_ACCOUNT_ID = os.environ["ACCOUNT_ID"]
+ROOT_ACCOUNT_ID = os.environ["ROOT_ACCOUNT_ID"]
+
+
+def store_regional_parameter_config(pipeline, parameter_store):
+    """
+    Responsible for storing the region information for specific
+    pipelines. These regions are defined in the deployment_map
+    either as top level regions for a pipeline or stage specific regions
+    """
+    if pipeline.top_level_regions:
+        parameter_store.put_parameter(
+            f"/deployment/{pipeline.name}/regions",
+            str(list(set(pipeline.top_level_regions))),
+        )
+        return
+
+    parameter_store.put_parameter(
+        f"/deployment/{pipeline.name}/regions",
+        str(list(set(Pipeline.flatten_list(pipeline.stage_regions)))),
+    )
+
+
+def fetch_required_ssm_params(regions):
+    output = {}
+    for region in regions:
+        parameter_store = ParameterStore(region, boto3)
+        output[region] = {
+            "s3": parameter_store.fetch_parameter(
+                f"/cross_region/s3_regional_bucket/{region}"
+            ),
+            "kms": parameter_store.fetch_parameter(f"/cross_region/kms_arn/{region}"),
+        }
+        if region == DEPLOYMENT_ACCOUNT_REGION:
+            output[region]["modules"] = parameter_store.fetch_parameter(
+                "deployment_account_bucket"
+            )
+            output['default_scm_branch'] = parameter_store.fetch_parameter('default_scm_branch')
+    return output
+
+
+def generate_pipeline_inputs(pipeline, organizations, parameter_store):
+    data = {}
+    pipeline_object = Pipeline(pipeline)
+    regions = []
+    for target in pipeline.get("targets", []):
+        target_structure = TargetStructure(target)
+        for step in target_structure.target:
+            regions = step.get(
+                "regions", pipeline.get("regions", DEPLOYMENT_ACCOUNT_REGION)
+            )
+            paths_tags = []
+            for path in step.get("path", []):
+                paths_tags.append(path)
+            if step.get("tags") is not None:
+                paths_tags.append(step.get("tags", {}))
+            for path_or_tag in paths_tags:
+                pipeline_object.stage_regions.append(regions)
+                pipeline_target = Target(
+                    path_or_tag, target_structure, organizations, step, regions
+                )
+                pipeline_target.fetch_accounts_for_target()
+        # Targets should be a list of lists.
+
+        # Note: This is a big shift away from how ADF handles targets natively.
+        # Previously this would be a list of [accountId(s)] it now returns a list of [[account_ids], [account_ids]]
+        # for the sake of consistency we should probably think of a target consisting of multiple "waves". So if you see
+        # any reference to a wave going forward it will be the individual batch of account ids
+        pipeline_object.template_dictionary["targets"].append(
+            list(target_structure.generate_waves()),
+        )
+
+    if DEPLOYMENT_ACCOUNT_REGION not in regions:
+        pipeline_object.stage_regions.append(DEPLOYMENT_ACCOUNT_REGION)
+
+    pipeline_object.generate_input()
+    data["ssm_params"] = fetch_required_ssm_params(
+        pipeline_object.input["regions"] or [DEPLOYMENT_ACCOUNT_REGION]
+    )
+    data["input"] = pipeline_object.input
+    data['input']['default_scm_branch'] = data["ssm_params"].get('default_scm_branch')
+    store_regional_parameter_config(pipeline_object, parameter_store)
+    return data
+
+
+def lambda_handler(pipeline, _):
+    """Main Lambda Entry point"""
+    parameter_store = ParameterStore(DEPLOYMENT_ACCOUNT_REGION, boto3)
+    sts = STS()
+    role = sts.assume_cross_account_role(
+        f'arn:{get_partition(DEPLOYMENT_ACCOUNT_REGION)}:iam::{ROOT_ACCOUNT_ID}:role/{parameter_store.fetch_parameter("cross_account_access_role")}-readonly',
+        "pipeline",
+    )
+    organizations = Organizations(role)
+
+    output = generate_pipeline_inputs(pipeline, organizations, parameter_store)
+
+    return output