chore(tests): refactor E2E test mechanics to ease maintenance, writing tests and parallelization #1444

heitorlessa · 2022-08-11T16:38:41Z

Issue number: #1435

Summary

This PR refactors the E2E test mechanism to make tests more explicit, easier to reason and write, and improves maintenance aspects of the CDK internal logic.

I’ll rename the V2 and delete the older one after all tests are rewritten (future PRs)

Changes

Please provide a summary of what's being changed

See new comments to prevent PR body become too big.

Remaining tasks

Update cold start test: Ensure cold start function is executed twice and a single metric is created
Encapsulate parallelization logic

User experience

Please share what the user experience looks like before and after this change

Checklist

If your change doesn't seem to apply, please leave them unchecked.

Meet tenets criteria
I have performed a self-review of this change
Changes have been tested
Changes are documented
PR title follows conventional commit semantics

Is this a breaking change?

RFC issue number:

Checklist:

Migration process documented
Implement warnings (if it can live side by side)

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

* develop: chore(ci): update changelog with latest changes docs(parser): minor grammar fix (aws-powertools#1427) chore(ci): update changelog with latest changes docs(apigateway): removes duplicate admonition (aws-powertools#1426) chore(ci): update changelog with latest changes docs(jmespath_util): snippets split, improved, and lint (aws-powertools#1419) chore(ci): reduce payload and only send prod notification

…ting test

Signed-off-by: heitorlessa <[email protected]>

* develop: chore(ci): update changelog with latest changes

heitorlessa · 2022-08-11T16:39:22Z

Encapsulates CDK Assets logic

I had challenges understanding our CDK drop-in replacement to CDK largely due to not understanding CDK Assets structure. For that reason, I've created:

Asset models: tests/e2e/utils/models.py contains Pydantic models with the internal structure of an Asset. This makes it easier to understand what's inside and also prevent common mistakes of accessing deeply nested dictionary keys manually
Encapsulate assets logic: tests/e2e/utils/asset.py contains all drop-in replacement logic under Assets and Asset. I simplified the code to foster readability (Path over os) and composition for maintenance.

Result

BEFORE

    def deploy(self, Stack: Type[BaseInfrastructureStack]) -> Dict[str, str]:

        stack = Stack(handlers_dir=self.handlers_dir, stack_name=self.stack_name, config=self.config)
        template, asset_root_dir, asset_manifest_file = stack()
        self._upload_assets(asset_root_dir, asset_manifest_file)

        response = self._deploy_stack(self.stack_name, template)

        return self._transform_output(response["Stacks"][0]["Outputs"])

AFTER

    def deploy(self) -> Dict[str, str]:
        """Creates CloudFormation Stack and return stack outputs as dict

        Returns
        -------
        Dict[str, str]
            CloudFormation Stack Outputs with output key and value
        """
        template, asset_manifest_file = self._synthesize()
        assets = Assets(cfn_template=asset_manifest_file, account_id=self.account_id, region=self.region)
        assets.upload()
        return self._deploy_stack(self.stack_name, template)

heitorlessa · 2022-08-11T16:39:29Z

Give infrastructure control to each feature group

This allows each feature group to control their infrastructure within tests/e2e/<feature-group>/infrastructure.py. For this PR, I've changed metrics so it's now independent from the stack resources that Tracer and Logger would create.

Result

BEFORE

tests/e2e/conftest.py

@dataclass
class InfrastructureOutput:
    arns: Dict[str, str]
    execution_time: datetime.datetime

    def get_lambda_arns(self) -> Dict[str, str]:
        return self.arns

    def get_lambda_function_arn(self, cf_output_name: str) -> Optional[str]:
        return self.arns.get(cf_output_name)

    def get_lambda_function_name(self, cf_output_name: str) -> Optional[str]:
        lambda_arn = self.get_lambda_function_arn(cf_output_name=cf_output_name)
        return lambda_arn.split(":")[-1] if lambda_arn else None

    def get_lambda_execution_time(self) -> datetime.datetime:
        return self.execution_time

    def get_lambda_execution_time_timestamp(self) -> int:
        return int(self.execution_time.timestamp() * 1000)


@pytest.fixture(scope="module")
def create_infrastructure(config, request) -> Generator[Dict[str, str], None, None]:
    stack_name = f"test-lambda-{uuid.uuid4()}"
    test_dir = request.fspath.dirname
    handlers_dir = f"{test_dir}/handlers/"

    infra = infrastructure.Infrastructure(stack_name=stack_name, handlers_dir=handlers_dir, config=config)
    yield infra.deploy(Stack=infrastructure.InfrastructureStack)
    infra.delete()

AFTER

tests/e2e/metrics/infrastructure.py

from pathlib import Path

from tests.e2e.utils.infrastructure import BaseInfrastructureV2


class MetricsStack(BaseInfrastructureV2):
    def __init__(self, handlers_dir: Path, feature_name: str = "metrics") -> None:
        super().__init__(feature_name, handlers_dir)

    def create_resources(self):
        self.create_lambda_functions()

This is now deployed within conftest.py for each feature. This fixture infrastructure now conveys a more explicit message that it contains the output of that infrastructure to be used within tests.

tests/e2e/metrics/conftest.py

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

heitorlessa · 2022-08-11T16:39:35Z

Make tests more explicit and easier to write

This change gives control back to each test definition to decide how they want to invoke their functions. It also makes it more explicit what information needs to be created as part of the test - explicit functions over fixtures to make it easier to trace the logic.

To confirm this was easier, I've also increased coverage to ensure Lambda cold starts are captured.

Result

BEFORE

import datetime
import uuid

import boto3
import pytest
from e2e import conftest
from e2e.utils import helpers


@pytest.fixture(scope="module")
def config() -> conftest.LambdaConfig:
    return {
        "parameters": {},
        "environment_variables": {
            "POWERTOOLS_METRICS_NAMESPACE": "powertools-e2e-metric",
            "POWERTOOLS_SERVICE_NAME": "test-powertools-service",
            "METRIC_NAME": f"business-metric-{str(uuid.uuid4()).replace('-','_')}",
        },
    }


def test_basic_lambda_metric_visible(execute_lambda: conftest.InfrastructureOutput, config: conftest.LambdaConfig):
    # GIVEN
    start_date = execute_lambda.get_lambda_execution_time()
    end_date = start_date + datetime.timedelta(minutes=5)

    # WHEN
    metrics = helpers.get_metrics(
        start_date=start_date,
        end_date=end_date,
        namespace=config["environment_variables"]["POWERTOOLS_METRICS_NAMESPACE"],
        metric_name=config["environment_variables"]["METRIC_NAME"],
        service_name=config["environment_variables"]["POWERTOOLS_SERVICE_NAME"],
        cw_client=boto3.client(service_name="cloudwatch"),
    )

    # THEN
    assert metrics.get("Timestamps") and len(metrics.get("Timestamps")) == 1
    assert metrics.get("Values") and len(metrics.get("Values")) == 1
    assert metrics.get("Values") and metrics.get("Values")[0] == 1

AFTER

import json

import pytest

from tests.e2e.utils import helpers


@pytest.fixture
def basic_handler_fn(infrastructure: dict) -> str:
    return infrastructure.get("BasicHandler", "")


@pytest.fixture
def basic_handler_fn_arn(infrastructure: dict) -> str:
    return infrastructure.get("BasicHandlerArn", "")


@pytest.fixture
def cold_start_fn(infrastructure: dict) -> str:
    return infrastructure.get("ColdStart", "")


@pytest.fixture
def cold_start_fn_arn(infrastructure: dict) -> str:
    return infrastructure.get("ColdStartArn", "")


METRIC_NAMESPACE = "powertools-e2e-metric"


def test_basic_lambda_metric_is_visible(basic_handler_fn: str, basic_handler_fn_arn: str):
    # GIVEN
    metric_name = helpers.build_metric_name()
    service = helpers.build_service_name()
    dimensions = helpers.build_add_dimensions_input(service=service)
    metrics = helpers.build_multiple_add_metric_input(metric_name=metric_name, value=1, quantity=3)

    # WHEN
    event = json.dumps({"metrics": metrics, "service": service, "namespace": METRIC_NAMESPACE})
    _, execution_time = helpers.trigger_lambda(lambda_arn=basic_handler_fn_arn, payload=event)

    metrics = helpers.get_metrics(
        namespace=METRIC_NAMESPACE, start_date=execution_time, metric_name=metric_name, dimensions=dimensions
    )

    # THEN
    metric_data = metrics.get("Values", [])
    assert metric_data and metric_data[0] == 3.0


def test_cold_start_metric(cold_start_fn_arn: str, cold_start_fn: str):
    # GIVEN
    metric_name = "ColdStart"
    service = helpers.build_service_name()
    dimensions = helpers.build_add_dimensions_input(function_name=cold_start_fn, service=service)

    # WHEN
    event = json.dumps({"service": service, "namespace": METRIC_NAMESPACE})
    _, execution_time = helpers.trigger_lambda(lambda_arn=cold_start_fn_arn, payload=event)

    metrics = helpers.get_metrics(
        namespace=METRIC_NAMESPACE, start_date=execution_time, metric_name=metric_name, dimensions=dimensions
    )

    # THEN
    metric_data = metrics.get("Values", [])
    assert metric_data and metric_data[0] == 1.0

heitorlessa · 2022-08-11T16:42:57Z

Parallelize tests

As of today, we parallelize CFN Stack creation but tests are run sequentially. Now that each feature group can control their infrastructure, this change parallelize tests too in a safely manner. It also handles a leftover to delete stacks in the case of failure (e.g, test runner, exceptions, etc.).

This also takes into account Step Through debugging in any IDE allowing to run one test at a time or in parallel.

Better yet, our tests were taking 175s before and now it takes ~94s even with an additional Lambda function.

Result

BEFORE

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

AFTER

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        if worker_id == "master":
            # no parallelization, deploy stack and let fixture be cached
            yield stack.deploy()
        else:
            # tmp dir shared by all workers
            root_tmp_dir = tmp_path_factory.getbasetemp().parent

            cache = root_tmp_dir / "cache.json"
            with FileLock(f"{cache}.lock"):
                # If cache exists, return stack outputs back
                # otherwise it's the first run by the main worker
                # deploy and return stack outputs so subsequent workers can reuse
                if cache.is_file():
                    stack_outputs = json.loads(cache.read_text())
                else:
                    stack_outputs: Dict = stack.deploy()
                    cache.write_text(json.dumps(stack_outputs))
            yield stack_outputs
    finally:
        stack.delete()

* develop: chore(deps-dev): bump types-requests from 2.28.7 to 2.28.8 (aws-powertools#1423) chore(ci): update changelog with latest changes fix(jmespath_util): snappy as dev dep and typing example (aws-powertools#1446) Signed-off-by: heitorlessa <[email protected]>

tests/e2e/metrics/conftest.py

tests/e2e/metrics/infrastructure.py

tests/e2e/metrics/test_metrics.py

tests/e2e/utils/helpers.py

tests/e2e/utils/infrastructure.py

mploski · 2022-08-12T10:38:16Z

Parallelize tests

As of today, we parallelize CFN Stack creation but tests are run sequentially. Now that each feature group can control their infrastructure, this change parallelize tests too in a safely manner. It also handles a leftover to delete stacks in the case of failure (e.g, test runner, exceptions, etc.).

This also takes into account Step Through debugging in any IDE allowing to run one test at a time or in parallel.

Better yet, our tests were taking 175s before and now it takes ~94s even with an additional Lambda function.

Result

BEFORE

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

AFTER

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        if worker_id == "master":
            # no parallelization, deploy stack and let fixture be cached
            yield stack.deploy()
        else:
            # tmp dir shared by all workers
            root_tmp_dir = tmp_path_factory.getbasetemp().parent

            cache = root_tmp_dir / "cache.json"
            with FileLock(f"{cache}.lock"):
                # If cache exists, return stack outputs back
                # otherwise it's the first run by the main worker
                # deploy and return stack outputs so subsequent workers can reuse
                if cache.is_file():
                    stack_outputs = json.loads(cache.read_text())
                else:
                    stack_outputs: Dict = stack.deploy()
                    cache.write_text(json.dumps(stack_outputs))
            yield stack_outputs
    finally:
        stack.delete()

One remark here :-) In E2E tests version 1, tests took 175 seconds because we are building powertools layer at the test beginning. This step is heavy and takes visible portion of overall test latency. In this PR you use already deployed layer so those numbers are not comparable. My assumption is that overall numbers will be similar at the end as you compensated lack of lambda result caching with parallelization

codecov-commenter · 2022-08-12T12:50:09Z

Codecov Report

Merging #1444 (f5c0bf4) into develop (0e20ee0) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff            @@
##           develop    #1444   +/-   ##
========================================
  Coverage    99.89%   99.89%           
========================================
  Files          121      121           
  Lines         5479     5479           
  Branches       627      627           
========================================
  Hits          5473     5473           
  Misses           2        2           
  Partials         4        4

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

heitorlessa · 2022-08-12T14:15:28Z

Parallelize tests

As of today, we parallelize CFN Stack creation but tests are run sequentially. Now that each feature group can control their infrastructure, this change parallelize tests too in a safely manner. It also handles a leftover to delete stacks in the case of failure (e.g, test runner, exceptions, etc.).
This also takes into account Step Through debugging in any IDE allowing to run one test at a time or in parallel.
Better yet, our tests were taking 175s before and now it takes ~94s even with an additional Lambda function.

Result

BEFORE

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

AFTER

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        if worker_id == "master":
            # no parallelization, deploy stack and let fixture be cached
            yield stack.deploy()
        else:
            # tmp dir shared by all workers
            root_tmp_dir = tmp_path_factory.getbasetemp().parent

            cache = root_tmp_dir / "cache.json"
            with FileLock(f"{cache}.lock"):
                # If cache exists, return stack outputs back
                # otherwise it's the first run by the main worker
                # deploy and return stack outputs so subsequent workers can reuse
                if cache.is_file():
                    stack_outputs = json.loads(cache.read_text())
                else:
                    stack_outputs: Dict = stack.deploy()
                    cache.write_text(json.dumps(stack_outputs))
            yield stack_outputs
    finally:
        stack.delete()

One remark here :-) In E2E tests version 1, tests took 175 seconds because we are building powertools layer at the test beginning. This step is heavy and takes visible portion of overall test latency. In this PR you use already deployed layer so those numbers are not comparable. My assumption is that overall numbers will be similar at the end as you compensated lack of lambda result caching with parallelization

So true, I missed that. Fat Layer eh? :D

…g tests and parallelization (aws-powertools#1444)

heitorlessa added 14 commits August 7, 2022 18:42

refactor(e2e): encapsulate assets logic; add sample stack

197eb7a

chore(dep): remove flake8-black due to intermittent preview errors

feaa491

refactor: setup infra fixture

18dadc8

refactor: dummy test using new explicit infra

6de3ce2

refactor: update test_metrics to use new verbose infra

0280d9f

refactor(helpers): create builders for data and types to simplify wri…

48085d5

…ting test

refactor(metrics): make test more explicit

54e9550

feat(metrics): add cold start test

4413844

feat(metrics): safely paralelize all defined tests per module

f37e988

refactor(helpers): set good default on end_date metrics; fix yield

074ff4f

feat: create BaseInfrastructureV2 to demonstrate before/after

c5f8e4e

Signed-off-by: heitorlessa <[email protected]>

Merge branch 'develop' into chore/refactor-e2e

47f41e6

* develop: chore(ci): update changelog with latest changes

chore: document remaining changes, cleanup

13df083

heitorlessa requested a review from a team as a code owner August 11, 2022 16:38

heitorlessa requested review from rubenfonseca and removed request for a team August 11, 2022 16:38

boring-cyborg bot added dependencies Pull requests that update a dependency file tests labels Aug 11, 2022

pull-request-size bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 11, 2022

heitorlessa requested review from mploski and removed request for rubenfonseca August 11, 2022 16:38

github-actions bot added the enhancement label Aug 11, 2022

heitorlessa changed the title ~~refactor(e2e): encapsulate assets logic; add sample stack~~ chore(tests): refactor end-to-end test mechanism to ease maintenance and parallelization Aug 11, 2022

heitorlessa changed the title ~~chore(tests): refactor end-to-end test mechanism to ease maintenance and parallelization~~ chore(tests): refactor E2E test mechanics to ease maintenance and parallelization Aug 11, 2022

heitorlessa changed the title ~~chore(tests): refactor E2E test mechanics to ease maintenance and parallelization~~ chore(tests): refactor E2E test mechanics to ease maintenance, test writing and parallelization Aug 11, 2022

heitorlessa changed the title ~~chore(tests): refactor E2E test mechanics to ease maintenance, test writing and parallelization~~ chore(tests): refactor E2E test mechanics to ease maintenance, writing tests and parallelization Aug 11, 2022

github-actions bot added the internal Maintenance changes label Aug 11, 2022

heitorlessa added 2 commits August 11, 2022 19:00

chore: ensure cold start fn is invoked twice for test

0ba7678