Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(tests): refactor E2E test mechanics to ease maintenance, writing tests and parallelization #1444

Merged
merged 20 commits into from
Aug 12, 2022

Conversation

heitorlessa
Copy link
Contributor

@heitorlessa heitorlessa commented Aug 11, 2022

Issue number: #1435

Summary

This PR refactors the E2E test mechanism to make tests more explicit, easier to reason and write, and improves maintenance aspects of the CDK internal logic.

I’ll rename the V2 and delete the older one after all tests are rewritten (future PRs)

Changes

Please provide a summary of what's being changed

See new comments to prevent PR body become too big.

Remaining tasks

  • Update cold start test: Ensure cold start function is executed twice and a single metric is created
  • Encapsulate parallelization logic

User experience

Please share what the user experience looks like before and after this change

Checklist

If your change doesn't seem to apply, please leave them unchecked.

Is this a breaking change?

RFC issue number:

Checklist:

  • Migration process documented
  • Implement warnings (if it can live side by side)

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

@heitorlessa heitorlessa requested a review from a team as a code owner August 11, 2022 16:38
@heitorlessa heitorlessa requested review from rubenfonseca and removed request for a team August 11, 2022 16:38
@boring-cyborg boring-cyborg bot added dependencies Pull requests that update a dependency file tests labels Aug 11, 2022
@pull-request-size pull-request-size bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 11, 2022
@heitorlessa heitorlessa requested review from mploski and removed request for rubenfonseca August 11, 2022 16:38
@heitorlessa
Copy link
Contributor Author

Encapsulates CDK Assets logic

I had challenges understanding our CDK drop-in replacement to CDK largely due to not understanding CDK Assets structure. For that reason, I've created:

  • Asset models: tests/e2e/utils/models.py contains Pydantic models with the internal structure of an Asset. This makes it easier to understand what's inside and also prevent common mistakes of accessing deeply nested dictionary keys manually
  • Encapsulate assets logic: tests/e2e/utils/asset.py contains all drop-in replacement logic under Assets and Asset. I simplified the code to foster readability (Path over os) and composition for maintenance.

Result

BEFORE

    def deploy(self, Stack: Type[BaseInfrastructureStack]) -> Dict[str, str]:

        stack = Stack(handlers_dir=self.handlers_dir, stack_name=self.stack_name, config=self.config)
        template, asset_root_dir, asset_manifest_file = stack()
        self._upload_assets(asset_root_dir, asset_manifest_file)

        response = self._deploy_stack(self.stack_name, template)

        return self._transform_output(response["Stacks"][0]["Outputs"])

AFTER

    def deploy(self) -> Dict[str, str]:
        """Creates CloudFormation Stack and return stack outputs as dict

        Returns
        -------
        Dict[str, str]
            CloudFormation Stack Outputs with output key and value
        """
        template, asset_manifest_file = self._synthesize()
        assets = Assets(cfn_template=asset_manifest_file, account_id=self.account_id, region=self.region)
        assets.upload()
        return self._deploy_stack(self.stack_name, template)

@heitorlessa
Copy link
Contributor Author

Give infrastructure control to each feature group

This allows each feature group to control their infrastructure within tests/e2e/<feature-group>/infrastructure.py. For this PR, I've changed metrics so it's now independent from the stack resources that Tracer and Logger would create.

Result

BEFORE

tests/e2e/conftest.py

@dataclass
class InfrastructureOutput:
    arns: Dict[str, str]
    execution_time: datetime.datetime

    def get_lambda_arns(self) -> Dict[str, str]:
        return self.arns

    def get_lambda_function_arn(self, cf_output_name: str) -> Optional[str]:
        return self.arns.get(cf_output_name)

    def get_lambda_function_name(self, cf_output_name: str) -> Optional[str]:
        lambda_arn = self.get_lambda_function_arn(cf_output_name=cf_output_name)
        return lambda_arn.split(":")[-1] if lambda_arn else None

    def get_lambda_execution_time(self) -> datetime.datetime:
        return self.execution_time

    def get_lambda_execution_time_timestamp(self) -> int:
        return int(self.execution_time.timestamp() * 1000)


@pytest.fixture(scope="module")
def create_infrastructure(config, request) -> Generator[Dict[str, str], None, None]:
    stack_name = f"test-lambda-{uuid.uuid4()}"
    test_dir = request.fspath.dirname
    handlers_dir = f"{test_dir}/handlers/"

    infra = infrastructure.Infrastructure(stack_name=stack_name, handlers_dir=handlers_dir, config=config)
    yield infra.deploy(Stack=infrastructure.InfrastructureStack)
    infra.delete()

AFTER

tests/e2e/metrics/infrastructure.py

from pathlib import Path

from tests.e2e.utils.infrastructure import BaseInfrastructureV2


class MetricsStack(BaseInfrastructureV2):
    def __init__(self, handlers_dir: Path, feature_name: str = "metrics") -> None:
        super().__init__(feature_name, handlers_dir)

    def create_resources(self):
        self.create_lambda_functions()

This is now deployed within conftest.py for each feature. This fixture infrastructure now conveys a more explicit message that it contains the output of that infrastructure to be used within tests.

tests/e2e/metrics/conftest.py

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

@heitorlessa
Copy link
Contributor Author

Make tests more explicit and easier to write

This change gives control back to each test definition to decide how they want to invoke their functions. It also makes it more explicit what information needs to be created as part of the test - explicit functions over fixtures to make it easier to trace the logic.

To confirm this was easier, I've also increased coverage to ensure Lambda cold starts are captured.

Result

BEFORE

import datetime
import uuid

import boto3
import pytest
from e2e import conftest
from e2e.utils import helpers


@pytest.fixture(scope="module")
def config() -> conftest.LambdaConfig:
    return {
        "parameters": {},
        "environment_variables": {
            "POWERTOOLS_METRICS_NAMESPACE": "powertools-e2e-metric",
            "POWERTOOLS_SERVICE_NAME": "test-powertools-service",
            "METRIC_NAME": f"business-metric-{str(uuid.uuid4()).replace('-','_')}",
        },
    }


def test_basic_lambda_metric_visible(execute_lambda: conftest.InfrastructureOutput, config: conftest.LambdaConfig):
    # GIVEN
    start_date = execute_lambda.get_lambda_execution_time()
    end_date = start_date + datetime.timedelta(minutes=5)

    # WHEN
    metrics = helpers.get_metrics(
        start_date=start_date,
        end_date=end_date,
        namespace=config["environment_variables"]["POWERTOOLS_METRICS_NAMESPACE"],
        metric_name=config["environment_variables"]["METRIC_NAME"],
        service_name=config["environment_variables"]["POWERTOOLS_SERVICE_NAME"],
        cw_client=boto3.client(service_name="cloudwatch"),
    )

    # THEN
    assert metrics.get("Timestamps") and len(metrics.get("Timestamps")) == 1
    assert metrics.get("Values") and len(metrics.get("Values")) == 1
    assert metrics.get("Values") and metrics.get("Values")[0] == 1

AFTER

import json

import pytest

from tests.e2e.utils import helpers


@pytest.fixture
def basic_handler_fn(infrastructure: dict) -> str:
    return infrastructure.get("BasicHandler", "")


@pytest.fixture
def basic_handler_fn_arn(infrastructure: dict) -> str:
    return infrastructure.get("BasicHandlerArn", "")


@pytest.fixture
def cold_start_fn(infrastructure: dict) -> str:
    return infrastructure.get("ColdStart", "")


@pytest.fixture
def cold_start_fn_arn(infrastructure: dict) -> str:
    return infrastructure.get("ColdStartArn", "")


METRIC_NAMESPACE = "powertools-e2e-metric"


def test_basic_lambda_metric_is_visible(basic_handler_fn: str, basic_handler_fn_arn: str):
    # GIVEN
    metric_name = helpers.build_metric_name()
    service = helpers.build_service_name()
    dimensions = helpers.build_add_dimensions_input(service=service)
    metrics = helpers.build_multiple_add_metric_input(metric_name=metric_name, value=1, quantity=3)

    # WHEN
    event = json.dumps({"metrics": metrics, "service": service, "namespace": METRIC_NAMESPACE})
    _, execution_time = helpers.trigger_lambda(lambda_arn=basic_handler_fn_arn, payload=event)

    metrics = helpers.get_metrics(
        namespace=METRIC_NAMESPACE, start_date=execution_time, metric_name=metric_name, dimensions=dimensions
    )

    # THEN
    metric_data = metrics.get("Values", [])
    assert metric_data and metric_data[0] == 3.0


def test_cold_start_metric(cold_start_fn_arn: str, cold_start_fn: str):
    # GIVEN
    metric_name = "ColdStart"
    service = helpers.build_service_name()
    dimensions = helpers.build_add_dimensions_input(function_name=cold_start_fn, service=service)

    # WHEN
    event = json.dumps({"service": service, "namespace": METRIC_NAMESPACE})
    _, execution_time = helpers.trigger_lambda(lambda_arn=cold_start_fn_arn, payload=event)

    metrics = helpers.get_metrics(
        namespace=METRIC_NAMESPACE, start_date=execution_time, metric_name=metric_name, dimensions=dimensions
    )

    # THEN
    metric_data = metrics.get("Values", [])
    assert metric_data and metric_data[0] == 1.0

@heitorlessa
Copy link
Contributor Author

heitorlessa commented Aug 11, 2022

Parallelize tests

As of today, we parallelize CFN Stack creation but tests are run sequentially. Now that each feature group can control their infrastructure, this change parallelize tests too in a safely manner. It also handles a leftover to delete stacks in the case of failure (e.g, test runner, exceptions, etc.).

This also takes into account Step Through debugging in any IDE allowing to run one test at a time or in parallel.

Better yet, our tests were taking 175s before and now it takes ~94s even with an additional Lambda function.

Result

BEFORE

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

AFTER

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        if worker_id == "master":
            # no parallelization, deploy stack and let fixture be cached
            yield stack.deploy()
        else:
            # tmp dir shared by all workers
            root_tmp_dir = tmp_path_factory.getbasetemp().parent

            cache = root_tmp_dir / "cache.json"
            with FileLock(f"{cache}.lock"):
                # If cache exists, return stack outputs back
                # otherwise it's the first run by the main worker
                # deploy and return stack outputs so subsequent workers can reuse
                if cache.is_file():
                    stack_outputs = json.loads(cache.read_text())
                else:
                    stack_outputs: Dict = stack.deploy()
                    cache.write_text(json.dumps(stack_outputs))
            yield stack_outputs
    finally:
        stack.delete()

@heitorlessa heitorlessa changed the title refactor(e2e): encapsulate assets logic; add sample stack chore(tests): refactor end-to-end test mechanism to ease maintenance and parallelization Aug 11, 2022
@heitorlessa heitorlessa changed the title chore(tests): refactor end-to-end test mechanism to ease maintenance and parallelization chore(tests): refactor E2E test mechanics to ease maintenance and parallelization Aug 11, 2022
@heitorlessa heitorlessa changed the title chore(tests): refactor E2E test mechanics to ease maintenance and parallelization chore(tests): refactor E2E test mechanics to ease maintenance, test writing and parallelization Aug 11, 2022
@heitorlessa heitorlessa changed the title chore(tests): refactor E2E test mechanics to ease maintenance, test writing and parallelization chore(tests): refactor E2E test mechanics to ease maintenance, writing tests and parallelization Aug 11, 2022
@github-actions github-actions bot added the internal Maintenance changes label Aug 11, 2022
* develop:
  chore(deps-dev): bump types-requests from 2.28.7 to 2.28.8 (aws-powertools#1423)
  chore(ci): update changelog with latest changes
  fix(jmespath_util): snappy as dev dep and typing example (aws-powertools#1446)

Signed-off-by: heitorlessa <[email protected]>
tests/e2e/utils/helpers.py Outdated Show resolved Hide resolved
@mploski
Copy link
Contributor

mploski commented Aug 12, 2022

Parallelize tests

As of today, we parallelize CFN Stack creation but tests are run sequentially. Now that each feature group can control their infrastructure, this change parallelize tests too in a safely manner. It also handles a leftover to delete stacks in the case of failure (e.g, test runner, exceptions, etc.).

This also takes into account Step Through debugging in any IDE allowing to run one test at a time or in parallel.

Better yet, our tests were taking 175s before and now it takes ~94s even with an additional Lambda function.

Result

BEFORE

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

AFTER

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        if worker_id == "master":
            # no parallelization, deploy stack and let fixture be cached
            yield stack.deploy()
        else:
            # tmp dir shared by all workers
            root_tmp_dir = tmp_path_factory.getbasetemp().parent

            cache = root_tmp_dir / "cache.json"
            with FileLock(f"{cache}.lock"):
                # If cache exists, return stack outputs back
                # otherwise it's the first run by the main worker
                # deploy and return stack outputs so subsequent workers can reuse
                if cache.is_file():
                    stack_outputs = json.loads(cache.read_text())
                else:
                    stack_outputs: Dict = stack.deploy()
                    cache.write_text(json.dumps(stack_outputs))
            yield stack_outputs
    finally:
        stack.delete()

One remark here :-) In E2E tests version 1, tests took 175 seconds because we are building powertools layer at the test beginning. This step is heavy and takes visible portion of overall test latency. In this PR you use already deployed layer so those numbers are not comparable. My assumption is that overall numbers will be similar at the end as you compensated lack of lambda result caching with parallelization

@codecov-commenter
Copy link

Codecov Report

Merging #1444 (f5c0bf4) into develop (0e20ee0) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff            @@
##           develop    #1444   +/-   ##
========================================
  Coverage    99.89%   99.89%           
========================================
  Files          121      121           
  Lines         5479     5479           
  Branches       627      627           
========================================
  Hits          5473     5473           
  Misses           2        2           
  Partials         4        4           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@heitorlessa
Copy link
Contributor Author

Parallelize tests

As of today, we parallelize CFN Stack creation but tests are run sequentially. Now that each feature group can control their infrastructure, this change parallelize tests too in a safely manner. It also handles a leftover to delete stacks in the case of failure (e.g, test runner, exceptions, etc.).
This also takes into account Step Through debugging in any IDE allowing to run one test at a time or in parallel.
Better yet, our tests were taking 175s before and now it takes ~94s even with an additional Lambda function.

Result

BEFORE

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        yield stack.deploy()
    finally:
        stack.delete()

AFTER

import json
from pathlib import Path
from typing import Dict

import pytest
from _pytest import fixtures
from filelock import FileLock

from tests.e2e.metrics.infrastructure import MetricsStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure(request: fixtures.SubRequest, tmp_path_factory: pytest.TempPathFactory, worker_id) -> MetricsStack:
    """Setup and teardown logic for E2E test infrastructure

    Parameters
    ----------
    request : fixtures.SubRequest
        test fixture containing metadata about test execution

    Returns
    -------
    MetricsStack
        Metrics Stack to deploy infrastructure

    Yields
    ------
    Iterator[MetricsStack]
        Deployed Infrastructure
    """
    stack = MetricsStack(handlers_dir=Path(f"{request.fspath.dirname}/handlers"))
    try:
        if worker_id == "master":
            # no parallelization, deploy stack and let fixture be cached
            yield stack.deploy()
        else:
            # tmp dir shared by all workers
            root_tmp_dir = tmp_path_factory.getbasetemp().parent

            cache = root_tmp_dir / "cache.json"
            with FileLock(f"{cache}.lock"):
                # If cache exists, return stack outputs back
                # otherwise it's the first run by the main worker
                # deploy and return stack outputs so subsequent workers can reuse
                if cache.is_file():
                    stack_outputs = json.loads(cache.read_text())
                else:
                    stack_outputs: Dict = stack.deploy()
                    cache.write_text(json.dumps(stack_outputs))
            yield stack_outputs
    finally:
        stack.delete()

One remark here :-) In E2E tests version 1, tests took 175 seconds because we are building powertools layer at the test beginning. This step is heavy and takes visible portion of overall test latency. In this PR you use already deployed layer so those numbers are not comparable. My assumption is that overall numbers will be similar at the end as you compensated lack of lambda result caching with parallelization

So true, I missed that. Fat Layer eh? :D

@heitorlessa heitorlessa merged commit 3464ee9 into aws-powertools:develop Aug 12, 2022
@heitorlessa heitorlessa deleted the chore/refactor-e2e branch August 12, 2022 14:20
heitorlessa added a commit to heitorlessa/aws-lambda-powertools-python that referenced this pull request Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement internal Maintenance changes size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants