Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use one forge namespace for each branch #12633

Merged
merged 1 commit into from
Apr 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 29 additions & 14 deletions .github/workflows/forge-stable.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ jobs:
outputs:
IMAGE_TAG: ${{ steps.get-docker-image-tag.outputs.IMAGE_TAG }}
BRANCH: ${{ steps.determine-test-branch.outputs.BRANCH }}
BRANCH_HASH: ${{ steps.hash-branch.outputs.BRANCH_HASH }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using branch hash instead of just branch name?

Is it because the length would exceed the underlying kubernetes namespace length? (we might also wanna enforce that namespace length in the forge.py so we fail earlier)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was running into issues when the namespace was too long. I can add a comment to document this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just fail forge.py, otherwise you end up failing later with a strange error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment and an assertion for namespace length to forge.py

steps:
- uses: actions/checkout@v3

Expand All @@ -69,6 +70,20 @@ jobs:
echo "BRANCH=${{ inputs.GIT_SHA }}" >> $GITHUB_OUTPUT
fi

# Use the branch hash instead of the full branch name to stay under kubernetes namespace length limit
- name: Hash the branch
id: hash-branch
run: |
# If BRANCH is empty, default to "main"
if [ -z "${{ steps.determine-test-branch.outputs.BRANCH }}" ]; then
BRANCH="main"
else
BRANCH="${{ steps.determine-test-branch.outputs.BRANCH }}"
fi

# Hashing the branch name
echo "BRANCH_HASH=$(echo -n "$BRANCH" | sha256sum | cut -c1-10)" >> $GITHUB_OUTPUT

- uses: aptos-labs/aptos-core/.github/actions/check-aptos-core@main
with:
cancel-workflow: ${{ github.event_name == 'schedule' }} # Cancel the workflow if it is scheduled on a fork
Expand Down Expand Up @@ -110,7 +125,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-max-load-long-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-max-load-long-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 7200 # Run for 2 hours
FORGE_TEST_SUITE: realistic_env_max_load_large
POST_TO_SLACK: true
Expand All @@ -122,7 +137,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-load-sweep-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-load-sweep-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 1500 # Run for 25 minutes (5 tests, each for 300 seconds)
FORGE_TEST_SUITE: realistic_env_load_sweep
POST_TO_SLACK: true
Expand All @@ -134,7 +149,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-workload-sweep-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-workload-sweep-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 1600 # Run for 26 minutes (4 tests, each for 400 seconds)
FORGE_TEST_SUITE: realistic_env_workload_sweep
POST_TO_SLACK: true
Expand All @@ -146,7 +161,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-graceful-overload-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-env-graceful-overload-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 1200 # Run for 20 minutes
FORGE_TEST_SUITE: realistic_env_graceful_overload
POST_TO_SLACK: true
Expand All @@ -158,7 +173,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-network-tuned-for-throughput-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-realistic-network-tuned-for-throughput-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 900 # Run for 15 minutes
FORGE_TEST_SUITE: realistic_network_tuned_for_throughput
FORGE_ENABLE_PERFORMANCE: true
Expand All @@ -173,7 +188,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-consensus-stress-test-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-consensus-stress-test-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 2400 # Run for 40 minutes
FORGE_TEST_SUITE: consensus_stress_test
POST_TO_SLACK: true
Expand All @@ -185,7 +200,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-workload-mix-test-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-workload-mix-test-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 900 # Run for 15 minutes
FORGE_TEST_SUITE: workload_mix
POST_TO_SLACK: true
Expand All @@ -197,7 +212,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-continuous-e2e-single-vfn-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-continuous-e2e-single-vfn-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 480 # Run for 8 minutes
FORGE_TEST_SUITE: single_vfn_perf
POST_TO_SLACK: true
Expand All @@ -209,7 +224,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-haproxy-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-haproxy-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 600 # Run for 10 minutes
FORGE_ENABLE_HAPROXY: true
FORGE_TEST_SUITE: realistic_env_max_load
Expand All @@ -222,7 +237,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-fullnode-reboot-stress-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-fullnode-reboot-stress-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 1800 # Run for 30 minutes
FORGE_TEST_SUITE: fullnode_reboot_stress_test
POST_TO_SLACK: true
Expand All @@ -235,7 +250,7 @@ jobs:
uses: aptos-labs/aptos-core/.github/workflows/workflow-run-forge.yaml@main
secrets: inherit
with:
FORGE_NAMESPACE: forge-compat-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-compat-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 300 # Run for 5 minutes
# This will upgrade from testnet branch to the latest main
FORGE_TEST_SUITE: compat
Expand All @@ -252,7 +267,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-changing-working-quorum-test-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-changing-working-quorum-test-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 1200 # Run for 20 minutes
FORGE_TEST_SUITE: changing_working_quorum_test
POST_TO_SLACK: true
Expand All @@ -265,7 +280,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-changing-working-quorum-test-high-load-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-changing-working-quorum-test-high-load-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 900 # Run for 15 minutes
FORGE_TEST_SUITE: changing_working_quorum_test_high_load
POST_TO_SLACK: true
Expand All @@ -279,7 +294,7 @@ jobs:
secrets: inherit
with:
IMAGE_TAG: ${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-pfn-const-tps-with-realistic-env-${{ needs.determine-test-metadata.outputs.IMAGE_TAG }}
FORGE_NAMESPACE: forge-pfn-const-tps-with-realistic-env-${{ needs.determine-test-metadata.outputs.BRANCH_HASH }}
FORGE_RUNNER_DURATION_SECS: 900 # Run for 15 minutes
FORGE_TEST_SUITE: pfn_const_tps_with_realistic_env
POST_TO_SLACK: true
8 changes: 7 additions & 1 deletion testsuite/forge.py
Original file line number Diff line number Diff line change
Expand Up @@ -1249,6 +1249,11 @@ async def run_multiple(
)


def seeded_random_choice(namespace: str, cluster_names: Sequence[str]) -> str:
random.seed(namespace)
return random.choice(cluster_names)


@main.command()
# output files
@envoption("FORGE_OUTPUT")
Expand Down Expand Up @@ -1366,6 +1371,7 @@ def test(
forge_namespace = f"forge-{processes.user()}-{time.epoch()}"

assert forge_namespace is not None, "Forge namespace is required"
assert len(forge_namespace) <= 63, "Forge namespace must be 63 characters or less"

forge_namespace = sanitize_forge_resource_name(forge_namespace)

Expand Down Expand Up @@ -1418,7 +1424,7 @@ def test(
# Perform cluster selection
if not forge_cluster_name or balance_clusters:
cluster_names = config.get("enabled_clusters")
forge_cluster_name = random.choice(cluster_names)
forge_cluster_name = seeded_random_choice(forge_namespace, cluster_names)

assert forge_cluster_name, "Forge cluster name is required"

Expand Down
Loading