Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load-test: fix failing tests due to role session timeout, clean up code #676

Merged
merged 1 commit into from
Jun 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions buildspec_load_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,21 +52,13 @@ phases:
- python -m venv venv
- source venv/bin/activate
- pip install -r ./load_tests/requirements.txt
# AWS configure sts temp credentials
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may wonder, why did he remove this? Is this safe?

In load tests, everything runs via the python script, and as you can see in the other code changes, that code assumes the LOAD_TEST_CFN_ROLE_ARN role itself. This means we were previously unnecessarily chaining sessions (buildspec assumes the role, then the python uses that role to assume the same role).

- |
CREDS=`aws sts assume-role --role-arn ${LOAD_TEST_CFN_ROLE_ARN} --duration-seconds 900 --role-session-name load-test-cfn`
aws configure set aws_access_key_id `echo $CREDS | jq -r .Credentials.AccessKeyId`
aws configure set aws_secret_access_key `echo $CREDS | jq -r .Credentials.SecretAccessKey`
aws configure set aws_session_token `echo $CREDS | jq -r .Credentials.SessionToken`
- |
if [ "${PLATFORM}" == "EKS" ]; then
aws eks update-kubeconfig --name $EKS_CLUSTER_NAME
fi
# Create and set up related testing resources
- python ./load_tests/load_test.py create_testing_resources
- source ./load_tests/setup_test_environment.sh
# Remove sts temp credentials, delegate responsibility to py script
- rm ~/.aws/credentials
# Run load tests on corresponding platform
- python ./load_tests/load_test.py ${PLATFORM}
finally:
Expand Down
17 changes: 13 additions & 4 deletions load_tests/load_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,6 @@ def run_ecs_tests():
session = get_sts_boto_session()

client = session.client('ecs')
waiter = client.get_waiter('tasks_stopped')

processes = []

Comment on lines 200 to 205
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we till need this for loop after logic change? I don't understand what's the work for this for-loop.
At least should we remove processes = [] out of the for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh you're right, thanks, processes needs to move to below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This for loop runs the tasks, then 2nd one validates.

Expand All @@ -224,7 +223,13 @@ def run_ecs_tests():
# Validation input type banner
print(f'\nTest {input_logger["name"]} to {OUTPUT_PLUGIN} in progress...')

# wait for tasks and validate
for input_logger in INPUT_LOGGERS:
Comment on lines +226 to +227
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should speed up the tests slightly. If you read the lines above and below here, we first start the tasks for each input (tcp vs stdout) and then validate... but each of our tests uses a unique resource (log stream, firehose, kinesis, etc) for each test... so there's no reason why we can't start all of them first and then loop over them and check for results.

# Wait until task stops and start validation
session = get_sts_boto_session()

client = session.client('ecs')
waiter = client.get_waiter('tasks_stopped')
for throughput in THROUGHPUT_LIST:
task_arn = names[f'{OUTPUT_PLUGIN}_{throughput}_task_arn']
waiter.wait(
Expand Down Expand Up @@ -266,6 +271,9 @@ def run_ecs_tests():
os.environ['LOG_PREFIX'] = resource_resolver.get_destination_s3_prefix(test_configuration["input_configuration"], OUTPUT_PLUGIN)
os.environ['DESTINATION'] = 's3'

# integ test sessions have expired due to long waits for tasks to complete
session = get_sts_boto_session()

# Go script environment with sts cred variables
credentials = session.get_credentials()
auth_env = {
Expand All @@ -286,8 +294,8 @@ def run_ecs_tests():
for p in processes:
p["process"].wait()
stdout, stderr = p["process"].communicate()
print(f'raw validator stdout: {stdout}')
print(f'raw validator stderr: {stderr}')
print(f'{input_logger["name"]} to {OUTPUT_PLUGIN} raw validator stdout: {stdout}')
print(f'{input_logger["name"]} to {OUTPUT_PLUGIN} raw validator stderr: {stderr}')
p["result"] = stdout
print(f'Test {input_logger["name"]} to {OUTPUT_PLUGIN} complete.')

Expand Down Expand Up @@ -513,7 +521,8 @@ def get_sts_boto_session():
# ARN and a role session name.
assumed_role_object = sts_client.assume_role(
RoleArn=os.environ["LOAD_TEST_CFN_ROLE_ARN"],
RoleSessionName="load-test-cfn"
RoleSessionName="load-test-cfn",
DurationSeconds=3600
)

# From the response that contains the assumed role, get the temporary
Expand Down