fix(ci): calculate parallel jobs based on infrastructure needs #1475
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue number: #1435
Summary
This addresses the unbalance work split across CPU Cores to coordinate CloudFormation stacks seen on: https://github.com/awslabs/aws-lambda-powertools-python/actions/runs/2917670742
Successful test after the change: https://github.com/awslabs/aws-lambda-powertools-python/actions/runs/2919001216
Background
The issue was that we automatically detected the number of CPU Cores and distributed jobs across them. It worked fine when we had a single interpreter when running locally. When we introduced a matrix CI with multiple interpreters, we had a mismatch of job queue per core - here's the scenario:
This causes the last jobs to fail for each additional interpreter because Layer stacks gets deleted and they can't complete creating Lambda Function, since the Layer no longer exists.
Changes
I created
parallel_run_e2e.py
to calculate how many infrastructure jobs must be scheduled per worker regardless of how many CPU Cores a machine has. It uses ourinfrastructure.py
convention - 1 (root e2e).I've also included
tox.ini
should we ever run into a similar situation with multiple interpreters and need to reproduce it locally. I didn't add dependencies but added notes on how to run if anyone ever needs it.User experience
Checklist
If your change doesn't seem to apply, please leave them unchecked.
Is this a breaking change?
RFC issue number:
Checklist:
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.