Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add failover to on-demand in case request is failing #3409

Merged
merged 3 commits into from
Nov 29, 2023

Conversation

npalm
Copy link
Member

@npalm npalm commented Aug 4, 2023

Adding the option to create on-demand instances in case spot is failing.

Problem

This module either support spot or on-demand instances. When using spot (default), creation can fail with several reasons. In case there is no sufficient capacity on AWS it makes sens to request on-demand instances. AWS does not support this kind of requests via the fleet API. This PR addresses this problme and add the option (opt-in_ to create on-demand instances in case of Insufficient capacity.

Migrations

No migrations required

Opt-in

Opt in by setting the variableenable_runner_on_demand_failover

Verfication

Not easy to test the failover. But due to changes in multi-runner, runner module as well lambda scale-up and pool. The following checks are required

  • Ephemeral example with pool
  • Multi runner example

@npalm npalm marked this pull request as draft August 4, 2023 13:10
@npalm npalm marked this pull request as ready for review August 4, 2023 13:44
Copy link

@m-beelman m-beelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, it all looks good.

Do you want to address the points from CodeScene?

@npalm
Copy link
Member Author

npalm commented Aug 4, 2023

As far as I can tell, it all looks good.

Do you want to address the points from CodeScene?

No, and actually the PR decreases the complexity by splitting the method in function. I think we can ignore codescne here

@npalm npalm force-pushed the faeat/support-on-dmand-failback branch from 4db0434 to b3bdd0e Compare August 23, 2023 12:21
lambdas/functions/control-plane/src/aws/runners.ts Outdated Show resolved Hide resolved
lambdas/functions/control-plane/src/aws/runners.test.ts Outdated Show resolved Hide resolved
lambdas/functions/control-plane/src/aws/runners.ts Outdated Show resolved Hide resolved
lambdas/functions/control-plane/src/aws/runners.ts Outdated Show resolved Hide resolved
@npalm
Copy link
Member Author

npalm commented Aug 24, 2023

@navdeepg2021 thx, review comments are addressed

@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Sep 24, 2023
@github-actions github-actions bot closed this Oct 5, 2023
@npalm npalm reopened this Nov 4, 2023
@npalm npalm marked this pull request as draft November 4, 2023 09:13
@npalm npalm added stale:exempt and removed Stale labels Nov 4, 2023
@npalm npalm force-pushed the faeat/support-on-dmand-failback branch from 3ded640 to 3f46965 Compare November 6, 2023 12:21
@npalm npalm marked this pull request as ready for review November 6, 2023 20:22
feat: Add failover to on-demand on (spot) errors
@npalm npalm force-pushed the faeat/support-on-dmand-failback branch from 7dc7c7c to dd1b293 Compare November 26, 2023 14:22
@npalm
Copy link
Member Author

npalm commented Nov 26, 2023

@GuptaNavdeep1983 @ScottGuymer please can you have a look at this PR?


it('Filter instances with status undefined, fall back to defaults.', async () => {
mockEC2Client.on(DescribeInstancesCommand).resolves(mockRunningInstances);
await listEC2Runners({ statuses: undefined });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this test is required? who is calling listec2runners with that status?

@npalm npalm merged commit d71e631 into main Nov 29, 2023
37 of 38 checks passed
@npalm npalm deleted the faeat/support-on-dmand-failback branch November 29, 2023 16:16
npalm pushed a commit that referenced this pull request Dec 2, 2023
🤖 I have created a release *beep* *boop*
---


##
[5.5.0](v5.4.2...v5.5.0)
(2023-11-30)


### Features

* add failover to on-demand in case request is failing
([#3409](#3409))
([d71e631](d71e631))


### Bug Fixes

* add runner name prefix to context of scale-up lambda
([#3644](#3644))
([2936edd](2936edd))
* **lambda:** bump the aws group in /lambdas with 5 updates
([#3635](#3635))
([9615e53](9615e53))
* **lambda:** bump the octokit group in /lambdas with 1 update
([#3636](#3636))
([876db0c](876db0c))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: forest-releaser[bot] <80285352+forest-releaser[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants