Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding capacityType discount as a percentage #4697

Closed

Conversation

rkirk-r7
Copy link

Fixes #N/A

Description
Reimplementation of this PR: #3896

Added code to allow a custom multiplier to be applied to on-demand and spot pricing.
This allows you to add a multiplier to prices that will be taken into consideration on which instance to use.
This take's the same approach as autospotting.

For Enterprise customers, instance prices can come from the Payer account which is not accessible outside of this account - this leads to a problem as we cannot get the accurate price data in AWS accounts running the Kubernetes clusters where it may be that running on-demand instance types would be cheaper than spot instance types.

When Karpenter retrieves the up-to-date pricing from AWS, the multiplier will be applied to the returned data.
The instance getCapacityType function has been updated so that it will be price aware - only if
offering.CapacityType == corev1beta1.CapacityTypeSpot && offering.Price == cheapestOffering will the capacityType be set to Spot

How was this change tested?

  • Tests written around the settings change
  • Tests written around the multiplier being applied
  • Tested in a Kubernetes cluster:
    • Verify multiplier set to default value does not modify prices and karpenter launched expected capacity
    • Verify when multiplier set prices were correctly modified and karpenter launched expected capacity
    • Verify when multiplier set prices were correctly modified and karpenter launched on-demand capacity type when cheaper than spot pricing
    • Verify spot instances are consolidated to cheaper on-demand
    • Verify on-demand instances are consolidated to cheaper spot

Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: #
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@rkirk-r7 rkirk-r7 requested a review from a team as a code owner September 26, 2023 13:27
@rkirk-r7 rkirk-r7 requested a review from bwagner5 September 26, 2023 13:27
@netlify
Copy link

netlify bot commented Sep 26, 2023

Deploy Preview for karpenter-docs-prod ready!

Name Link
🔨 Latest commit fabd0e0
🔍 Latest deploy log https://app.netlify.com/sites/karpenter-docs-prod/deploys/657af43806731a000848220e
😎 Deploy Preview https://deploy-preview-4697--karpenter-docs-prod.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@rkirk-r7
Copy link
Author

When testing consolidation to verify karpenter would consolidate more expensive spot instance types to on-demand instance types we noticed that the following event log message was output

Normal   Unconsolidatable         6s                 karpenter              Can't replace with a cheaper node

Which we believe is from this https://github.com/aws/karpenter-core/blob/cbc03e98ef1457bf4c7e4397a67cc892fa4719b1/pkg/controllers/deprovisioning/helpers.go#L226-L252.
This appears to only consider the spot pricing if the provisioner supports both capacity types, rather than the cheaper price for on-demand, which this change would allow us to launch as a replacement node.

Looking for some advice on best way to proceed here, should we be submitting a additional PR to change the pricing decision making in the deprovisioning module too?
Is there any other areas of the code which we should also be reviewing/checking that might prioritise spot instances rather than the cheapest instance type?

@njtran
Copy link
Contributor

njtran commented Sep 26, 2023

Thanks @rkirk-r7, can you come to working group so we can review the design? Ideally it would be nice to have reviewed the design before you started working on the implementation so that we could align on the details/direction of this PR.

@rkirk-r7
Copy link
Author

@njtran Yes of course, do I just need to attend one of the regular meetings listed here https://karpenter.sh/docs/contributing/working-group ?

@njtran
Copy link
Contributor

njtran commented Oct 3, 2023

Hey @rkirk-r7, checking in here. Trying to remember if you have a path forward here for your issue? Was there a bug?

@rkirk-r7
Copy link
Author

rkirk-r7 commented Oct 4, 2023

@njtran Hey, sorry for being a bit tardy have just put a thread into the slack channel to detail the issues we had run into hopefully more clearly than on the call. Since the working group call we also went back and revisited the change and also put an alternative idea into that thread as well that we think would be a much less invasive change but will wait on some feedback about that approach before updating this PR


## Overview

Karpenter is currently unaware of any discounted pricing, such as volume discounts or reserved instances/savings plans, which can lead to more expensive instances being chosen. For example an on demand instance with a savings plan discount may cost less than a spot instance of the same type. This pricing is apparently only available from the "payer" account, not any other child account API's for pricing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Karpenter is currently unaware of any discounted pricing, such as volume discounts or reserved instances/savings plans, which can lead to more expensive instances being chosen. For example an on demand instance with a savings plan discount may cost less than a spot instance of the same type. This pricing is apparently only available from the "payer" account, not any other child account API's for pricing.
Karpenter is currently unaware of any discounted pricing, such as volume discounts or reserved instances/savings plans, which can lead to more expensive instances being chosen. For example an on demand instance with a savings plan discount may cost less than a spot instance of the same type. On AWS, this pricing is apparently only available from the "payer" account, not any other child account API's for pricing.

💭 could Karpenter learn to assume an IAM role into the AWS account where discounts are visible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something like that was implemented it could resolve the issue of up-to-date govcloud pricing in AWS as well, as it could assume an IAM role in the commercial account
#2706 (comment)


### Spot pricing

The Spot price will be multiplied with the SpotPriceMultiplier to determine the real cost for Spot instances.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Spot price will be multiplied with the SpotPriceMultiplier to determine the real cost for Spot instances.
The Spot price will be multiplied with the `aws.spotPriceMultiplier` to determine the real cost for Spot instances.

Copy link
Contributor

@ellistarn ellistarn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the latest working group, I think we want to pursue an instance type overlay approach.

@github-actions
Copy link
Contributor

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

@njtran
Copy link
Contributor

njtran commented Nov 8, 2023

@rkirk-r7, if I remember correctly, we discussed the instance overlay approach rather than the spot discount factor as this PR explores, and cutting a bug issue/fix related to pricing if you were able to. I'm noticing that this PR is close to being staled out as I don't think this was the agreed upon direction. Did you want to discuss more here? Or is this good to close?

@rkirk-r7
Copy link
Author

@njtran I can update this as I split out some defects this was trying to solve as well as the new feature (#4923 / #4924). So does need cleaned up now. We were asked to follow up with our TAM in last working group we attended and after meeting with EKS team there did seem to be some interest in the multiplier approach so I am waiting to hear more from that so would like to keep open whilst those discussions are on going if thats okay?

@njtran
Copy link
Contributor

njtran commented Nov 13, 2023

there did seem to be some interest in the multiplier approach

Are you talking about from other users? Would be great if we could track these conversations somewhere. Maybe #3860 ?

Copy link
Contributor

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

@rkirk-r7
Copy link
Author

rkirk-r7 commented Dec 14, 2023

there did seem to be some interest in the multiplier approach

Are you talking about from other users? Would be great if we could track these conversations somewhere. Maybe #3860 ?

Sorry for late reply, stuck with day job work. It was call with our TAM which we were advised to setup from the last working group call I attended.

The person who raised #3860 is a colleague of mine so should cover the issues we are seeing. I'll get someone to post an update to that thread as well.

…ounts to apply them via price modifications.

Co-authored-by: Ryan Williams <[email protected]>
Co-authored-by: Adam Robinson <[email protected]>
@rkirk-r7 rkirk-r7 force-pushed the adding-discount-as-percentage-design branch from 6f6cbeb to fabd0e0 Compare December 14, 2023 12:25
@rkirk-r7
Copy link
Author

Have removed the changes related to launching as #4924 has been raised instead. Will need to revisit to update for the v1beta1 changes I had not realised the global settings had been changed so much

Copy link
Contributor

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants