Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for additional ARM/graviton instance types #969

Closed
wants to merge 1 commit into from

Conversation

yob
Copy link
Contributor

@yob yob commented Dec 1, 2021

Announced at re:Invent 2021:

Announced earlier in 2021 and we missed it:

While I was adding to the list I also re-orderde it to be alphabetical. Mainly so it's easier to scan and check for missing types.

Note that I haven't actually tested the stack will boot on all these instance types. I think it'll be fine, based on my assumption that the AMI we bake for the graviton instances types we already support will Just Work.

Announced at re:Invent 2021:

* c7g.* new graviton3 instances in preview [1]
* g5g.* graviton2 with nvidia GPU [2]
* Im4gn.*  graviton2 with local SSDs [3]
* Im4gen.* graviton2 with local SSDs and extra memory [3]

Announced earlier in 2021 and we missed it:

* x2gd.* graviton2 with super extra sized memory [4]

While I was adding to the list I also re-orderde it to be alphabetical.
Mainly so it's easier to scan and check for missing types.

[1] https://aws.amazon.com/blogs/aws/join-the-preview-amazon-ec2-c7g-instances-powered-by-new-aws-graviton3-processors/
[2] https://aws.amazon.com/blogs/aws/new-amazon-ec2-g5g-instances-powered-by-aws-graviton2-processors-and-nvidia-t4g-tensor-core-gpus/
[3] https://aws.amazon.com/blogs/aws/new-storage-optimized-amazon-ec2-instances-im4gn-and-is4gen-powered-by-aws-graviton2-processors/
[4] https://aws.amazon.com/blogs/aws/new-amazon-ec2-x2gd-instances-graviton2-power-for-memory-intensive-workloads/
@yob yob requested a review from keithduncan December 1, 2021 12:08
@yob
Copy link
Contributor Author

yob commented Dec 1, 2021

oh no:

Template error: every Fn::Or object requires a list of at least 2 and at most 10 boolean parameters.

@eleanorakh eleanorakh self-requested a review December 1, 2021 22:44
@eleanorakh
Copy link
Contributor

Neat! Let's chat about that build failure in our 1:1 ? 💃

@keithduncan
Copy link
Contributor

🤔 I have previously thought about whether there was a better way to detect the instance architecture in a way that doesn’t require updating this map for each new instance type

@eleanorakh
Copy link
Contributor

@keithduncan can we have a pair on that next week? 🤔

@yob yob changed the title add support for additional ARM/graaviton instance types add support for additional ARM/graviton instance types Dec 2, 2021
@yob
Copy link
Contributor Author

yob commented Dec 2, 2021

I have previously thought about whether there was a better way to detect the instance architecture in a way that doesn’t require updating this map for each new instance type

That would be great, the map is a giant hack.

@keithduncan
Copy link
Contributor

Some research ahead of looking at how we might implement this.

aws-cdk uses a regex to examine the instance family being a or the instance additional capabilities including g

The image seems to come from Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (CMP307-R1) - AWS re:Invent 2018 slide 8 and matches what the hardware specifications docs say.

@acatighera
Copy link

Is there a workaround for now?

@pda
Copy link
Member

pda commented Jan 11, 2022

@acatighera Out of interest, what instance type are you trying to launch?

A workaround at the moment would be to edit the template for your specific use-case, e.g. delete some instance types from that list that you don't use. But that's not a good solution.

@pda
Copy link
Member

pda commented Jan 11, 2022

I wonder if we can use Fn::FindInMap.
We'd still need to maintain a list of all the instance types, but it would avoid the current static limit on Fn::Or.

@toothbrush
Copy link
Contributor

toothbrush commented Jan 12, 2022

@acatighera Out of interest, what instance type are you trying to launch?

Not @acatighera but personally i'm keen to switch to c7g.* for our ARM builders.

@pda
Copy link
Member

pda commented Jan 12, 2022

By my glance, c7g looks like the most generally useful CI instance type of the new ones, and we have one slot free before reaching the limit of 10. So… I guess we can add that one :)

@pda
Copy link
Member

pda commented Jan 13, 2022

I wonder if we can use Fn::FindInMap.

Ugh I thought I had it in #980 but Fn::FindInMap frustratingly can't use a derived value as its lookup key.

@pda
Copy link
Member

pda commented Jan 25, 2022

I am loath to add an architecture parameter; partly because it is derivable from the instance type, and partly because the stack is broken if the wrong value is selected.

The only other solution I can think of is a CloudFormation Custom Resource, backed by a Lambda which returns the architecture for a given instance type. That feels like a lot of complexity for a simple problem. However, it's almost exactly the same as one of their documented use-cases:

AWS CloudFormation templates that declare an Amazon Elastic Compute Cloud (Amazon EC2) instance must also specify an Amazon Machine Image (AMI) ID, which includes an operating system and other software and configuration information used to launch the instance. The correct AMI ID depends on the instance type and region in which you're launching your stack. And IDs can change regularly, such as when an AMI is updated with software updates.

It still feels overly complex for what should be a simple regexp.

If we do pursue this option, we might want to consider also using it for our region:AMI mapping, which is currently a slightly awkward part of the build process. Then again, there's something nice about knowing that a particular stack template version references particular AMIs, and that can't change over time.

@toothbrush
Copy link
Contributor

It's super off the wall and i should probably head home but i wonder if you could hack around the 10-argument limit something like this:

Instead of:

( a || b || c || ...)

Write this?

( (a || b) || (c || d) || ...)

This might not work and is probably really ugly. I'm writing this as a reminder to myself to try... 🤔

@pda
Copy link
Member

pda commented Jan 25, 2022

Further to that, the Custom Resources page has this banner:

ℹ️ Note

The CloudFormation registry offers several advantages over custom resources, such advantages include:

  • Supports the modeling, provisioning, and managing of third-party application resources
  • Supports the Create, Read, Update, Delete, and List (CRUDL) operations
  • Supports drift detection on private and third-party resource types

Unlike custom resources, registry based resources won't need to associate an Amazon SNS topic or a Lambda function to perform CRUDL operations. For more information about the CloudFormation registry, see Using the AWS CloudFormation registry.

I haven't used the CloudFormation Registry before, but maybe we could make one of these?

Third-party public extensions
These are extensions made available for general use by publishers other than Amazon. To use a public extension, you must first activate it in your account and region.
You can publish your own third-party extensions to make them available to general CloudFormation users. For more information, see Publishing extensions in the CloudFormation Command Line Interface User Guide.

It all feels a bit janky, though. Does that introduce an extra step, where users must activate this extension prior to deploying our stack? Doesn't seem ideal.

@toothbrush
Copy link
Contributor

toothbrush commented Jan 25, 2022

It all feels a bit janky, though. Does that introduce an extra step, where users must activate this extension prior to deploying our stack? Doesn't seem ideal.

I for one wouldn't like that, but i'm a rando on the internet 😛

This is ... "creative" in the worst possible way but appears to work?

    # Unfortunately, Cloudformation's !Or intrinsic function only accepts between 2 and 10 arguments.  
    # To get around this, we're grouping the instance families in sub-conditionals.  At least this doesn't 
    # force us into using a Custom Resource.
    UsingArmInstances:
      !Or
        - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "a1" ]
        - !Or
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "c6g" ]
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "c6gd" ]
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "c6gn" ]
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "c7g" ]
        - !Or
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "m6g" ]
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "m6gd" ]
        - !Or
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "r6g" ]
          - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "r6gd" ]
        - !Equals [ !Select [ 0, !Split [ ".", !Ref InstanceType ] ], "t4g" ]

WDYT? 🤪

EDIT: I went ahead and made the example change in https://github.com/buildkite/elastic-ci-stack-for-aws/compare/master...toothbrush:paul/arm-instances?expand=1. Not the most beautiful, but gets the job done today with minimal hassle.

@pda
Copy link
Member

pda commented Jan 25, 2022

Oh that's much better :)

@toothbrush
Copy link
Contributor

Oh that's much better :)

Hackiness is in the eye of the beholder? 😉

I went ahead and made a PR: #981.

@pda pda closed this Jan 26, 2022
@pda pda deleted the new-arm-instance-types branch January 26, 2022 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants