Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Skip 'hugepages' resources on ceiling #206

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions pkg/utils/resources/resources.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,9 @@ func Merge(resources ...v1.ResourceList) v1.ResourceList {
result := make(v1.ResourceList, len(resources[0]))
for _, resourceList := range resources {
for resourceName, quantity := range resourceList {
if resourceName == "hugepages-2Mi" || resourceName == "hugepages-1Gi" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a similar ask here but we had to reject the idea because ignoring the resource means that we can't know the best instance type to launch your workload pod on. You can imagine that we launch an instance type which doesn't actually support the resource, or we schedule a node that doesn't have enough of the resource to support your pod. Ideally, we would be able to return the exact resources each InstanceType has through the GetInstanceTypes API

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathan-innis Thanks for the review.
Regarding -

You can imagine that we launch an instance type which doesn't actually support the resource

That's what I tried to describe above, such resource as hugepages isn't a HW based (instance type based) resource, but a SW based resource, i.e. you'll never know if your node/instance type supports the resource.

Anyway, how that behavior can be achieved? We want to migrate to Karpenter but we have to support 'hugepages' resource.

Thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a SW-based component, but I believe that it scales based on the size of the instance, which means it's difficult to achieve a fixed sizing that we could pass through. Is this something that should be consistent on your kernel across all instance types or just the few that you've associated with a given image and NodeTemplate?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second option.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathan-innis, since its a specific request for a specific instance type (DL1) which is always needed (there is no point running habanlabs sw pod on DL1 without hugepages - it will not work), What if we add this condition/ignore case only for DL1 instance type will that be, Ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to assume hugepages to be some large number in instance type provider? Is this something that we should specify in the AWSNodeTemplate or Provisioner.spec.resources?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ellistarn generally speaking - Provisioner.spec.resources

continue
}
current := result[resourceName]
current.Add(quantity)
result[resourceName] = current
Expand Down Expand Up @@ -93,6 +96,9 @@ func MaxResources(resources ...v1.ResourceList) v1.ResourceList {
resourceList := v1.ResourceList{}
for _, resource := range resources {
for resourceName, quantity := range resource {
if resourceName == "hugepages-2Mi" || resourceName == "hugepages-1Gi" {
continue
}
if value, ok := resourceList[resourceName]; !ok || quantity.Cmp(value) > 0 {
resourceList[resourceName] = quantity
}
Expand Down