-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: For issue 429 "Unable to deploy llama2 on Eks/Ray Serve/inf2" #430
Conversation
@harishvs, thank you for your efforts in testing and creating the PR. It's great to see that some of the fixes you've identified align with those we've implemented in the Stable Diffusion model. Could you please rebase your code with the latest updates from the main branch and then resubmit your PR, particularly focusing on the |
@vara-bonthu I will create a separate PR for the gradio format changes. It is still work in progress. I will rebase this PR for now with a narrow focus of fixing issue 429 |
@vara-bonthu I rebased on latest main. Please review and merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harishvs You can run Llama2 inference example with Karpenter as of now. If you want to run the model with managed nodegroups then you have to change the Ray deployment yaml to match with managed node groups labels.
ai-ml/trainium-inferentia/eks.tf
Outdated
instanceType = "mixed-x86" | ||
provisionerType = "Karpenter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These labels are for Karpenter to spinup the nodes, but not the Managed Node groups with CA. I would say remove these or update as below.
provisionerType = "ClusterAutosclaer"
If you want to deploy Llama2 model on Managed Nodegroups with these instances then you have to update the Ray deployment yaml with unique label that is used only by this node group
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok , addressed
ai-ml/trainium-inferentia/eks.tf
Outdated
instanceType = "inf2-24xl" | ||
provisionerType = "cluster-autoscaler" | ||
instanceType = "inferentia-inf2" | ||
provisionerType = "Karpenter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok , addressed
ai-ml/trainium-inferentia/eks.tf
Outdated
instanceType = "inf2-48xl" | ||
provisionerType = "cluster-autoscaler" | ||
instanceType = "inferentia-inf2" | ||
provisionerType = "Karpenter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok , addressed
What does this PR do?
This is a fix for issue 429 - "Unable to deploy llama2 on Eks/Ray Serve/inf2"
🛑 Please open an issue first to discuss any significant work and flesh out details/direction - we would hate for your time to be wasted.
Consult the CONTRIBUTING guide for submitting pull-requests.
Motivation
I could not complete the tutorial as it is outlined here -> https://awslabs.github.io/data-on-eks/docs/gen-ai/inference/Llama2
More
website/docs
orwebsite/blog
section for this featurepre-commit run -a
with this PR. Link for installing pre-commit locallyFor Moderators
Additional Notes
I don't know if i broke any other work flow !!