-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add internal CLI to generate instance descriptions for CSPs #1137
Add internal CLI to generate instance descriptions for CSPs #1137
Conversation
…iles for csps Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
I want to discuss: is it a good approach to add a new |
I would rather keep it under the |
what would be the goal of keeping it under the same cli an end user would use without having a useful info message for anyone including dev to see? |
@mattahrens |
Thanks @cindyyuanjiang ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cindyyuanjiang !
Overall, this looks going in the right direction.
I may need to take a deeper look to see if we can use more implementations from the spark_rapids_tools package.
OK, then we can have a dev CLI then for separation. |
user_tools/src/spark_rapids_pytools/rapids/dev/instance_description.py
Outdated
Show resolved
Hide resolved
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cindyyuanjiang.
Should we use a class to encapsulate the instance description structure instead of using hardcoded dictionary keys? This could improve maintainability across multiple files.
{
"MemoryInfo": {
"SizeInMiB": 49152
},
"GpuInfo": {
"GPUs": [
{
"Name": "NVIDIA",
"Manufacturer": "NVIDIA",
"Count": 1,
"MemoryInfo": {}
}
]
},
"VCpuInfo": {
"DefaultVCpus": 12
}
}
Added Gpu info for n1-standard series for Dataproc. New entry looks like:
|
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently n1-standard
have array type for GpuInfo.Count
whereas it is an int for others. Do we want to keep this separate for any special handling for n1-standard
?
Example,
"n1-standard-64": {
"VCpuCount": 64,
"MemoryInMB": 245760,
"GpuInfo": [
{
"Name": "T4",
"Count": [
4
]
},
{
"Name": "P4",
"Count": [
4
]
}
]
},
vs
"g2-standard-4": {
"VCpuCount": 4,
"MemoryInMB": 16384,
"GpuInfo": [
{
"Name": "L4",
"Count": 1
}
]
},
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Thanks @parthosa! I have updated the GPU Count field to a list for consistency across instances and platforms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cindyyuanjiang !
LGTME
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cindyyuanjiang. LGTM
"GpuInfo": [ | ||
{ | ||
"Name": gpu_name, | ||
"Count": 000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment needs updated to be array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit on the comment, otherwise looks great. Thanks Cindy!
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Fixes #1123.
This PR is first step to remove dependency on CSP CLIs. Ideally after we added this CLI, we can generate instance description json files for each platform and store them as resources in tools repo. Then we can add logic to use these instance description files when running tools.
Changes
Added an internal CLI
spark_rapids_dev generate_instance_description [options]
.Options:
emr
,dataproc
anddatabricks-azure
.Databricks-aws
platform can reuse the file generated byemr
, anddataproc-gke
can reuse the one fordataproc
.The generated json file has the following format (which is inspired by EMR CLI output):
For CPU instance, the entry will not have "GpuInfo".
Example json file entry for EMR platform
Testing
spark_rapids_dev generate_instance_description --platform emr/dataproc/databricks-azure