-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invalid values in cluster creation script #935
Conversation
Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: Partho Sarthi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @parthosa ! LGTM.
@@ -165,6 +165,9 @@ def get_matching_executor_instance(self, cores_per_executor): | |||
def generate_cluster_configuration(self, render_args: dict): | |||
executor_names = ','.join([f'"test-node-e{i}"' for i in range(render_args['NUM_EXECUTOR_NODES'])]) | |||
render_args['EXECUTOR_NAMES'] = f'[{executor_names}]' | |||
image_version = self.configs.get_value('clusterInference', 'defaultImage') | |||
render_args['IMAGE'] = f'"{image_version}"' | |||
render_args['ZONE'] = f'"{self.cli.get_zone()}"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zone should be picked up at the beginning of the execution.
I remember it was part of the getting the platform information during initialization.
My worry in this part that we are going to rely on the SDK CLI being installed which get us more restricted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we set env_vars
dict during platform setup from the SDK config file. However, zone
is not available in AWS' config file. Only region
is available.
Now that makes me think, if we need zone
in the cluster creation script for EMR? Why dont we use region
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it depends on what are the accepted keys under Ec2InstanceAttributes
If region is not one of the attributes, then we cannot use region there.
I don't remember exactly why did I have the availabilityZone part of the script. MAybe I tried to create a cluster and it required me to set the availability zone..I cannot really remember. It has been long time since I wrote that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with the PR.
Thanks @parthosa
@@ -165,6 +165,9 @@ def get_matching_executor_instance(self, cores_per_executor): | |||
def generate_cluster_configuration(self, render_args: dict): | |||
executor_names = ','.join([f'"test-node-e{i}"' for i in range(render_args['NUM_EXECUTOR_NODES'])]) | |||
render_args['EXECUTOR_NAMES'] = f'[{executor_names}]' | |||
image_version = self.configs.get_value('clusterInference', 'defaultImage') | |||
render_args['IMAGE'] = f'"{image_version}"' | |||
render_args['ZONE'] = f'"{self.cli.get_zone()}"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it depends on what are the accepted keys under Ec2InstanceAttributes
If region is not one of the attributes, then we cannot use region there.
I don't remember exactly why did I have the availabilityZone part of the script. MAybe I tried to create a cluster and it required me to set the availability zone..I cannot really remember. It has been long time since I wrote that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @parthosa!
Fixes #934. This PR fixes the invalid values of zone and image version for dataproc and emr cluster creation scripts.
Changes:
Image Version:
image version
in mustache template of dataproc and emr.image version
in config files.Zone:
{{ ZONE }}-a
from template file.aws cli
and select the first one.Testing
CMD:
Manually tested on Dataproc and EMR.